Manual¶
This document goes through every Neat language feature in approximate sequence. The goal is that you should be able to understand the entire language by just reading it top to bottom. But you can also use it as a reference to quickly find how a feature works.
Lexical¶
All Neat source files are encoded as UTF-8.
Comment syntax is as in C:
/**
* This is a comment. It goes from /* to */.
*/
// This is also a comment. It goes to the end of the line.
/* Comments can be /* nested. */ */
Comments may appear anywhere in a source file except inside identifiers or operators.
An identifier is a letter or underscore, followed by a sequence of letters, digits or underscores.
Modules¶
A Neat source file has the extension .nt
. Each source file corresponds to a module.
A module is a dot-separated list of packages, corresponding to folders, followed the filename.
Every file must begin with the module declaration. For instance, a file src/hello/world.nt
:
module hello.world;
Packages¶
Neat does not use includes, but instead packages. A package is a folder associated with a name:
$ # -P<name>:<folder>[:<dependency>[,<dependency]*]
$ neat -Proot:src src/hello/world.nt
This defines the folder ./src
to be the package root
. The file passed will be the module
hello.world
, because its name will be relative to src
.
Packages cannot access modules in other packages. To allow a package access, explicitly list the packages that it has access to:
$ neat -Proot:src:dep1,dep2 -Pdep1:include/dep1 -Pdep2:include/dep2
This allows the modules from package root
in src/
to import the modules
in dep1
and dep2
.
Because dependencies are explicitly listed, accidental import of modules from
an unrelated package is impossible.
Module-Level Statements¶
Import¶
A module can import another module:
module hello.world;
import std.stdio;
Import is non-transitive, ie. symbols from modules imported by std.stdio
are invisible.
Modules can be imported transitively:
module first;
public import second;
Now all modules that import first
will also see the symbols in second
.
Symbols can be imported by name: import std.stdio : print;
.
Declaration¶
Structs, classes, templates and functions can be declared by name.
Every declaration can be marked as public
or private
; they are public
by default.
Private declarations cannot be seen when the module is imported.
Extern(C)¶
A function can be declared as extern(C)
. This will ensure that it matches the calling convention of the platform’s native C compiler.
For example:
extern(C) void* memcpy(void* dest, void* src, size_t n);
Note: instead of declaring lots of extern(C) functions manually, try using the
std.macro.cimport
built-in macro! (Grep for examples.)
Expressions¶
Literals¶
5
is an integer literal of type int
.
Integer literals may be arbitrarily divided by underscores for readability: 1_048_576
.
"Hello World"
is a string literal. string
is the same as char[]
.
You can interpolate values into a string literal with $name
or $(expression)
. The
compiler will select a type-appropriate default representation. For class and struct types,
toString
will be called. An example:
print("$remaining bottles of beer on the wall.");
As in Python, expressions with a trailing =
are quoted in the output:
int a = 2, b = 3;
print("$(a + b =)"); // "a + b = 5"
`Hello World`
is a literal string. Unlike regular string literals,
escape sequences and format string quotes are not processed.
A backslash in a string starts an escape sequence. These escape sequences are supported in strings:
\r
: carriage return\n
: newline\t
: tab\"
: double quote\'
: single quote\\
: backslash\0
: null character\x##
: two-digit hexadecimal character code (e.g.,\x0A
for newline)\$
: dollar sign
Additionally, as in Rust, a backslash followed by a newline character (\n
or \r\n
) indicates
a line continuation. The newline and all succeeding whitespace are removed. Note that as opposed
to Rust, newline characters beyond the first are not skipped!
1.2
is a double
literal. 1.2f
is a float
literal.
Arithmetic¶
Binary operations can be performed on types. These are:
Operation |
Description |
Rank |
---|---|---|
|
Boolean “or” |
1 |
|
Boolean “and” |
2 |
|
Comparison |
3 |
|
Range |
4 |
|
Addition |
5 |
|
Subtraction |
5 |
|
Concatenation |
5 |
|
Multiplication |
6 |
|
Division |
6 |
|
Bitwise “or” |
7 |
|
Left shift |
8 |
|
Right shift |
8 |
|
Bitwise “xor” |
9 |
|
Bitwise “and” |
10 |
Boolean “or” and “and” are short-circuiting. Comparison operators are >
, ==
, <
, >=
, <=
, and !=
.
Higher-ranked operators take precedence over lower-ranked, with boolean operators being the loosest.
Note that the placement of bitwise operators diverges from C’s order. This is because C’s order is stupid^W a legacy holdover from before it had boolean operators.
Operator precedence can be clarified using parentheses: 2 * (3 + 4)
instead of 2 * 3 + 4
.
Ternary If¶
a if t else b
has the value of a
if t
is true, else it has the value of b
.
Only the selected expression is evaluated. So if t
is true, b
is never evaluated.
This operator has a lower rank than any of the binary operators.
The ternary operator can be shortened to a else b
. In that case, a
is always taken unless
the expression branches to b
via breakelse
.
The ternary operator syntax diverges from C because ?
is already used for error propagation.
Control flow expressions¶
break
is an expression that, when evaluated, transfers control flow to after the current loop.
continue
is an expression that, when evaluated, transfers control flow to the next pass
of the current loop.
return
, or return x
, is an expression that, when evaluated, transfers control flow out of
the current function. If a parameter x
is given, the current function call evaluates to
x
; else it evaluates to void
.
breakelse
is an expression that, when evaluated, transfers control flow to the else
block
of the surrounding if
statement, or causes the else
expression of the surrounding ternary if
to be used. If the if
statement has no else
block, it continues after the if
block.
Since these expressions exit the local scope (they’re “non-local control flow primitives”),
they are all typed bottom
- their local value is empty.
Error propagation operator¶
x?
is the error propagation operator. Its behavior depends on the type of x
:
if
x
is a subtype ofstd.error.Error
, it is returned from the current function.- if
x
is a sumtype: all subclasses of
std.error.Error
are returned from the current function.all types marked
fail
are returned from the current function. This is a legacy feature:Error
subclasses should be preferred.if it contains an
:else
type, it is mapped tobreakelse
.if it contains a
nullptr_t
type, it is mapped tobreakelse
.
- if
if
x
is anullable T
, it is treated as a sumtype ofT | nullptr_t
. Thenullptr_t
is then mapped tobreakelse
.
The member types nullptr_t
and :else
are thus interpreted as “not error, not success”:
they are “expected failures” that exit the current if
test but not the function.
For instance, when reading data from a file, an I/O error class would subclass Error
and
thus be returned, but reaching the end of the file would be communicated by :else
.
Since ?
maps certain types to control flow expressions, which are typed bottom
,
they are removed from the sumtype. As such, ?
leaves only successful types behind.
Note that when a sumtype contains both Error
/fail
types and a nullable class,
the first application of ?
will only get rid of the Error
/fail
types: you may require two ?
.
Example:
string line = file.readText()?? else die;
nullable Class obj;
if (auto var = obj?.field?) { }
while (true) {
auto data = file.readBlock? else break;
...
}
Functions¶
A function is a series of statements operating on a list of parameters, culminating in a return value:
ReturnType functionName(ParameterType parameterName) {
statement;
statement;
statement;
return 5;
}
...
ReturnType ret = functionName(foo);
When a function is called with name(arg, arg)
, the arguments are passed to the parameters and
control passes to the function. The statements of the function are then executed, until control
returns to the caller when the function exits, by explicit return
or reaching its end.
If the return type is auto
, it is inferred from the type returned by the return
statements
in the function body. This is called return type inference.
Call¶
A function, class method or struct method can be called with a comma-separated list of arguments:
print("Hello World");
double d = sin(0.0);
class.method();
When a function does not have any parameters, the empty parens can be left out, and the function will be called implicitly:
doWork;
This also allows struct or class methods that look like properties.
Uniform Function Call Syntax¶
As in D, “uniform function call syntax” (UFCS) may be used. That is, if a call of the form a.method(b)
did not find a method a.method
to call, it will instead be interpreted as method(a, b)
.
This allows easily defining global functions that can be called as if they are member functions of a
.
Named Arguments¶
The value of every parameter on a call may be assigned by name:
int twice(int x) { return x + x; }
assert(twice(x=2) == 4);
This feature does not allow reordering parameters! It is purely intended to improve call readability, and to ensure that arguments are passed to the intended parameter.
Nested functions¶
Functions may be nested inside other functions. They remain valid while the surrounding function is running, and can access variables and parameters of the containing function, that were declared before them:
int double(int a) {
int add(int b) {
return a + b;
}
return add(a);
}
Note that calling the nested function after the surrounding function has returned will lead to a crash!
main¶
Every program must contain a function with this signature:
void main(string[] args) {
}
This function will be called when the program is executed.
Statements¶
Variable declaration¶
A variable can be declared like so:
int a; // a is 0
int b = 5;
int c, d = 6; // c is 0
mut int e;
Instead of a type, you may write auto
:
auto f = 7;
Then the type of the variable is taken from the type of the initializer.
Only mutable variables (mut a;
) may be changed later.
Variable extraction declaration¶
When an expression is a sumtype, a subset or a single type may be extracted as such:
(int | Error) foo;
// `Error` will be returned if `foo` is not `int`.
int bar <- foo;
Note
This syntax is disabled pending renovations!
The new error propagation syntax foo?.bar
has made it superfluous.
Block statement¶
Multiple statements can be combined into one block:
{
print("Hello");
print("World");
}
Variables declared inside the block are not visible outside of it.
Expression statement¶
Expressions can appear as statements. They are terminated with a semicolon:
5;
foo();
Assignment¶
Any reference may be assigned a new value:
mut int a = 3;
a = 5;
assert(a == 5);
Note that only mutable (mut
) variables or parameters can be reassigned. As this allows
some optimizations to reference counting, non-mutable variables should be preferred.
If block¶
If a condition is true, execute one statement, else the other:
if (2 + 2 == 4)
print("2 + 2 = 4");
else {
print("sanity has deserted us");
}
The condition of the if
statement may be a variable declaration.
In that case, the condition is true if the value of the variable is true.
The variable will only be visible inside the if
block:
if (Foo foo = getFoo()) {
// do foo things here
}
nullable Class
types are true if the class is non-null. In that case, the type
of the tested variable can be Class
. This is the only way in which nullable Class
types can be converted to Class
.
The if let
form acts exactly like if
, except that the variable does not have to be truthy:
if let(auto bar = getFoo()?.bar) {
// bar may be false here.
}
The intended meaning is: “The fact that the variable was declared already indicates success.”
As with regular if
, breakelse
jumps to the else
block or past the statement.
This idiom is aimed at code that wants to use the result of a chain of ?
expressions,
but doesn’t particularly care about its truth value.
With block¶
The with
block takes an expression and makes its fields implicitly accessible:
auto s = (foo=2, bar=3);
int baz = 5;
with (s) {
assert(foo == 2);
assert(bar == 3);
// we can still access other variables.
assert(baz == 5);
// lookup proceeds lexically, so the
// variable masks the `with` statement.
int bar = 8;
assert(bar == 8);
}
While loop¶
While a condition is true, execute a statement:
mut int i = 0;
while (i < 10) { i += 1; }
For loop¶
You can loop over a range expression:
// prints 2, then 3
for (size_t i in 2 .. 4) {
print(ltoa(i));
}
The type of the loop variable may be left out.
Array expressions are ranges. Array indexes can be iterated like:
for (i, value in array) {
array[i] = value + 2;
}
You can also use a C-style for loop:
for (mut int a = 0; a < 10; a += 1) { }
But this is rarely needed.
break, continue¶
While inside any loop, you may immediately abort and continue after the loop with break
.
You may immediately jump to the next iteration of the loop with continue
.
Types¶
Basic types¶
name |
meaning |
---|---|
int |
32-bit signed integer |
short |
16-bit signed integer |
byte |
8-bit signed integer |
char |
8-bit UTF-8 code unit |
long |
64-bit signed integer |
void |
0-bit empty data |
size_t |
platform-dependent unsigned word |
float |
32-bit IEEE floating point number |
double |
64-bit IEEE floating point number |
Array¶
The type T[]
is an “array of T”, which some languages call a slice.
It consists of a pointer, a length and a reference to the array object.
[2]
is an array of ints (int[]
), allocated on the heap.
array ~ array
is the concatenation of two arrays.
Concatenation is the only way to add elements to the array. The values in an array cannot be directly modified! In other words, arrays are immutable by default.
array.length
is the length of the array.
Appending to an array in a loop will follow a doubling strategy. It should be reasonably efficient.
array[2]
is the third element (base-0) of the array.
array.dup
creates a copy of the array. The copy will be mutable.
Mutable Array¶
The type T mut[]
is a “mutable array of T”. It differs from normal arrays in that elements
can be freely reassigned.
array.freeze
converts array
to an immutable array. Unless array
has exactly one
reference, this operation is forbidden; using the array
variable after this expression
has been evaluated is forbidden. (The compiler does not at present enforce this, but it
will in the future.)
T[]
and T mut[]
are separated because in my experience these types occupy fundamentally
different roles in a program. If you pass an array to a function, you get the
assurance that it won’t be modified. Likewise if you are a class, and somebody gives you an
array that you store in a class member, the value of that array will not change on you.
Tuple¶
(int, float)
is a tuple with two member types, int
and float
. Each member can have an independent value.
(2, 3.0f)
is an expression of type (int, float)
.
tuple[0]
is the first member of the tuple. The index value must be an int literal.
Tuple members can be named: (int i, float f)
. This allows accessing the member with value.i
.
When implicitly converting tuples, tuple fields without names implicitly convert to any name, but tuple fields with names only convert to other fields with the same name.
For example, (2, 3)
implicitly converts to (int from, int to)
, but (min=2, max=3)
does not.
Pointers¶
Don’t use pointers.
Sum type¶
(int | float)
is either an int or a float value:
(int | float) a = 4;
return a.case(
int i: i / 2,
float f: f / 2.0f);
a.case {
int i:
print(itoa(i));
float f:
print(ftoa(f));
}
Members of a sumtype can be marked as “fail”, enabling error return:
(int | fail FileNotFound) foo() { return "test".readAll?.itoa; }
int i = foo()?;
If foo returns a FileNotFound
, it will be automatically returned at the ?
.
Note that this is not required for subtypes of std.error.Error
.
Symbol Identifier¶
A symbol identifier takes the form :name
.
It is both a type and an expression. The type :name
has one value, which is also :name
.
This feature can be used to “type-tag” entries in sumtypes, to differentiate identically
typed entries, such as (:centimeters, int | :meters, int)
.
It is also used to construct “value-less” sumtype entries, such as (int | :none)
.
Struct¶
A struct is a value type that combines various members and methods that operate on them:
struct Foo
{
int a, b;
int sum() { return this.a + b; }
}
Foo foo = Foo(2, 3);
assert(foo.sum() == 5);
A method is a function defined in a struct (or class). It takes a reference to the struct value it is called
on as a hidden parameter called this
.
Class¶
A class is a reference type that combines various members and methods that operate on them:
class Foo
{
int a, b;
this(this.a, this.b) { }
int sum() { return this.a + b; }
}
Foo foo = new Foo(2, 3);
assert(foo.sum() == 5);
Note that, as opposed to C++, the type Foo
designates a reference to the class. It is impossible
to hold a class by value.
this
is a special method without return value that designates the constructor of the class. When instantiating
a class with new Class(args)
, this(args)
is called.
The parameter this.a
indicates that the argument is directly assigned to the member a
,
rather than passed to the method as a parameter.
Classes can be inherited with a subclass. An instance of the subclass can be implicitly converted to the parent class. When a method is called on an instance, the function that runs is that of the allocated class, not of the type of the reference:
class Foo
{
int get() { return 5; }
}
class Bar : Foo
{
// "override" must be specified, to indicate
// that a parent method is being redefined.
override int get() { return 7; }
}
Foo foo = new Bar;
assert(foo.get == 7);
Classes can also inherit from interfaces, which are like “thin classes” that can only contain methods. In exchange, arbitrarily many interfaces can be inherited from:
interface Foo
{
int get();
}
class Bar : Parent, Foo
{
override int get() { return 5; }
}
Foo foo = new Bar;
assert(foo.get == 5);
In a subclass constructor, you can use the syntax super()
to call the constructor of the parent class.
You can also use the keyword super
in the parameter list to insert an implicit super constructor call:
class Bar : Foo
{
int c;
this(super, this.c) { }
}
The type of an object can be tested with the instanceOf
property:
nullable Bar bar = foo.instanceOf(Bar);
if (Bar bar = foo.instanceOf(Bar)) { }
Return and parameter types follow covariance and contravariance on inheritance.
A class type may be qualified as nullable
. In that case, the special value
null
implicitly converts to a reference to the type. By default, class references are not
nullable:
nullable Foo foo = null;
assert(!foo);
Foo bar = foo; // errors
As a special treat, the case
expression allows treating a nullable class as a sumtype
of a non-nullable class and null
:
nullable Foo foo;
Foo bar = foo.case(null: return false);
Function and Delegate¶
You can take the address of a function using the &
operator. The type of
the expression is R function(T)
.
When you take the address of a class method, the type will be R delegate(T)
.
A delegate
is a “fat function pointer” that carries a pointer to the context,
ie. the object.
You can also take the address of a nested function with &
, but then the type
will be R delegate!(T)
, a “noncopyable delegate”. It cannot be used anywhere
where a reference would have to be taken. As the delegate carries a pointer to
the stackframe, this is necessary to protect the developer from use-after-return
bugs.
A nested function can be heap-allocated using the syntax new &fun
. new
will
make a copy of the surrounding stackframe for the function. In that case,
the type will be R delegate(T)
and the allocated stackframe will be reference
counted.
typeof
¶
Given an expression, the type of the expression can be used as a type with typeof
:
typeof(a + b) sum = a + b
Since auto
exists, this is mostly used for return and parameter types.
Unittest¶
Unittest blocks will be compiled and run when the compiler is called with -unittest
:
int sum(int a, int b) { return a + b; }
unittest
{
assert(sum(2, 3) == 5);
}
Templates¶
A template is a wrapper around a declaration that allows parameterizing it. The syntax is:
template max(T) {
T max(T first, T second) {
if (first > second) return first;
return second;
}
}
Here, T
is the “template parameter”. Multiple template parameters can be used.
The symbol in the template must be eponymous, ie. have the same name as the template. To call it,
instantiate the template: max!int(2, 3)
or max!float(2.5, 3)
. Here, max!int
is “the function max
in the version of the template max
where T
is int
.”
Multiple parameters are passed in parentheses: templ!(int, float)
.
If the template is called directly, without explicitly instantiating it, the compiler will try to unify the arguments passed with the template arguments available in order to infer their types. If only some template arguments are given in the instantiation, the compiler will try to infer the rest.
Ranges¶
If a type T
has the properties bool empty
, T next
and E front
, then it is called a “range over E
”.
Arrays are an example of such.
Another example is range expressions: from .. to
.
If you define these properties in a data type, you can use it as the source of a loop.
Lambdas¶
A lambda is a templated nested function reference. They can be assigned to a value. When called, they are implicitly instantiated.
Example:
int a = 5;
auto add = b => a + b;
assert(add(2) == 7);
Every lambda has a unique type. Because of this, they cannot be stored in data structures. Their primary purpose is being passed to templated functions:
auto a = (0 .. 10).filter(a => a & 1 == 0).map(a => a / 2).array;
assert(a == [0, 1, 2, 3, 4]);
The compiler will try to prevent you from returning a lambda from the function where it was defined. To enable this, lambdas cannot be assigned to class fields, or in general put in any location where the compiler could lose track of where the lambda is.
Macros¶
Note
For this feature, compiler knowledge is required!
When macro(function)
is called, function
is loaded into the compiler and executed with a macro state
parameter. This allows modifying the macro state of the compiler to add a macro class instance.
Macro classes can extend the compiler with new functionality using a set of hooks:
calls:
a(b, c)
expressions:
2 ★ 2
properties:
a.b<property goes here>
statements:
macroThing;
imports:
import github("http://github.com/neat-lang/example").module;
Look at std.macro.*
for examples.
The entire compiler is available for importing and reuse in macros. However, it is recommended
to limit yourself to the functionality in neat.base
. This will also keep compile times down.