Manual

This document goes through every Neat language feature in approximate sequence. The goal is that you should be able to understand the entire language by just reading it top to bottom. But you can also use it as a reference to quickly find how a feature works.

Lexical

All Neat source files are encoded as UTF-8.

Comment syntax is as in C:

/**
 * This is a comment. It goes from /* to */.
 */
// This is also a comment. It goes to the end of the line.
/* Comments can be /* nested. */ */

Comments may appear anywhere in a source file except inside identifiers or operators.

An identifier is a letter or underscore, followed by a sequence of letters, digits or underscores.

Modules

A Neat source file has the extension .nt. Each source file corresponds to a module. A module is a dot-separated list of packages, corresponding to folders, followed the filename.

Every file must begin with the module declaration. For instance, a file src/hello/world.nt:

module hello.world;

Packages

Neat does not use includes, but instead packages. A package is a folder associated with a name:

$ # -P<name>:<folder>[:<dependency>[,<dependency]*]
$ neat -Proot:src src/hello/world.nt

This defines the folder ./src to be the package root. The file passed will be the module hello.world, because its name will be relative to src.

Packages cannot access modules in other packages. To allow a package access, explicitly list the packages that it has access to:

$ neat -Proot:src:dep1,dep2 -Pdep1:include/dep1 -Pdep2:include/dep2

This allows the modules from package root in src/ to import the modules in dep1 and dep2. Because dependencies are explicitly listed, accidental import of modules from an unrelated package is impossible.

Module-Level Statements

Import

A module can import another module:

module hello.world;

import std.stdio;

Import is non-transitive, ie. symbols from modules imported by std.stdio are invisible.

Modules can be imported transitively:

module first;

public import second;

Now all modules that import first will also see the symbols in second.

Symbols can be imported by name: import std.stdio : print;.

Declaration

Structs, classes, templates and functions can be declared by name. Every declaration can be marked as public or private; they are public by default. Private declarations cannot be seen when the module is imported.

Extern(C)

A function can be declared as extern(C). This will ensure that it matches the calling convention of the platform’s native C compiler.

For example:

extern(C) void* memcpy(void* dest, void* src, size_t n);

Note: instead of declaring lots of extern(C) functions manually, try using the std.macro.cimport built-in macro! (Grep for examples.)

Expressions

Literals

5 is an integer literal of type int.

Integer literals may be arbitrarily divided by underscores for readability: 1_048_576.

"Hello World" is a string literal. string is the same as char[].

You can interpolate values into a string literal with $name or $(expression). The compiler will select a type-appropriate default representation. For class and struct types, toString will be called. An example:

print("$remaining bottles of beer on the wall.");

As in Python, expressions with a trailing = are quoted in the output:

int a = 2, b = 3;
print("$(a + b =)"); // "a + b = 5"

`Hello World` is a literal string. Unlike regular string literals, escape sequences and format string quotes are not processed.

A backslash in a string starts an escape sequence. These escape sequences are supported in strings:

  • \r: carriage return

  • \n: newline

  • \t: tab

  • \": double quote

  • \': single quote

  • \\: backslash

  • \0: null character

  • \x##: two-digit hexadecimal character code (e.g., \x0A for newline)

  • \$: dollar sign

Additionally, as in Rust, a backslash followed by a newline character (\n or \r\n) indicates a line continuation. The newline and all succeeding whitespace are removed. Note that as opposed to Rust, newline characters beyond the first are not skipped!

1.2 is a double literal. 1.2f is a float literal.

Arithmetic

Binary operations can be performed on types. These are:

Operation

Description

Rank

a || b

Boolean “or”

1

a && b

Boolean “and”

2

a <= b

Comparison

3

a .. b

Range

4

a + b

Addition

5

a - b

Subtraction

5

a ~ b

Concatenation

5

a * b

Multiplication

6

a / b

Division

6

a | b

Bitwise “or”

7

a << b

Left shift

8

a >> b

Right shift

8

a ^ b

Bitwise “xor”

9

a & b

Bitwise “and”

10

Boolean “or” and “and” are short-circuiting. Comparison operators are >, ==, <, >=, <=, and !=. Higher-ranked operators take precedence over lower-ranked, with boolean operators being the loosest.

Note that the placement of bitwise operators diverges from C’s order. This is because C’s order is stupid^W a legacy holdover from before it had boolean operators.

Operator precedence can be clarified using parentheses: 2 * (3 + 4) instead of 2 * 3 + 4.

Ternary If

a if t else b has the value of a if t is true, else it has the value of b.

Only the selected expression is evaluated. So if t is true, b is never evaluated.

This operator has a lower rank than any of the binary operators.

The ternary operator can be shortened to a else b. In that case, a is always taken unless the expression branches to b via breakelse.

The ternary operator syntax diverges from C because ? is already used for error propagation.

Control flow expressions

break is an expression that, when evaluated, transfers control flow to after the current loop.

continue is an expression that, when evaluated, transfers control flow to the next pass of the current loop.

return, or return x, is an expression that, when evaluated, transfers control flow out of the current function. If a parameter x is given, the current function call evaluates to x; else it evaluates to void.

breakelse is an expression that, when evaluated, transfers control flow to the else block of the surrounding if statement, or causes the else expression of the surrounding ternary if to be used. If the if statement has no else block, it continues after the if block.

Since these expressions exit the local scope (they’re “non-local control flow primitives”), they are all typed bottom - their local value is empty.

Error propagation operator

x? is the error propagation operator. Its behavior depends on the type of x:

  • if x is a subtype of std.error.Error, it is returned from the current function.

  • if x is a sumtype:
    • all subclasses of std.error.Error are returned from the current function.

    • all types marked fail are returned from the current function. This is a legacy feature: Error subclasses should be preferred.

    • if it contains an :else type, it is mapped to breakelse.

    • if it contains a nullptr_t type, it is mapped to breakelse.

  • if x is a nullable T, it is treated as a sumtype of T | nullptr_t. The nullptr_t is then mapped to breakelse.

The member types nullptr_t and :else are thus interpreted as “not error, not success”: they are “expected failures” that exit the current if test but not the function. For instance, when reading data from a file, an I/O error class would subclass Error and thus be returned, but reaching the end of the file would be communicated by :else.

Since ? maps certain types to control flow expressions, which are typed bottom, they are removed from the sumtype. As such, ? leaves only successful types behind.

Note that when a sumtype contains both Error/fail types and a nullable class, the first application of ? will only get rid of the Error/fail types: you may require two ?.

Example:

string line = file.readText()?? else die;

nullable Class obj;
if (auto var = obj?.field?) { }

while (true) {
    auto data = file.readBlock? else break;
    ...
}

Functions

A function is a series of statements operating on a list of parameters, culminating in a return value:

ReturnType functionName(ParameterType parameterName) {
    statement;
    statement;
    statement;
    return 5;
}
...
    ReturnType ret = functionName(foo);

When a function is called with name(arg, arg), the arguments are passed to the parameters and control passes to the function. The statements of the function are then executed, until control returns to the caller when the function exits, by explicit return or reaching its end.

If the return type is auto, it is inferred from the type returned by the return statements in the function body. This is called return type inference.

Call

A function, class method or struct method can be called with a comma-separated list of arguments:

print("Hello World");

double d = sin(0.0);

class.method();

When a function does not have any parameters, the empty parens can be left out, and the function will be called implicitly:

doWork;

This also allows struct or class methods that look like properties.

Uniform Function Call Syntax

As in D, “uniform function call syntax” (UFCS) may be used. That is, if a call of the form a.method(b) did not find a method a.method to call, it will instead be interpreted as method(a, b). This allows easily defining global functions that can be called as if they are member functions of a.

Named Arguments

The value of every parameter on a call may be assigned by name:

int twice(int x) { return x + x; }
assert(twice(x=2) == 4);

This feature does not allow reordering parameters! It is purely intended to improve call readability, and to ensure that arguments are passed to the intended parameter.

Nested functions

Functions may be nested inside other functions. They remain valid while the surrounding function is running, and can access variables and parameters of the containing function, that were declared before them:

int double(int a) {
    int add(int b) {
        return a + b;
    }
    return add(a);
}

Note that calling the nested function after the surrounding function has returned will lead to a crash!

main

Every program must contain a function with this signature:

void main(string[] args) {
}

This function will be called when the program is executed.

Statements

Variable declaration

A variable can be declared like so:

int a; // a is 0
int b = 5;
int c, d = 6; // c is 0
mut int e;

Instead of a type, you may write auto:

auto f = 7;

Then the type of the variable is taken from the type of the initializer.

Only mutable variables (mut a;) may be changed later.

Variable extraction declaration

When an expression is a sumtype, a subset or a single type may be extracted as such:

(int | Error) foo;
// `Error` will be returned if `foo` is not `int`.
int bar <- foo;

Note

This syntax is disabled pending renovations! The new error propagation syntax foo?.bar has made it superfluous.

Block statement

Multiple statements can be combined into one block:

{
    print("Hello");
    print("World");
}

Variables declared inside the block are not visible outside of it.

Expression statement

Expressions can appear as statements. They are terminated with a semicolon:

5;
foo();

Assignment

Any reference may be assigned a new value:

mut int a = 3;
a = 5;
assert(a == 5);

Note that only mutable (mut) variables or parameters can be reassigned. As this allows some optimizations to reference counting, non-mutable variables should be preferred.

If block

If a condition is true, execute one statement, else the other:

if (2 + 2 == 4)
    print("2 + 2 = 4");
else {
    print("sanity has deserted us");
}

The condition of the if statement may be a variable declaration. In that case, the condition is true if the value of the variable is true. The variable will only be visible inside the if block:

if (Foo foo = getFoo()) {
    // do foo things here
}

nullable Class types are true if the class is non-null. In that case, the type of the tested variable can be Class. This is the only way in which nullable Class types can be converted to Class.

The if let form acts exactly like if, except that the variable does not have to be truthy:

if let(auto bar = getFoo()?.bar) {
    // bar may be false here.
}

The intended meaning is: “The fact that the variable was declared already indicates success.”

As with regular if, breakelse jumps to the else block or past the statement.

This idiom is aimed at code that wants to use the result of a chain of ? expressions, but doesn’t particularly care about its truth value.

With block

The with block takes an expression and makes its fields implicitly accessible:

auto s = (foo=2, bar=3);

int baz = 5;
with (s) {
    assert(foo == 2);
    assert(bar == 3);
    // we can still access other variables.
    assert(baz == 5);
    // lookup proceeds lexically, so the
    // variable masks the `with` statement.
    int bar = 8;
    assert(bar == 8);
}

While loop

While a condition is true, execute a statement:

mut int i = 0;
while (i < 10) { i += 1; }

For loop

You can loop over a range expression:

// prints 2, then 3
for (size_t i in 2 .. 4) {
    print(ltoa(i));
}

The type of the loop variable may be left out.

Array expressions are ranges. Array indexes can be iterated like:

for (i, value in array) {
    array[i] = value + 2;
}

You can also use a C-style for loop:

for (mut int a = 0; a < 10; a += 1) { }

But this is rarely needed.

break, continue

While inside any loop, you may immediately abort and continue after the loop with break.

You may immediately jump to the next iteration of the loop with continue.

Types

Basic types

name

meaning

int

32-bit signed integer

short

16-bit signed integer

byte

8-bit signed integer

char

8-bit UTF-8 code unit

long

64-bit signed integer

void

0-bit empty data

size_t

platform-dependent unsigned word

float

32-bit IEEE floating point number

double

64-bit IEEE floating point number

Array

The type T[] is an “array of T”, which some languages call a slice. It consists of a pointer, a length and a reference to the array object.

[2] is an array of ints (int[]), allocated on the heap.

array ~ array is the concatenation of two arrays.

Concatenation is the only way to add elements to the array. The values in an array cannot be directly modified! In other words, arrays are immutable by default.

array.length is the length of the array.

Appending to an array in a loop will follow a doubling strategy. It should be reasonably efficient.

array[2] is the third element (base-0) of the array.

array.dup creates a copy of the array. The copy will be mutable.

Mutable Array

The type T mut[] is a “mutable array of T”. It differs from normal arrays in that elements can be freely reassigned.

array.freeze converts array to an immutable array. Unless array has exactly one reference, this operation is forbidden; using the array variable after this expression has been evaluated is forbidden. (The compiler does not at present enforce this, but it will in the future.)

T[] and T mut[] are separated because in my experience these types occupy fundamentally different roles in a program. If you pass an array to a function, you get the assurance that it won’t be modified. Likewise if you are a class, and somebody gives you an array that you store in a class member, the value of that array will not change on you.

Tuple

(int, float) is a tuple with two member types, int and float. Each member can have an independent value.

(2, 3.0f) is an expression of type (int, float).

tuple[0] is the first member of the tuple. The index value must be an int literal.

Tuple members can be named: (int i, float f). This allows accessing the member with value.i.

When implicitly converting tuples, tuple fields without names implicitly convert to any name, but tuple fields with names only convert to other fields with the same name.

For example, (2, 3) implicitly converts to (int from, int to), but (min=2, max=3) does not.

Pointers

Don’t use pointers.

Sum type

(int | float) is either an int or a float value:

(int | float) a = 4;

return a.case(
    int i: i / 2,
    float f: f / 2.0f);

a.case {
    int i:
        print(itoa(i));
    float f:
        print(ftoa(f));
}

Members of a sumtype can be marked as “fail”, enabling error return:

(int | fail FileNotFound) foo() { return "test".readAll?.itoa; }

int i = foo()?;

If foo returns a FileNotFound, it will be automatically returned at the ?.

Note that this is not required for subtypes of std.error.Error.

Symbol Identifier

A symbol identifier takes the form :name.

It is both a type and an expression. The type :name has one value, which is also :name.

This feature can be used to “type-tag” entries in sumtypes, to differentiate identically typed entries, such as (:centimeters, int | :meters, int).

It is also used to construct “value-less” sumtype entries, such as (int | :none).

Struct

A struct is a value type that combines various members and methods that operate on them:

struct Foo
{
    int a, b;
    int sum() { return this.a + b; }
}

Foo foo = Foo(2, 3);

assert(foo.sum() == 5);

A method is a function defined in a struct (or class). It takes a reference to the struct value it is called on as a hidden parameter called this.

Class

A class is a reference type that combines various members and methods that operate on them:

class Foo
{
    int a, b;
    this(this.a, this.b) { }
    int sum() { return this.a + b; }
}

Foo foo = new Foo(2, 3);

assert(foo.sum() == 5);

Note that, as opposed to C++, the type Foo designates a reference to the class. It is impossible to hold a class by value.

this is a special method without return value that designates the constructor of the class. When instantiating a class with new Class(args), this(args) is called.

The parameter this.a indicates that the argument is directly assigned to the member a, rather than passed to the method as a parameter.

Classes can be inherited with a subclass. An instance of the subclass can be implicitly converted to the parent class. When a method is called on an instance, the function that runs is that of the allocated class, not of the type of the reference:

class Foo
{
    int get() { return 5; }
}

class Bar : Foo
{
    // "override" must be specified, to indicate
    // that a parent method is being redefined.
    override int get() { return 7; }
}

Foo foo = new Bar;
assert(foo.get == 7);

Classes can also inherit from interfaces, which are like “thin classes” that can only contain methods. In exchange, arbitrarily many interfaces can be inherited from:

interface Foo
{
    int get();
}

class Bar : Parent, Foo
{
    override int get() { return 5; }
}

Foo foo = new Bar;
assert(foo.get == 5);

In a subclass constructor, you can use the syntax super() to call the constructor of the parent class.

You can also use the keyword super in the parameter list to insert an implicit super constructor call:

class Bar : Foo
{
    int c;
    this(super, this.c) { }
}

The type of an object can be tested with the instanceOf property:

nullable Bar bar = foo.instanceOf(Bar);

if (Bar bar = foo.instanceOf(Bar)) { }

Return and parameter types follow covariance and contravariance on inheritance.

A class type may be qualified as nullable. In that case, the special value null implicitly converts to a reference to the type. By default, class references are not nullable:

nullable Foo foo = null;
assert(!foo);
Foo bar = foo; // errors

As a special treat, the case expression allows treating a nullable class as a sumtype of a non-nullable class and null:

nullable Foo foo;
Foo bar = foo.case(null: return false);

Function and Delegate

You can take the address of a function using the & operator. The type of the expression is R function(T).

When you take the address of a class method, the type will be R delegate(T). A delegate is a “fat function pointer” that carries a pointer to the context, ie. the object.

You can also take the address of a nested function with &, but then the type will be R delegate!(T), a “noncopyable delegate”. It cannot be used anywhere where a reference would have to be taken. As the delegate carries a pointer to the stackframe, this is necessary to protect the developer from use-after-return bugs.

A nested function can be heap-allocated using the syntax new &fun. new will make a copy of the surrounding stackframe for the function. In that case, the type will be R delegate(T) and the allocated stackframe will be reference counted.

typeof

Given an expression, the type of the expression can be used as a type with typeof:

typeof(a + b) sum = a + b

Since auto exists, this is mostly used for return and parameter types.

Unittest

Unittest blocks will be compiled and run when the compiler is called with -unittest:

int sum(int a, int b) { return a + b; }

unittest
{
    assert(sum(2, 3) == 5);
}

Templates

A template is a wrapper around a declaration that allows parameterizing it. The syntax is:

template max(T) {
    T max(T first, T second) {
        if (first > second) return first;
        return second;
    }
}

Here, T is the “template parameter”. Multiple template parameters can be used.

The symbol in the template must be eponymous, ie. have the same name as the template. To call it, instantiate the template: max!int(2, 3) or max!float(2.5, 3). Here, max!int is “the function max in the version of the template max where T is int.”

Multiple parameters are passed in parentheses: templ!(int, float).

If the template is called directly, without explicitly instantiating it, the compiler will try to unify the arguments passed with the template arguments available in order to infer their types. If only some template arguments are given in the instantiation, the compiler will try to infer the rest.

Ranges

If a type T has the properties bool empty, T next and E front, then it is called a “range over E”.

Arrays are an example of such.

Another example is range expressions: from .. to.

If you define these properties in a data type, you can use it as the source of a loop.

Lambdas

A lambda is a templated nested function reference. They can be assigned to a value. When called, they are implicitly instantiated.

Example:

int a = 5;
auto add = b => a + b;
assert(add(2) == 7);

Every lambda has a unique type. Because of this, they cannot be stored in data structures. Their primary purpose is being passed to templated functions:

auto a = (0 .. 10).filter(a => a & 1 == 0).map(a => a / 2).array;

assert(a == [0, 1, 2, 3, 4]);

The compiler will try to prevent you from returning a lambda from the function where it was defined. To enable this, lambdas cannot be assigned to class fields, or in general put in any location where the compiler could lose track of where the lambda is.

Macros

Note

For this feature, compiler knowledge is required!

When macro(function) is called, function is loaded into the compiler and executed with a macro state parameter. This allows modifying the macro state of the compiler to add a macro class instance. Macro classes can extend the compiler with new functionality using a set of hooks:

  • calls: a(b, c)

  • expressions: 2 2

  • properties: a.b<property goes here>

  • statements: macroThing;

  • imports: import github("http://github.com/neat-lang/example").module;

Look at std.macro.* for examples.

The entire compiler is available for importing and reuse in macros. However, it is recommended to limit yourself to the functionality in neat.base. This will also keep compile times down.