.. _manual: .. highlight:: d Manual ====== This document goes through every Neat language feature in approximate sequence. The goal is that you should be able to understand the entire language by just reading it top to bottom. But you can also use it as a reference to quickly find how a feature works. Lexical ------- All Neat source files are encoded as UTF-8. Comment syntax is as in C:: /** * This is a comment. It goes from /* to */. */ // This is also a comment. It goes to the end of the line. /* Comments can be /* nested. */ */ Comments may appear anywhere in a source file except inside identifiers or operators. An identifier is a letter or underscore, followed by a sequence of letters, digits or underscores. Modules ------- A Neat source file has the extension `.nt`. Each source file corresponds to a module. A module is a dot-separated list of packages, corresponding to folders, followed the filename. Every file must begin with the module declaration. For instance, a file `src/hello/world.nt`:: module hello.world; Packages -------- .. highlight:: none Neat does not use includes, but instead packages. A package is a folder associated with a name:: $ # -P:[:[,> b` Right shift 8 `a ^ b` Bitwise "xor" 9 `a & b` Bitwise "and" 10 ========= ============== ==== Boolean "or" and "and" are short-circuiting. Comparison operators are `>`, `==`, `<`, `>=`, `<=`, and `!=`. Higher-ranked operators take precedence over lower-ranked, with boolean operators being the loosest. Note that the placement of bitwise operators diverges from C's order. This is because C's order is stupid^W a legacy holdover from before it had boolean operators. Operator precedence can be clarified using parentheses: `2 * (3 + 4)` instead of `2 * 3 + 4`. Ternary If ^^^^^^^^^^ `a if t else b` has the value of `a` if `t` is true, else it has the value of `b`. Only the selected expression is evaluated. So if `t` is true, `b` is never evaluated. This operator has a lower rank than any of the binary operators. The ternary operator can be shortened to `a else b`. In that case, `a` is always taken unless the expression branches to `b` via `breakelse`. The ternary operator syntax diverges from C because `?` is already used for error propagation. Control flow expressions ^^^^^^^^^^^^^^^^^^^^^^^^ `break` is an expression that, when evaluated, transfers control flow to after the current loop. `continue` is an expression that, when evaluated, transfers control flow to the next pass of the current loop. `return`, or `return x`, is an expression that, when evaluated, transfers control flow out of the current function. If a parameter `x` is given, the current function call evaluates to `x`; else it evaluates to `void`. `breakelse` is an expression that, when evaluated, transfers control flow to the `else` block of the surrounding `if` statement, or causes the `else` expression of the surrounding ternary `if` to be used. If the `if` statement has no `else` block, it continues after the `if` block. Since these expressions exit the local scope (they're "non-local control flow primitives"), they are all typed `bottom` - their *local* value is empty. Error propagation operator ^^^^^^^^^^^^^^^^^^^^^^^^^^ `x?` is the error propagation operator. Its behavior depends on the type of `x`: - if `x` is a subtype of `std.error.Error`, it is returned from the current function. - if `x` is a sumtype: - all subclasses of `std.error.Error` are returned from the current function. - all types marked `fail` are returned from the current function. This is a legacy feature: `Error` subclasses should be preferred. - if it contains an `:else` type, it is mapped to `breakelse`. - if it contains a `nullptr_t` type, it is mapped to `breakelse`. - if `x` is a `nullable T`, it is treated as a sumtype of `T | nullptr_t`. The `nullptr_t` is then mapped to `breakelse`. The member types `nullptr_t` and `:else` are thus interpreted as "not error, not success": they are "expected failures" that exit the current `if` test but not the function. For instance, when reading data from a file, an I/O error class would subclass `Error` and thus be returned, but reaching the end of the file would be communicated by `:else`. Since `?` maps certain types to control flow expressions, which are typed `bottom`, they are removed from the sumtype. As such, `?` leaves only successful types behind. Note that when a sumtype contains both `Error`/`fail` types *and* a nullable class, the first application of `?` will only get rid of the `Error`/`fail` types: you may require two `?`. Example:: string line = file.readText()?? else die; nullable Class obj; if (auto var = obj?.field?) { } while (true) { auto data = file.readBlock? else break; ... } Functions --------- A function is a series of statements operating on a list of parameters, culminating in a return value:: ReturnType functionName(ParameterType parameterName) { statement; statement; statement; return 5; } ... ReturnType ret = functionName(foo); When a function is called with `name(arg, arg)`, the arguments are passed to the parameters and control passes to the function. The statements of the function are then executed, until control returns to the caller when the function exits, by explicit `return` or reaching its end. If the return type is `auto`, it is inferred from the type returned by the `return` statements in the function body. This is called return type inference. Call ^^^^ A function, class method or struct method can be called with a comma-separated list of arguments:: print("Hello World"); double d = sin(0.0); class.method(); When a function does not have any parameters, the empty parens can be left out, and the function will be called implicitly:: doWork; This also allows struct or class methods that look like properties. Uniform Function Call Syntax ############################ As in D, "uniform function call syntax" (UFCS) may be used. That is, if a call of the form `a.method(b)` did not find a method `a.method` to call, it will instead be interpreted as `method(a, b)`. This allows easily defining global functions that can be called as if they are member functions of `a`. Named Arguments ############### The value of every parameter on a call may be assigned by name:: int twice(int x) { return x + x; } assert(twice(x=2) == 4); This feature does not allow reordering parameters! It is purely intended to improve call readability, and to ensure that arguments are passed to the intended parameter. Nested functions ^^^^^^^^^^^^^^^^ Functions may be nested inside other functions. They remain valid while the surrounding function is running, and can access variables and parameters of the containing function, that were declared before them:: int double(int a) { int add(int b) { return a + b; } return add(a); } Note that calling the nested function after the surrounding function has returned will lead to a crash! main ^^^^ Every program must contain a function with this signature:: void main(string[] args) { } This function will be called when the program is executed. Statements ---------- Variable declaration ^^^^^^^^^^^^^^^^^^^^ A variable can be declared like so:: int a; // a is 0 int b = 5; int c, d = 6; // c is 0 mut int e; Instead of a type, you may write `auto`:: auto f = 7; Then the type of the variable is taken from the type of the initializer. Only mutable variables (`mut a;`) may be changed later. Variable extraction declaration ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ When an expression is a sumtype, a subset or a single type may be extracted as such:: (int | Error) foo; // `Error` will be returned if `foo` is not `int`. int bar <- foo; .. note:: This syntax is disabled pending renovations! The new error propagation syntax `foo?.bar` has made it superfluous. Block statement ^^^^^^^^^^^^^^^ Multiple statements can be combined into one block:: { print("Hello"); print("World"); } Variables declared inside the block are not visible outside of it. Expression statement ^^^^^^^^^^^^^^^^^^^^ Expressions can appear as statements. They are terminated with a semicolon:: 5; foo(); Assignment ^^^^^^^^^^ Any reference may be assigned a new value:: mut int a = 3; a = 5; assert(a == 5); Note that only mutable (`mut`) variables or parameters can be reassigned. As this allows some optimizations to reference counting, non-mutable variables should be preferred. If block ^^^^^^^^ If a condition is true, execute one statement, else the other:: if (2 + 2 == 4) print("2 + 2 = 4"); else { print("sanity has deserted us"); } The condition of the `if` statement may be a variable declaration. In that case, the condition is true if the value of the variable is true. The variable will only be visible inside the `if` block:: if (Foo foo = getFoo()) { // do foo things here } `nullable Class` types are true if the class is non-null. In that case, the type of the tested variable can be `Class`. This is the only way in which `nullable Class` types can be converted to `Class`. The `if let` form acts exactly like `if`, except that the variable does not have to be truthy:: if let(auto bar = getFoo()?.bar) { // bar may be false here. } The intended meaning is: "The fact that the variable was declared already indicates success." As with regular `if`, `breakelse` jumps to the `else` block or past the statement. This idiom is aimed at code that wants to use the result of a chain of `?` expressions, but doesn't particularly care about its truth value. With block ^^^^^^^^^^ The `with` block takes an expression and makes its fields implicitly accessible:: auto s = (foo=2, bar=3); int baz = 5; with (s) { assert(foo == 2); assert(bar == 3); // we can still access other variables. assert(baz == 5); // lookup proceeds lexically, so the // variable masks the `with` statement. int bar = 8; assert(bar == 8); } While loop ^^^^^^^^^^ While a condition is true, execute a statement:: mut int i = 0; while (i < 10) { i += 1; } For loop ^^^^^^^^ You can loop over a range expression:: // prints 2, then 3 for (size_t i in 2 .. 4) { print(ltoa(i)); } The type of the loop variable may be left out. Array expressions are ranges. Array indexes can be iterated like:: for (i, value in array) { array[i] = value + 2; } You can also use a C-style for loop:: for (mut int a = 0; a < 10; a += 1) { } But this is rarely needed. break, continue ^^^^^^^^^^^^^^^ While inside any loop, you may immediately abort and continue after the loop with `break`. You may immediately jump to the next iteration of the loop with `continue`. Types ----- Basic types ^^^^^^^^^^^ ====== ================================== name meaning ====== ================================== int 32-bit signed integer short 16-bit signed integer byte 8-bit signed integer char 8-bit UTF-8 code unit long 64-bit signed integer void 0-bit empty data size_t platform-dependent unsigned word float 32-bit IEEE floating point number double 64-bit IEEE floating point number ====== ================================== Array ^^^^^ The type `T[]` is an "array of T", which some languages call a slice. It consists of a pointer, a length and a reference to the array object. `[2]` is an array of ints (`int[]`), allocated on the heap. `array ~ array` is the concatenation of two arrays. Concatenation is the only way to add elements to the array. The values in an array cannot be directly modified! In other words, arrays are immutable by default. `array.length` is the length of the array. Appending to an array in a loop will follow a doubling strategy. It should be reasonably efficient. `array[2]` is the third element (base-0) of the array. `array.dup` creates a copy of the array. The copy will be mutable. Mutable Array ^^^^^^^^^^^^^ The type `T mut[]` is a "mutable array of T". It differs from normal arrays in that elements can be freely reassigned. `array.freeze` converts `array` to an immutable array. Unless `array` has exactly one reference, this operation is forbidden; using the `array` variable after this expression has been evaluated is forbidden. (The compiler does not at present enforce this, but it will in the future.) `T[]` and `T mut[]` are separated because in my experience these types occupy fundamentally different roles in a program. If you pass an array to a function, you get the assurance that it won't be modified. Likewise if you are a class, and somebody gives you an array that you store in a class member, the value of that array will not change on you. Tuple ^^^^^ `(int, float)` is a tuple with two member types, `int` and `float`. Each member can have an independent value. `(2, 3.0f)` is an expression of type `(int, float)`. `tuple[0]` is the first member of the tuple. The index value must be an int literal. Tuple members can be named: `(int i, float f)`. This allows accessing the member with `value.i`. When implicitly converting tuples, tuple fields without names implicitly convert to any name, but tuple fields with names only convert to other fields with the same name. For example, `(2, 3)` implicitly converts to `(int from, int to)`, but `(min=2, max=3)` does not. Pointers ^^^^^^^^ Don't use pointers. Sum type ^^^^^^^^ `(int | float)` is either an int or a float value:: (int | float) a = 4; return a.case( int i: i / 2, float f: f / 2.0f); a.case { int i: print(itoa(i)); float f: print(ftoa(f)); } Members of a sumtype can be marked as "fail", enabling error return:: (int | fail FileNotFound) foo() { return "test".readAll?.itoa; } int i = foo()?; If foo returns a `FileNotFound`, it will be automatically returned at the `?`. Note that this is not required for subtypes of `std.error.Error`. Symbol Identifier ^^^^^^^^^^^^^^^^^ A symbol identifier takes the form `:name`. It is both a type and an expression. The type `:name` has one value, which is also `:name`. This feature can be used to "type-tag" entries in sumtypes, to differentiate identically typed entries, such as `(:centimeters, int | :meters, int)`. It is also used to construct "value-less" sumtype entries, such as `(int | :none)`. Struct ^^^^^^ A struct is a value type that combines various members and methods that operate on them:: struct Foo { int a, b; int sum() { return this.a + b; } } Foo foo = Foo(2, 3); assert(foo.sum() == 5); A method is a function defined in a struct (or class). It takes a reference to the struct value it is called on as a hidden parameter called `this`. Class ^^^^^ A class is a **reference type** that combines various members and methods that operate on them:: class Foo { int a, b; this(this.a, this.b) { } int sum() { return this.a + b; } } Foo foo = new Foo(2, 3); assert(foo.sum() == 5); Note that, as opposed to C++, the type `Foo` designates a reference to the class. It is impossible to hold a class by value. `this` is a special method without return value that designates the constructor of the class. When instantiating a class with `new Class(args)`, `this(args)` is called. The parameter `this.a` indicates that the argument is directly assigned to the member `a`, rather than passed to the method as a parameter. Classes can be inherited with a subclass. An instance of the subclass can be implicitly converted to the parent class. When a method is called on an instance, the function that runs is that of the allocated class, not of the type of the reference:: class Foo { int get() { return 5; } } class Bar : Foo { // "override" must be specified, to indicate // that a parent method is being redefined. override int get() { return 7; } } Foo foo = new Bar; assert(foo.get == 7); Classes can also inherit from interfaces, which are like "thin classes" that can only contain methods. In exchange, arbitrarily many interfaces can be inherited from:: interface Foo { int get(); } class Bar : Parent, Foo { override int get() { return 5; } } Foo foo = new Bar; assert(foo.get == 5); In a subclass constructor, you can use the syntax `super()` to call the constructor of the parent class. You can also use the keyword `super` in the parameter list to insert an implicit super constructor call:: class Bar : Foo { int c; this(super, this.c) { } } The type of an object can be tested with the `instanceOf` property:: nullable Bar bar = foo.instanceOf(Bar); if (Bar bar = foo.instanceOf(Bar)) { } Return and parameter types follow `covariance and contravariance`_ on inheritance. A class type may be qualified as `nullable`. In that case, the special value `null` implicitly converts to a reference to the type. By default, class references are not nullable:: nullable Foo foo = null; assert(!foo); Foo bar = foo; // errors As a special treat, the `case` expression allows treating a nullable class as a sumtype of a non-nullable class and `null`:: nullable Foo foo; Foo bar = foo.case(null: return false); Function and Delegate ^^^^^^^^^^^^^^^^^^^^^ You can take the address of a function using the `&` operator. The type of the expression is `R function(T)`. When you take the address of a class method, the type will be `R delegate(T)`. A `delegate` is a "fat function pointer" that carries a pointer to the context, ie. the object. You can also take the address of a nested function with `&`, but then the type will be `R delegate!(T)`, a "noncopyable delegate". It cannot be used anywhere where a reference would have to be taken. As the delegate carries a pointer to the stackframe, this is necessary to protect the developer from use-after-return bugs. A nested function can be heap-allocated using the syntax `new &fun`. `new` will make a copy of the surrounding stackframe for the function. In that case, the type will be `R delegate(T)` and the allocated stackframe will be reference counted. `typeof` ^^^^^^^^ Given an expression, the type of the expression can be used as a type with `typeof`:: typeof(a + b) sum = a + b Since `auto` exists, this is mostly used for return and parameter types. .. _covariance and contravariance: https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science) Unittest -------- Unittest blocks will be compiled and run when the compiler is called with `-unittest`:: int sum(int a, int b) { return a + b; } unittest { assert(sum(2, 3) == 5); } Templates --------- A template is a wrapper around a declaration that allows parameterizing it. The syntax is:: template max(T) { T max(T first, T second) { if (first > second) return first; return second; } } Here, `T` is the "template parameter". Multiple template parameters can be used. The symbol in the template must be *eponymous*, ie. have the same name as the template. To call it, instantiate the template: `max!int(2, 3)` or `max!float(2.5, 3)`. Here, `max!int` is "the function `max` in the version of the template `max` where `T` is `int`." Multiple parameters are passed in parentheses: `templ!(int, float)`. If the template is called directly, without explicitly instantiating it, the compiler will try to unify the arguments passed with the template arguments available in order to infer their types. If only some template arguments are given in the instantiation, the compiler will try to infer the rest. Ranges ------ If a type `T` has the properties `bool empty`, `T next` and `E front`, then it is called a "range over `E`". Arrays are an example of such. Another example is range expressions: `from .. to`. If you define these properties in a data type, you can use it as the source of a loop. Lambdas ------- A lambda is a templated nested function reference. They can be assigned to a value. When called, they are implicitly instantiated. Example:: int a = 5; auto add = b => a + b; assert(add(2) == 7); Every lambda has a unique type. Because of this, they cannot be stored in data structures. Their primary purpose is being passed to templated functions:: auto a = (0 .. 10).filter(a => a & 1 == 0).map(a => a / 2).array; assert(a == [0, 1, 2, 3, 4]); The compiler will try to prevent you from returning a lambda from the function where it was defined. To enable this, lambdas cannot be assigned to class fields, or in general put in any location where the compiler could lose track of where the lambda is. Macros ------ .. note:: For this feature, compiler knowledge is required! When `macro(function)` is called, `function` is loaded into the compiler and executed with a macro state parameter. This allows modifying the macro state of the compiler to add a macro class instance. Macro classes can extend the compiler with new functionality using a set of hooks: - calls: `a(b, c)` - expressions: `2 ★ 2` - properties: `a.b` - statements: `macroThing;` - imports: `import github("http://github.com/neat-lang/example").module;` Look at `std.macro.*` for examples. The entire compiler is available for importing and reuse in macros. However, it is recommended to limit yourself to the functionality in `neat.base`. This will also keep compile times down.