# sx language specification ## 1. Lexical Structure ### Comments Line comments start with `//` and extend to end of line. ```sx // this is a comment ``` ### Identifiers - Lowercase or mixed-case for variables, functions: `x`, `compute`, `main` - UPPER_SNAKE_CASE for constants: `SOME_INT`, `SOME_STR` - PascalCase for types: `Foo` ### Literals | Kind | Examples | Type | |-----------|---------------------|---------| | Integer | `0`, `42`, `0xFF`, `0b1010` | `s64` | | Float | `0.3`, `0.9` | `f32` | | String | `"Hello"`, `"z: {z}"` | `string` (may span multiple lines) | | Heredoc String | `#string END`...`END` | `string` | | Boolean | `true`, `false` | `bool` | | Enum | `.variant1` | inferred from context | | Undefined | `---` | context-dependent | String literals support escape sequences (`\n`, `\t`, `\r`, `\\`, `\"`, `\0`) and may span multiple lines directly: ```sx shader_src := "#version 330 core void main() { gl_Position = vec4(0.0); } "; ``` **Heredoc strings** use `#string DELIMITER` syntax (inspired by Jai). Content is completely raw — no escape processing. The delimiter is any identifier. Content starts after the newline following the delimiter and ends when the delimiter appears at column 0 of a line. ```sx vert_src := #string GLSL #version 330 core void main() { gl_Position = vec4(aPos, 1.0); } GLSL; ``` ### Keywords `if`, `else`, `then`, `while`, `for`, `break`, `continue`, `true`, `false`, `enum`, `struct`, `union`, `case`, `return`, `defer`, `push`, `ufcs`, `in`, `xx`, `and`, `or`, `raise`, `try`, `catch`, `onfail`, `error` > Note: `enum` is used for both payload-less and payload-bearing sum types (tagged unions). `union` is reserved for C-style untagged unions (memory overlays). > Note: `raise`, `try`, `catch`, `onfail`, and `error` are the error-handling keywords. `or` is reused as the failable-fallback / chain operator. See [§12 Error Handling](#12-error-handling). ### Operators | Operator | Meaning | |----------|------------------| | `+` | addition | | `-` | subtraction / negation | | `*` | multiplication | | `/` | division | | `==` | equality | | `!=` | inequality | | `<` | less than | | `>` | greater than | | `<=` | less or equal | | `>=` | greater or equal | | `&` | bitwise AND | | `\|` | bitwise OR | | `^` | bitwise XOR | | `~` | bitwise NOT (unary) | | `<<` | left shift | | `>>` | right shift (arithmetic for signed, logical for unsigned) | | `and` | logical AND (short-circuit) | | `or` | logical OR (short-circuit) | | `in` | membership test (tuples) | | `\|>` | pipe (function application) | | `+=` | add-assign | | `-=` | sub-assign | | `*=` | mul-assign | | `/=` | div-assign | | `&=` | bitwise AND assign | | `\|=` | bitwise OR assign | | `^=` | bitwise XOR assign | | `<<=` | left shift assign | | `>>=` | right shift assign | ### Delimiters and Punctuation | Token | Meaning | |--------|--------------------------------------| | `::` | constant binding / definition | | `:=` | variable binding (mutable, inferred) | | `:` | type annotation | | `=` | assignment (in typed var decl) | | `;` | statement terminator | | `,` | separator (trailing commas allowed) | | `.` | field access / enum literal prefix | | `->` | return type annotation | | `=>` | lambda arrow | | `$` | generic type parameter introduction | | `---` | undefined value | | `()` | grouping / params | | `{}` | blocks / bodies | --- ## 2. Type System ### Primitive Types - `s1`..`s64` — signed integers (1 to 64 bits). `s64` is the default for integer literals. - `u1`..`u64` — unsigned integers (1 to 64 bits). - `f32` — 32-bit floating point - `f64` — 64-bit floating point - `bool` — boolean (`true` / `false`) - `string` — string of characters - `Any` — type-erased value, represented as `{ i64, i64 }` (type tag + payload). Used for variadic arguments and runtime type dispatch. - `Type` — compile-time type value. At runtime, represented as an `i64` type tag (same tag space as `Any`). ### Enum Types User-defined sum types with named variants. Variants may optionally carry typed data (tagged unions). Internally, payload-less enums are represented as `i64` (variant index). Enums with payloads are represented as `{ i64, [max_payload_size x i8] }` (tag + data). #### Declaration ```sx // Payload-less enum Color :: enum { red; green; blue; } // Enum with payloads (tagged union) Shape :: enum { circle: f32; // typed variant rect: s32; // typed variant none; // void variant } ``` Variants are referenced with dot-prefix syntax: `.variant1` #### Construction ```sx c := Color.red; // payload-less s :Shape = .circle(3.14); // inferred from context s = .none; // void variant s = Shape.rect(42); // explicit prefix ``` #### Payload Access ```sx r := s.circle; // load payload as f32 (undefined behavior if wrong variant active) ``` #### Pattern Matching ```sx if s == { case .circle: print("circle\n"); case .rect: print("rect\n"); case .none: print("none\n"); } ``` #### Payload Capture Match arms can capture the variant's payload into a local variable: ```sx if s == { case .circle: (radius) { print("radius: {}\n", radius); } case .rect: (size) => print("size: {}\n", size); } ``` The `(name)` after the colon binds the payload. Two forms: - Block: `case .variant: (name) { body }` - Short: `case .variant: (name) => expr;` #### Enum Interpolation Payload-less enums print as `.variant`. Enums with payloads print as `.variant(value)` or ``: ```sx print("{}", s); // .circle(3.140000) ``` ### Union Types (Untagged) C-style untagged unions for zero-cost memory overlays (type punning). All fields share the same memory — no tag, no runtime overhead. The LLVM representation is `[max_field_size x i8]`. #### Declaration ```sx Overlay :: union { f: f32; i: s32; } ``` All fields must have types (unlike enums, which may have void variants). #### Anonymous Struct Fields (Member Promotion) Anonymous `struct` fields inside a union have their members promoted to the union namespace: ```sx Vec2 :: union { data: [2]f32; struct { x, y: f32; }; } ``` Access promoted members directly: `v.x`, `v.y` — these are zero-cost GEPs into the same underlying memory as `v.data[0]`, `v.data[1]`. #### Initialization Unions must be initialized with `---` (undefined) and then assigned per-field: ```sx o :Overlay = ---; o.f = 3.14; print("{}\n", o.i); // reinterpret bits as s32 ``` #### Restrictions - Pattern matching (`if x == { case ... }`) is not supported on unions. - Unions cannot be printed directly via `print("{}", union_val)` — access individual fields instead. ### Struct Types User-defined product types with named fields. ```sx Vec4 :: struct { x, y, z, w: f32; } ``` Fields are declared as `name1, name2: type;` (comma-separated names sharing a type, semicolon-terminated). #### Field Defaults Fields may have default values. Fields without an explicit default have a zero-value default. `---` marks a field as explicitly undefined. ```sx Foo :: struct { a : u2; // default is 0 b : u8 = 42; // default is 42 c : u8 = ---; // default is undefined } ``` #### Struct Literals ```sx // Positional (with type annotation — type inferred from annotation) v1 : Vec4 = .{ 1, 2, 3, 0 }; // Positional (with type prefix) v2 := Vec4.{ 4, 1, 1, 3 }; // Named fields (any order) v3 := Vec4.{ w=0, x=2, y=3, z=4 }; // Mixed named + shorthand (bare identifier = field name matches variable name) z := 5.0; w := 6.0; v4 := Vec4.{ y=3, x=9, w, z }; // Trailing commas are allowed in all comma-separated lists v5 := Vec4.{ x = 1.0, y = 2.0, z = 3.0, w = 4.0, }; ``` #### Field Access and Assignment ```sx v1.x // read field x of struct v1 v1.x = 3.0; // assign to field x of struct v1 ``` #### `#using` — Struct Composition `#using StructName;` inside a struct declaration embeds all fields from `StructName` at that position. The embedded fields are accessed directly, as if declared inline. ```sx UBase :: struct { x: s32; y: s32; } UExt :: struct { #using UBase; z: s32; } e := UExt.{ x = 1, y = 2, z = 3 }; print("{}\n", e.x); // 1 ``` `#using` may appear at any field position (beginning, middle, end) and multiple `#using` entries are allowed: ```sx UPos :: struct { px: s32; py: s32; } UCol :: struct { r: s32; g: s32; } USprite :: struct { #using UPos; #using UCol; scale: s32; } s := USprite.{ px = 10, py = 20, r = 255, g = 128, scale = 1 }; ``` The referenced struct must be declared before use. This is purely a compile-time field expansion — no runtime overhead. #### Struct Interpolation Struct values in string interpolation print as `TypeName{field:value, ...}`: ```sx print("{}", v1); // Vec4{x:1.0, y:2.0, z:3.0, w:0.0} ``` ### Struct Methods Functions declared inside a struct body become methods, registered as `StructName.method`: ```sx Point :: struct { x, y: s32; sum :: (self: *Point) -> s32 { self.x + self.y; } } p := Point.{ x = 3, y = 4 }; print("{}\n", p.sum()); // 7 ``` Methods receive the struct (typically as a pointer) as their first parameter. Dot-call syntax `obj.method(args)` resolves struct methods — it is **not** UFCS for arbitrary free functions. The pipe operator `|>` remains the universal UFCS mechanism. ### Protocol Types Protocols define a set of method signatures that types can implement. They enable: - **Static dispatch**: compile-time checked constraints on generic type parameters. - **Dynamic dispatch**: type-erased protocol values with runtime method dispatch through function pointers. #### Declaration ```sx Allocator :: protocol #inline { alloc :: (size: s64) -> *void; dealloc :: (ptr: *void); } ``` Protocol methods have an **implicit receiver** — no `self` in the protocol signature. The compiler adds `*Self` automatically. The `#inline` modifier embeds function pointers directly in the protocol value (no vtable indirection). #### `#inline` vs default layout | Layout | Declaration | Value layout | Dispatch cost | |--------|-------------|--------------|---------------| | `#inline` | `protocol #inline { ... }` | `{ ctx: *void, fn_ptr1, fn_ptr2, ... }` | Zero indirection | | Default | `protocol { ... }` | `{ ctx: *void, __vtable: *Vtable }` | One pointer chase | Use `#inline` for protocols with few methods where call overhead matters (e.g., allocators). Use the default layout for protocols with many methods to keep the value size small. #### `impl` Blocks ```sx impl Allocator for GPA { alloc :: (self: *GPA, size: s64) -> *void { self.alloc_count += 1; malloc(size); } dealloc :: (self: *GPA, ptr: *void) { self.alloc_count -= 1; free(ptr); } } ``` - Top-level declarations (not inside struct bodies) - Enable retroactive conformance — implement a protocol for types you don't own - Impl methods are also registered as struct methods (`GPA.alloc`) for direct calls - Duplicate `{Protocol, Type}` pair in the same compilation unit is a compile error #### Protocol Values and `xx` Conversion Convert a concrete type to a protocol value with `xx`: ```sx gpa := GPA.init(); a : Allocator = xx gpa; // concrete → protocol value ptr := a.alloc(64); // dynamic dispatch through fn-ptr a.dealloc(ptr); ``` `xx` works at assignment, call sites, and return positions: ```sx use_allocator(xx gpa); // at call site make_alloc :: () -> Allocator { xx gpa; } // in return position ``` Protocol values can be stored in struct fields, arrays, and passed through function calls: ```sx Arena :: struct { parent: Allocator; // protocol value as struct field // ... } allocators : [2]Allocator = .[xx gpa, xx arena]; // protocol values in array ``` #### Ownership and Lifetime Protocol values have two ownership modes. The mode is selected by the shape of the operand to `xx`: | Operand shape | `ctx` points to | Lifetime | Who frees | |---|---|---|---| | `xx ` (struct literal, call result, etc.) | Heap-allocated copy | Until `free(p)` | Caller | | `xx ` (identifier, field, index, deref) | The named storage | Tied to that storage's scope | Caller manages the storage | | `xx ` / `xx @ptr` | Original pointee | Tied to pointee | Caller manages pointee | **`xx `** — when the operand has no storage of its own (struct literal, function-call result, arithmetic expression, etc.) the concrete data is heap-copied through `context.allocator` so the protocol value is self-contained. It can be stored in containers, returned from functions, and outlives the scope where it was created. Call `free(p)` to release the backing memory when done: ```sx s : Sizable = xx Widget.{ value = 42 }; // heap-copies Widget print("{}\n", s.size()); free(s); // frees the heap-allocated Widget copy ``` **`xx `** — when the operand names existing storage (a local variable, struct field, array element, or dereferenced pointer) the protocol borrows that storage directly. No heap copy, no allocation, no `free` needed; mutations through the protocol are visible to the original. The protocol value is only valid while the named storage is alive: ```sx w := Widget.{ value = 0 }; s : Sizable = xx w; // borrows w's storage; no copy s.add(5); // modifies w through ctx print("{}\n", w.value); // 5 // do NOT free(s) — w owns the data ``` **`xx @ptr`** is equivalent to `xx ` for the dereferenced pointee — the protocol borrows. It's mostly redundant under the lvalue rule above but stays valid for explicit clarity when the operand is a pointer you want to make obvious is being borrowed: ```sx w := Widget.{ value = 0 }; s : Sizable = xx @w; // identical to `xx w` — borrows w ``` **Vtables** are global constants — shared across all protocol values of the same `(Protocol, ConcreteType)` pair. They are never allocated or freed at runtime. #### Default Methods Protocol methods can have bodies. `self` dispatches through the vtable (dynamic dispatch): ```sx Writer :: protocol { write :: (data: string) -> s64; // required write_line :: (data: string) -> s64 { // default n := self.write(data); n + self.write("\n"); } } ``` Default methods are used unless overridden in the impl. Default methods calling `self.method()` dispatch through the vtable, so they work correctly with any concrete type. #### `Self` Type `Self` is a contextual keyword in protocol declarations — resolves to the concrete type in impls: ```sx Eq :: protocol { eq :: (other: Self) -> bool; } impl Eq for Point { eq :: (self: *Point, other: Point) -> bool { self.x == other.x and self.y == other.y; } } // Static dispatch: p1.eq(p2); // calls Point.eq directly // Dynamic dispatch: e : Eq = xx p1; e.eq(p2); // dispatches through vtable, Self params erased to *void ``` For dynamic dispatch, `Self` parameters are erased to `*void` — the caller passes a pointer to the argument, and the thunk loads the concrete value. #### Generic Constraints `$T/Protocol` syntax validates that a type parameter implements the required protocol(s): ```sx are_equal :: (a: $T/Eq, b: T) -> bool { a.eq(b); } // Multiple constraints: eq_and_hash :: (a: $T/Eq/Hashable, b: T) -> bool { ... } ``` Constraints produce clear errors at monomorphization: `"s64 does not implement Hashable"`. Dispatch is static — same as unconstrained generics but with compile-time validation. Constraints also work on struct type parameters: ```sx SortedPair :: struct ($T: Type/Comparable) { lo: T; hi: T; } ``` #### Generic Struct Impls ```sx Pair :: struct ($T: Type) { a: T; b: T; } impl Summable for Pair($T) { sum :: (self: *Pair(T)) -> s32 { xx self.a + xx self.b; } } ``` The impl is instantiated per concrete type argument, like generic struct methods. #### Dispatch Rules | Usage | Dispatch | Cost | |-------|----------|------| | `gpa.alloc(64)` on `*GPA` | Static — direct call | Zero | | `$T/Allocator` constraint | Static — monomorphized | Zero | | `a : Allocator = xx gpa; a.alloc(64)` | Dynamic — fn-ptr / vtable | Indirect call | Static dispatch is automatic when the concrete type is known. Dynamic dispatch only when explicitly type-erased via `xx` into a protocol value. #### Parameterised Protocols (compile-time only) A protocol with type parameters is compile-time only — it has no vtable and no boxed instance shape. Each `impl` is monomorphised per `(ProtocolArgs, Source)` pair. The canonical example is `Into`, declared in `modules/std.sx`: ```sx Into :: protocol(Target: Type) { convert :: () -> Target; } ``` A user can then add conversions for any `(Source, Target)` pair: ```sx MyString :: struct { tag: s64 = 0; } impl Into(MyString) for s64 { convert :: (self: s64) -> MyString { .{ tag = self }; } } main :: () -> s32 { x : MyString = xx 42; // direct call to monomorphised convert 0; } ``` The `xx` operator hooks into this mechanism: when an explicit target type is provided and the built-in coercion ladder doesn't apply, `xx val : T` lowers to `val.convert()` where `convert` comes from the visible `impl Into(T) for typeof(val)`. The call is a direct call — no vtable, no runtime dispatch. **Source side is a TypeExpr.** Unlike nullary `impl P for SomeStruct`, the `for`-side of a parameterised impl accepts any type expression, including closure and function types: ```sx impl Into(Block) for Closure() -> void { ... } impl Into(MyBuf) for []u8 { ... } ``` **Lookup rules:** - **Built-ins win.** The user-space fallback only fires when `coerceToType` made no progress (numeric narrow/widen, ptr↔int, etc. take priority). - **Only at explicit `xx`.** Implicit conversions (assignment, parameter passing) never trigger user-space coercions. - **Explicit target required.** `xx val` with no surrounding type context still defaults to `s64` for legacy reasons; the user-space fallback only fires when the target was named explicitly. - **Import-scoped visibility.** An `impl` is visible from a file only if the file transitively imports the impl's defining module. An impl in an imported-but-not-directly-related module produces a clean diagnostic (`no visible xx conversion …`). - **Duplicate impls error.** If two impls for the same `(Source, Target)` pair are both visible, the compiler emits a diagnostic naming both source modules. Same-file duplicates are caught at registration time. Cross-module duplicates are caught at the `xx` site. - **No recursion.** A `convert` body that re-enters `xx self : Target` for the same `(Source, Target)` pair produces a "recursive xx conversion" diagnostic; the compiler does not try to monomorphise the convert into itself. ### Tuple Types Anonymous product types with optional field names. Tuples are first-class values — they can be stored in variables, passed to functions, and returned. Tuples also support **spread** (`..tuple` / `(..tuple)`) and **field projection** (`tuple.field` across all elements) — see "Variadic Heterogeneous Type Packs". #### Construction ```sx pair := (40, 2); // positional tuple: (s64, s64) named := (x: 10, y: 20); // named tuple: (x: s64, y: s64) single := (42,); // 1-tuple (trailing comma in value position) zeroed : (s32, s32) = ---; // zero-initialized tuple ``` Note: In value position, `(expr)` without a comma is a grouping expression, not a tuple. Use `(expr,)` for a 1-tuple value. #### Type Syntax In type position, `(T)` is always a tuple type — no trailing comma needed. The `->` arrow disambiguates function types from tuple types: ```sx (s64) // tuple type with one field (s64, s64) // tuple type with two fields (s64) -> s64 // function type: takes s64, returns s64 (s64, s64) -> s64 // function type: takes two s64, returns s64 ``` #### Field Access ```sx pair.0; // 40 — numeric index pair.1; // 2 named.x; // 10 — named field named.0; // 10 — numeric index also works on named tuples ``` #### As Return Type ```sx swap :: (a: s64, b: s64) -> (s64, s64) { (b, a); } wrap :: (x: s64) -> (s64) { (x,); } s := swap(1, 2); // s.0 = 2, s.1 = 1 t := wrap(42); // t.0 = 42 ``` #### Representation Tuples are represented as anonymous LLVM struct types (same layout as named structs). A tuple `(s64, s64)` has LLVM type `{ i64, i64 }`. #### Tuple Operators **Equality and inequality** — element-wise comparison, both sides must have the same field count: ```sx (1, 2) == (1, 2) // true (1, 2) != (1, 3) // true ``` **Concatenation** (`+`) — creates a new tuple with fields from both sides: ```sx c := (1, 2) + (3, 4); // c : (s64, s64, s64, s64) c.0; // 1 c.3; // 4 ``` **Repetition** (`*`) — repeats a tuple N times (N must be a compile-time integer literal): ```sx r := (1, 2) * 3; // r : (s64, s64, s64, s64, s64, s64) r.0; // 1 r.5; // 2 ``` **Lexicographic comparison** (`<`, `<=`, `>`, `>=`) — compares element-by-element left to right: ```sx (1, 2) < (1, 3) // true (first fields equal, 2 < 3) (2, 0) > (1, 9) // true (2 > 1, rest ignored) (1, 2) <= (1, 2) // true (all equal, <= allows tie) ``` **Membership** (`in`) — checks if a value exists in a tuple: ```sx 3 in (1, 2, 3) // true 5 in (1, 2, 3) // false ``` ### Array Types Fixed-size arrays with element type and length. ```sx buffer : [5]f32 = .[0, 2, 3.5, 4, 0]; val := buffer[2]; // 3.5 buffer.len // 5 (compile-time constant, s64) ``` Arrays can also be constructed programmatically with the `Array` builtin: ```sx MyArr :: Array(5, s32); // equivalent to [5]s32 ``` ### Slice Types A slice `[]T` is a fat pointer `{ptr, i64}` referencing a contiguous sequence of `T` elements. Same runtime layout as `string`. ```sx // Arrays implicitly coerce to slices at call sites arr : [5]s32 = .[3, 1, 4, 1, 5]; sortSlice(arr); // [5]s32 → []s32 coercion // Slice operations items[i] // read element at index items[i] = val; // write element at index items.len // length (s64) items.ptr // raw pointer ``` Slices support generic type parameters: `[]$T` introduces type parameter `T` inferred from the element type of the argument (array or slice). ### Subslicing Arrays, slices, and strings support subslice syntax to create zero-copy views: ```sx arr : [5]s32 = .[3, 1, 4, 1, 5]; sub := arr[1..4]; // []s32 → [1, 4, 1] head := arr[..3]; // []s32 → [3, 1, 4] tail := arr[2..]; // []s32 → [4, 1, 5] msg := "hello world"; word := msg[6..11]; // string → "world" ``` - `expr[start..end]` — elements from `start` (inclusive) to `end` (exclusive) - `expr[start..]` — elements from `start` to end - `expr[..end]` — elements from beginning to `end` - Result type: `[]T` for arrays/slices, `string` for strings - No memory allocation — the result points into the original backing storage ### Pointer Types | Syntax | Meaning | `.len` | `[i]` | |--------|---------|--------|-------| | `*T` | pointer to one T | no | no | | `[*]T` | many-pointer (buffer) | no | yes | | `*[N]T` | pointer to array of N T | yes | yes | | `*[]T` | pointer to slice | yes | yes | **Address-of**: `@x` returns a pointer to the variable. ```sx v := Vec2.{ 1.0, 2.0 }; ptr := @v; // *Vec2 ``` **Dereference**: `p.*` loads the value through the pointer. ```sx copy := ptr.*; // Vec2 ``` **Auto-deref**: `p.field` is sugar for `p.*.field`. ```sx set_x :: (p: *Vec2, val: f32) { p.x = val; // auto-deref: p.*.x = val } set_x(@v, 99.0); ``` **Null**: Pointer types are currently nullable by default. `null` is the null pointer literal. ```sx np : *Vec2 = null; ``` **Many-pointer**: `[*]T` supports indexing for buffers of unknown size. ```sx arr : [5]s32 = .[10, 20, 30, 40, 50]; mp : [*]s32 = @arr[0]; // *s32 → [*]s32 implicit val := mp[2]; // 30 ``` **Implicit conversions**: - `*T` → `[*]T` (pointer to element → many-pointer) - `*[N]T` → `[*]T` (pointer to array → many-pointer) - `[N]T` → `[*]T` at call sites (array decays to many-pointer) - `[]T` → `[*]T` (slice decays to many-pointer, extracts `.ptr`) - `T` → `*T` at call sites (implicit address-of) - `null` (`*void`) → any `*T` **Fat pointer layout**: `[:0]u8`, `string`, and `[]T` are `{ptr, i64}` structs. The raw pointer is always the first field at offset 0. This means `*[:0]u8` works as C's `char**` — a C function dereferences through the outer pointer and reads the raw `char*` from offset 0. ### Optional Types Optional types represent values that may or may not be present. #### Type Syntax ```sx x: ?s32 = 42; // optional s32, has value y: ?s32 = null; // optional s32, no value ``` Any type `T` can be made optional: `?s32`, `?string`, `?Point`, `?*T`, `?[]T`. #### LLVM Representation - Non-pointer optionals (`?s32`, `?Point`): `{ T, i1 }` struct — payload + has_value flag - Pointer optionals (`?*T`): bare pointer — null represents absence #### Implicit Wrapping A value of type `T` implicitly converts to `?T`: ```sx wrap :: (n: s32) -> ?s32 { if n > 0 { return n; } // s32 → ?s32 (wraps) return null; // null → ?s32 } ``` #### Force Unwrap (`!`) Extracts the payload, traps at runtime if null: ```sx x: ?s32 = 42; val := x!; // val : s32 = 42 ``` #### Null Coalescing (`??`) Returns the payload if present, otherwise evaluates the right-hand side: ```sx x: ?s32 = 42; y: ?s32 = null; a := x ?? 0; // 42 b := y ?? 99; // 99 ``` #### Safe Unwrap (`if val := expr`) Binds the payload to a variable if present: ```sx x: ?s32 = 42; if val := x { print("{}\n", val); // val : s32 = 42 } else { print("none\n"); } ``` #### While-Optional Binding ```sx while val := get_next() { // val is the unwrapped value } ``` #### Pattern Matching Optionals support `.some` and `.none` virtual enum variants: ```sx result := if opt == { case .some: (val) { val * 2; } case .none: { 0; } }; ``` #### Optional Chaining (`?.`) Short-circuits field access on optionals: ```sx x: ?Point = Point.{ x = 1, y = 2 }; y: ?Point = null; a := x?.x ?? 0; // 1 b := y?.x ?? 0; // 0 ``` Result type of `x?.field` is always `?FieldType`. #### Flow-Sensitive Narrowing The compiler narrows `?T` to `T` in control flow branches: ```sx x: ?s32 = 42; if x != null { print("{}\n", x); // x is s32 here (narrowed) } if x == null { return; } print("{}\n", x); // x is s32 here (guard narrowing) ``` Compound conditions: ```sx if a != null and b != null { // both a and b are narrowed to their inner types } if a == null or b == null { return; } // both a and b are narrowed after the guard ``` Reassignment kills narrowing. #### Struct Field Defaults Optional fields in structs default to `null`: ```sx Node :: struct { value: s32; next: ?s32; } n := Node.{ value = 10 }; // n.next is null ``` #### Printing `print("{}", opt)` prints the payload value if present, or `"null"`. #### Comptime Optionals work in `#run` blocks — `??`, `!`, `if val :=`, null checks all supported. ### Foreign Function Interface (C Interop) To call C functions, declare a library constant with `#library` and bind functions with `#foreign`: ```sx // Declare a named library constant libc :: #library "c"; sdl :: #library "SDL3"; // Bind foreign functions — library ref is required socket :: (domain: s32, type: s32, protocol: s32) -> s32 #foreign libc; SDL_Init :: (flags: u32) -> bool #foreign sdl; // Symbol renaming — optional second argument gives the C symbol name write_fd :: (fd: s32, buf: [*]u8, count: u64) -> s64 #foreign libc "write"; ``` - `#library "name"` must be assigned to a named constant. The library is passed to the linker (`-lname` on Unix, `name.lib` on Windows). - `#foreign lib_ref` declares a function as external C. The library reference is mandatory. - `#foreign lib_ref "c_symbol"` renames the binding: the sx function name differs from the C symbol. This avoids name collisions (e.g. POSIX `write` vs an sx builtin). ### C Interop Type Mapping | C type | sx type | Notes | |--------|---------|-------| | `const char*` (input) | `[:0]u8` | compiler extracts `.ptr` at call site | | `char*` (output buffer) | `[*]u8` | raw buffer, no length | | `const char**` | `*[:0]u8` | address of `[:0]u8` — `.ptr` at offset 0 | | `int*` (single out) | `*s32` | | | `unsigned*` (single out) | `*u32` | | | `float*` (buffer) | `[*]f32` | | | `void*` (generic) | `*void` | only for truly opaque/generic data | ### Vector Types (SIMD) LLVM SIMD vectors, parameterized by length and element type. ```sx v := vec3(1, 3, 2); // Vector(3, f32) ``` **Arithmetic**: Element-wise `+`, `-`, `*`, `/` on vectors of same dimensions. ```sx add := v1 + v2; // element-wise addition ``` **Scalar broadcast**: Scalar operands are broadcast to match the vector. ```sx scaled := v * 2.0; // [2.0, 6.0, 4.0] ``` **Negation**: Unary `-` negates each element. ```sx neg := -v; // [-1.0, -3.0, -2.0] ``` **Element access**: `.x`, `.y`, `.z`, `.w` (aliases `.r`, `.g`, `.b`, `.a`) extract single components. ```sx v.x // first element v.z // third element ``` **Index access**: `v[i]` extracts by index. ```sx v[0] // first element ``` **Built-in `sqrt`**: Calls LLVM `llvm.sqrt.f32`/`.f64` intrinsic. ```sx s := sqrt(9.0); // 3.0 ``` ### Function Types Expressed as `(param_types) -> return_type`. A function with no return type annotation returns void. ```sx // type is (s32) -> s32 compute :: (x: s32) -> s32 { x * x; } // type is () -> void main :: () { } ``` ### Type Aliases A name bound to an existing type. ```sx SOME_TYPE :: f64; ``` ### Generic Functions (Monomorphization) Functions can be parameterized over types using `$T` syntax. The `$` prefix introduces a type parameter; subsequent uses of the name reference it. ```sx sum :: (a: $T, b: T) -> T { return a + b; } ``` - `$T` in a parameter type **introduces** type parameter `T` - Bare `T` (without `$`) **references** the introduced type parameter - At call sites, type arguments are **inferred** from actual argument types: ```sx sum(40, 2) // T = s32 sum(1.5, 2.5) // T = f32 ``` - Each unique set of concrete types produces a **separate specialized function** (monomorphization) - Multiple type parameters are supported: `(a: $T, b: $U) -> T` ### Variadic Functions Functions can accept a variable number of arguments using `..name: []Type` syntax: ```sx print :: (fmt: string, ..args: []Any) { ... } path_join :: (..parts: []string) -> string { ... } ``` - The leading `..` marks the parameter as variadic; the declared type is the slice the body sees (so `..parts: []string` makes `parts` a `[]string` inside). - The variadic parameter must be the last positional parameter. - For homogeneous element types (`[]s32`, `[]string`, ...), the call site packs the trailing args into a stack-allocated `[N x T]` and passes a slice over it. - For `[]Any`, each trailing arg is boxed into `Any` (type tag + payload) before packing; `args[i]` reads back the boxed value. - For `[]Protocol` (the element type is a protocol, e.g. `..xs: []Show`), each trailing arg is `xx`-erased to a protocol value `{ctx, vtable}` (impl-driven, like `xx`) and packed into a runtime `[N]Protocol`. `xs[runtime_i].method()` then dispatches through the protocol — this is the **runtime** counterpart to the comptime heterogeneous pack `..xs: Protocol`. - A `..` spread at the call site unpacks an existing slice/array into the variadic tail: `sum(..arr)`. - The heterogeneous comptime-pack form `..$args: []Type` binds per-position comptime types — see "Variadic Heterogeneous Type Packs" below. ### Variadic Heterogeneous Type Packs A **pack** is a comptime sequence of per-position-typed arguments. Unlike a slice variadic (`..xs: []T`, one uniform element type, a runtime slice), a pack binds a *distinct* type to each position and exists only at compile time. The full family of variadic/pack forms and how they differ: | Form | Element types | Lives at | `xs[i]` index | `xs[i]` yields | `xs.len` | |---|---|---|---|---|---| | `..xs: []T` | one uniform `T` | **runtime** (slice) | runtime or comptime | `T` | runtime | | `..xs: []Any` | mixed, **boxed** to `Any` | **runtime** (slice) | runtime or comptime | `Any` (match/unwrap to use) | runtime | | `..xs: []P` *(P a protocol)* | mixed, **erased** to `P` `{ctx,vtable}` | **runtime** (slice) | runtime or comptime | `P` (call protocol methods) | runtime | | `..xs: P` *(pack)* | per-position **concrete**, each conforms to `P` | **comptime** (no runtime value) | comptime only (literal / `inline for` cursor) | the concrete element, **viewed through `P`** | comptime int | | `..$args` / `..$xs: []Type` | per-position comptime **types** | **comptime** | comptime only | element value/type (reflection) | comptime int | Key axis — **concrete vs erased, comptime vs runtime**: - `..xs: P` (pack) keeps each element's *concrete* type but is **comptime-only**: `xs[i]` needs a compile-time index (a literal or an `inline for` cursor); a runtime index is an error (a pack has no runtime representation). Use it when you need per-position types (monomorphization, `xs.T` / `xs.value` projection). - `..xs: []P` (slice of protocol) **erases** each element to the protocol value but is **runtime**: `xs[runtime_i].method()` works in an ordinary loop. Use it when you need to iterate the args at runtime and only the protocol interface matters. It is the runtime counterpart to the pack. The heterogeneous pack (`..xs: P`) is what powers `map :: (mapper: ..., ..sources: ValueListenable) -> ...`: it accepts any number of trailing args, each some `ValueListenable(T)` for a possibly-different `T`. A pack is **not a runtime value** — it lowers to N typed positional parameters (zero overhead). The body refers to elements only through the comptime forms below; using the pack name where a runtime value is required is an error (see "Pack as value"). **Element access is through the protocol, not the concrete type.** Although the pack monomorphizes per call shape and each element has a known concrete type, `xs[i]` is viewed **through the constraint protocol**: only the protocol's own interface (its methods, and the projections `xs.T` / `xs.value`) is accessible. Reaching a concrete member that isn't part of the protocol — e.g. `xs[i].v` where `v` is a field of the concrete `IntBox` but not declared on `Show` — is an error, exactly as it would be for a constrained generic `T: Show`. The protocol constraint is enforced (each trailing arg must conform) and bounds what the body may do, regardless of the concrete arg types at any particular call site. #### Pack operations | Use | Spelling | Meaning | |---|---|---| | Length | `xs.len` | comptime int (field-style, not `len(xs)`) | | Index | `xs[i]` | i-th element; `i` must be comptime | | Comptime unroll (index) | `inline for 0..xs.len (i) { ... }` | unrolled loop; cursor `i` is a comptime constant per iteration; not `#for` | | Projection | `xs.field` | see "Pack projection" | | Spread → call args | `..xs` / `..xs.field` | expands to N positional args | | Spread → tuple value | `(..xs)` / `(..xs.field)` | materializes a tuple | | Spread → tuple type | `(..F(Ts))` / `(..F(Ts.Arg))` | tuple type with per-element type application | | Spread → callable sig | `Closure(..Ts) -> R` / `Closure(..Ts.Arg) -> R` | positional params of the callable | #### Pack projection `xs.field` projects the same member out of every element, preserving order. Resolution is **position-driven** (no cross-namespace shadowing): - In **type** position, `..xs.field` looks `field` up in the pack constraint's **type-arg** namespace. `ValueListenable :: protocol($T: Type) { ... }` declares type-arg `T`, so `..xs.T` is the pack of element value-types. - In **value** position, `xs.field` looks `field` up in the constraint's **runtime-field** namespace and yields a *tuple* of the projected values (e.g. `xs.value` → `(xs[0].value, xs[1].value, ...)`). A protocol that declares a type-arg and a runtime field with the **same name** compiles, but emits a soft warning at the protocol declaration (the human is alerted; resolution still proceeds by position). #### Tuple parallels The same spread/projection syntax applies to a **tuple value** whose source is a tuple rather than a pack: - `..tuple` / `..tuple.field` spreads a tuple's fields into call args. - `tuple.field` projects `field` out of every element (when all elements have a same-named field), returning a tuple of the projected values. This lets a pack be materialized once (`stored := (..xs)`) and later re-spread (`f(..stored)`) or re-projected (`stored.value`). #### Pack of zero (N = 0) `xs.len == 0` is valid: `inline for` over an empty range doesn't execute, spreads are no-ops, and `(..xs)` is the empty tuple. A library built on packs (e.g. `map`) must handle N=0 — typically by producing a constant result that never changes. #### Pack as value Because a pack has no runtime representation, using the **bare pack name** where a runtime value is required is a compile error with a context-tailored suggestion: - storing/binding it (`x := xs;`, `self.f = xs;`) → materialize a tuple `(..xs)`; - passing it to a runtime call (`f(xs)`) → declare the parameter as a *slice* variadic `..xs: []P` (a runtime slice) instead of a pack `..xs: P`; - returning it (`return xs;`) → return a tuple `(..xs)` (and make the return type that tuple); - iterating it (`for xs : (x)`, `xs[runtime_i]`) → `inline for 0..xs.len (i)` for a comptime unroll, or take `..xs: []P` for a runtime loop. The recurring runtime escape hatch is the **slice-of-protocol variadic** `..xs: []P` (see "Variadic Functions"): it is the runtime, protocol-erased counterpart to the comptime pack. A pack indexed/iterated/forwarded at runtime is almost always better expressed by declaring `xs` as `..xs: []P` in the first place. #### Storage and protocol conformance To **store** a pack, materialize a tuple: a pack-shaped struct field is tuple-typed, `sources: (..ValueListenable(Ts))`, assigned `self.sources = (..sources)`. To **return** a struct as a protocol value, `xx` requires an explicit impl (protocol erasure is impl-driven, not structural) — e.g. `impl ValueListenable($R) for Combined($R, ..$Ts) { ... }`. #### Canonical example ```sx Combined :: struct($R: Type, ..$Ts: []Type) { sources: (..ValueListenable(Ts)); // pack-spread in tuple type position mapper: Closure(..Ts) -> $R; // pack-spread in callable sig value: $R; own_allocator: Allocator; recompute :: (self: *Combined) { new_val := self.mapper(..self.sources.value); // tuple projection + spread if new_val == self.value return; self.value = new_val; } } map :: (mapper: Closure(..sources.T) -> $R, ..sources: ValueListenable) -> ValueListenable($R) { c := context.allocator.alloc(Combined($R, ..sources.T)); c.own_allocator = context.allocator; c.mapper = mapper; c.sources = (..sources); // pack-to-tuple materialization inline for 0..sources.len (i) { // comptime unroll over the pack sources[i].addListener((_) => c.recompute()); } c.value = mapper(..sources.value); // pack spread + projection in a call return xx c; // needs impl ValueListenable for Combined } isReady : ValueListenable(bool) = map( (va, vb, vc) => va and vb > 10 and vc == "cool", a, b, c); // a,b,c : ValueListenable(bool/s32/string) ``` ### Type Inference - `::` bindings infer type from the right-hand side - `:=` bindings infer type from the right-hand side - Explicit annotation overrides inference: `NAME : f64 : 0.9;` - Integer literals default to `s64` - Float literals default to `f32` - Enum literals (`.variant`) infer their enum type from context (expected type) ### Type Conversions **Implicit (widening)** — allowed without annotation: - Integer to wider integer of same signedness (`u8` → `u16`, `s8` → `s32`) - Unsigned to strictly wider signed (`u8` → `s16`) - Any integer to any float (`u8` → `f32`, `s32` → `f64`) - Float to wider float (`f32` → `f64`) - Integer and float literals can convert to any numeric type implicitly **Explicit (narrowing)** — requires `xx` prefix: - Integer to narrower integer (`s32` → `u8`) - Signed to unsigned (`s32` → `u32`) - Float to narrower float (`f64` → `f32`) - Float to any integer (`f64` → `u16`) - Unsigned to signed of same or narrower width (`u8` → `s8`) The `xx` prefix operator marks an expression for auto-conversion to the expected type from context (assignment, declaration, argument, return): ```sx large: f64 = 5999.5; x : u16 = xx large; // f64 → u16 d : u8 = #run xx resolve(5); // s32 → u8 at compile time ``` Using `xx` outside a typed context (where the target type is known) is a compile error. --- ## 3. Declarations ### Constant Binding (immutable) ```sx // inferred type NAME :: value; // explicit type NAME : type : value; ``` The `::` operator creates an immutable binding. The value is evaluated at compile time when possible. Examples: ```sx SOME_INT :: 0; // s32 SOME_STR :: "Hello"; // string SOME_FLOAT :: 0.3; // f32 SOME_DOUBLE : f64 : 0.9; // f64 (explicit) SOME_FUNC :: () => 42; // () -> s32 SOME_TYPE :: f64; // type alias ``` ### Variable Binding (mutable) ```sx // inferred type name := value; // explicit type name : type = value; // default-initialized (type required) name : type; // undefined (type required) name : type = ---; ``` The `:=` operator creates a mutable binding. The type is inferred unless explicitly annotated. `name : type;` initializes using the type's defaults: zero for primitives, per-field defaults for structs (see Field Defaults). `name : type = ---;` leaves the value undefined (uninitialized memory). Reading before writing is undefined behavior. Examples: ```sx x := 42; // s32, mutable x := if true then 1 else 2; z : Foo = .variant2; // Foo, mutable, explicit type a : Foo; // Foo, default-initialized (a=0, b=42, c=undef) b : Foo = ---; // Foo, entirely undefined ``` ### Function Definition ```sx name :: (params) -> return_type { body } ``` - Parameters: `name: type` separated by commas - Return type: `-> type` (omit for void). A multi-value return is a tuple: `-> (T1, T2)`. - Body: block of statements; last expression is the implicit return value - No `return` keyword needed (last expression = return value) A trailing `!` in the return type marks the function **failable** — it adds a separate error channel alongside the normal returns (`-> (T, !)`, `-> !`, `-> (T1, T2, !)`). The `!` is not a wrapper around the value; it is one more return slot. See [§12 Error Handling](#12-error-handling). Examples: ```sx compute :: (x: s32) -> s32 { x * x; } main :: () { // void return, no -> annotation } // No-arg void function: main :: () { // ... } ``` #### Default Parameter Values A parameter can declare a default value with `name: type = expr`. When a caller omits the trailing positional argument, the compiler substitutes the default expression at the call site: ```sx greet :: (name: string, prefix: string = "Hello") { print("{} {}!\n", prefix, name); } greet("world"); // prints "Hello world!" greet("world", "Good morning"); // prints "Good morning world!" ``` The default expression is captured as an AST node at parse time and re-lowered fresh at each call site, so runtime expressions like `context.allocator` resolve in the **caller's** scope, not the callee's definition site. This is the mechanism that lets stdlib containers like `List(T)` expose an optional allocator argument that defaults to `context.allocator` without requiring callers to thread one through: ```sx // In std.sx: List :: struct ($T: Type) { append :: (list: *List(T), item: T, alloc: Allocator = context.allocator) { // ... grows via `alloc.alloc(...)` ... } } // Call sites: list.append(42); // alloc = current context.allocator list.append(42, self.parent_allocator); // alloc = the named long-lived owner ``` Defaults are only consulted for **trailing** missing positional args; once a position is provided, all earlier positions must also be provided. There is no named-argument syntax for skipping middle defaults. ### Enum Definition ```sx Name :: enum { variant1; variant2; } ``` Defines a new enum type with the given variants. Trailing comma is allowed. ### Enum Backing Type An optional backing type can be specified after the `enum` keyword (Jai-style): ```sx Color :: enum u8 { red; green; blue; } Status :: enum s16 { ok; error; timeout; } ``` Syntax: `Name :: enum [flags] [type] { ... }` The backing type must be an integer type (`u8`, `u16`, `u32`, `s8`, `s16`, `s32`, `s64`, etc.). When omitted, the default is `s64`. This is useful for C interop (matching C enum sizes) and memory efficiency. ### Enum Layout Struct For C interop with tagged unions (e.g. SDL_Event), a struct can be used as the backing type to specify the exact memory layout: ```sx // Inline layout SDL_Event :: enum struct { tag: u32; _: u32; payload: [30]u32; } { quit :: 0x100; key_down :: 0x300: SDL_KeyData; key_up :: 0x301: SDL_KeyData; } // Named layout EventLayout :: struct { tag: u32; _: u32; payload: [30]u32; } SDL_Event :: enum EventLayout { quit :: 0x100; key_down :: 0x300: SDL_KeyData; } ``` The layout struct must have: - A field named `tag` — integer type, the discriminant. Its type becomes the enum's backing type. - A field named `payload` — array type, the variant data area. Its size determines the maximum payload capacity. - Any other fields are treated as padding/reserved and positioned by the struct layout. This gives explicit control over the memory layout instead of relying on automatic alignment. The total size equals the struct size. Without a layout struct, tagged enums use `{ tag, [max_payload_size x i8] }` with no padding. ### Enum Flags ```sx Perms :: enum flags { read; // 1 write; // 2 execute; // 4 } ``` Flags can also specify a backing type: ```sx SDL_InitFlags :: enum flags u32 { video :: 0x20; audio :: 0x10; } ``` The `flags` modifier assigns auto power-of-2 values (1, 2, 4, 8, ...) instead of sequential indices (0, 1, 2, ...). Flags can be combined with `|` and tested with `&`: ```sx p :Perms = .read | .write; if p & .execute { ... } print("{}\n", p); // .read | .write ``` Explicit values use `::` syntax (Jai-style): ```sx WindowFlags :: enum flags { vsync :: 64; resizable :: 4; hidden :: 128; } ``` Restrictions: - Flags enum variants cannot have payloads - `flags` is a contextual identifier, not a keyword ### Bitwise Operators All bitwise operators work on integer types. `>>` is arithmetic (sign-extending) for signed types and logical (zero-filling) for unsigned types. ```sx x := 0xFF & 0x0F; // 15 — AND y := 1 | 2 | 4; // 7 — OR z := 0xFF ^ 0x0F; // 240 — XOR w := ~0; // -1 — NOT a := 1 << 4; // 16 — left shift b := 256 >> 4; // 16 — right shift ``` Compound assignment forms: `&=`, `|=`, `^=`, `<<=`, `>>=`. ```sx x := 0xFF; x &= 0x0F; // 15 x |= 0xF0; // 255 x ^= 0x0F; // 240 y := 1; y <<= 8; // 256 y >>= 4; // 16 ``` --- ## 4. Expressions Everything in `sx` is expression-oriented where possible. ### Operator Precedence | Prec | Operators | Notes | |------|-----------|-------| | 9 (highest) | `*`, `/`, `%` | multiplication, division, modulo | | 8 | `+`, `-` | addition, subtraction | | 7 | `<<`, `>>` | shifts | | 6 | `<`, `<=`, `>`, `>=`, `==`, `!=` | comparisons (chainable) | | 5 | `&` | bitwise AND | | 4 | `^` | bitwise XOR | | 3 | `\|` | bitwise OR | | 2 | `and` | logical AND (short-circuit) | | 1 (lowest) | `or` | logical OR (short-circuit) / failable fallback (§12) | `try` is a unary prefix in the same tier as `xx` / `@` / `-` / `!` / `~` (tighter than every binary operator, including `or`); `catch` is a postfix attached to a failable expression. So `try foo() or try boo()` parses as `(try foo()) or (try boo())`. See [§12 Error Handling](#12-error-handling). ### Arithmetic Standard infix: `+`, `-`, `*`, `/` with usual precedence (`*`/`/` before `+`/`-`). ```sx x * x x + 2 ``` ### Chained Comparisons Comparison operators can be chained. Each operand is evaluated exactly once. ```sx 0 <= x <= 100 // equivalent to: 0 <= x and x <= 100 1000 > x >= -100 // equivalent to: 1000 > x and x >= -100 a == b == c // equivalent to: a == b and b == c ``` Mixed operators are allowed: `a < b <= c > d` means `a < b and b <= c and c > d`. ### Logical Operators `and` and `or` are short-circuit boolean operators. The right operand is not evaluated if the left operand determines the result. ```sx if 0 <= x <= 100 and 0 <= y <= 100 { print("contained"); } ``` ### If Expression (inline form) ```sx if condition then consequent else alternate ``` Both branches are single expressions. The whole form produces a value. ```sx x := if true then 1 else 2; ``` The `else` branch is optional. Without it, the form is a statement (no value): ```sx if i == 2 then continue; if done then break; if err then return; ``` ### If Expression (block form) ```sx if condition { stmts } else { stmts } ``` Each branch is a block. The last expression in each block is the branch's value. Can be used inline within other expressions: ```sx y := x + if false { 7; } else { 12; }; ``` ### Pattern Matching ```sx if subject == { case pattern: body case pattern: body else: body // optional default arm } ``` Matches `subject` against each `case`. Patterns can be: - **Enum literals**: `.variant` — matches a specific enum variant. - **Integer/bool literals**: `42`, `true` — matches a specific value. - **Type categories**: `struct`, `enum`, `union` — matches all types in that category (used with `type_of` values). `break` exits a case arm without producing a value. The optional `else:` arm matches when no `case` pattern matches. ```sx if z == { case .variant1: break; case .variant2: print("z: {z}"); else: print("unknown"); } ``` #### Type Category Matching When switching on a `Type` value (from `type_of`), category keywords match all registered types of that category: ```sx type := type_of(val); if type == { case int: result = int_to_string(xx val); case struct: result = struct_to_string(cast(type) val); case enum: result = enum_to_string(cast(type) val); } ``` Available categories: `int`, `float`, `bool`, `string`, `struct`, `enum`, `vector`, `array`, `slice`, `pointer`, `type`. > Note: `case enum:` matches both payload-less enums and tagged enums (enums with payloads). C-style untagged unions are not registered with the Any type system and cannot be matched by category. Inside a category arm, `cast(type) val` performs **runtime generic dispatch**: the compiler generates a switch over all types in the category, monomorphizing the callee for each concrete type. ### While Loop ```sx while condition { body } ``` Repeats `body` as long as `condition` is true. `break;` exits the loop. `continue;` skips to the next iteration. ```sx i := 0; while i < 10 { i += 1; if i == 5 { continue; } if i == 8 { break; } print("{i}\n"); } ``` ### For Loop #### Range form ```sx for start..end: (i) { } // counting loop, cursor `i` (s64), `end` exclusive for start..end { } // no cursor — body runs `end - start` times inline for start..end: (i) { } // comptime-unrolled; `i` is a comptime constant per iteration ``` `start` and `end` are `s64` expressions; the loop counts `start, start+1, …, end-1`. The cursor is optional — omit `: (i)` entirely when the body doesn't need the index (`for 0..n { … }`). When present it is introduced by `:`, matching the collection form (`for xs: (x)`). The `inline` variant requires comptime-known bounds and unrolls the body once per value, binding the cursor as a compile-time constant (so it can index a pack: `inline for 0..xs.len: (i) { xs[i].m() }`). `break;` / `continue;` work in the runtime form. #### Collection form ```sx for iterable: (elem) { } // element alias (no copy) for iterable: (elem, ix) { } // element + index for iterable: (_, ix) { } // index only for iterable: (*elem) { } // element pointer (*T) — by-reference for iterable: (*elem, ix) { } // element pointer + index ``` Iterates over arrays and slices. The capture clause after `:` binds loop variables: - The first name is the element capture (non-reassignable alias into the array/slice) - The optional second name is the index (s64, starting at 0, also non-reassignable) - Use `_` to discard a capture The element capture is a direct alias — reads and field writes go to the original array element. Direct reassignment of the capture (`elem = x`) is a compile error. **By-reference capture (`*elem`)** binds the element to a *pointer* into the collection (`*T`) instead of a value — no per-element copy. It GEPs straight into the array/slice backing, so: - Passing it onward is zero-copy — `f(elem)` where `f` takes `*T` hands over the pointer, not a copy. - Writes through it land in the original: `elem.* = v` (or `elem.field = v`). - In a value position the pointer auto-derefs to the element: `elem + 1` reads the value, and `if elem == { … }` matches the pointee (a pointer subject matches through the deref). Where a `*T` is expected, the pointer is passed as-is. ```sx events := plat.poll_events(); // []Event for events: (*ev) { // ev : *Event — no copy pipeline.dispatch_event(ev); // passes the pointer } ``` `break;` exits the loop. `continue;` skips to the next iteration. ```sx arr : [5]s32 = .[1, 2, 3, 4, 5]; for arr: (val, ix) { if ix == 2 { continue; } print("{}\n", val); } ``` ### Lambda ```sx (params) => expr (params) -> return_type => expr ``` Anonymous function. Produces a function value. Supports the same parameter features as named functions: `$` generic type params, `..` variadic params, and optional return type annotation. ```sx SOME_FUNC :: () => 42; // () -> s32 double :: (x: $T) -> T => x + x; // generic lambda with return type ``` ### Closures A **closure** is a function bundled with captured state. It is represented as a fat pointer `{ fn_ptr, env }` (16 bytes), unlike a bare function pointer which is 8 bytes. #### Closure Type ```sx Closure(param_types) -> R // e.g. Closure(s32, s32) -> s32 Closure(param_types) // void return: Closure(s64) -> void ?Closure(s32) -> s32 // optional closure (null = none) Closure(..Ts) -> R // pack-expanded params (see Variadic Heterogeneous Type Packs) ``` #### Creating Closures — `closure()` intrinsic ```sx offset := 50; f := closure((x: s32) -> s32 => x + offset); // expression body g := closure((x: s32) -> s32 { // block body if x < 0 { return 0; } return x + offset; }); ``` The `closure()` intrinsic: 1. Analyzes the lambda body for free variables (variables from outer scope) 2. Allocates an env struct on the heap (via `malloc`) containing captured values 3. Generates a trampoline function with signature `(env: *void, params...) -> R` 4. Returns a `Closure` value `{ trampoline, env_ptr }` **Capture semantics**: capture by value (snapshot at creation time). Mutating the original variable after creating the closure does not affect the captured value. ```sx n := 10; f := closure((x: s64) -> s64 => x + n); n = 999; print("{}\n", f(5)); // 15, not 1004 ``` #### Calling Closures Closures are called with normal function call syntax: ```sx result := f(10); ``` The compiler prepends the env pointer to the argument list and does an indirect call through the fn_ptr. #### Auto-Promotion A bare function can be implicitly promoted to a `Closure` where one is expected. The compiler generates a static thunk that ignores the env parameter, with a null env pointer. ```sx double :: (x: s32) -> s32 { return x * 2; } apply :: (f: Closure(s32) -> s32, x: s32) -> s32 { return f(x); } apply(double, 10); // double auto-promoted to Closure ``` #### Factory Functions Functions can return closures, enabling the factory pattern: ```sx make_adder :: (n: s32) -> Closure(s32) -> s32 { return closure((x: s32) -> s32 => x + n); } add5 := make_adder(5); print("{}\n", add5(100)); // 105 ``` #### Optional Closures `?Closure` is supported for nullable callbacks. Uses `fn_ptr == null` as the none sentinel (zero overhead — same layout as `Closure`). ```sx Button :: struct { label: string; on_click: ?Closure(s64) -> void; } btn := Button.{ label = "OK", on_click = null }; if handler := btn.on_click { handler(1); } ``` #### Memory Closure env is allocated via `context.allocator`. The compiler auto-initializes `context` with a default GPA (malloc/free wrapper) at the start of `main()`. Use `push Context` to override with a custom allocator. Auto-promoted closures have a null env and require no allocation. ```sx f := closure((x: s64) -> s64 => x + 10); // env allocated via default GPA print("{}\n", f(5)); ``` ### Function Call ```sx callee(args) ``` ```sx compute(6) print("hello") ``` ### UFCS (Uniform Function Call Syntax) ```sx object.func(args) // equivalent to func(object, args) ``` When `object.func(args)` is encountered and `func` is not a field of `object`'s type, the compiler rewrites the call to `func(object, args)`. This enables method-like syntax without dedicated method declarations. ```sx Point :: struct { x: s32; y: s32; } point_sum :: (p: Point) -> s32 { p.x + p.y; } p := Point.{3, 4}; print("{}\n", p.point_sum()); // calls point_sum(p) → 7 ``` UFCS works with pointer receivers (auto-deref applies) and generic functions. If the field name exists as both a struct field and a free function, the struct field takes priority. #### UFCS Aliases The `ufcs` keyword creates a name alias for a function, decoupling the method name from the function name: ```sx arena_alloc :: (arena: *Arena, size: s64) -> *void { ... } alloc :: ufcs arena_alloc; myArena.alloc(42); // calls arena_alloc(myArena, 42) alloc(myArena, 42); // also works as a direct call ``` This avoids the naming redundancy of `myArena.arena_alloc(42)`. #### Tuple UFCS Splatting When a tuple is used as the receiver of a UFCS call, its elements are unpacked as leading arguments: ```sx num_add :: (a: s64, b: s64) -> s64 { a + b; } add :: ufcs num_add; (40, 2).add(); // splats to num_add(40, 2) → 42 (40,).add(2); // partial: num_add(40, 2) → 42 40.add(2); // normal UFCS: num_add(40, 2) → 42 ``` With more arguments: ```sx compute :: (a: s64, b: s64, c: s64, d: s64) -> s64 { a + b * c - d; } calc :: ufcs compute; (1, 2, 3, 4).calc(); // full splat → compute(1, 2, 3, 4) (1, 2).calc(3, 4); // partial splat → compute(1, 2, 3, 4) 1.calc(2, 3, 4); // normal UFCS → compute(1, 2, 3, 4) ``` ### Pipe Operator The pipe operator `|>` inserts the left-hand side as the first argument of the right-hand side call. It is desugared at parse time. ```sx a |> f(b, c) // → f(a, b, c) a |> f // → f(a) a |> f(b) |> g(c) // → g(f(a, b), c) ``` The pipe is left-associative with the lowest precedence of all binary operators, so expressions like `x + 1 |> f(2)` are parsed as `f(x + 1, 2)`. This is especially useful with namespaced imports: ```sx pkg :: #import "modules/math"; 3 |> pkg.add(4) // → pkg.add(3, 4) → 7 3 |> pkg.add(4) |> pkg.mul(2) // → pkg.mul(pkg.add(3, 4), 2) → 14 ``` ### Field Access ```sx object.field ``` Used for module access (`std.print`) and struct member access. ### Enum Literal ```sx .variant_name ``` The enum type is inferred from context (expected type from declaration or parameter). --- ## 5. Statements Statements are terminated by `;`. - **Declaration**: `name :: value;` / `name := value;` - **Assignment**: `name = value;` / `name += value;` (and other compound assignments). Also supports field targets: `obj.field = value;` - **Multi-target assignment**: `a, b = b, a;` — all RHS values are evaluated before any stores, enabling swaps without temporaries. Target count must equal value count. Only plain `=` is supported (no compound operators). Each target must be a valid lvalue (variable, field, index, dereference). - **Expression statement**: `expr;` — evaluates the expression (last in a block = return value) - **Return**: `return expr;` — returns from the enclosing function with the given value. `return;` returns void. - **Break**: `break;` — exits a match arm or while loop - **Continue**: `continue;` — skips to the next iteration of a while loop - **Defer**: `defer expr;` — defers execution of `expr` until the enclosing block exits (LIFO order) - **Push**: `push expr { body }` — scoped context override (see below) ### `push` Statement and Implicit `context` The `push` statement temporarily overrides a global `context` variable for the duration of a block. The previous context is saved before the block and restored after it exits. ```sx push Context.{ allocator = arena.allocator(), data = xx @logger } { handle(client); // inside here, `context` has the new value } // context is restored to its previous value here ``` **`Context` struct** — defined in `std.sx`: ```sx Context :: struct { allocator: Allocator; // active allocator for dynamic allocation data: *void; // opaque pointer for application-specific data } context : Context = ---; // global mutable variable ``` The compiler auto-initializes `context` with a default GPA (malloc/free wrapper) at the start of `main()`. Inside the pushed block, any code (including called functions) can read `context.allocator` and `context.data`. The standard library's `cstring()`, `alloc_slice()`, and `closure()` all allocate via `context.allocator`. `push` requires a global mutable variable named `context` to be in scope (provided by `std.sx`). --- ## 6. Blocks, Scoping, and Implicit Returns A block `{ ... }` contains zero or more statements. The last expression in a block is its value (implicit return). In function bodies, the last expression becomes the return value: ```sx compute :: (x: s32) -> s32 { x * x; // this is returned } ``` ### Scope Blocks Bare blocks can be used as statements to introduce a new lexical scope. Variables declared inside a scope block are local to that block. No trailing `;` is required. ```sx main :: () { x := 42; { x := 6; // shadows outer x print("inner: {x}"); // prints 6 } print("outer: {x}"); // prints 42 } ``` ### Variable Shadowing A variable declaration (`name :=`) inside an inner scope shadows any variable with the same name from outer scopes. The outer variable is restored when the inner scope exits. ### Defer `defer expr;` schedules `expr` to execute when the enclosing scope block exits. Multiple defers in the same scope execute in reverse order (LIFO). ```sx { defer print("second"); defer print("first"); } // prints: first, then second ``` --- ## 7. Built-in Functions Built-in functions are declared in `std.sx` with the `#builtin` suffix, which tells the compiler to generate the implementation internally rather than looking for a function body. ### I/O - `out(str: string) -> void` — write a string to standard output - `print(fmt: string, ..args: []Any)` — formatted print. Parses `{}` placeholders in the format string and substitutes arguments. When all argument types are statically known, the compiler specializes the call at compile time (no `Any` boxing). ### Math - `sqrt(x: $T) -> T` — square root (maps to LLVM intrinsic) - `sin(x: $T) -> T` — sine (maps to LLVM intrinsic) - `cos(x: $T) -> T` — cosine (maps to LLVM intrinsic) ### Memory - `malloc(size: s64) -> *void` — allocate `size` bytes of heap memory - `free(ptr: *void) -> void` — free previously allocated memory - `memcpy(dst: *void, src: *void, size: s64) -> *void` — copy `size` bytes from `src` to `dst` - `memset(dst: *void, val: s64, size: s64) -> void` — fill `size` bytes at `dst` with `val` - `size_of($T: Type) -> s64` — size of type `T` in bytes ### Type Introspection - `type_of(val: $T) -> Type` — returns the runtime type tag of a value - `type_name($T: Type) -> string` — returns the name of type `T` as a string (e.g., `"Point"`) - `field_count($T: Type) -> s64` — returns the number of fields (struct), variants (enum), or elements (vector) in type `T` - `field_name($T: Type, idx: s64) -> string` — returns the name of the `idx`-th field (struct) or variant (enum) of type `T` - `field_value(s: $T, idx: s64) -> Any` — returns the `idx`-th field (struct) or element (vector) of `s`, boxed as `Any` - `field_value_int($T: Type, idx: s64) -> s64` — returns the integer value of the `idx`-th enum variant - `field_index($T: Type, val: T) -> s64` — returns the sequential variant index for an explicit enum value (reverse of `field_value_int`). Returns `-1` if no variant matches. - `is_flags($T: Type) -> bool` — returns `true` if `T` is a flags enum (declared with `#flags`) ### Type Conversion - `cast(Type) expr` — prefix operator that converts `expr` to `Type`. Examples: `cast(s32) 3.14`, `cast(f64) n`. When `Type` is a runtime `Type` value inside a type-category match arm, the compiler generates a dispatch switch over all types in the category, monomorphizing the callee for each concrete type. ### Vectors - `Vector($N: int, $T: Type) -> Type` — returns an LLVM vector type of `N` elements of type `T` --- ## 8. Compile-time Evaluation ### `#run` Directive `#run expr` evaluates `expr` at compile time using lazy JIT execution. It can appear in two contexts: **Compile-time constants** — bind a compile-time value to a name: ```sx compute :: (x: s32) -> s32 { x * x; } x :: #run compute(5); // x = 25, evaluated at compile time ``` Comptime globals are resolved lazily: the JIT executes only when the value is first referenced during code generation. Chained dependencies are resolved automatically. **Side effects** — execute code at compile time for its side effects: ```sx #run print("compiling..."); ``` ### `#insert` Directive `#insert expr;` evaluates `expr` at compile time to obtain a string, then parses and compiles that string as inline code at the insertion point. ```sx generate :: () -> string { return "print(\"hello from the other side\");"; } main :: () { #insert #run generate(); // equivalent to: print("hello from the other side"); } ``` The inserted string must contain valid `sx` statements (including semicolons). The statements are parsed and compiled in the same scope as the `#insert` site. Variables created by one `#insert` are visible to subsequent `#insert` directives in the same function. ### Comptime Call Evaluation When a `::` constant binding is initialized with a function call and all arguments are comptime-known (literals or other `::` constants), the compiler attempts to evaluate the entire call at compile time using the bytecode VM. If evaluation succeeds, the result is baked into the binary as a static constant with zero runtime overhead. ```sx body :: "

Hello

"; response :: format("HTTP/1.1 200 OK\r\nContent-Length: {}\r\n\r\n{}", body.len, body); // response is a static string constant — no runtime allocation ``` This works for any function, not just `format`. The mechanism is general: the VM compiles the function body (including `#insert` directives, variadic `..args: []Any` args, and calls to other functions) and executes it entirely at compile time. If the VM encounters something it cannot evaluate (e.g., foreign function calls, unsupported operations), it silently falls through to runtime codegen. ### Build Configuration The `BuildOptions` struct (from `modules/compiler.sx`) provides compile-time build configuration via `#run`. Methods on `BuildOptions` are compiler builtins intercepted during compilation — they have no runtime cost. ```sx #import "modules/compiler.sx"; configure_build :: () { opts := build_options(); opts.add_link_flag("-lm"); opts.set_output_path("out/my_program"); inline if OS == .wasm { opts.set_output_path("sx-out/wasm/app.html"); opts.add_link_flag("-sUSE_SDL=3"); opts.add_link_flag("-sALLOW_MEMORY_GROWTH=1"); } } #run configure_build(); ``` **API:** | Method | Description | |--------|-------------| | `build_options()` | Returns a `BuildOptions` value for the current compilation | | `opts.add_link_flag(flag)` | Appends a linker flag (merged with CLI flags) | | `opts.set_output_path(path)` | Sets the output binary path (overridden by CLI `-o`) | Build flags from `add_link_flag` are merged with any flags passed on the command line. Duplicate library flags (e.g., `-lSDL3` from multiple imports) are automatically deduplicated. ### Compiler Constants The `modules/compiler.sx` module provides compile-time constants set by the compiler based on the target: | Constant | Type | Description | |----------|------|-------------| | `OS` | `OperatingSystem` | Target OS: `.macos`, `.linux`, `.windows`, `.wasm`, `.unknown` | | `ARCH` | `Architecture` | Target arch: `.aarch64`, `.x86_64`, `.wasm32`, `.unknown` | | `POINTER_SIZE` | `s64` | Pointer width in bytes (8 for 64-bit, 4 for wasm32) | These are used with `inline if` for compile-time conditional compilation: ```sx inline if OS == .wasm { // Only compiled when targeting wasm } inline if POINTER_SIZE == 8 { // Only compiled on 64-bit platforms } ``` --- ## 9. Modules / Imports ### `#import` Directive The `#import` directive brings declarations from another `.sx` file or directory into the current file. **Flat import** — splices all declarations from the imported file into the current scope: ```sx #import "modules/std/math.sx"; ``` **Namespaced import** — wraps all declarations under a namespace name: ```sx std :: #import "modules/std.sx"; ``` **Directory import** — when the path refers to a directory, all `.sx` files in that directory are aggregated into a single module: ```sx pkg :: #import "modules/testpkg"; // namespaced — all .sx files merged under pkg #import "modules/testpkg"; // flat — all declarations spliced into scope ``` Directory imports scan only the top level of the specified directory (non-recursive). Files are processed in alphabetical order for deterministic builds. Files within the directory may `#import` each other or external files. Namespaced declarations are accessed with dot notation: ```sx std.print("hello"); ``` ### Import Resolution - Imports are resolved after parsing and before code generation. - Paths are first resolved relative to the directory of the file containing the `#import`. If not found, they fall back to the working directory (cwd). This allows modules in subdirectories to import shared modules using the same paths as the root file. - If the path resolves to a file, it is imported directly. If it resolves to a directory, all `.sx` files in that directory are aggregated. - Nested imports are supported (imported files may themselves contain `#import`). - Circular imports are detected and silently skipped (each file is imported at most once). - Generic functions in namespaced imports are supported (e.g., `std.mul(5, 2)` where `mul` is generic). **Example:** Given this project layout: ``` project/ modules/std.sx modules/math/ math.sx vector3.sx ← contains: #import "modules/std.sx"; main.sx ← contains: #import "modules/std.sx"; ``` When compiling from `project/`, both `main.sx` and `modules/math/vector3.sx` can use `#import "modules/std.sx"` — the root file resolves it relative to its own directory, and the nested file falls back to resolving relative to cwd. ### Intra-module References Functions within a namespaced import can call each other without the namespace prefix. When generating code for a namespaced module, unresolved function names are automatically tried with the namespace prefix. ### Example ```sx // modules/std/math.sx mul :: (base: $T, exp: T) -> T { base * exp; } // modules/std/std.sx out :: (str: string) -> void #builtin; // main.sx std :: #import "modules/std.sx"; #import "modules/std/math.sx"; main :: () -> s32 { std.out("hello there"); mul(5, 2); } ``` --- ## 10. CLI & Cross-Compilation ### Commands ``` sx run Compile and run sx build Compile to binary sx lsp Start language server (LSP) ``` ### Options | Flag | Description | |------|-------------| | `--target ` | Target triple or shorthand (default: host) | | `--cpu ` | CPU name (default: generic) | | `--opt ` | Optimization: `none`/`0`, `less`/`1`, `default`/`2`, `aggressive`/`3` | | `-o ` | Output path (overrides `set_output_path`) | ### Target Shorthands The `--target` flag accepts shorthand aliases for common targets: | Shorthand | Expands to | |-----------|-----------| | `wasm`, `emscripten` | `wasm32-unknown-emscripten` | | `macos`, `macos-arm` | `aarch64-apple-macos` | | `macos-x86` | `x86_64-apple-macos` | | `linux`, `linux-x86` | `x86_64-unknown-linux-gnu` | | `linux-arm` | `aarch64-unknown-linux-gnu` | | `windows` | `x86_64-windows-msvc` | Full triples are also accepted and passed through as-is. --- ## 10.5 Bundling and Post-Link Callbacks Platform-specific bundling (Apple `.app`, Android `.apk`) lives in [library/modules/platform/bundle.sx](library/modules/platform/bundle.sx). The compiler shrinks to: parse → IR → codegen → link → invoke a sx function. Bundling, codesigning, manifest generation, Java compilation (via `javac` + `d8`), etc. are all sx code running in the IR interpreter post-link. ### Discovery Users opt in **explicitly** from their own `#run` block: ```sx #import "modules/compiler.sx"; #import "modules/platform/bundle.sx"; #run { opts := build_options(); opts.set_bundle_path("MyApp.app"); opts.set_bundle_id("com.example.app"); opts.set_post_link_callback(bundle_main); } ``` Programs that don't register a callback simply don't bundle — the linked binary is produced and nothing further runs. There is no stdlib default and no implicit prelude. Two registration forms: | Setter | Behavior | |--------|----------| | `BuildOptions.set_post_link_callback(cb: () -> bool)` | First-class function value. Preferred. | | `BuildOptions.set_post_link_module(name: [:0]u8)` | Name-based fallback; compiler resolves `.bundle_main` post-link. | CLI `--bundle ` / `--apk ` are transitional aliases: if `bundle_path` is set and no callback was registered, the compiler auto-falls-back to `post_link_module = "platform.bundle"`. The sx bundler reads `bundle_path()` regardless of which flag the user used. The callback returns `false` to fail the build. ### BuildOptions surface `BuildOptions` is a `#compiler` struct in [library/modules/compiler.sx](library/modules/compiler.sx). Setters accumulate config in the compiler's `BuildConfig`; accessors read it back inside the post-link callback. | Method | Read / write | Purpose | |--------|--------------|---------| | `add_link_flag(flag)` | write | extra linker flag | | `add_framework(name)` | write | `-framework ` (Apple) | | `set_output_path(path)` | write | linked binary path | | `set_wasm_shell(path)` | write | custom WASM shell template | | `add_asset_dir(src, dest)` | write | bundle a directory of runtime assets | | `set_post_link_callback(cb)` | write | first-class callback (preferred) | | `set_post_link_module(name)` | write | name-based callback fallback | | `set_bundle_path(path)` | write | `.app` / `.apk` output | | `set_bundle_id(id)` | write | iOS `CFBundleIdentifier` / Android package | | `set_codesign_identity(name)` | write | Apple signing identity (`-` = ad-hoc) | | `set_provisioning_profile(path)` | write | iOS device `.mobileprovision` | | `set_manifest_path(path)` | write | Android AndroidManifest.xml override | | `set_keystore_path(path)` | write | Android keystore override | | `binary_path()` | read | path of the freshly-linked binary | | `bundle_path() / bundle_id()` | read | mirror of the setters | | `codesign_identity() / provisioning_profile()` | read | Apple codesign params | | `manifest_path() / keystore_path()` | read | Android overrides | | `target_triple()` | read | canonicalized target triple | | `is_macos() / is_ios() / is_ios_device() / is_ios_simulator() / is_android()` | read | per-target predicates | | `framework_count() / framework_at(i)` | read | linker `-framework` names (for `Frameworks/` embed) | | `framework_path_count() / framework_path_at(i)` | read | linker `-F` search paths | | `jni_main_count() / jni_main_foreign_path_at(i) / jni_main_java_source_at(i)` | read | `#jni_main` emissions for the APK bundler | | `asset_dir_count() / asset_dir_src_at(i) / asset_dir_dest_at(i)` | read | iterate registered asset trees | Returned strings are `""` when unset; integer counts are `0`. Accessors that read after-the-fact (`binary_path`, `bundle_path`, etc.) return the value that was either set in `#run` or forwarded from a CLI flag. ### `fs.sx` and `process.sx` stdlib modules The bundler is implemented in sx; its calls into `fs.sx` / `process.sx` work both at runtime through the dynamic linker and at `#run` / post-link through the host-FFI dispatch in [src/ir/host_ffi.zig](src/ir/host_ffi.zig) (a `dlsym(RTLD_DEFAULT)` + arity-switched cdecl trampoline). [library/modules/fs.sx](library/modules/fs.sx) (POSIX backend): | Function | Purpose | |----------|---------| | `open_file(path, mode) -> ?File` | open a handle | | `read_file(path) -> ?string` | one-shot slurp | | `write_file(path, data) -> bool` | create / truncate / write | | `append_file(path, data) -> bool` | append | | `copy_file(src, dst) -> bool` | byte copy (streamed through 64 KB buffer) | | `delete_file(path) -> bool` | `unlink` | | `delete_dir(path) -> bool` | `rmdir` (empty only) | | `create_dir(path) -> bool` / `create_dir_all(path) -> bool` | `mkdir` / `mkdir -p` | | `move(old, new) -> bool` | `rename` | | `set_mode(path, mode) -> bool` | `chmod` | | `exists(path) -> bool` | `access(F_OK)` | | `basename(p) -> string` / `dirname(p) -> string` | text-only path split | `File` is a small value-typed handle wrapping a POSIX fd, with methods `is_valid / close / read / write / seek`. Higher-level helpers (`read_file`, `write_file`, `copy_file`) bypass `*File` methods and call libc directly so they remain callable from the post-link IR interpreter (which doesn't yet handle `*Self` method dispatch on locally-unwrapped optionals). [library/modules/process.sx](library/modules/process.sx) (POSIX backend): | Function | Purpose | |----------|---------| | `run(cmd: [:0]u8) -> ?ProcessResult` | `popen` shell command, capture stdout + exit | | `env(name: [:0]u8) -> ?string` | `getenv` (null if unset) | | `find_executable(name) -> ?string` | `command -v ` via shell | `ProcessResult` is `{ exit_code: s32, stdout: string }`. The post-link bundler invokes `codesign`, `plutil`, `security`, `aapt2`, `javac`, `d8`, `keytool`, `apksigner`, etc. through `run`. ### Apple `.app` flow (`bundle.sx::bundle_main`) `bundle_main` branches on `is_android()` first; the remaining body is the Apple path. Per target: | Step | macOS | iOS sim | iOS device | |------|-------|---------|------------| | Stage `` (rm-rf + mkdir + copy binary + set exe bit) | ✓ | ✓ | ✓ | | Write `Info.plist` | minimal `CFBundle*` | + `UIDeviceFamily` + `LSRequiresIPhoneOS` + `UIApplicationSceneManifest` + `DTPlatformName=iPhoneSimulator` | + same with `DTPlatformName=iPhoneOS` | | Embed provisioning profile to `/embedded.mobileprovision` | — | — | when `provisioning_profile()` set | | Embed `Frameworks/.framework/` (recursive `cp -R` per `-F` search path) | — | when present | when present | | Extract entitlements (`security cms -D` + `plutil -extract Entitlements` + `plutil -extract ApplicationIdentifierPrefix.0` + `plutil -replace application-identifier` resolving `.*` → `.`) | — | — | when `provisioning_profile()` set | | Codesign | ad-hoc (`-`) | ad-hoc | `--sign --entitlements ` | ### Android `.apk` flow (`bundle.sx::android_bundle_main`) The Android branch: 1. **Discover SDK** — `$ANDROID_HOME` → `$ANDROID_SDK_ROOT` → `$HOME/Library/Android/sdk`. 2. **Find highest `build-tools` / `platforms` subdir** — `process.run("ls -1 | sort -V | tail -1")`. 3. **Stage `.stage/lib/arm64-v8a/`** — `copy_file` from the linked output. 4. **Manifest** — user-supplied via `set_manifest_path()`, or synthesized: - `NativeActivity` shape when no `#jni_main` is declared. - `#jni_main` Activity shape with `android:name=""` + `android:hasCode="true"` otherwise. 5. **Compile `#jni_main` Java sources** — write each entry's `java_source` to `/java//.java`, run `javac --release 11 -classpath ` to `/classes/`, run `d8 --release --lib --output ` to produce `/classes.dex`. `javac` discovered via `$JAVA_HOME/bin/javac` then `command -v javac`. 6. **`aapt2 link -I --manifest -o `**. 7. **Append archives** — `zip -q -r lib/`, then `zip -q classes.dex` (if dex was produced), then `zip` each registered asset dir at its `dest` path. 8. **`zipalign -f 4 `**. 9. **Debug keystore** — `keytool -genkeypair -keystore ` on first use; defaults match Android Studio (`androiddebugkey` alias, password `android`). 10. **`apksigner sign --ks --ks-pass pass:android --key-pass pass:android --ks-key-alias androiddebugkey --out `**. 11. Clean intermediates (keep `.stage/` for inspection if it lasts the build). --- ## 11. Program Structure A program is a sequence of top-level declarations and `#import` directives. Execution begins at `main`. ```sx main :: () { // entry point } ``` `main` takes no arguments. Its return type may be any of: void (`()`, `-> ()`, `-> void`, or no annotation), an integer type (POSIX exit code), `-> !` (pure failable), or `-> (int_type, !)` (value-carrying failable). The exit code is `0` for void / `-> !` success, the integer return truncated to `u8` otherwise. An error that escapes a failable `main` prints the unhandled-error header + return trace to stderr and exits `1`. See [§12 Error Handling](#12-error-handling). --- ## 12. Error Handling sx models recoverable errors as a **separate return channel**, not a wrapped result type. A trailing `!` in a function's return type adds one extra return slot — a `u32` error tag — alongside the normal value slots. This keeps sx's native multi-return ergonomics: `-> (s32, s64, !)` is a function returning two values *and* an error, with no tuple-in-a-wrapper. This section is the canonical surface reference. The design rationale, trade-offs, and implementation breakdown live in `current/PLAN-ERR.md`. ### Failable signatures ```sx parse_digit :: (s: string) -> (s32, !) { ... } // one value + error parse :: (s: string) -> (s32, s64, !) { ... } // multi-value + error must_init :: () -> ! { ... } // pure failable, no value divide :: (a: s32, b: s32) -> (s32, !MathErr) { ... } // named set ``` The `!` is always the **last** slot. `0` in the error slot means "no error"; non-zero is an interned global tag id. ### Error sets Two forms of error set: ```sx // Named set — declared once, referenced by name from signatures. ParseErr :: error { BadDigit, Overflow, Empty }; // Inferred set — bare `!` collects whatever tags the body raises. quick :: () -> (s32, !) { if cond raise error.SomeAdHocTag; // mints into the inferred set return 0; } ``` - An `error { ... }` set is an opaque type; tags are referenced as `error.X`. - A declared empty set `error { }` is **rejected**. - **Inferred sets are whole-program.** The compiler runs an SCC fix-point pass over the entire call graph to converge each bare-`!` function's set (matching sx's whole-program compilation model). Callers see the converged union, not bare `!`. - A top-level (non-`main`) function declared `!` that never errors warns ("declared `!` but never errors — drop the `!`"). Closures and function-type slots with an empty `!` do **not** warn. **Tag identity is the name, globally (Zig-style).** Two sets that both list `NotFound` reference the *same* tag id; `if e == error.NotFound` matches every `NotFound` regardless of which set raised it. Use distinct names (`FsNotFound` / `HttpNotFound`) when subsystems must be distinguishable. ### `raise` Statement form. Terminates the immediately enclosing failable function (like `return`), setting the error slot; value slots are left undefined. ```sx if bad raise error.BadDigit; // literal tag v := foo() catch e { if e == error.Specific return default; raise e; // variable tag — re-raise }; ``` `raise EXPR` accepts any tag-typed expression. EXPR's set must be ⊆ the enclosing function's error set (for a named set), or is absorbed into the inferred set (for bare `!`). `raise` inside an inline expression is rejected (`v := if cond raise error.X else 0;` — compile error). A closure body is its own function boundary: `raise` inside a closure terminates the *closure*. ### `try` Expression form. `try X` requires `X` to be failable; on `X`'s failure it routes control to the nearest enclosing fallback target: - inside an `or` chain → the next `or` operand; - otherwise → the function's error return (propagation, like Zig's `try`). ```sx v := try parse_digit(s); // propagate on failure v2, n := try parse(s); // multi-value try must_init(); // statement form, discard values v3 := try foo() or try bar(); // chain: foo fails → try bar return try transform(try parse(s)); // nests in any value position ``` `try` works in any value-producing position (argument, struct/array literal, `if`-condition); evaluation is left-to-right and short-circuits on the first failure, so no partial aggregate is ever built. `try`'s body never binds the tag — use `catch` for that. ### `catch` Expression form. Handles the error inline. The binding is a **bare name, no parens** (`catch e`), and is **optional**. Four shapes, disambiguated by the token after `catch`: | Form | Binding | Body | |---|---|---| | `catch { ... }` | none (tag ignored) | block — braces required | | `catch e { ... }` | `e` | block | | `catch e EXPR` | `e` | bare expression (no braces) | | `catch e == { case ... }` | `e` | match over `e` (sugar for `{ if e == { ... } }`) | ```sx v := parse_digit(s) catch e { log.warn("bad input: {}", e); return default; // noreturn body }; v := parse_digit(s) catch e compute_fallback(e); // value-producing body v, n := parse(s) catch e { log.warn("parse failed: {}", e); (0, 0) // tuple body for a multi-value failable }; v := parse(s) catch e == { // match-body form case .Empty: 0; case .BadDigit: -1; else: raise e; }; v := (try foo() or try boo()) catch e { return 0; }; // catch over an `or` chain ``` **Body type rule.** The body (block-as-expression) must produce the failable's success tuple type, or be `noreturn` (the `noreturn` arm subsumes `return` / `raise` / `break` / `continue` / `unreachable` / noreturn calls). For a multi-value failable the body must produce a tuple of matching arity and element types. A non-diverging body that produces no value is a compile error. ### `or` (fallback / chain) Expression form (the same operator as optional-unwrap). LHS must be failable; the RHS shape decides the result: - **plain value of the success type** — terminate; the chain becomes non-failable; on LHS failure the result is the RHS value (LHS tag discarded); - **`try EXPR`** — chain; on LHS failure, attempt the RHS (its `try` defines the next fallback target); - **bare failable** — allowed only when its error path hits a marker downstream (see the path-marker rule). `or` is **left-associative**, evaluated left-to-right with short-circuit. ```sx v := parse_digit(s) or 0; // value terminator → non-failable v := try foo() or try boo(); // chain, propagate if both fail v := foo() or boo() or 0; // bare operands, 0 absorbs all v, n := parse_pair(s) or (0, 0); // tuple terminator (multi-value) ``` A **void** failable (`-> !`) rejects a plain-value RHS (no success type to fall back to); `must_init() or must_other()` (chain) and `must_init() catch {}` (absorb) are the legal forms. ### Path-marker rule A failable expression `X` may appear **bare** (no `try`) iff its error path passes through at least one explicit marker before reaching the function boundary. The markers are: a `try` keyword, a `catch` handler, an `or` value terminator, or a destructure binding (`v, err := X`). Otherwise `try` (or one of the other markers directly on `X`) is required. ```sx a := parse(s) or 0; // OK — terminator on the path a := parse(s) catch e {...}; // OK — catch marks v, err := failable(); // OK — destructure marks a := try foo() or try boo(); // OK — each try marks its own exit a := foo() or boo(); // ERROR — no marker on the way to the function a := foo(); // ERROR — bare, no marker downstream ``` ### Set widening Widening is checked **only at subexpressions whose failure escapes to the function** (propagation). For a **named** caller `!CallerErr`, the escape set must be ⊆ `CallerErr` (no auto-widening). For an **inferred** caller `!`, the escape set is absorbed into the converged union. Failures absorbed by a downstream chain operand / `catch` / terminator / destructure don't contribute. ### `error.X` as a value `error.X` is a first-class value outside `raise`: ```sx default_err : ParseErr = error.BadDigit; // typed as the named set tag_id : u32 = error.BadDigit; // untyped context → global tag id if e == error.Empty { ... } // compare against a literal ``` - Against a **named-set** destination, `error.X` is valid only if `X ∈` the set (typo-checked). A comparison to a literal not in the set is a compile error (it could never be true). For **inferred** sets this check is skipped. - An error-set value compares (`==` / `!=`) only with an `error.X` literal or another error-set value — **never a raw integer** (`e == 42` is rejected). Coerce explicitly (`(xx e) == id`) to use the raw id. - **Interpolation renders the tag name.** `{}` on an error-set value prints the tag name (`BadDigit`), never the raw id, via a tag-name table that is **always linked, even in release builds**. ### Discard rejection & flow-check Dropping the error slot is a compile error: ```sx v, _ := failable(); // ERROR: the error slot cannot be dropped — handle it ``` Value slots may be discarded (`_, n := parse(s) catch e { return; }`). The statement form `try foo();` is the explicit "propagate, use no value." On a value-carrying failable, the value slot is live only where the compiler can prove the error slot is null (path-sensitive flow-check). ### `onfail` (error-path cleanup) Statement form. Block-rooted (Zig-aligned): legal in any block inside a failable function. **Fires when an error propagates out of its enclosing block**, regardless of whether an outer `catch` / terminator later absorbs it. On success exit (fall-through, `return`, `break` / `continue` without an error) it is skipped — only `defer` runs. ```sx make_handle :: () -> (Handle, !) { h := try open(); onfail close(h); // close ONLY on a subsequent failure try configure(h); // fails → onfail runs → close(h) return h; // success → onfail skipped; caller owns h } open :: (path: string) -> (Handle, !) { h := try sys_open(path); onfail e { log.warn("init failed for {}: {}", path, e); sys_close(h); } ... } ``` **Ordering with `defer`.** Both run in reverse declaration order, interleaved. On block-error exit both kinds run (newest-first); on block-success exit only `defer`s run. **Restrictions.** `raise` / `try` / `return` / `break` / `continue` are rejected inside an `onfail` (and a `defer`) body — a cleanup body has no control-transfer target. A failable call in cleanup must be absorbed locally (`close(h) catch {};` or `flush(buf) or 0`). `onfail` outside a failable function, or at top level, is rejected. ### Closures with `!` - **Explicit annotation required.** A closure literal's value type is inferred as today, but if its body raises or `try`-escapes, the `!` channel is **not** inferred — declare it (`closure((x: s32) -> (s32, !) { ... })`). This keeps adding a `raise` from silently changing a lambda's type. - **Program-wide union per shape.** All `Closure() -> (T, !)` occurrences with the same signature share one inferred-set node; the SCC pass unions every closure flowing into any matching slot. - **FFI boundary.** A failable closure cannot be assigned to a non-failable function-type slot — foreign code can't observe the error channel. Wrap and absorb the error instead. - **Non-failable → failable widening is allowed** (∅ ⊆ any set). A non-failable closure assigned to a failable slot contributes ∅; a single coalesced adapter thunk `(v) → (v, 0)` reconciles the 1-slot vs 2-slot ABI at the crossing point. ### Return traces A failable that reaches the function boundary unhandled carries a **return trace** — the chain of `raise` / `try` sites the error passed through. - **Storage:** a thread-local fixed-cap ring (32 frames; newest survive on overflow). `raise` and each failing `try` push a frame; every absorbing site (`catch`, a succeeding chain attempt, a value terminator, a destructure) clears the buffer. - **Resolution is in-process — no DWARF, no OS symbolizer.** A runtime frame is a pointer to a compile-time-interned `Frame { file, line, col, func, line_text }` stamped at the push site; the formatter reads it directly (deterministic, identical across OS/target, works under the JIT and a signed iOS `.app`). A comptime frame is `(func_id, ir_offset)` resolved via the interpreter's in-memory IR/source tables. - **Mode.** On by default in debug; release no-ops the push points (opt back in with `--release-traces`). **Comptime (`#run`) is always traced.** - **Formatting** lives in `library/modules/trace.sx` (`trace.print_current()`), rendering `func at file:line:col` per frame plus the source line and a `^` caret. DWARF line-info is still emitted (debug, strippable) so `lldb` / `gdb` can step sx source — that is a debugger artifact, separate from trace resolution. ### ABI The error slot is a `u32`, always the last slot of the multi-return tuple, in both register- and stack-return conventions. `0` = no error; non-zero = an interned global tag id (pool capacity ~4.3 billion; fixed 32-bit, no dynamic widening across builds). Errors are a pure value channel — no coupling to the implicit `context`. --- ## 13. Grammar (informal) ``` program = top_level* top_level = decl | import_decl import_decl = '#import' STRING ';' | IDENT '::' '#import' STRING ';' decl = const_decl | var_decl | fn_decl | enum_decl | struct_decl | error_decl error_decl = IDENT '::' 'error' '{' IDENT (',' IDENT)* ','? '}' ';' const_decl = IDENT '::' expr ';' | IDENT ':' type ':' expr ';' var_decl = IDENT ':=' expr ';' | IDENT ':' type '=' expr ';' | IDENT ':' type ';' fn_decl = IDENT '::' '(' params? ')' ('->' type)? block | IDENT '::' block enum_decl = IDENT '::' 'enum' '{' (IDENT ';')* '}' struct_decl = IDENT '::' 'struct' '{' struct_member* '}' struct_member = field_group | '#using' IDENT ';' field_group = IDENT (',' IDENT)* ':' type ('=' expr)? ';' params = param (',' param)* ','? param = IDENT ':' type ('=' expr)? block = '{' stmt* '}' stmt = decl | assignment ';' | multi_assign ';' | return_stmt | defer_stmt | insert_stmt | push_stmt | break_stmt | continue_stmt | raise_stmt | onfail_stmt | expr ';' return_stmt = 'return' expr? ';' break_stmt = 'break' ';' continue_stmt = 'continue' ';' raise_stmt = 'raise' expr ';' onfail_stmt = 'onfail' IDENT? block defer_stmt = 'defer' expr ';' insert_stmt = '#insert' expr ';' push_stmt = 'push' expr block assignment = lvalue ('=' | '+=' | '-=' | '*=' | '/=') expr multi_assign = lvalue (',' lvalue)+ '=' expr (',' expr)+ lvalue = IDENT | postfix '.' IDENT expr = if_expr | match_expr | while_expr | for_expr | lambda | binary while_expr = 'while' expr block for_expr = 'for' expr ':' '(' IDENT [',' IDENT] ')' block binary = catch_expr (binop catch_expr)* // binop includes `or` (fallback / chain) catch_expr = unary ('catch' IDENT? (block | '==' '{' case_arm* else_arm? '}' | unary))? unary = ('-' | '!' | 'xx' | 'try' | 'cast' '(' type ')') postfix | postfix postfix = primary ('(' args? ')' | '.' IDENT | '.{' field_init_list '}')* primary = INT | HEX_INT | BIN_INT | FLOAT | STRING | BOOL | IDENT | '---' | '.' IDENT | '.' '{' field_init_list '}' | '(' expr ')' | block | '#run' expr field_init_list = field_init (',' field_init)* ','? field_init = IDENT '=' expr | IDENT | expr if_expr = 'if' expr 'then' expr ('else' expr)? | 'if' expr block ('else' block)? match_expr = 'if' expr '==' '{' case_arm* else_arm? '}' case_arm = 'case' pattern ':' (stmt* | 'break' ';') else_arm = 'else' ':' stmt* pattern = '.' IDENT | INT | BOOL | IDENT lambda = '(' params? ')' ('->' type)? '=>' expr args = expr (',' expr)* ','? type = '$' IDENT | 's32' | 'f32' | 'f64' | 'bool' | 'string' | 'Any' | 'Type' | '..' type | '[' expr ']' type | IDENT | '(' type (',' type)* ',' '!' IDENT? ')' // value-carrying failable | '!' IDENT? // pure failable (`!` / `!Named`) ``` --- ## 14. Open Questions - **Nested functions**: Can functions be defined inside other functions? - **Operator overloading**: Not shown — presumably no. - **Top-level expressions**: Are bare expressions allowed at the top level or only declarations?