# sx language specification ## 1. Lexical Structure ### Comments Line comments start with `//` and extend to end of line. ```sx // this is a comment ``` ### Identifiers - Lowercase or mixed-case for variables, functions: `x`, `compute`, `main` - UPPER_SNAKE_CASE for constants: `SOME_INT`, `SOME_STR` - PascalCase for types: `Foo` ### Literals | Kind | Examples | Type | |-----------|---------------------|---------| | Integer | `0`, `42`, `0xFF`, `0b1010` | `s64` | | Float | `0.3`, `0.9` | `f32` | | String | `"Hello"`, `"z: {z}"` | `string` (may span multiple lines) | | Heredoc String | `#string END`...`END` | `string` | | Boolean | `true`, `false` | `bool` | | Enum | `.variant1` | inferred from context | | Undefined | `---` | context-dependent | String literals support escape sequences (`\n`, `\t`, `\r`, `\\`, `\"`, `\0`) and may span multiple lines directly: ```sx shader_src := "#version 330 core void main() { gl_Position = vec4(0.0); } "; ``` **Heredoc strings** use `#string DELIMITER` syntax (inspired by Jai). Content is completely raw — no escape processing. The delimiter is any identifier. Content starts after the newline following the delimiter and ends when the delimiter appears at column 0 of a line. ```sx vert_src := #string GLSL #version 330 core void main() { gl_Position = vec4(aPos, 1.0); } GLSL; ``` ### Keywords `if`, `else`, `then`, `while`, `break`, `continue`, `true`, `false`, `enum`, `struct`, `union`, `case`, `return`, `defer`, `xx`, `and`, `or` > Note: `enum` is used for both payload-less and payload-bearing sum types (tagged unions). `union` is reserved for C-style untagged unions (memory overlays). ### Operators | Operator | Meaning | |----------|------------------| | `+` | addition | | `-` | subtraction / negation | | `*` | multiplication | | `/` | division | | `==` | equality | | `!=` | inequality | | `<` | less than | | `>` | greater than | | `<=` | less or equal | | `>=` | greater or equal | | `&` | bitwise AND | | `\|` | bitwise OR | | `and` | logical AND (short-circuit) | | `or` | logical OR (short-circuit) | | `+=` | add-assign | | `-=` | sub-assign | | `*=` | mul-assign | | `/=` | div-assign | ### Delimiters and Punctuation | Token | Meaning | |--------|--------------------------------------| | `::` | constant binding / definition | | `:=` | variable binding (mutable, inferred) | | `:` | type annotation | | `=` | assignment (in typed var decl) | | `;` | statement terminator | | `,` | separator | | `.` | field access / enum literal prefix | | `->` | return type annotation | | `=>` | lambda arrow | | `$` | generic type parameter introduction | | `---` | undefined value | | `()` | grouping / params | | `{}` | blocks / bodies | --- ## 2. Type System ### Primitive Types - `s1`..`s64` — signed integers (1 to 64 bits). `s64` is the default for integer literals. - `u1`..`u64` — unsigned integers (1 to 64 bits). - `f32` — 32-bit floating point - `f64` — 64-bit floating point - `bool` — boolean (`true` / `false`) - `string` — string of characters - `Any` — type-erased value, represented as `{ i64, i64 }` (type tag + payload). Used for variadic arguments and runtime type dispatch. - `Type` — compile-time type value. At runtime, represented as an `i64` type tag (same tag space as `Any`). ### Enum Types User-defined sum types with named variants. Variants may optionally carry typed data (tagged unions). Internally, payload-less enums are represented as `i64` (variant index). Enums with payloads are represented as `{ i64, [max_payload_size x i8] }` (tag + data). #### Declaration ```sx // Payload-less enum Color :: enum { red; green; blue; } // Enum with payloads (tagged union) Shape :: enum { circle: f32; // typed variant rect: s32; // typed variant none; // void variant } ``` Variants are referenced with dot-prefix syntax: `.variant1` #### Construction ```sx c := Color.red; // payload-less s :Shape = .circle(3.14); // inferred from context s = .none; // void variant s = Shape.rect(42); // explicit prefix ``` #### Payload Access ```sx r := s.circle; // load payload as f32 (undefined behavior if wrong variant active) ``` #### Pattern Matching ```sx if s == { case .circle: print("circle\n"); case .rect: print("rect\n"); case .none: print("none\n"); } ``` #### Payload Capture Match arms can capture the variant's payload into a local variable: ```sx if s == { case .circle: (radius) { print("radius: {}\n", radius); } case .rect: (size) => print("size: {}\n", size); } ``` The `(name)` after the colon binds the payload. Two forms: - Block: `case .variant: (name) { body }` - Short: `case .variant: (name) => expr;` #### Enum Interpolation Payload-less enums print as `.variant`. Enums with payloads print as `.variant(value)` or ``: ```sx print("{}", s); // .circle(3.140000) ``` ### Union Types (Untagged) C-style untagged unions for zero-cost memory overlays (type punning). All fields share the same memory — no tag, no runtime overhead. The LLVM representation is `[max_field_size x i8]`. #### Declaration ```sx Overlay :: union { f: f32; i: s32; } ``` All fields must have types (unlike enums, which may have void variants). #### Anonymous Struct Fields (Member Promotion) Anonymous `struct` fields inside a union have their members promoted to the union namespace: ```sx Vec2 :: union { data: [2]f32; struct { x, y: f32; }; } ``` Access promoted members directly: `v.x`, `v.y` — these are zero-cost GEPs into the same underlying memory as `v.data[0]`, `v.data[1]`. #### Initialization Unions must be initialized with `---` (undefined) and then assigned per-field: ```sx o :Overlay = ---; o.f = 3.14; print("{}\n", o.i); // reinterpret bits as s32 ``` #### Restrictions - Pattern matching (`if x == { case ... }`) is not supported on unions. - Unions cannot be printed directly via `print("{}", union_val)` — access individual fields instead. ### Struct Types User-defined product types with named fields. ```sx Vec4 :: struct { x, y, z, w: f32; } ``` Fields are declared as `name1, name2: type;` (comma-separated names sharing a type, semicolon-terminated). #### Field Defaults Fields may have default values. Fields without an explicit default have a zero-value default. `---` marks a field as explicitly undefined. ```sx Foo :: struct { a : u2; // default is 0 b : u8 = 42; // default is 42 c : u8 = ---; // default is undefined } ``` #### Struct Literals ```sx // Positional (with type annotation — type inferred from annotation) v1 : Vec4 = .{ 1, 2, 3, 0 }; // Positional (with type prefix) v2 := Vec4.{ 4, 1, 1, 3 }; // Named fields (any order) v3 := Vec4.{ w=0, x=2, y=3, z=4 }; // Mixed named + shorthand (bare identifier = field name matches variable name) z := 5.0; w := 6.0; v4 := Vec4.{ y=3, x=9, w, z }; ``` #### Field Access and Assignment ```sx v1.x // read field x of struct v1 v1.x = 3.0; // assign to field x of struct v1 ``` #### Struct Interpolation Struct values in string interpolation print as `TypeName{field:value, ...}`: ```sx print("{}", v1); // Vec4{x:1.0, y:2.0, z:3.0, w:0.0} ``` ### Array Types Fixed-size arrays with element type and length. ```sx buffer : [5]f32 = .[0, 2, 3.5, 4, 0]; val := buffer[2]; // 3.5 buffer.len // 5 (compile-time constant, s64) ``` Arrays can also be constructed programmatically with the `Array` builtin: ```sx MyArr :: Array(5, s32); // equivalent to [5]s32 ``` ### Slice Types A slice `[]T` is a fat pointer `{ptr, i64}` referencing a contiguous sequence of `T` elements. Same runtime layout as `string`. ```sx // Arrays implicitly coerce to slices at call sites arr : [5]s32 = .[3, 1, 4, 1, 5]; sortSlice(arr); // [5]s32 → []s32 coercion // Slice operations items[i] // read element at index items[i] = val; // write element at index items.len // length (s64) items.ptr // raw pointer ``` Slices support generic type parameters: `[]$T` introduces type parameter `T` inferred from the element type of the argument (array or slice). ### Subslicing Arrays, slices, and strings support subslice syntax to create zero-copy views: ```sx arr : [5]s32 = .[3, 1, 4, 1, 5]; sub := arr[1..4]; // []s32 → [1, 4, 1] head := arr[..3]; // []s32 → [3, 1, 4] tail := arr[2..]; // []s32 → [4, 1, 5] msg := "hello world"; word := msg[6..11]; // string → "world" ``` - `expr[start..end]` — elements from `start` (inclusive) to `end` (exclusive) - `expr[start..]` — elements from `start` to end - `expr[..end]` — elements from beginning to `end` - Result type: `[]T` for arrays/slices, `string` for strings - No memory allocation — the result points into the original backing storage ### Pointer Types | Syntax | Meaning | `.len` | `[i]` | |--------|---------|--------|-------| | `*T` | pointer to one T | no | no | | `[*]T` | many-pointer (buffer) | no | yes | | `*[N]T` | pointer to array of N T | yes | yes | | `*[]T` | pointer to slice | yes | yes | **Address-of**: `@x` returns a pointer to the variable. ```sx v := Vec2.{ 1.0, 2.0 }; ptr := @v; // *Vec2 ``` **Dereference**: `p.*` loads the value through the pointer. ```sx copy := ptr.*; // Vec2 ``` **Auto-deref**: `p.field` is sugar for `p.*.field`. ```sx set_x :: (p: *Vec2, val: f32) { p.x = val; // auto-deref: p.*.x = val } set_x(@v, 99.0); ``` **Null**: All pointer types are nullable. `null` is the null pointer literal. ```sx np : *Vec2 = null; ``` **Many-pointer**: `[*]T` supports indexing for buffers of unknown size. ```sx arr : [5]s32 = .[10, 20, 30, 40, 50]; mp : [*]s32 = @arr[0]; // *s32 → [*]s32 implicit val := mp[2]; // 30 ``` **Implicit conversions**: - `*T` → `[*]T` (pointer to element → many-pointer) - `*[N]T` → `[*]T` (pointer to array → many-pointer) - `[N]T` → `[*]T` at call sites (array decays to many-pointer) - `[]T` → `[*]T` (slice decays to many-pointer, extracts `.ptr`) - `T` → `*T` at call sites (implicit address-of) - `null` (`*void`) → any `*T` **Fat pointer layout**: `[:0]u8`, `string`, and `[]T` are `{ptr, i64}` structs. The raw pointer is always the first field at offset 0. This means `*[:0]u8` works as C's `char**` — a C function dereferences through the outer pointer and reads the raw `char*` from offset 0. ### C Interop Type Mapping | C type | sx type | Notes | |--------|---------|-------| | `const char*` (input) | `[:0]u8` | compiler extracts `.ptr` at call site | | `char*` (output buffer) | `[*]u8` | raw buffer, no length | | `const char**` | `*[:0]u8` | address of `[:0]u8` — `.ptr` at offset 0 | | `int*` (single out) | `*s32` | | | `unsigned*` (single out) | `*u32` | | | `float*` (buffer) | `[*]f32` | | | `void*` (generic) | `*void` | only for truly opaque/generic data | ### Vector Types (SIMD) LLVM SIMD vectors, parameterized by length and element type. ```sx v := vec3(1, 3, 2); // Vector(3, f32) ``` **Arithmetic**: Element-wise `+`, `-`, `*`, `/` on vectors of same dimensions. ```sx add := v1 + v2; // element-wise addition ``` **Scalar broadcast**: Scalar operands are broadcast to match the vector. ```sx scaled := v * 2.0; // [2.0, 6.0, 4.0] ``` **Negation**: Unary `-` negates each element. ```sx neg := -v; // [-1.0, -3.0, -2.0] ``` **Element access**: `.x`, `.y`, `.z`, `.w` (aliases `.r`, `.g`, `.b`, `.a`) extract single components. ```sx v.x // first element v.z // third element ``` **Index access**: `v[i]` extracts by index. ```sx v[0] // first element ``` **Built-in `sqrt`**: Calls LLVM `llvm.sqrt.f32`/`.f64` intrinsic. ```sx s := sqrt(9.0); // 3.0 ``` ### Function Types Expressed as `(param_types) -> return_type`. A function with no return type annotation returns void. ```sx // type is (s32) -> s32 compute :: (x: s32) -> s32 { x * x; } // type is () -> void main :: () { } ``` ### Type Aliases A name bound to an existing type. ```sx SOME_TYPE :: f64; ``` ### Generic Functions (Monomorphization) Functions can be parameterized over types using `$T` syntax. The `$` prefix introduces a type parameter; subsequent uses of the name reference it. ```sx sum :: (a: $T, b: T) -> T { return a + b; } ``` - `$T` in a parameter type **introduces** type parameter `T` - Bare `T` (without `$`) **references** the introduced type parameter - At call sites, type arguments are **inferred** from actual argument types: ```sx sum(40, 2) // T = s32 sum(1.5, 2.5) // T = f32 ``` - Each unique set of concrete types produces a **separate specialized function** (monomorphization) - Multiple type parameters are supported: `(a: $T, b: $U) -> T` ### Variadic Functions Functions can accept a variable number of arguments using `..Type` syntax: ```sx print :: (fmt: string, args: ..Any) { ... } ``` - `..Any` means zero or more arguments, each boxed into `Any` (type tag + payload) - The variadic parameter must be the last parameter - At call sites, variadic arguments are automatically boxed: `print("x={}, y={}\n", x, y)` - Inside the function body, `args` is accessed as a slice-like sequence ### Type Inference - `::` bindings infer type from the right-hand side - `:=` bindings infer type from the right-hand side - Explicit annotation overrides inference: `NAME : f64 : 0.9;` - Integer literals default to `s64` - Float literals default to `f32` - Enum literals (`.variant`) infer their enum type from context (expected type) ### Type Conversions **Implicit (widening)** — allowed without annotation: - Integer to wider integer of same signedness (`u8` → `u16`, `s8` → `s32`) - Unsigned to strictly wider signed (`u8` → `s16`) - Any integer to any float (`u8` → `f32`, `s32` → `f64`) - Float to wider float (`f32` → `f64`) - Integer and float literals can convert to any numeric type implicitly **Explicit (narrowing)** — requires `xx` prefix: - Integer to narrower integer (`s32` → `u8`) - Signed to unsigned (`s32` → `u32`) - Float to narrower float (`f64` → `f32`) - Float to any integer (`f64` → `u16`) - Unsigned to signed of same or narrower width (`u8` → `s8`) The `xx` prefix operator marks an expression for auto-conversion to the expected type from context (assignment, declaration, argument, return): ```sx large: f64 = 5999.5; x : u16 = xx large; // f64 → u16 d : u8 = #run xx resolve(5); // s32 → u8 at compile time ``` Using `xx` outside a typed context (where the target type is known) is a compile error. --- ## 3. Declarations ### Constant Binding (immutable) ```sx // inferred type NAME :: value; // explicit type NAME : type : value; ``` The `::` operator creates an immutable binding. The value is evaluated at compile time when possible. Examples: ```sx SOME_INT :: 0; // s32 SOME_STR :: "Hello"; // string SOME_FLOAT :: 0.3; // f32 SOME_DOUBLE : f64 : 0.9; // f64 (explicit) SOME_FUNC :: () => 42; // () -> s32 SOME_TYPE :: f64; // type alias ``` ### Variable Binding (mutable) ```sx // inferred type name := value; // explicit type name : type = value; // default-initialized (type required) name : type; // undefined (type required) name : type = ---; ``` The `:=` operator creates a mutable binding. The type is inferred unless explicitly annotated. `name : type;` initializes using the type's defaults: zero for primitives, per-field defaults for structs (see Field Defaults). `name : type = ---;` leaves the value undefined (uninitialized memory). Reading before writing is undefined behavior. Examples: ```sx x := 42; // s32, mutable x := if true then 1 else 2; z : Foo = .variant2; // Foo, mutable, explicit type a : Foo; // Foo, default-initialized (a=0, b=42, c=undef) b : Foo = ---; // Foo, entirely undefined ``` ### Function Definition ```sx name :: (params) -> return_type { body } ``` - Parameters: `name: type` separated by commas - Return type: `-> type` (omit for void) - Body: block of statements; last expression is the implicit return value - No `return` keyword needed (last expression = return value) Examples: ```sx compute :: (x: s32) -> s32 { x * x; } main :: () { // void return, no -> annotation } // Bare-block shorthand (equivalent to no-arg void function): main :: { // same as main :: () { ... } } ``` ### Enum Definition ```sx Name :: enum { variant1; variant2; } ``` Defines a new enum type with the given variants. Trailing comma is allowed. ### Enum Backing Type An optional backing type can be specified after the `enum` keyword (Jai-style): ```sx Color :: enum u8 { red; green; blue; } Status :: enum s16 { ok; error; timeout; } ``` Syntax: `Name :: enum [flags] [type] { ... }` The backing type must be an integer type (`u8`, `u16`, `u32`, `s8`, `s16`, `s32`, `s64`, etc.). When omitted, the default is `s64`. This is useful for C interop (matching C enum sizes) and memory efficiency. ### Enum Layout Struct For C interop with tagged unions (e.g. SDL_Event), a struct can be used as the backing type to specify the exact memory layout: ```sx // Inline layout SDL_Event :: enum struct { tag: u32; _: u32; payload: [30]u32; } { quit :: 0x100; key_down :: 0x300: SDL_KeyData; key_up :: 0x301: SDL_KeyData; } // Named layout EventLayout :: struct { tag: u32; _: u32; payload: [30]u32; } SDL_Event :: enum EventLayout { quit :: 0x100; key_down :: 0x300: SDL_KeyData; } ``` The layout struct must have: - A field named `tag` — integer type, the discriminant. Its type becomes the enum's backing type. - A field named `payload` — array type, the variant data area. Its size determines the maximum payload capacity. - Any other fields are treated as padding/reserved and positioned by the struct layout. This gives explicit control over the memory layout instead of relying on automatic alignment. The total size equals the struct size. Without a layout struct, tagged enums use `{ tag, [max_payload_size x i8] }` with no padding. ### Enum Flags ```sx Perms :: enum flags { read; // 1 write; // 2 execute; // 4 } ``` Flags can also specify a backing type: ```sx SDL_InitFlags :: enum flags u32 { video :: 0x20; audio :: 0x10; } ``` The `flags` modifier assigns auto power-of-2 values (1, 2, 4, 8, ...) instead of sequential indices (0, 1, 2, ...). Flags can be combined with `|` and tested with `&`: ```sx p :Perms = .read | .write; if p & .execute { ... } print("{}\n", p); // .read | .write ``` Explicit values use `::` syntax (Jai-style): ```sx WindowFlags :: enum flags { vsync :: 64; resizable :: 4; hidden :: 128; } ``` Restrictions: - Flags enum variants cannot have payloads - `flags` is a contextual identifier, not a keyword ### Bitwise Operators `&` (bitwise AND) and `|` (bitwise OR) work on all integer types, not just flags. They sit at precedence level 3, between comparisons and logical operators. ```sx x := 0xFF & 0x0F; // 15 y := 1 | 2 | 4; // 7 ``` --- ## 4. Expressions Everything in `sx` is expression-oriented where possible. ### Operator Precedence | Prec | Operators | Notes | |------|-----------|-------| | 6 (highest) | `*`, `/`, `%` | multiplication, division, modulo | | 5 | `+`, `-` | addition, subtraction | | 4 | `<`, `<=`, `>`, `>=`, `==`, `!=` | comparisons (chainable) | | 3 | `&`, `\|` | bitwise AND, bitwise OR | | 2 | `and` | logical AND (short-circuit) | | 1 (lowest) | `or` | logical OR (short-circuit) | ### Arithmetic Standard infix: `+`, `-`, `*`, `/` with usual precedence (`*`/`/` before `+`/`-`). ```sx x * x x + 2 ``` ### Chained Comparisons Comparison operators can be chained. Each operand is evaluated exactly once. ```sx 0 <= x <= 100 // equivalent to: 0 <= x and x <= 100 1000 > x >= -100 // equivalent to: 1000 > x and x >= -100 a == b == c // equivalent to: a == b and b == c ``` Mixed operators are allowed: `a < b <= c > d` means `a < b and b <= c and c > d`. ### Logical Operators `and` and `or` are short-circuit boolean operators. The right operand is not evaluated if the left operand determines the result. ```sx if 0 <= x <= 100 and 0 <= y <= 100 { print("contained"); } ``` ### If Expression (inline form) ```sx if condition then consequent else alternate ``` Both branches are single expressions. The whole form produces a value. ```sx x := if true then 1 else 2; ``` The `else` branch is optional. Without it, the form is a statement (no value): ```sx if i == 2 then continue; if done then break; if err then return; ``` ### If Expression (block form) ```sx if condition { stmts } else { stmts } ``` Each branch is a block. The last expression in each block is the branch's value. Can be used inline within other expressions: ```sx y := x + if false { 7; } else { 12; }; ``` ### Pattern Matching ```sx if subject == { case pattern: body case pattern: body else: body // optional default arm } ``` Matches `subject` against each `case`. Patterns can be: - **Enum literals**: `.variant` — matches a specific enum variant. - **Integer/bool literals**: `42`, `true` — matches a specific value. - **Type categories**: `struct`, `enum`, `union` — matches all types in that category (used with `type_of` values). `break` exits a case arm without producing a value. The optional `else:` arm matches when no `case` pattern matches. ```sx if z == { case .variant1: break; case .variant2: print("z: {z}"); else: print("unknown"); } ``` #### Type Category Matching When switching on a `Type` value (from `type_of`), category keywords match all registered types of that category: ```sx type := type_of(val); if type == { case int: result = int_to_string(xx val); case struct: result = struct_to_string(cast(type) val); case enum: result = enum_to_string(cast(type) val); } ``` Available categories: `int`, `float`, `bool`, `string`, `struct`, `enum`, `vector`, `array`, `slice`, `pointer`, `type`. > Note: `case enum:` matches both payload-less enums and tagged enums (enums with payloads). C-style untagged unions are not registered with the Any type system and cannot be matched by category. Inside a category arm, `cast(type) val` performs **runtime generic dispatch**: the compiler generates a switch over all types in the category, monomorphizing the callee for each concrete type. ### While Loop ```sx while condition { body } ``` Repeats `body` as long as `condition` is true. `break;` exits the loop. `continue;` skips to the next iteration. ```sx i := 0; while i < 10 { i += 1; if i == 5 { continue; } if i == 8 { break; } print("{i}\n"); } ``` ### For Loop ```sx for iterable: (elem) { } // element alias (no copy) for iterable: (elem, ix) { } // element + index for iterable: (_, ix) { } // index only ``` Iterates over arrays and slices. The capture clause after `:` binds loop variables: - The first name is the element capture (non-reassignable alias into the array/slice) - The optional second name is the index (s64, starting at 0, also non-reassignable) - Use `_` to discard a capture The element capture is a direct alias — reads and field writes go to the original array element. Direct reassignment of the capture (`elem = x`) is a compile error. `break;` exits the loop. `continue;` skips to the next iteration. ```sx arr : [5]s32 = .[1, 2, 3, 4, 5]; for arr: (val, ix) { if ix == 2 { continue; } print("{}\n", val); } ``` ### Lambda ```sx (params) => expr (params) -> return_type => expr ``` Anonymous function. Produces a function value. Supports the same parameter features as named functions: `$` generic type params, `..` variadic params, and optional return type annotation. ```sx SOME_FUNC :: () => 42; // () -> s32 double :: (x: $T) -> T => x + x; // generic lambda with return type ``` ### Function Call ```sx callee(args) ``` ```sx compute(6) print("hello") ``` ### Field Access ```sx object.field ``` Used for module access (`std.print`) and struct member access. ### Enum Literal ```sx .variant_name ``` The enum type is inferred from context (expected type from declaration or parameter). --- ## 5. Statements Statements are terminated by `;`. - **Declaration**: `name :: value;` / `name := value;` - **Assignment**: `name = value;` / `name += value;` (and other compound assignments). Also supports field targets: `obj.field = value;` - **Multi-target assignment**: `a, b = b, a;` — all RHS values are evaluated before any stores, enabling swaps without temporaries. Target count must equal value count. Only plain `=` is supported (no compound operators). Each target must be a valid lvalue (variable, field, index, dereference). - **Expression statement**: `expr;` — evaluates the expression (last in a block = return value) - **Return**: `return expr;` — returns from the enclosing function with the given value. `return;` returns void. - **Break**: `break;` — exits a match arm or while loop - **Continue**: `continue;` — skips to the next iteration of a while loop - **Defer**: `defer expr;` — defers execution of `expr` until the enclosing block exits (LIFO order) --- ## 6. Blocks, Scoping, and Implicit Returns A block `{ ... }` contains zero or more statements. The last expression in a block is its value (implicit return). In function bodies, the last expression becomes the return value: ```sx compute :: (x: s32) -> s32 { x * x; // this is returned } ``` ### Scope Blocks Bare blocks can be used as statements to introduce a new lexical scope. Variables declared inside a scope block are local to that block. No trailing `;` is required. ```sx main :: { x := 42; { x := 6; // shadows outer x print("inner: {x}"); // prints 6 } print("outer: {x}"); // prints 42 } ``` ### Variable Shadowing A variable declaration (`name :=`) inside an inner scope shadows any variable with the same name from outer scopes. The outer variable is restored when the inner scope exits. ### Defer `defer expr;` schedules `expr` to execute when the enclosing scope block exits. Multiple defers in the same scope execute in reverse order (LIFO). ```sx { defer print("second"); defer print("first"); } // prints: first, then second ``` --- ## 7. Built-in Functions Built-in functions are declared in `std.sx` with the `#builtin` suffix, which tells the compiler to generate the implementation internally rather than looking for a function body. ### I/O - `write(str: string) -> void` — write a string to standard output - `print(fmt: string, args: ..Any)` — formatted print. Parses `{}` placeholders in the format string and substitutes arguments. When all argument types are statically known, the compiler specializes the call at compile time (no `Any` boxing). ### Math - `sqrt(x: $T) -> T` — square root (maps to LLVM intrinsic) ### Memory - `alloc(size: s64) -> string` — allocate `size` bytes of memory, returned as a string slice - `size_of($T: Type) -> s64` — size of type `T` in bytes ### Type Introspection - `type_of(val: $T) -> Type` — returns the runtime type tag of a value - `type_name($T: Type) -> string` — returns the name of type `T` as a string (e.g., `"Point"`) - `field_count($T: Type) -> s64` — returns the number of fields (struct), variants (enum), or elements (vector) in type `T` - `field_name($T: Type, idx: s64) -> string` — returns the name of the `idx`-th field (struct) or variant (enum) of type `T` - `field_value(s: $T, idx: s64) -> Any` — returns the `idx`-th field (struct) or element (vector) of `s`, boxed as `Any` - `field_index($T: Type, val: T) -> s64` — returns the sequential variant index for an explicit enum value (reverse of `field_value_int`). Returns `-1` if no variant matches. ### Type Conversion - `cast(Type) expr` — prefix operator that converts `expr` to `Type`. Examples: `cast(s32) 3.14`, `cast(f64) n`. When `Type` is a runtime `Type` value inside a type-category match arm, the compiler generates a dispatch switch over all types in the category, monomorphizing the callee for each concrete type. ### Vectors - `Vector($N: int, $T: Type) -> Type` — returns an LLVM vector type of `N` elements of type `T` --- ## 8. Compile-time Evaluation ### `#run` Directive `#run expr` evaluates `expr` at compile time using lazy JIT execution. It can appear in two contexts: **Compile-time constants** — bind a compile-time value to a name: ```sx compute :: (x: s32) -> s32 { x * x; } x :: #run compute(5); // x = 25, evaluated at compile time ``` Comptime globals are resolved lazily: the JIT executes only when the value is first referenced during code generation. Chained dependencies are resolved automatically. **Side effects** — execute code at compile time for its side effects: ```sx #run print("compiling..."); ``` ### `#insert` Directive `#insert expr;` evaluates `expr` at compile time to obtain a string, then parses and compiles that string as inline code at the insertion point. ```sx generate :: () -> string { return "print(\"hello from the other side\");"; } main :: () { #insert #run generate(); // equivalent to: print("hello from the other side"); } ``` The inserted string must contain valid `sx` statements (including semicolons). The statements are parsed and compiled in the same scope as the `#insert` site. --- ## 9. Modules / Imports ### `#import` Directive The `#import` directive brings declarations from another `.sx` file into the current file. Paths are resolved relative to the importing file's directory. **Flat import** — splices all declarations from the imported file into the current scope: ```sx #import "modules/std/math.sx"; ``` **Namespaced import** — wraps all declarations under a namespace name: ```sx std :: #import "modules/std.sx"; ``` Namespaced declarations are accessed with dot notation: ```sx std.print("hello"); ``` ### Import Resolution - Imports are resolved after parsing and before code generation. - Paths are relative to the directory of the file containing the `#import`. - Nested imports are supported (imported files may themselves contain `#import`). - Circular imports are detected and silently skipped (each file is imported at most once). - Generic functions in namespaced imports are supported (e.g., `std.mul(5, 2)` where `mul` is generic). ### Intra-module References Functions within a namespaced import can call each other without the namespace prefix. When generating code for a namespaced module, unresolved function names are automatically tried with the namespace prefix. ### Example ```sx // modules/std/math.sx mul :: (base: $T, exp: T) -> T { base * exp; } // modules/std/std.sx print :: (str: string) -> void #builtin; // main.sx std :: #import "modules/std.sx"; #import "modules/std/math.sx"; main :: () -> s32 { std.print("hello there"); mul(5, 2); } ``` --- ## 10. Program Structure A program is a sequence of top-level declarations and `#import` directives. Execution begins at `main`. ```sx main :: () { // entry point } ``` `main` takes no arguments and returns void. The process exit code is 0 unless otherwise specified. --- ## 11. Grammar (informal) ``` program = top_level* top_level = decl | import_decl import_decl = '#import' STRING ';' | IDENT '::' '#import' STRING ';' decl = const_decl | var_decl | fn_decl | enum_decl | struct_decl const_decl = IDENT '::' expr ';' | IDENT ':' type ':' expr ';' var_decl = IDENT ':=' expr ';' | IDENT ':' type '=' expr ';' | IDENT ':' type ';' fn_decl = IDENT '::' '(' params? ')' ('->' type)? block | IDENT '::' block enum_decl = IDENT '::' 'enum' '{' (IDENT ';')* '}' struct_decl = IDENT '::' 'struct' '{' field_group* '}' field_group = IDENT (',' IDENT)* ':' type ('=' expr)? ';' params = param (',' param)* param = IDENT ':' type block = '{' stmt* '}' stmt = decl | assignment ';' | multi_assign ';' | return_stmt | defer_stmt | insert_stmt | break_stmt | continue_stmt | expr ';' return_stmt = 'return' expr? ';' break_stmt = 'break' ';' continue_stmt = 'continue' ';' defer_stmt = 'defer' expr ';' insert_stmt = '#insert' expr ';' assignment = lvalue ('=' | '+=' | '-=' | '*=' | '/=') expr multi_assign = lvalue (',' lvalue)+ '=' expr (',' expr)+ lvalue = IDENT | postfix '.' IDENT expr = if_expr | match_expr | while_expr | for_expr | lambda | binary while_expr = 'while' expr block for_expr = 'for' expr ':' '(' IDENT [',' IDENT] ')' block binary = unary (binop unary)* unary = ('-' | '!' | 'xx' | 'cast' '(' type ')') postfix | postfix postfix = primary ('(' args? ')' | '.' IDENT | '.{' field_init_list '}')* primary = INT | HEX_INT | BIN_INT | FLOAT | STRING | BOOL | IDENT | '---' | '.' IDENT | '.' '{' field_init_list '}' | '(' expr ')' | block | '#run' expr field_init_list = field_init (',' field_init)* field_init = IDENT '=' expr | IDENT | expr if_expr = 'if' expr 'then' expr ('else' expr)? | 'if' expr block ('else' block)? match_expr = 'if' expr '==' '{' case_arm* else_arm? '}' case_arm = 'case' pattern ':' (stmt* | 'break' ';') else_arm = 'else' ':' stmt* pattern = '.' IDENT | INT | BOOL | IDENT lambda = '(' params? ')' ('->' type)? '=>' expr args = expr (',' expr)* type = '$' IDENT | 's32' | 'f32' | 'f64' | 'bool' | 'string' | 'Any' | 'Type' | '..' type | '[' expr ']' type | IDENT ``` --- ## 12. Open Questions These are inferred gaps — things not shown in the readme that need decisions: - **`return`**: Both `return expr;` and implicit return (last expression) are supported. - **Else in match**: Is there a default/else arm in pattern matching? - **Nested functions**: Can functions be defined inside other functions? - **Mutability of params**: Are function parameters immutable by default? - **Array/list types**: Not shown — deferred. - **Struct types**: Implemented — named struct types with positional/named/shorthand literals. - **Imports/modules**: `#import` directive supports flat and namespaced imports (see Section 8). - **Operator overloading**: Not shown — presumably no. - **Semicolons**: Required on all statements? What about the last expression in a block? - **Top-level expressions**: Are bare expressions allowed at the top level or only declarations?