`xx <struct-typed local>` used to heap-copy the value through context.allocator.
The protocol value's `ctx` pointed at the heap copy; the original local was
left behind, untouched. Mutations through the protocol never reached the
original, and direct reads of the original never saw protocol mutations.
Two-fork bug, silent, easy to write by mistake.
New rule (Option 3 in the discussion):
- `xx <lvalue>` — identifier, field access, index expression, deref —
borrows the operand's storage. No heap copy, no `free` needed.
- `xx <rvalue>` — struct literal, function-call result, arithmetic, etc. —
heap-copies through context.allocator. Unchanged from today.
- `xx @ptr` and `xx <pointer-typed value>` — borrows the pointee. Unchanged.
Single switch in `buildProtocolErasure` ([lower.zig:10334](src/ir/lower.zig#L10334))
gated by a new `isLvalueExpr` helper ([lower.zig:10322](src/ir/lower.zig#L10322)).
Struct-typed operand: if the AST shape is identifier/field/index/deref,
emit `lowerExprAsPtr(operand_node)` and skip the heap-copy; otherwise
keep the alloca-store-heap_copy path.
specs.md §3 ownership table extended to three rows (rvalue, lvalue,
pointer) with examples and rationale per row.
Regressions:
- `examples/130-xx-value-routes-through-context-allocator.sx` — the
Phase 1.1 witness for heap-copy-via-context-allocator. Previous shape
(`xx <local-value>`) is now a borrow under Option 3 and no longer
exercises the heap-copy path. Rewritten to use a struct literal
(`xx ByValue.{...}`) which still heap-copies through context.allocator
— Tracer.count = 1 as before.
- `examples/135-xx-lvalue-borrows.sx` — new test. Dereferences a
TrackingAllocator into a stack value, does `xx tracker` inside a
push Context, and asserts alloc_count/dealloc_count on the LOCAL go
up. Under old semantics this would have stayed at 0 (heap copy got
the increments, local stayed stale).
157/157 example tests pass; chess clean on macOS / iOS sim / Android
(`tools/verify-step.sh` ran green immediately before this work).
73 KiB
sx language specification
1. Lexical Structure
Comments
Line comments start with // and extend to end of line.
// this is a comment
Identifiers
- Lowercase or mixed-case for variables, functions:
x,compute,main - UPPER_SNAKE_CASE for constants:
SOME_INT,SOME_STR - PascalCase for types:
Foo
Literals
| Kind | Examples | Type |
|---|---|---|
| Integer | 0, 42, 0xFF, 0b1010 |
s64 |
| Float | 0.3, 0.9 |
f32 |
| String | "Hello", "z: {z}" |
string (may span multiple lines) |
| Heredoc String | #string END...END |
string |
| Boolean | true, false |
bool |
| Enum | .variant1 |
inferred from context |
| Undefined | --- |
context-dependent |
String literals support escape sequences (\n, \t, \r, \\, \", \0) and may span multiple lines directly:
shader_src := "#version 330 core
void main() {
gl_Position = vec4(0.0);
}
";
Heredoc strings use #string DELIMITER syntax (inspired by Jai). Content is completely raw — no escape processing. The delimiter is any identifier. Content starts after the newline following the delimiter and ends when the delimiter appears at column 0 of a line.
vert_src := #string GLSL
#version 330 core
void main() {
gl_Position = vec4(aPos, 1.0);
}
GLSL;
Keywords
if, else, then, while, for, break, continue, true, false, enum, struct, union, case, return, defer, push, ufcs, in, xx, and, or
Note:
enumis used for both payload-less and payload-bearing sum types (tagged unions).unionis reserved for C-style untagged unions (memory overlays).
Operators
| Operator | Meaning |
|---|---|
+ |
addition |
- |
subtraction / negation |
* |
multiplication |
/ |
division |
== |
equality |
!= |
inequality |
< |
less than |
> |
greater than |
<= |
less or equal |
>= |
greater or equal |
& |
bitwise AND |
| |
bitwise OR |
^ |
bitwise XOR |
~ |
bitwise NOT (unary) |
<< |
left shift |
>> |
right shift (arithmetic for signed, logical for unsigned) |
and |
logical AND (short-circuit) |
or |
logical OR (short-circuit) |
in |
membership test (tuples) |
|> |
pipe (function application) |
+= |
add-assign |
-= |
sub-assign |
*= |
mul-assign |
/= |
div-assign |
&= |
bitwise AND assign |
|= |
bitwise OR assign |
^= |
bitwise XOR assign |
<<= |
left shift assign |
>>= |
right shift assign |
Delimiters and Punctuation
| Token | Meaning |
|---|---|
:: |
constant binding / definition |
:= |
variable binding (mutable, inferred) |
: |
type annotation |
= |
assignment (in typed var decl) |
; |
statement terminator |
, |
separator (trailing commas allowed) |
. |
field access / enum literal prefix |
-> |
return type annotation |
=> |
lambda arrow |
$ |
generic type parameter introduction |
--- |
undefined value |
() |
grouping / params |
{} |
blocks / bodies |
2. Type System
Primitive Types
s1..s64— signed integers (1 to 64 bits).s64is the default for integer literals.u1..u64— unsigned integers (1 to 64 bits).f32— 32-bit floating pointf64— 64-bit floating pointbool— boolean (true/false)string— string of charactersAny— type-erased value, represented as{ i64, i64 }(type tag + payload). Used for variadic arguments and runtime type dispatch.Type— compile-time type value. At runtime, represented as ani64type tag (same tag space asAny).
Enum Types
User-defined sum types with named variants. Variants may optionally carry typed data (tagged unions). Internally, payload-less enums are represented as i64 (variant index). Enums with payloads are represented as { i64, [max_payload_size x i8] } (tag + data).
Declaration
// Payload-less enum
Color :: enum {
red;
green;
blue;
}
// Enum with payloads (tagged union)
Shape :: enum {
circle: f32; // typed variant
rect: s32; // typed variant
none; // void variant
}
Variants are referenced with dot-prefix syntax: .variant1
Construction
c := Color.red; // payload-less
s :Shape = .circle(3.14); // inferred from context
s = .none; // void variant
s = Shape.rect(42); // explicit prefix
Payload Access
r := s.circle; // load payload as f32 (undefined behavior if wrong variant active)
Pattern Matching
if s == {
case .circle: print("circle\n");
case .rect: print("rect\n");
case .none: print("none\n");
}
Payload Capture
Match arms can capture the variant's payload into a local variable:
if s == {
case .circle: (radius) { print("radius: {}\n", radius); }
case .rect: (size) => print("size: {}\n", size);
}
The (name) after the colon binds the payload. Two forms:
- Block:
case .variant: (name) { body } - Short:
case .variant: (name) => expr;
Enum Interpolation
Payload-less enums print as .variant. Enums with payloads print as .variant(value) or <TypeName tag=N>:
print("{}", s); // .circle(3.140000)
Union Types (Untagged)
C-style untagged unions for zero-cost memory overlays (type punning). All fields share the same memory — no tag, no runtime overhead. The LLVM representation is [max_field_size x i8].
Declaration
Overlay :: union {
f: f32;
i: s32;
}
All fields must have types (unlike enums, which may have void variants).
Anonymous Struct Fields (Member Promotion)
Anonymous struct fields inside a union have their members promoted to the union namespace:
Vec2 :: union {
data: [2]f32;
struct { x, y: f32; };
}
Access promoted members directly: v.x, v.y — these are zero-cost GEPs into the same underlying memory as v.data[0], v.data[1].
Initialization
Unions must be initialized with --- (undefined) and then assigned per-field:
o :Overlay = ---;
o.f = 3.14;
print("{}\n", o.i); // reinterpret bits as s32
Restrictions
- Pattern matching (
if x == { case ... }) is not supported on unions. - Unions cannot be printed directly via
print("{}", union_val)— access individual fields instead.
Struct Types
User-defined product types with named fields.
Vec4 :: struct {
x, y, z, w: f32;
}
Fields are declared as name1, name2: type; (comma-separated names sharing a type, semicolon-terminated).
Field Defaults
Fields may have default values. Fields without an explicit default have a zero-value default. --- marks a field as explicitly undefined.
Foo :: struct {
a : u2; // default is 0
b : u8 = 42; // default is 42
c : u8 = ---; // default is undefined
}
Struct Literals
// Positional (with type annotation — type inferred from annotation)
v1 : Vec4 = .{ 1, 2, 3, 0 };
// Positional (with type prefix)
v2 := Vec4.{ 4, 1, 1, 3 };
// Named fields (any order)
v3 := Vec4.{ w=0, x=2, y=3, z=4 };
// Mixed named + shorthand (bare identifier = field name matches variable name)
z := 5.0;
w := 6.0;
v4 := Vec4.{ y=3, x=9, w, z };
// Trailing commas are allowed in all comma-separated lists
v5 := Vec4.{
x = 1.0,
y = 2.0,
z = 3.0,
w = 4.0,
};
Field Access and Assignment
v1.x // read field x of struct v1
v1.x = 3.0; // assign to field x of struct v1
#using — Struct Composition
#using StructName; inside a struct declaration embeds all fields from StructName at that position. The embedded fields are accessed directly, as if declared inline.
UBase :: struct { x: s32; y: s32; }
UExt :: struct { #using UBase; z: s32; }
e := UExt.{ x = 1, y = 2, z = 3 };
print("{}\n", e.x); // 1
#using may appear at any field position (beginning, middle, end) and multiple #using entries are allowed:
UPos :: struct { px: s32; py: s32; }
UCol :: struct { r: s32; g: s32; }
USprite :: struct { #using UPos; #using UCol; scale: s32; }
s := USprite.{ px = 10, py = 20, r = 255, g = 128, scale = 1 };
The referenced struct must be declared before use. This is purely a compile-time field expansion — no runtime overhead.
Struct Interpolation
Struct values in string interpolation print as TypeName{field:value, ...}:
print("{}", v1); // Vec4{x:1.0, y:2.0, z:3.0, w:0.0}
Struct Methods
Functions declared inside a struct body become methods, registered as StructName.method:
Point :: struct {
x, y: s32;
sum :: (self: *Point) -> s32 { self.x + self.y; }
}
p := Point.{ x = 3, y = 4 };
print("{}\n", p.sum()); // 7
Methods receive the struct (typically as a pointer) as their first parameter. Dot-call syntax obj.method(args) resolves struct methods — it is not UFCS for arbitrary free functions. The pipe operator |> remains the universal UFCS mechanism.
Protocol Types
Protocols define a set of method signatures that types can implement. They enable:
- Static dispatch: compile-time checked constraints on generic type parameters.
- Dynamic dispatch: type-erased protocol values with runtime method dispatch through function pointers.
Declaration
Allocator :: protocol #inline {
alloc :: (size: s64) -> *void;
dealloc :: (ptr: *void);
}
Protocol methods have an implicit receiver — no self in the protocol signature. The compiler adds *Self automatically. The #inline modifier embeds function pointers directly in the protocol value (no vtable indirection).
#inline vs default layout
| Layout | Declaration | Value layout | Dispatch cost |
|---|---|---|---|
#inline |
protocol #inline { ... } |
{ ctx: *void, fn_ptr1, fn_ptr2, ... } |
Zero indirection |
| Default | protocol { ... } |
{ ctx: *void, __vtable: *Vtable } |
One pointer chase |
Use #inline for protocols with few methods where call overhead matters (e.g., allocators). Use the default layout for protocols with many methods to keep the value size small.
impl Blocks
impl Allocator for GPA {
alloc :: (self: *GPA, size: s64) -> *void {
self.alloc_count += 1;
malloc(size);
}
dealloc :: (self: *GPA, ptr: *void) {
self.alloc_count -= 1;
free(ptr);
}
}
- Top-level declarations (not inside struct bodies)
- Enable retroactive conformance — implement a protocol for types you don't own
- Impl methods are also registered as struct methods (
GPA.alloc) for direct calls - Duplicate
{Protocol, Type}pair in the same compilation unit is a compile error
Protocol Values and xx Conversion
Convert a concrete type to a protocol value with xx:
gpa := GPA.init();
a : Allocator = xx gpa; // concrete → protocol value
ptr := a.alloc(64); // dynamic dispatch through fn-ptr
a.dealloc(ptr);
xx works at assignment, call sites, and return positions:
use_allocator(xx gpa); // at call site
make_alloc :: () -> Allocator { xx gpa; } // in return position
Protocol values can be stored in struct fields, arrays, and passed through function calls:
Arena :: struct {
parent: Allocator; // protocol value as struct field
// ...
}
allocators : [2]Allocator = .[xx gpa, xx arena]; // protocol values in array
Ownership and Lifetime
Protocol values have two ownership modes. The mode is selected by the
shape of the operand to xx:
| Operand shape | ctx points to |
Lifetime | Who frees |
|---|---|---|---|
xx <rvalue> (struct literal, call result, etc.) |
Heap-allocated copy | Until free(p) |
Caller |
xx <lvalue> (identifier, field, index, deref) |
The named storage | Tied to that storage's scope | Caller manages the storage |
xx <pointer> / xx @ptr |
Original pointee | Tied to pointee | Caller manages pointee |
xx <rvalue> — when the operand has no storage of its own (struct
literal, function-call result, arithmetic expression, etc.) the concrete
data is heap-copied through context.allocator so the protocol value is
self-contained. It can be stored in containers, returned from functions,
and outlives the scope where it was created. Call free(p) to release
the backing memory when done:
s : Sizable = xx Widget.{ value = 42 }; // heap-copies Widget
print("{}\n", s.size());
free(s); // frees the heap-allocated Widget copy
xx <lvalue> — when the operand names existing storage (a local
variable, struct field, array element, or dereferenced pointer) the
protocol borrows that storage directly. No heap copy, no allocation,
no free needed; mutations through the protocol are visible to the
original. The protocol value is only valid while the named storage is
alive:
w := Widget.{ value = 0 };
s : Sizable = xx w; // borrows w's storage; no copy
s.add(5); // modifies w through ctx
print("{}\n", w.value); // 5
// do NOT free(s) — w owns the data
xx @ptr is equivalent to xx <lvalue> for the dereferenced
pointee — the protocol borrows. It's mostly redundant under the
lvalue rule above but stays valid for explicit clarity when the
operand is a pointer you want to make obvious is being borrowed:
w := Widget.{ value = 0 };
s : Sizable = xx @w; // identical to `xx w` — borrows w
Vtables are global constants — shared across all protocol values of the same (Protocol, ConcreteType) pair. They are never allocated or freed at runtime.
Default Methods
Protocol methods can have bodies. self dispatches through the vtable (dynamic dispatch):
Writer :: protocol {
write :: (data: string) -> s64; // required
write_line :: (data: string) -> s64 { // default
n := self.write(data);
n + self.write("\n");
}
}
Default methods are used unless overridden in the impl. Default methods calling self.method() dispatch through the vtable, so they work correctly with any concrete type.
Self Type
Self is a contextual keyword in protocol declarations — resolves to the concrete type in impls:
Eq :: protocol { eq :: (other: Self) -> bool; }
impl Eq for Point {
eq :: (self: *Point, other: Point) -> bool {
self.x == other.x and self.y == other.y;
}
}
// Static dispatch:
p1.eq(p2); // calls Point.eq directly
// Dynamic dispatch:
e : Eq = xx p1;
e.eq(p2); // dispatches through vtable, Self params erased to *void
For dynamic dispatch, Self parameters are erased to *void — the caller passes a pointer to the argument, and the thunk loads the concrete value.
Generic Constraints
$T/Protocol syntax validates that a type parameter implements the required protocol(s):
are_equal :: (a: $T/Eq, b: T) -> bool { a.eq(b); }
// Multiple constraints:
eq_and_hash :: (a: $T/Eq/Hashable, b: T) -> bool { ... }
Constraints produce clear errors at monomorphization: "s64 does not implement Hashable". Dispatch is static — same as unconstrained generics but with compile-time validation.
Constraints also work on struct type parameters:
SortedPair :: struct ($T: Type/Comparable) {
lo: T;
hi: T;
}
Generic Struct Impls
Pair :: struct ($T: Type) { a: T; b: T; }
impl Summable for Pair($T) {
sum :: (self: *Pair(T)) -> s32 { xx self.a + xx self.b; }
}
The impl is instantiated per concrete type argument, like generic struct methods.
Dispatch Rules
| Usage | Dispatch | Cost |
|---|---|---|
gpa.alloc(64) on *GPA |
Static — direct call | Zero |
$T/Allocator constraint |
Static — monomorphized | Zero |
a : Allocator = xx gpa; a.alloc(64) |
Dynamic — fn-ptr / vtable | Indirect call |
Static dispatch is automatic when the concrete type is known. Dynamic dispatch only when explicitly type-erased via xx into a protocol value.
Parameterised Protocols (compile-time only)
A protocol with type parameters is compile-time only — it has no vtable
and no boxed instance shape. Each impl is monomorphised per
(ProtocolArgs, Source) pair. The canonical example is Into, declared
in modules/std.sx:
Into :: protocol(Target: Type) {
convert :: () -> Target;
}
A user can then add conversions for any (Source, Target) pair:
MyString :: struct { tag: s64 = 0; }
impl Into(MyString) for s64 {
convert :: (self: s64) -> MyString { .{ tag = self }; }
}
main :: () -> s32 {
x : MyString = xx 42; // direct call to monomorphised convert
0;
}
The xx operator hooks into this mechanism: when an explicit target type
is provided and the built-in coercion ladder doesn't apply,
xx val : T lowers to val.convert() where convert comes from the
visible impl Into(T) for typeof(val). The call is a direct call — no
vtable, no runtime dispatch.
Source side is a TypeExpr. Unlike nullary impl P for SomeStruct,
the for-side of a parameterised impl accepts any type expression,
including closure and function types:
impl Into(Block) for Closure() -> void { ... }
impl Into(MyBuf) for []u8 { ... }
Lookup rules:
- Built-ins win. The user-space fallback only fires when
coerceToTypemade no progress (numeric narrow/widen, ptr↔int, etc. take priority). - Only at explicit
xx. Implicit conversions (assignment, parameter passing) never trigger user-space coercions. - Explicit target required.
xx valwith no surrounding type context still defaults tos64for legacy reasons; the user-space fallback only fires when the target was named explicitly. - Import-scoped visibility. An
implis visible from a file only if the file transitively imports the impl's defining module. An impl in an imported-but-not-directly-related module produces a clean diagnostic (no visible xx conversion …). - Duplicate impls error. If two impls for the same
(Source, Target)pair are both visible, the compiler emits a diagnostic naming both source modules. Same-file duplicates are caught at registration time. Cross-module duplicates are caught at thexxsite. - No recursion. A
convertbody that re-entersxx self : Targetfor the same(Source, Target)pair produces a "recursive xx conversion" diagnostic; the compiler does not try to monomorphise the convert into itself.
Tuple Types
Anonymous product types with optional field names. Tuples are first-class values — they can be stored in variables, passed to functions, and returned.
Construction
pair := (40, 2); // positional tuple: (s64, s64)
named := (x: 10, y: 20); // named tuple: (x: s64, y: s64)
single := (42,); // 1-tuple (trailing comma in value position)
zeroed : (s32, s32) = ---; // zero-initialized tuple
Note: In value position, (expr) without a comma is a grouping expression, not a tuple. Use (expr,) for a 1-tuple value.
Type Syntax
In type position, (T) is always a tuple type — no trailing comma needed. The -> arrow disambiguates function types from tuple types:
(s64) // tuple type with one field
(s64, s64) // tuple type with two fields
(s64) -> s64 // function type: takes s64, returns s64
(s64, s64) -> s64 // function type: takes two s64, returns s64
Field Access
pair.0; // 40 — numeric index
pair.1; // 2
named.x; // 10 — named field
named.0; // 10 — numeric index also works on named tuples
As Return Type
swap :: (a: s64, b: s64) -> (s64, s64) { (b, a); }
wrap :: (x: s64) -> (s64) { (x,); }
s := swap(1, 2); // s.0 = 2, s.1 = 1
t := wrap(42); // t.0 = 42
Representation
Tuples are represented as anonymous LLVM struct types (same layout as named structs). A tuple (s64, s64) has LLVM type { i64, i64 }.
Tuple Operators
Equality and inequality — element-wise comparison, both sides must have the same field count:
(1, 2) == (1, 2) // true
(1, 2) != (1, 3) // true
Concatenation (+) — creates a new tuple with fields from both sides:
c := (1, 2) + (3, 4); // c : (s64, s64, s64, s64)
c.0; // 1
c.3; // 4
Repetition (*) — repeats a tuple N times (N must be a compile-time integer literal):
r := (1, 2) * 3; // r : (s64, s64, s64, s64, s64, s64)
r.0; // 1
r.5; // 2
Lexicographic comparison (<, <=, >, >=) — compares element-by-element left to right:
(1, 2) < (1, 3) // true (first fields equal, 2 < 3)
(2, 0) > (1, 9) // true (2 > 1, rest ignored)
(1, 2) <= (1, 2) // true (all equal, <= allows tie)
Membership (in) — checks if a value exists in a tuple:
3 in (1, 2, 3) // true
5 in (1, 2, 3) // false
Array Types
Fixed-size arrays with element type and length.
buffer : [5]f32 = .[0, 2, 3.5, 4, 0];
val := buffer[2]; // 3.5
buffer.len // 5 (compile-time constant, s64)
Arrays can also be constructed programmatically with the Array builtin:
MyArr :: Array(5, s32); // equivalent to [5]s32
Slice Types
A slice []T is a fat pointer {ptr, i64} referencing a contiguous sequence of T elements. Same runtime layout as string.
// Arrays implicitly coerce to slices at call sites
arr : [5]s32 = .[3, 1, 4, 1, 5];
sortSlice(arr); // [5]s32 → []s32 coercion
// Slice operations
items[i] // read element at index
items[i] = val; // write element at index
items.len // length (s64)
items.ptr // raw pointer
Slices support generic type parameters: []$T introduces type parameter T inferred from the element type of the argument (array or slice).
Subslicing
Arrays, slices, and strings support subslice syntax to create zero-copy views:
arr : [5]s32 = .[3, 1, 4, 1, 5];
sub := arr[1..4]; // []s32 → [1, 4, 1]
head := arr[..3]; // []s32 → [3, 1, 4]
tail := arr[2..]; // []s32 → [4, 1, 5]
msg := "hello world";
word := msg[6..11]; // string → "world"
expr[start..end]— elements fromstart(inclusive) toend(exclusive)expr[start..]— elements fromstartto endexpr[..end]— elements from beginning toend- Result type:
[]Tfor arrays/slices,stringfor strings - No memory allocation — the result points into the original backing storage
Pointer Types
| Syntax | Meaning | .len |
[i] |
|---|---|---|---|
*T |
pointer to one T | no | no |
[*]T |
many-pointer (buffer) | no | yes |
*[N]T |
pointer to array of N T | yes | yes |
*[]T |
pointer to slice | yes | yes |
Address-of: @x returns a pointer to the variable.
v := Vec2.{ 1.0, 2.0 };
ptr := @v; // *Vec2
Dereference: p.* loads the value through the pointer.
copy := ptr.*; // Vec2
Auto-deref: p.field is sugar for p.*.field.
set_x :: (p: *Vec2, val: f32) {
p.x = val; // auto-deref: p.*.x = val
}
set_x(@v, 99.0);
Null: Pointer types are currently nullable by default. null is the null pointer literal.
np : *Vec2 = null;
Many-pointer: [*]T supports indexing for buffers of unknown size.
arr : [5]s32 = .[10, 20, 30, 40, 50];
mp : [*]s32 = @arr[0]; // *s32 → [*]s32 implicit
val := mp[2]; // 30
Implicit conversions:
*T→[*]T(pointer to element → many-pointer)*[N]T→[*]T(pointer to array → many-pointer)[N]T→[*]Tat call sites (array decays to many-pointer)[]T→[*]T(slice decays to many-pointer, extracts.ptr)T→*Tat call sites (implicit address-of)null(*void) → any*T
Fat pointer layout: [:0]u8, string, and []T are {ptr, i64} structs. The raw pointer is always the first field at offset 0. This means *[:0]u8 works as C's char** — a C function dereferences through the outer pointer and reads the raw char* from offset 0.
Optional Types
Optional types represent values that may or may not be present.
Type Syntax
x: ?s32 = 42; // optional s32, has value
y: ?s32 = null; // optional s32, no value
Any type T can be made optional: ?s32, ?string, ?Point, ?*T, ?[]T.
LLVM Representation
- Non-pointer optionals (
?s32,?Point):{ T, i1 }struct — payload + has_value flag - Pointer optionals (
?*T): bare pointer — null represents absence
Implicit Wrapping
A value of type T implicitly converts to ?T:
wrap :: (n: s32) -> ?s32 {
if n > 0 { return n; } // s32 → ?s32 (wraps)
return null; // null → ?s32
}
Force Unwrap (!)
Extracts the payload, traps at runtime if null:
x: ?s32 = 42;
val := x!; // val : s32 = 42
Null Coalescing (??)
Returns the payload if present, otherwise evaluates the right-hand side:
x: ?s32 = 42;
y: ?s32 = null;
a := x ?? 0; // 42
b := y ?? 99; // 99
Safe Unwrap (if val := expr)
Binds the payload to a variable if present:
x: ?s32 = 42;
if val := x {
print("{}\n", val); // val : s32 = 42
} else {
print("none\n");
}
While-Optional Binding
while val := get_next() {
// val is the unwrapped value
}
Pattern Matching
Optionals support .some and .none virtual enum variants:
result := if opt == {
case .some: (val) { val * 2; }
case .none: { 0; }
};
Optional Chaining (?.)
Short-circuits field access on optionals:
x: ?Point = Point.{ x = 1, y = 2 };
y: ?Point = null;
a := x?.x ?? 0; // 1
b := y?.x ?? 0; // 0
Result type of x?.field is always ?FieldType.
Flow-Sensitive Narrowing
The compiler narrows ?T to T in control flow branches:
x: ?s32 = 42;
if x != null {
print("{}\n", x); // x is s32 here (narrowed)
}
if x == null { return; }
print("{}\n", x); // x is s32 here (guard narrowing)
Compound conditions:
if a != null and b != null {
// both a and b are narrowed to their inner types
}
if a == null or b == null { return; }
// both a and b are narrowed after the guard
Reassignment kills narrowing.
Struct Field Defaults
Optional fields in structs default to null:
Node :: struct { value: s32; next: ?s32; }
n := Node.{ value = 10 }; // n.next is null
Printing
print("{}", opt) prints the payload value if present, or "null".
Comptime
Optionals work in #run blocks — ??, !, if val :=, null checks all supported.
Foreign Function Interface (C Interop)
To call C functions, declare a library constant with #library and bind functions with #foreign:
// Declare a named library constant
libc :: #library "c";
sdl :: #library "SDL3";
// Bind foreign functions — library ref is required
socket :: (domain: s32, type: s32, protocol: s32) -> s32 #foreign libc;
SDL_Init :: (flags: u32) -> bool #foreign sdl;
// Symbol renaming — optional second argument gives the C symbol name
write_fd :: (fd: s32, buf: [*]u8, count: u64) -> s64 #foreign libc "write";
#library "name"must be assigned to a named constant. The library is passed to the linker (-lnameon Unix,name.libon Windows).#foreign lib_refdeclares a function as external C. The library reference is mandatory.#foreign lib_ref "c_symbol"renames the binding: the sx function name differs from the C symbol. This avoids name collisions (e.g. POSIXwritevs an sx builtin).
C Interop Type Mapping
| C type | sx type | Notes |
|---|---|---|
const char* (input) |
[:0]u8 |
compiler extracts .ptr at call site |
char* (output buffer) |
[*]u8 |
raw buffer, no length |
const char** |
*[:0]u8 |
address of [:0]u8 — .ptr at offset 0 |
int* (single out) |
*s32 |
|
unsigned* (single out) |
*u32 |
|
float* (buffer) |
[*]f32 |
|
void* (generic) |
*void |
only for truly opaque/generic data |
Vector Types (SIMD)
LLVM SIMD vectors, parameterized by length and element type.
v := vec3(1, 3, 2); // Vector(3, f32)
Arithmetic: Element-wise +, -, *, / on vectors of same dimensions.
add := v1 + v2; // element-wise addition
Scalar broadcast: Scalar operands are broadcast to match the vector.
scaled := v * 2.0; // [2.0, 6.0, 4.0]
Negation: Unary - negates each element.
neg := -v; // [-1.0, -3.0, -2.0]
Element access: .x, .y, .z, .w (aliases .r, .g, .b, .a) extract single components.
v.x // first element
v.z // third element
Index access: v[i] extracts by index.
v[0] // first element
Built-in sqrt: Calls LLVM llvm.sqrt.f32/.f64 intrinsic.
s := sqrt(9.0); // 3.0
Function Types
Expressed as (param_types) -> return_type.
A function with no return type annotation returns void.
// type is (s32) -> s32
compute :: (x: s32) -> s32 { x * x; }
// type is () -> void
main :: () { }
Type Aliases
A name bound to an existing type.
SOME_TYPE :: f64;
Generic Functions (Monomorphization)
Functions can be parameterized over types using $T syntax. The $ prefix introduces a type parameter; subsequent uses of the name reference it.
sum :: (a: $T, b: T) -> T {
return a + b;
}
$Tin a parameter type introduces type parameterT- Bare
T(without$) references the introduced type parameter - At call sites, type arguments are inferred from actual argument types:
sum(40, 2) // T = s32 sum(1.5, 2.5) // T = f32 - Each unique set of concrete types produces a separate specialized function (monomorphization)
- Multiple type parameters are supported:
(a: $T, b: $U) -> T
Variadic Functions
Functions can accept a variable number of arguments using ..Type syntax:
print :: (fmt: string, args: ..Any) { ... }
..Anymeans zero or more arguments, each boxed intoAny(type tag + payload)- The variadic parameter must be the last parameter
- At call sites, variadic arguments are automatically boxed:
print("x={}, y={}\n", x, y) - Inside the function body,
argsis accessed as a slice-like sequence
Type Inference
::bindings infer type from the right-hand side:=bindings infer type from the right-hand side- Explicit annotation overrides inference:
NAME : f64 : 0.9; - Integer literals default to
s64 - Float literals default to
f32 - Enum literals (
.variant) infer their enum type from context (expected type)
Type Conversions
Implicit (widening) — allowed without annotation:
- Integer to wider integer of same signedness (
u8→u16,s8→s32) - Unsigned to strictly wider signed (
u8→s16) - Any integer to any float (
u8→f32,s32→f64) - Float to wider float (
f32→f64) - Integer and float literals can convert to any numeric type implicitly
Explicit (narrowing) — requires xx prefix:
- Integer to narrower integer (
s32→u8) - Signed to unsigned (
s32→u32) - Float to narrower float (
f64→f32) - Float to any integer (
f64→u16) - Unsigned to signed of same or narrower width (
u8→s8)
The xx prefix operator marks an expression for auto-conversion to the expected type from context (assignment, declaration, argument, return):
large: f64 = 5999.5;
x : u16 = xx large; // f64 → u16
d : u8 = #run xx resolve(5); // s32 → u8 at compile time
Using xx outside a typed context (where the target type is known) is a compile error.
3. Declarations
Constant Binding (immutable)
// inferred type
NAME :: value;
// explicit type
NAME : type : value;
The :: operator creates an immutable binding. The value is evaluated at compile time when possible.
Examples:
SOME_INT :: 0; // s32
SOME_STR :: "Hello"; // string
SOME_FLOAT :: 0.3; // f32
SOME_DOUBLE : f64 : 0.9; // f64 (explicit)
SOME_FUNC :: () => 42; // () -> s32
SOME_TYPE :: f64; // type alias
Variable Binding (mutable)
// inferred type
name := value;
// explicit type
name : type = value;
// default-initialized (type required)
name : type;
// undefined (type required)
name : type = ---;
The := operator creates a mutable binding. The type is inferred unless explicitly annotated.
name : type; initializes using the type's defaults: zero for primitives, per-field defaults for structs (see Field Defaults).
name : type = ---; leaves the value undefined (uninitialized memory). Reading before writing is undefined behavior.
Examples:
x := 42; // s32, mutable
x := if true then 1 else 2;
z : Foo = .variant2; // Foo, mutable, explicit type
a : Foo; // Foo, default-initialized (a=0, b=42, c=undef)
b : Foo = ---; // Foo, entirely undefined
Function Definition
name :: (params) -> return_type {
body
}
- Parameters:
name: typeseparated by commas - Return type:
-> type(omit for void) - Body: block of statements; last expression is the implicit return value
- No
returnkeyword needed (last expression = return value)
Examples:
compute :: (x: s32) -> s32 {
x * x;
}
main :: () {
// void return, no -> annotation
}
// No-arg void function:
main :: () {
// ...
}
Default Parameter Values
A parameter can declare a default value with name: type = expr. When a
caller omits the trailing positional argument, the compiler substitutes
the default expression at the call site:
greet :: (name: string, prefix: string = "Hello") {
print("{} {}!\n", prefix, name);
}
greet("world"); // prints "Hello world!"
greet("world", "Good morning"); // prints "Good morning world!"
The default expression is captured as an AST node at parse time and
re-lowered fresh at each call site, so runtime expressions like
context.allocator resolve in the caller's scope, not the callee's
definition site. This is the mechanism that lets stdlib containers like
List(T) expose an optional allocator argument that defaults to
context.allocator without requiring callers to thread one through:
// In std.sx:
List :: struct ($T: Type) {
append :: (list: *List(T), item: T, alloc: Allocator = context.allocator) {
// ... grows via `alloc.alloc(...)` ...
}
}
// Call sites:
list.append(42); // alloc = current context.allocator
list.append(42, self.parent_allocator); // alloc = the named long-lived owner
Defaults are only consulted for trailing missing positional args; once a position is provided, all earlier positions must also be provided. There is no named-argument syntax for skipping middle defaults.
Enum Definition
Name :: enum {
variant1;
variant2;
}
Defines a new enum type with the given variants. Trailing comma is allowed.
Enum Backing Type
An optional backing type can be specified after the enum keyword (Jai-style):
Color :: enum u8 { red; green; blue; }
Status :: enum s16 { ok; error; timeout; }
Syntax: Name :: enum [flags] [type] { ... }
The backing type must be an integer type (u8, u16, u32, s8, s16, s32, s64, etc.). When omitted, the default is s64. This is useful for C interop (matching C enum sizes) and memory efficiency.
Enum Layout Struct
For C interop with tagged unions (e.g. SDL_Event), a struct can be used as the backing type to specify the exact memory layout:
// Inline layout
SDL_Event :: enum struct { tag: u32; _: u32; payload: [30]u32; } {
quit :: 0x100;
key_down :: 0x300: SDL_KeyData;
key_up :: 0x301: SDL_KeyData;
}
// Named layout
EventLayout :: struct { tag: u32; _: u32; payload: [30]u32; }
SDL_Event :: enum EventLayout {
quit :: 0x100;
key_down :: 0x300: SDL_KeyData;
}
The layout struct must have:
- A field named
tag— integer type, the discriminant. Its type becomes the enum's backing type. - A field named
payload— array type, the variant data area. Its size determines the maximum payload capacity. - Any other fields are treated as padding/reserved and positioned by the struct layout.
This gives explicit control over the memory layout instead of relying on automatic alignment. The total size equals the struct size. Without a layout struct, tagged enums use { tag, [max_payload_size x i8] } with no padding.
Enum Flags
Perms :: enum flags {
read; // 1
write; // 2
execute; // 4
}
Flags can also specify a backing type:
SDL_InitFlags :: enum flags u32 {
video :: 0x20;
audio :: 0x10;
}
The flags modifier assigns auto power-of-2 values (1, 2, 4, 8, ...) instead of sequential indices (0, 1, 2, ...). Flags can be combined with | and tested with &:
p :Perms = .read | .write;
if p & .execute { ... }
print("{}\n", p); // .read | .write
Explicit values use :: syntax (Jai-style):
WindowFlags :: enum flags {
vsync :: 64;
resizable :: 4;
hidden :: 128;
}
Restrictions:
- Flags enum variants cannot have payloads
flagsis a contextual identifier, not a keyword
Bitwise Operators
All bitwise operators work on integer types. >> is arithmetic (sign-extending) for signed types and logical (zero-filling) for unsigned types.
x := 0xFF & 0x0F; // 15 — AND
y := 1 | 2 | 4; // 7 — OR
z := 0xFF ^ 0x0F; // 240 — XOR
w := ~0; // -1 — NOT
a := 1 << 4; // 16 — left shift
b := 256 >> 4; // 16 — right shift
Compound assignment forms: &=, |=, ^=, <<=, >>=.
x := 0xFF;
x &= 0x0F; // 15
x |= 0xF0; // 255
x ^= 0x0F; // 240
y := 1;
y <<= 8; // 256
y >>= 4; // 16
4. Expressions
Everything in sx is expression-oriented where possible.
Operator Precedence
| Prec | Operators | Notes |
|---|---|---|
| 9 (highest) | *, /, % |
multiplication, division, modulo |
| 8 | +, - |
addition, subtraction |
| 7 | <<, >> |
shifts |
| 6 | <, <=, >, >=, ==, != |
comparisons (chainable) |
| 5 | & |
bitwise AND |
| 4 | ^ |
bitwise XOR |
| 3 | | |
bitwise OR |
| 2 | and |
logical AND (short-circuit) |
| 1 (lowest) | or |
logical OR (short-circuit) |
Arithmetic
Standard infix: +, -, *, / with usual precedence (*// before +/-).
x * x
x + 2
Chained Comparisons
Comparison operators can be chained. Each operand is evaluated exactly once.
0 <= x <= 100 // equivalent to: 0 <= x and x <= 100
1000 > x >= -100 // equivalent to: 1000 > x and x >= -100
a == b == c // equivalent to: a == b and b == c
Mixed operators are allowed: a < b <= c > d means a < b and b <= c and c > d.
Logical Operators
and and or are short-circuit boolean operators. The right operand is not evaluated if the left operand determines the result.
if 0 <= x <= 100 and 0 <= y <= 100 {
print("contained");
}
If Expression (inline form)
if condition then consequent else alternate
Both branches are single expressions. The whole form produces a value.
x := if true then 1 else 2;
The else branch is optional. Without it, the form is a statement (no value):
if i == 2 then continue;
if done then break;
if err then return;
If Expression (block form)
if condition {
stmts
} else {
stmts
}
Each branch is a block. The last expression in each block is the branch's value. Can be used inline within other expressions:
y := x + if false {
7;
} else {
12;
};
Pattern Matching
if subject == {
case pattern: body
case pattern: body
else: body // optional default arm
}
Matches subject against each case. Patterns can be:
- Enum literals:
.variant— matches a specific enum variant. - Integer/bool literals:
42,true— matches a specific value. - Type categories:
struct,enum,union— matches all types in that category (used withtype_ofvalues).
break exits a case arm without producing a value. The optional else: arm matches when no case pattern matches.
if z == {
case .variant1: break;
case .variant2:
print("z: {z}");
else:
print("unknown");
}
Type Category Matching
When switching on a Type value (from type_of), category keywords match all registered types of that category:
type := type_of(val);
if type == {
case int: result = int_to_string(xx val);
case struct: result = struct_to_string(cast(type) val);
case enum: result = enum_to_string(cast(type) val);
}
Available categories: int, float, bool, string, struct, enum, vector, array, slice, pointer, type.
Note:
case enum:matches both payload-less enums and tagged enums (enums with payloads). C-style untagged unions are not registered with the Any type system and cannot be matched by category.
Inside a category arm, cast(type) val performs runtime generic dispatch: the compiler generates a switch over all types in the category, monomorphizing the callee for each concrete type.
While Loop
while condition {
body
}
Repeats body as long as condition is true. break; exits the loop. continue; skips to the next iteration.
i := 0;
while i < 10 {
i += 1;
if i == 5 { continue; }
if i == 8 { break; }
print("{i}\n");
}
For Loop
for iterable: (elem) { } // element alias (no copy)
for iterable: (elem, ix) { } // element + index
for iterable: (_, ix) { } // index only
Iterates over arrays and slices. The capture clause after : binds loop variables:
- The first name is the element capture (non-reassignable alias into the array/slice)
- The optional second name is the index (s64, starting at 0, also non-reassignable)
- Use
_to discard a capture
The element capture is a direct alias — reads and field writes go to the original array element. Direct reassignment of the capture (elem = x) is a compile error.
break; exits the loop. continue; skips to the next iteration.
arr : [5]s32 = .[1, 2, 3, 4, 5];
for arr: (val, ix) {
if ix == 2 { continue; }
print("{}\n", val);
}
Lambda
(params) => expr
(params) -> return_type => expr
Anonymous function. Produces a function value. Supports the same parameter features as named functions: $ generic type params, .. variadic params, and optional return type annotation.
SOME_FUNC :: () => 42; // () -> s32
double :: (x: $T) -> T => x + x; // generic lambda with return type
Closures
A closure is a function bundled with captured state. It is represented as a fat pointer { fn_ptr, env } (16 bytes), unlike a bare function pointer which is 8 bytes.
Closure Type
Closure(param_types) -> R // e.g. Closure(s32, s32) -> s32
Closure(param_types) // void return: Closure(s64) -> void
?Closure(s32) -> s32 // optional closure (null = none)
Creating Closures — closure() intrinsic
offset := 50;
f := closure((x: s32) -> s32 => x + offset); // expression body
g := closure((x: s32) -> s32 { // block body
if x < 0 { return 0; }
return x + offset;
});
The closure() intrinsic:
- Analyzes the lambda body for free variables (variables from outer scope)
- Allocates an env struct on the heap (via
malloc) containing captured values - Generates a trampoline function with signature
(env: *void, params...) -> R - Returns a
Closurevalue{ trampoline, env_ptr }
Capture semantics: capture by value (snapshot at creation time). Mutating the original variable after creating the closure does not affect the captured value.
n := 10;
f := closure((x: s64) -> s64 => x + n);
n = 999;
print("{}\n", f(5)); // 15, not 1004
Calling Closures
Closures are called with normal function call syntax:
result := f(10);
The compiler prepends the env pointer to the argument list and does an indirect call through the fn_ptr.
Auto-Promotion
A bare function can be implicitly promoted to a Closure where one is expected. The compiler generates a static thunk that ignores the env parameter, with a null env pointer.
double :: (x: s32) -> s32 { return x * 2; }
apply :: (f: Closure(s32) -> s32, x: s32) -> s32 { return f(x); }
apply(double, 10); // double auto-promoted to Closure
Factory Functions
Functions can return closures, enabling the factory pattern:
make_adder :: (n: s32) -> Closure(s32) -> s32 {
return closure((x: s32) -> s32 => x + n);
}
add5 := make_adder(5);
print("{}\n", add5(100)); // 105
Optional Closures
?Closure is supported for nullable callbacks. Uses fn_ptr == null as the none sentinel (zero overhead — same layout as Closure).
Button :: struct {
label: string;
on_click: ?Closure(s64) -> void;
}
btn := Button.{ label = "OK", on_click = null };
if handler := btn.on_click {
handler(1);
}
Memory
Closure env is allocated via context.allocator. The compiler auto-initializes context with a default GPA (malloc/free wrapper) at the start of main(). Use push Context to override with a custom allocator. Auto-promoted closures have a null env and require no allocation.
f := closure((x: s64) -> s64 => x + 10); // env allocated via default GPA
print("{}\n", f(5));
Function Call
callee(args)
compute(6)
print("hello")
UFCS (Uniform Function Call Syntax)
object.func(args) // equivalent to func(object, args)
When object.func(args) is encountered and func is not a field of object's type, the compiler rewrites the call to func(object, args). This enables method-like syntax without dedicated method declarations.
Point :: struct { x: s32; y: s32; }
point_sum :: (p: Point) -> s32 { p.x + p.y; }
p := Point.{3, 4};
print("{}\n", p.point_sum()); // calls point_sum(p) → 7
UFCS works with pointer receivers (auto-deref applies) and generic functions. If the field name exists as both a struct field and a free function, the struct field takes priority.
UFCS Aliases
The ufcs keyword creates a name alias for a function, decoupling the method name from the function name:
arena_alloc :: (arena: *Arena, size: s64) -> *void { ... }
alloc :: ufcs arena_alloc;
myArena.alloc(42); // calls arena_alloc(myArena, 42)
alloc(myArena, 42); // also works as a direct call
This avoids the naming redundancy of myArena.arena_alloc(42).
Tuple UFCS Splatting
When a tuple is used as the receiver of a UFCS call, its elements are unpacked as leading arguments:
num_add :: (a: s64, b: s64) -> s64 { a + b; }
add :: ufcs num_add;
(40, 2).add(); // splats to num_add(40, 2) → 42
(40,).add(2); // partial: num_add(40, 2) → 42
40.add(2); // normal UFCS: num_add(40, 2) → 42
With more arguments:
compute :: (a: s64, b: s64, c: s64, d: s64) -> s64 { a + b * c - d; }
calc :: ufcs compute;
(1, 2, 3, 4).calc(); // full splat → compute(1, 2, 3, 4)
(1, 2).calc(3, 4); // partial splat → compute(1, 2, 3, 4)
1.calc(2, 3, 4); // normal UFCS → compute(1, 2, 3, 4)
Pipe Operator
The pipe operator |> inserts the left-hand side as the first argument of the right-hand side call. It is desugared at parse time.
a |> f(b, c) // → f(a, b, c)
a |> f // → f(a)
a |> f(b) |> g(c) // → g(f(a, b), c)
The pipe is left-associative with the lowest precedence of all binary operators, so expressions like x + 1 |> f(2) are parsed as f(x + 1, 2).
This is especially useful with namespaced imports:
pkg :: #import "modules/math";
3 |> pkg.add(4) // → pkg.add(3, 4) → 7
3 |> pkg.add(4) |> pkg.mul(2) // → pkg.mul(pkg.add(3, 4), 2) → 14
Field Access
object.field
Used for module access (std.print) and struct member access.
Enum Literal
.variant_name
The enum type is inferred from context (expected type from declaration or parameter).
5. Statements
Statements are terminated by ;.
- Declaration:
name :: value;/name := value; - Assignment:
name = value;/name += value;(and other compound assignments). Also supports field targets:obj.field = value; - Multi-target assignment:
a, b = b, a;— all RHS values are evaluated before any stores, enabling swaps without temporaries. Target count must equal value count. Only plain=is supported (no compound operators). Each target must be a valid lvalue (variable, field, index, dereference). - Expression statement:
expr;— evaluates the expression (last in a block = return value) - Return:
return expr;— returns from the enclosing function with the given value.return;returns void. - Break:
break;— exits a match arm or while loop - Continue:
continue;— skips to the next iteration of a while loop - Defer:
defer expr;— defers execution ofexpruntil the enclosing block exits (LIFO order) - Push:
push expr { body }— scoped context override (see below)
push Statement and Implicit context
The push statement temporarily overrides a global context variable for the duration of a block. The previous context is saved before the block and restored after it exits.
push Context.{ allocator = arena.allocator(), data = xx @logger } {
handle(client); // inside here, `context` has the new value
}
// context is restored to its previous value here
Context struct — defined in std.sx:
Context :: struct {
allocator: Allocator; // active allocator for dynamic allocation
data: *void; // opaque pointer for application-specific data
}
context : Context = ---; // global mutable variable
The compiler auto-initializes context with a default GPA (malloc/free wrapper) at the start of main(). Inside the pushed block, any code (including called functions) can read context.allocator and context.data. The standard library's cstring(), alloc_slice(), and closure() all allocate via context.allocator.
push requires a global mutable variable named context to be in scope (provided by std.sx).
6. Blocks, Scoping, and Implicit Returns
A block { ... } contains zero or more statements. The last expression in a block is its value (implicit return).
In function bodies, the last expression becomes the return value:
compute :: (x: s32) -> s32 {
x * x; // this is returned
}
Scope Blocks
Bare blocks can be used as statements to introduce a new lexical scope. Variables declared inside a scope block are local to that block. No trailing ; is required.
main :: () {
x := 42;
{
x := 6; // shadows outer x
print("inner: {x}"); // prints 6
}
print("outer: {x}"); // prints 42
}
Variable Shadowing
A variable declaration (name :=) inside an inner scope shadows any variable with the same name from outer scopes. The outer variable is restored when the inner scope exits.
Defer
defer expr; schedules expr to execute when the enclosing scope block exits. Multiple defers in the same scope execute in reverse order (LIFO).
{
defer print("second");
defer print("first");
}
// prints: first, then second
7. Built-in Functions
Built-in functions are declared in std.sx with the #builtin suffix, which tells the compiler to generate the implementation internally rather than looking for a function body.
I/O
out(str: string) -> void— write a string to standard outputprint(fmt: string, args: ..Any)— formatted print. Parses{}placeholders in the format string and substitutes arguments. When all argument types are statically known, the compiler specializes the call at compile time (noAnyboxing).
Math
sqrt(x: $T) -> T— square root (maps to LLVM intrinsic)sin(x: $T) -> T— sine (maps to LLVM intrinsic)cos(x: $T) -> T— cosine (maps to LLVM intrinsic)
Memory
malloc(size: s64) -> *void— allocatesizebytes of heap memoryfree(ptr: *void) -> void— free previously allocated memorymemcpy(dst: *void, src: *void, size: s64) -> *void— copysizebytes fromsrctodstmemset(dst: *void, val: s64, size: s64) -> void— fillsizebytes atdstwithvalsize_of($T: Type) -> s64— size of typeTin bytes
Type Introspection
type_of(val: $T) -> Type— returns the runtime type tag of a valuetype_name($T: Type) -> string— returns the name of typeTas a string (e.g.,"Point")field_count($T: Type) -> s64— returns the number of fields (struct), variants (enum), or elements (vector) in typeTfield_name($T: Type, idx: s64) -> string— returns the name of theidx-th field (struct) or variant (enum) of typeTfield_value(s: $T, idx: s64) -> Any— returns theidx-th field (struct) or element (vector) ofs, boxed asAnyfield_value_int($T: Type, idx: s64) -> s64— returns the integer value of theidx-th enum variantfield_index($T: Type, val: T) -> s64— returns the sequential variant index for an explicit enum value (reverse offield_value_int). Returns-1if no variant matches.is_flags($T: Type) -> bool— returnstrueifTis a flags enum (declared with#flags)
Type Conversion
cast(Type) expr— prefix operator that convertsexprtoType. Examples:cast(s32) 3.14,cast(f64) n. WhenTypeis a runtimeTypevalue inside a type-category match arm, the compiler generates a dispatch switch over all types in the category, monomorphizing the callee for each concrete type.
Vectors
Vector($N: int, $T: Type) -> Type— returns an LLVM vector type ofNelements of typeT
8. Compile-time Evaluation
#run Directive
#run expr evaluates expr at compile time using lazy JIT execution. It can appear in two contexts:
Compile-time constants — bind a compile-time value to a name:
compute :: (x: s32) -> s32 { x * x; }
x :: #run compute(5); // x = 25, evaluated at compile time
Comptime globals are resolved lazily: the JIT executes only when the value is first referenced during code generation. Chained dependencies are resolved automatically.
Side effects — execute code at compile time for its side effects:
#run print("compiling...");
#insert Directive
#insert expr; evaluates expr at compile time to obtain a string, then parses and compiles that string as inline code at the insertion point.
generate :: () -> string {
return "print(\"hello from the other side\");";
}
main :: () {
#insert #run generate();
// equivalent to: print("hello from the other side");
}
The inserted string must contain valid sx statements (including semicolons). The statements are parsed and compiled in the same scope as the #insert site. Variables created by one #insert are visible to subsequent #insert directives in the same function.
Comptime Call Evaluation
When a :: constant binding is initialized with a function call and all arguments are comptime-known (literals or other :: constants), the compiler attempts to evaluate the entire call at compile time using the bytecode VM. If evaluation succeeds, the result is baked into the binary as a static constant with zero runtime overhead.
body :: "<html><body><h1>Hello</h1></body></html>";
response :: format("HTTP/1.1 200 OK\r\nContent-Length: {}\r\n\r\n{}", body.len, body);
// response is a static string constant — no runtime allocation
This works for any function, not just format. The mechanism is general: the VM compiles the function body (including #insert directives, variadic ..Any args, and calls to other functions) and executes it entirely at compile time. If the VM encounters something it cannot evaluate (e.g., foreign function calls, unsupported operations), it silently falls through to runtime codegen.
Build Configuration
The BuildOptions struct (from modules/compiler.sx) provides compile-time build configuration via #run. Methods on BuildOptions are compiler builtins intercepted during compilation — they have no runtime cost.
#import "modules/compiler.sx";
configure_build :: () {
opts := build_options();
opts.add_link_flag("-lm");
opts.set_output_path("out/my_program");
inline if OS == .wasm {
opts.set_output_path("sx-out/wasm/app.html");
opts.add_link_flag("-sUSE_SDL=3");
opts.add_link_flag("-sALLOW_MEMORY_GROWTH=1");
}
}
#run configure_build();
API:
| Method | Description |
|---|---|
build_options() |
Returns a BuildOptions value for the current compilation |
opts.add_link_flag(flag) |
Appends a linker flag (merged with CLI flags) |
opts.set_output_path(path) |
Sets the output binary path (overridden by CLI -o) |
Build flags from add_link_flag are merged with any flags passed on the command line. Duplicate library flags (e.g., -lSDL3 from multiple imports) are automatically deduplicated.
Compiler Constants
The modules/compiler.sx module provides compile-time constants set by the compiler based on the target:
| Constant | Type | Description |
|---|---|---|
OS |
OperatingSystem |
Target OS: .macos, .linux, .windows, .wasm, .unknown |
ARCH |
Architecture |
Target arch: .aarch64, .x86_64, .wasm32, .unknown |
POINTER_SIZE |
s64 |
Pointer width in bytes (8 for 64-bit, 4 for wasm32) |
These are used with inline if for compile-time conditional compilation:
inline if OS == .wasm {
// Only compiled when targeting wasm
}
inline if POINTER_SIZE == 8 {
// Only compiled on 64-bit platforms
}
9. Modules / Imports
#import Directive
The #import directive brings declarations from another .sx file or directory into the current file.
Flat import — splices all declarations from the imported file into the current scope:
#import "modules/std/math.sx";
Namespaced import — wraps all declarations under a namespace name:
std :: #import "modules/std.sx";
Directory import — when the path refers to a directory, all .sx files in that directory are aggregated into a single module:
pkg :: #import "modules/testpkg"; // namespaced — all .sx files merged under pkg
#import "modules/testpkg"; // flat — all declarations spliced into scope
Directory imports scan only the top level of the specified directory (non-recursive). Files are processed in alphabetical order for deterministic builds. Files within the directory may #import each other or external files.
Namespaced declarations are accessed with dot notation:
std.print("hello");
Import Resolution
- Imports are resolved after parsing and before code generation.
- Paths are first resolved relative to the directory of the file containing the
#import. If not found, they fall back to the working directory (cwd). This allows modules in subdirectories to import shared modules using the same paths as the root file. - If the path resolves to a file, it is imported directly. If it resolves to a directory, all
.sxfiles in that directory are aggregated. - Nested imports are supported (imported files may themselves contain
#import). - Circular imports are detected and silently skipped (each file is imported at most once).
- Generic functions in namespaced imports are supported (e.g.,
std.mul(5, 2)wheremulis generic).
Example: Given this project layout:
project/
modules/std.sx
modules/math/
math.sx
vector3.sx ← contains: #import "modules/std.sx";
main.sx ← contains: #import "modules/std.sx";
When compiling from project/, both main.sx and modules/math/vector3.sx can use #import "modules/std.sx" — the root file resolves it relative to its own directory, and the nested file falls back to resolving relative to cwd.
Intra-module References
Functions within a namespaced import can call each other without the namespace prefix. When generating code for a namespaced module, unresolved function names are automatically tried with the namespace prefix.
Example
// modules/std/math.sx
mul :: (base: $T, exp: T) -> T { base * exp; }
// modules/std/std.sx
out :: (str: string) -> void #builtin;
// main.sx
std :: #import "modules/std.sx";
#import "modules/std/math.sx";
main :: () -> s32 {
std.out("hello there");
mul(5, 2);
}
10. CLI & Cross-Compilation
Commands
sx run <file.sx> Compile and run
sx build <file.sx> Compile to binary
sx lsp Start language server (LSP)
Options
| Flag | Description |
|---|---|
--target <target> |
Target triple or shorthand (default: host) |
--cpu <name> |
CPU name (default: generic) |
--opt <level> |
Optimization: none/0, less/1, default/2, aggressive/3 |
-o <path> |
Output path (overrides set_output_path) |
Target Shorthands
The --target flag accepts shorthand aliases for common targets:
| Shorthand | Expands to |
|---|---|
wasm, emscripten |
wasm32-unknown-emscripten |
macos, macos-arm |
aarch64-apple-macos |
macos-x86 |
x86_64-apple-macos |
linux, linux-x86 |
x86_64-unknown-linux-gnu |
linux-arm |
aarch64-unknown-linux-gnu |
windows |
x86_64-windows-msvc |
Full triples are also accepted and passed through as-is.
10.5 Bundling and Post-Link Callbacks
Platform-specific bundling (Apple .app, Android .apk) lives in
library/modules/platform/bundle.sx.
The compiler shrinks to: parse → IR → codegen → link → invoke a sx
function. Bundling, codesigning, manifest generation, Java compilation
(via javac + d8), etc. are all sx code running in the IR
interpreter post-link.
Discovery
Users opt in explicitly from their own #run block:
#import "modules/compiler.sx";
#import "modules/platform/bundle.sx";
#run {
opts := build_options();
opts.set_bundle_path("MyApp.app");
opts.set_bundle_id("com.example.app");
opts.set_post_link_callback(bundle_main);
}
Programs that don't register a callback simply don't bundle — the linked binary is produced and nothing further runs. There is no stdlib default and no implicit prelude.
Two registration forms:
| Setter | Behavior |
|---|---|
BuildOptions.set_post_link_callback(cb: () -> bool) |
First-class function value. Preferred. |
BuildOptions.set_post_link_module(name: [:0]u8) |
Name-based fallback; compiler resolves <name>.bundle_main post-link. |
CLI --bundle <path> / --apk <path> are transitional aliases: if
bundle_path is set and no callback was registered, the compiler
auto-falls-back to post_link_module = "platform.bundle". The sx
bundler reads bundle_path() regardless of which flag the user used.
The callback returns false to fail the build.
BuildOptions surface
BuildOptions is a #compiler struct in
library/modules/compiler.sx. Setters
accumulate config in the compiler's BuildConfig; accessors read it
back inside the post-link callback.
| Method | Read / write | Purpose |
|---|---|---|
add_link_flag(flag) |
write | extra linker flag |
add_framework(name) |
write | -framework <name> (Apple) |
set_output_path(path) |
write | linked binary path |
set_wasm_shell(path) |
write | custom WASM shell template |
add_asset_dir(src, dest) |
write | bundle a directory of runtime assets |
set_post_link_callback(cb) |
write | first-class callback (preferred) |
set_post_link_module(name) |
write | name-based callback fallback |
set_bundle_path(path) |
write | .app / .apk output |
set_bundle_id(id) |
write | iOS CFBundleIdentifier / Android package |
set_codesign_identity(name) |
write | Apple signing identity (- = ad-hoc) |
set_provisioning_profile(path) |
write | iOS device .mobileprovision |
set_manifest_path(path) |
write | Android AndroidManifest.xml override |
set_keystore_path(path) |
write | Android keystore override |
binary_path() |
read | path of the freshly-linked binary |
bundle_path() / bundle_id() |
read | mirror of the setters |
codesign_identity() / provisioning_profile() |
read | Apple codesign params |
manifest_path() / keystore_path() |
read | Android overrides |
target_triple() |
read | canonicalized target triple |
is_macos() / is_ios() / is_ios_device() / is_ios_simulator() / is_android() |
read | per-target predicates |
framework_count() / framework_at(i) |
read | linker -framework names (for Frameworks/ embed) |
framework_path_count() / framework_path_at(i) |
read | linker -F search paths |
jni_main_count() / jni_main_foreign_path_at(i) / jni_main_java_source_at(i) |
read | #jni_main emissions for the APK bundler |
asset_dir_count() / asset_dir_src_at(i) / asset_dir_dest_at(i) |
read | iterate registered asset trees |
Returned strings are "" when unset; integer counts are 0. Accessors
that read after-the-fact (binary_path, bundle_path, etc.) return
the value that was either set in #run or forwarded from a CLI flag.
fs.sx and process.sx stdlib modules
The bundler is implemented in sx; its calls into fs.sx / process.sx
work both at runtime through the dynamic linker and at #run / post-link
through the host-FFI dispatch in
src/ir/host_ffi.zig (a dlsym(RTLD_DEFAULT) +
arity-switched cdecl trampoline).
library/modules/fs.sx (POSIX backend):
| Function | Purpose |
|---|---|
open_file(path, mode) -> ?File |
open a handle |
read_file(path) -> ?string |
one-shot slurp |
write_file(path, data) -> bool |
create / truncate / write |
append_file(path, data) -> bool |
append |
copy_file(src, dst) -> bool |
byte copy (streamed through 64 KB buffer) |
delete_file(path) -> bool |
unlink |
delete_dir(path) -> bool |
rmdir (empty only) |
create_dir(path) -> bool / create_dir_all(path) -> bool |
mkdir / mkdir -p |
move(old, new) -> bool |
rename |
set_mode(path, mode) -> bool |
chmod |
exists(path) -> bool |
access(F_OK) |
basename(p) -> string / dirname(p) -> string |
text-only path split |
File is a small value-typed handle wrapping a POSIX fd, with
methods is_valid / close / read / write / seek. Higher-level helpers
(read_file, write_file, copy_file) bypass *File methods and
call libc directly so they remain callable from the post-link IR
interpreter (which doesn't yet handle *Self method dispatch on
locally-unwrapped optionals).
library/modules/process.sx (POSIX backend):
| Function | Purpose |
|---|---|
run(cmd: [:0]u8) -> ?ProcessResult |
popen shell command, capture stdout + exit |
env(name: [:0]u8) -> ?string |
getenv (null if unset) |
find_executable(name) -> ?string |
command -v <name> via shell |
ProcessResult is { exit_code: s32, stdout: string }. The post-link
bundler invokes codesign, plutil, security, aapt2, javac,
d8, keytool, apksigner, etc. through run.
Apple .app flow (bundle.sx::bundle_main)
bundle_main branches on is_android() first; the remaining body is
the Apple path. Per target:
| Step | macOS | iOS sim | iOS device |
|---|---|---|---|
Stage <bundle> (rm-rf + mkdir + copy binary + set exe bit) |
✓ | ✓ | ✓ |
Write Info.plist |
minimal CFBundle* |
+ UIDeviceFamily + LSRequiresIPhoneOS + UIApplicationSceneManifest + DTPlatformName=iPhoneSimulator |
+ same with DTPlatformName=iPhoneOS |
Embed provisioning profile to <bundle>/embedded.mobileprovision |
— | — | when provisioning_profile() set |
Embed Frameworks/<Name>.framework/ (recursive cp -R per -F search path) |
— | when present | when present |
Extract entitlements (security cms -D + plutil -extract Entitlements + plutil -extract ApplicationIdentifierPrefix.0 + plutil -replace application-identifier resolving <TEAM>.* → <TEAM>.<bundle_id>) |
— | — | when provisioning_profile() set |
| Codesign | ad-hoc (-) |
ad-hoc | --sign <identity> --entitlements <ent> |
Android .apk flow (bundle.sx::android_bundle_main)
The Android branch:
- Discover SDK —
$ANDROID_HOME→$ANDROID_SDK_ROOT→$HOME/Library/Android/sdk. - Find highest
build-tools/platformssubdir —process.run("ls -1 <parent> | sort -V | tail -1"). - Stage
<apk>.stage/lib/arm64-v8a/<libfoo.so>—copy_filefrom the linked output. - Manifest — user-supplied via
set_manifest_path(), or synthesized:NativeActivityshape when no#jni_mainis declared.#jni_mainActivity shape withandroid:name="<foreign_path_with_dots>"+android:hasCode="true"otherwise.
- Compile
#jni_mainJava sources — write each entry'sjava_sourceto<stage>/java/<pkg>/<Cls>.java, runjavac --release 11 -classpath <android.jar>to<stage>/classes/, rund8 --release --lib <android.jar> --output <stage>to produce<stage>/classes.dex.javacdiscovered via$JAVA_HOME/bin/javacthencommand -v javac. aapt2 link -I <android.jar> --manifest <m> -o <unaligned>.- Append archives —
zip -q -r <unaligned> lib/, thenzip -q <unaligned> classes.dex(if dex was produced), thenzipeach registered asset dir at itsdestpath. zipalign -f 4 <unaligned> <aligned>.- Debug keystore —
keytool -genkeypair -keystore <path>on first use; defaults match Android Studio (androiddebugkeyalias, passwordandroid). apksigner sign --ks <ks> --ks-pass pass:android --key-pass pass:android --ks-key-alias androiddebugkey --out <apk> <aligned>.- Clean intermediates (keep
<apk>.stage/for inspection if it lasts the build).
11. Program Structure
A program is a sequence of top-level declarations and #import directives. Execution begins at main.
main :: () {
// entry point
}
main takes no arguments and returns void. The process exit code is 0 unless otherwise specified.
12. Grammar (informal)
program = top_level*
top_level = decl | import_decl
import_decl = '#import' STRING ';'
| IDENT '::' '#import' STRING ';'
decl = const_decl | var_decl | fn_decl | enum_decl | struct_decl
const_decl = IDENT '::' expr ';'
| IDENT ':' type ':' expr ';'
var_decl = IDENT ':=' expr ';'
| IDENT ':' type '=' expr ';'
| IDENT ':' type ';'
fn_decl = IDENT '::' '(' params? ')' ('->' type)? block
| IDENT '::' block
enum_decl = IDENT '::' 'enum' '{' (IDENT ';')* '}'
struct_decl = IDENT '::' 'struct' '{' struct_member* '}'
struct_member = field_group | '#using' IDENT ';'
field_group = IDENT (',' IDENT)* ':' type ('=' expr)? ';'
params = param (',' param)* ','?
param = IDENT ':' type ('=' expr)?
block = '{' stmt* '}'
stmt = decl | assignment ';' | multi_assign ';' | return_stmt | defer_stmt | insert_stmt
| push_stmt | break_stmt | continue_stmt | expr ';'
return_stmt = 'return' expr? ';'
break_stmt = 'break' ';'
continue_stmt = 'continue' ';'
defer_stmt = 'defer' expr ';'
insert_stmt = '#insert' expr ';'
push_stmt = 'push' expr block
assignment = lvalue ('=' | '+=' | '-=' | '*=' | '/=') expr
multi_assign = lvalue (',' lvalue)+ '=' expr (',' expr)+
lvalue = IDENT | postfix '.' IDENT
expr = if_expr | match_expr | while_expr | for_expr | lambda | binary
while_expr = 'while' expr block
for_expr = 'for' expr ':' '(' IDENT [',' IDENT] ')' block
binary = unary (binop unary)*
unary = ('-' | '!' | 'xx' | 'cast' '(' type ')') postfix
| postfix
postfix = primary ('(' args? ')' | '.' IDENT | '.{' field_init_list '}')*
primary = INT | HEX_INT | BIN_INT | FLOAT | STRING | BOOL | IDENT | '---'
| '.' IDENT | '.' '{' field_init_list '}'
| '(' expr ')' | block | '#run' expr
field_init_list = field_init (',' field_init)* ','?
field_init = IDENT '=' expr | IDENT | expr
if_expr = 'if' expr 'then' expr ('else' expr)?
| 'if' expr block ('else' block)?
match_expr = 'if' expr '==' '{' case_arm* else_arm? '}'
case_arm = 'case' pattern ':' (stmt* | 'break' ';')
else_arm = 'else' ':' stmt*
pattern = '.' IDENT | INT | BOOL | IDENT
lambda = '(' params? ')' ('->' type)? '=>' expr
args = expr (',' expr)* ','?
type = '$' IDENT | 's32' | 'f32' | 'f64' | 'bool' | 'string'
| 'Any' | 'Type' | '..' type | '[' expr ']' type | IDENT
13. Open Questions
- Nested functions: Can functions be defined inside other functions?
- Operator overloading: Not shown — presumably no.
- Top-level expressions: Are bare expressions allowed at the top level or only declarations?