Protocol method declarations now declare their receiver explicitly as the first parameter — 'self: *Self' (or 'self: Self') — matching the impl method signature, instead of the old implicit-receiver form where the listed params were only the extra args. That asymmetry repeatedly caused confusion over whether the first param was the receiver or an argument. The parser validates the first param is 'self' typed Self/*Self, then strips it, so all downstream lowering and the dispatch ABI are unchanged (impl blocks and call sites are unaffected). A protocol method missing the receiver is now a parse error. Migrated all 129 protocol method signatures across library + examples (+ one inline-sx test in sema.zig) to the explicit form. Updated specs.md + readme.md. New: examples/0418-protocols-explicit-receiver.sx (feature), examples/1190-diagnostics-protocol-missing-receiver.sx (negative/diagnostic).
778 lines
29 KiB
Markdown
778 lines
29 KiB
Markdown
# sx
|
|
|
|
An experimental systems programming language with Jai-inspired syntax, compile-time execution, generics, closures, protocols, and an LLVM backend.
|
|
|
|
> **Status**: Highly experimental. The language and compiler are under active development.
|
|
|
|
## At a Glance
|
|
|
|
```sx
|
|
#import "modules/std.sx";
|
|
|
|
Point :: struct {
|
|
x, y: i32;
|
|
magnitude :: (self: *Point) -> f32 { sqrt(self.x * self.x + self.y * self.y); }
|
|
}
|
|
|
|
main :: () {
|
|
p := Point.{ x = 3, y = 4 };
|
|
print("point: {}, magnitude: {}\n", p, p.magnitude());
|
|
}
|
|
```
|
|
|
|
**Key characteristics:**
|
|
|
|
- Jai-inspired declaration syntax: `name :: value` for constants, `name := value` for variables
|
|
- Compiles to native code via LLVM 19
|
|
- Compile-time execution with `#run`
|
|
- Generics via monomorphization
|
|
- First-class closures with value capture
|
|
- Protocol-based polymorphism (traits)
|
|
- Pattern matching on enums, optionals, and type categories
|
|
- C interop via `extern` / `export` and `#import c`
|
|
- Targets: macOS (ARM64, x86_64), Linux (x86_64, ARM64), Windows (x86_64), WebAssembly
|
|
|
|
## Building
|
|
|
|
Requires **Zig 0.16+** and **LLVM 19+**.
|
|
|
|
```sh
|
|
zig build
|
|
```
|
|
|
|
On macOS with Homebrew LLVM:
|
|
```sh
|
|
# default path: /opt/homebrew/opt/llvm@22 (Homebrew `llvm@22`)
|
|
zig build
|
|
```
|
|
|
|
Custom LLVM path:
|
|
```sh
|
|
zig build -Dllvm-prefix=/path/to/llvm
|
|
```
|
|
|
|
## Usage
|
|
|
|
```sh
|
|
sx run file.sx # compile and run
|
|
sx build file.sx # compile to binary
|
|
sx build file.sx -o out # compile with output path
|
|
sx ir file.sx # emit LLVM IR
|
|
sx lsp # start language server
|
|
```
|
|
|
|
Options:
|
|
```
|
|
--target <triple> target platform (shortcuts: macos, linux, windows, wasm)
|
|
--opt <level> optimization: none, less, default, aggressive
|
|
--cpu <name> target CPU
|
|
-o <path> output path
|
|
```
|
|
|
|
## Language Overview
|
|
|
|
### Types
|
|
|
|
| Type | Description |
|
|
|------|-------------|
|
|
| `i8`..`i64`, `u8`..`u64` | Signed/unsigned integers (default: `i64`) |
|
|
| `f32`, `f64` | Floating point (default: `f32`) |
|
|
| `bool` | `true` / `false` |
|
|
| `string` | UTF-8 fat pointer `{ptr, len}` |
|
|
| `[N]T` | Fixed-size array |
|
|
| `[]T` | Slice (fat pointer) |
|
|
| `*T`, `[*]T` | Single / many pointer |
|
|
| `?T` | Optional |
|
|
| `struct`, `enum`, `union` | Composite types |
|
|
| `Closure(args) -> ret` | Closure type |
|
|
|
|
A `[*]T` many-pointer carries no length, so it does **not** implicitly coerce to
|
|
a slice `[]T` — slice it with a length: `ptr[0..len]`. (A fixed array `[N]T`
|
|
*does* coerce to `[]T` — its length is known.)
|
|
|
|
**Numeric limits.** A field-like access on a builtin integer type name folds to
|
|
a compile-time constant of that type: `i64.max` → `9223372036854775807`,
|
|
`u8.min` → `0`, `i3.max` → `3`. It works for every width `i1`..`i64` / `u1`..`u64`
|
|
plus `usize`/`isize`, and is usable anywhere a constant of that type is — including
|
|
array dimensions (`[u8.max]T` is a 255-element array). The float types `f32`/`f64`
|
|
expose `.min` / `.max` too (with `.min` = most-negative finite = `-max`, **not**
|
|
C's `DBL_MIN`) plus the float-only `.epsilon` (ULP of 1.0, not C#'s denormal
|
|
`Epsilon`), `.min_positive` (smallest normal = C `DBL_MIN`), `.true_min` (smallest
|
|
subnormal — beware flush-to-zero CPU modes), `.inf`, and `.nan`. A float-only
|
|
accessor on an integer (`i32.epsilon`), or any accessor on a non-numeric type, is
|
|
a clean compile error. The fold applies only to a bare type-name receiver: a raw
|
|
identifier that binds a value shadowing a type name (`` `f64 := … `` then
|
|
`` `f64.epsilon ``) reads the value's field, not the limit — for a local, global,
|
|
or module-constant binding alike. This stays an ordinary *runtime* field read
|
|
even when it flows into an integer binding or an array dimension, so it truncates
|
|
(its field value) / is a non-constant count — never the builtin limit. See
|
|
`specs.md` → Numeric Limits.
|
|
|
|
### Declarations
|
|
|
|
```sx
|
|
// Constants (compile-time when possible)
|
|
PI :: 3.14159;
|
|
MAX : i32 : 100;
|
|
|
|
// Variables (mutable)
|
|
x := 42; // inferred type
|
|
y : i32 = 0; // explicit type
|
|
z : i32 = ---; // uninitialized
|
|
```
|
|
|
|
A typed constant's initializer must be compatible with its annotation — an
|
|
integer fits any integer or float, a float a float type, a string `string`,
|
|
`null` a pointer/optional. The check is type-based, so it covers a literal and a
|
|
constant expression alike: both `N : string : 4` and `N : string : M + 2` are a
|
|
compile-time `type mismatch` error, not a silently-accepted constant. Mixed
|
|
int+float arithmetic promotes to the float in either operand order (`n + 0.5` and
|
|
`0.5 + n` are both `f64`), so `C : i64 : M + 0.5` is rejected regardless of order
|
|
while `F : f64 : M + 0.5` folds to `2.5`.
|
|
|
|
**Aggregate constants.** Array- and struct-typed `::` constants are immutable
|
|
globals — one storage, reads index into it directly, whole-value uses copy by
|
|
value, and unused tables are dropped from the binary. `::` is the one and only
|
|
const spelling (`const` is not a keyword):
|
|
|
|
```sx
|
|
K : [4]i64 : .[11, 22, 33, 44]; // typed array const
|
|
A :: .[1, 2, 3]; // untyped — infers [3]i64
|
|
M :: .[1, 2.2, 3]; // numeric mix promotes — [3]f64
|
|
LIT :: Color.{ r = 255, g = 0, b = 0 }; // struct const — also one global
|
|
|
|
N :: K[0] + K[3]; // 55 — const element reads fold at compile time
|
|
D : [K.len]u8 = ---; // [4]u8 — .len and LIT.r fold in dimensions too
|
|
K[0] = 5; // error: cannot assign through constant 'K'
|
|
```
|
|
|
|
Writes through any constant's name — element, field, compound — are compile
|
|
errors; a local copy (`k := K`) stays writable. A struct constant whose
|
|
initializer calls a function (`CALL :: Color.{ r = bump(), … }`) is re-evaluated
|
|
at each use (documented contract); use `NAME :: #run f();` for evaluate-once.
|
|
|
|
**Float → integer narrowing (unified rule).** A float flowing into an
|
|
integer-typed binding *without* a cast follows the same integral-fold rule an
|
|
array dimension uses: an **integral** compile-time float folds to its integer, a
|
|
**non-integral** one is a compile error. It holds whether the value is a literal
|
|
or *any* compile-time-constant float expression — including one that references a
|
|
float-typed const (`F : f64 : 2.5; y : i64 = F + 1.5` → `4`), a builtin float
|
|
numeric-limit accessor (`f64.max - f64.max` → `0`, while `f64.true_min + 0.5`
|
|
errors), a float `%` (`6.0 % 4.0` → `2`, while `5.5 % 2.0` = `1.5` errors), or a
|
|
float `/` (`6.0 / 2.0` → `3`, while `5.0 / 2.0` = `2.5` errors — a float `/` is
|
|
always float division, never integer truncation, even with integral operands):
|
|
the compile-time float evaluator recognises every leaf shape the integer one does, so
|
|
no constant float form escapes the rule at one site while folding at another — and
|
|
is uniform
|
|
across a typed local, a parameter default, a struct field default, a call
|
|
argument, a typed constant, **and an array dimension / count** — `y : i64 = 4.0`,
|
|
`K : i64 : 4.0`, `y : i64 = M + 2.0`, and `[F + 1.5]i64` (≡ `[4]i64`, whether
|
|
written directly, through a const, or via a type alias) all give `4`, while
|
|
`y : i64 = 1.5`, `N : i64 : 1.5`, `y : i64 = M + 0.5`, `y : i64 = F + 0.25`
|
|
(= `2.75`), and `[F + 0.25]i64` all error (one wording at the binding sites:
|
|
`cannot implicitly narrow non-integral float …`; a dimension instead reports
|
|
`array dimension must be an integer, but '…' is a non-integral float`, since the
|
|
cast escape does not apply in a count position). An explicit `xx` / `cast(i64)`
|
|
is the escape hatch and always truncates (`y : i64 = xx 1.5` → `1`,
|
|
`y : i64 = xx (M + 0.5)` → `2`); a genuine runtime float is likewise unaffected.
|
|
|
|
Builtin type names (`i2`, `u8`, `bool`, `string`, …) are reserved and a *bare*
|
|
spelling can't be used as an identifier at a **value-binding or declaration-name**
|
|
site — a value binding (`:=` / typed local / parameter), a `::` constant or
|
|
function declaration, an `impl` method definition, or a `::` type declaration
|
|
(`struct` / `enum` / `union` / alias / `protocol` / …) — each is an error
|
|
(`i2 :: 5` and `i2 :: (n) { … }` are rejected just like `i2 := 5`). **Member-name
|
|
positions are exempt**: a struct *field*, a union *tag*, and a protocol
|
|
*method-signature* may be a bare reserved spelling (`struct { i2: i64 }`,
|
|
`union { u8: … }`, `protocol { i2 :: (self: *Self) -> i64 }`) — they are reached via `obj.name`,
|
|
so they never mis-lower. The bare exemption covers only the identifier-classified
|
|
reserved names (`i1`..`i64`, `u1`..`u64`, `bool`, `string`, `void`, `usize`,
|
|
`isize`, `Any`); `f32` and `f64` are lexer keywords, so even in a member slot they
|
|
need the backtick (`` struct { `f32: i64 } ``).
|
|
|
|
**After a leading `.`**, however, even a full lexer keyword is accepted bare as the
|
|
member/variant name — the dot makes the keyword reading impossible, so no backtick
|
|
is needed. This covers the enum-literal (`.enum`), field-access (`x.enum`,
|
|
`E.struct`), and match-arm (`case .enum:`) positions. A *declaration* of such a
|
|
variant still needs the backtick (`` enum { `enum: i64 } ``), since the decl site
|
|
has no disambiguating dot:
|
|
|
|
```sx
|
|
TI :: enum { `enum: i64; closed; } // decl: backtick needed (no dot)
|
|
t := TI.enum(7); // construct: dot disambiguates — bare `enum`
|
|
if t == { case .enum: (v) { … } // match: likewise bare after `.`
|
|
case .closed: { … } }
|
|
```
|
|
|
|
A leading backtick escapes one into
|
|
a **raw identifier**:
|
|
`` `name `` is the literal identifier `name` (the backtick drops out of the text),
|
|
usable in **every** position — value, declaration, and type, and optional in the
|
|
exempt member positions. It is the only way handwritten sx can spell a reserved
|
|
name in a binding or declaration site.
|
|
|
|
```sx
|
|
`i2 := 2.5; // identifier "i2", distinct from the i2 type
|
|
print("{}\n", `i2); // 2.5 (or bare `i2` in value position)
|
|
|
|
`i2 :: struct { x: i64; } // declare a type named with a reserved spelling
|
|
v : `i2 = ---; // and reference it as a type — resolves to the struct
|
|
x : i2 = 3; // bare `i2` in type position is still the int type
|
|
```
|
|
|
|
It works in every identifier position — local, global, parameter, struct field,
|
|
union tag, function name, type/alias/import name, a top-level or struct-body
|
|
constant, and the control-flow / capture / binding forms (destructure, `if`/`while`
|
|
binding, `for` capture, match capture, `catch`/`onfail` tag) — and a reserved-spelled
|
|
function is bare-callable (`i2(10)`). A backtick name used as a type resolves to a
|
|
`` `name ``-declared type — including a parameterized template (`` `i2(i64) ``) and
|
|
under pointer/optional wrappers — else a normal `unknown type` error.
|
|
|
|
Extern declarations from `#import c { … }` are exempt automatically: C names that
|
|
collide with reserved type names (e.g. `i1`, `i2`) import unedited, and an extern
|
|
reserved-name function is bare-callable by its C name.
|
|
|
|
### Structs
|
|
|
|
```sx
|
|
Vec3 :: struct {
|
|
x, y, z: f32;
|
|
|
|
length :: (self: *Vec3) -> f32 {
|
|
sqrt(self.x * self.x + self.y * self.y + self.z * self.z);
|
|
}
|
|
}
|
|
|
|
v := Vec3.{ x = 1, y = 2, z = 3 };
|
|
v2 := Vec3.{ 1, 2, 3 }; // positional
|
|
print("{}\n", v.length());
|
|
```
|
|
|
|
Structs support field defaults, `#using` for composition, and methods defined in the body.
|
|
|
|
### Enums (Tagged Unions)
|
|
|
|
```sx
|
|
Shape :: enum {
|
|
circle: f32;
|
|
rect: struct { w, h: f32; };
|
|
none;
|
|
}
|
|
|
|
area :: (s: Shape) -> f32 {
|
|
if s == {
|
|
case .circle: (r) => 3.14159 * r * r;
|
|
case .rect: (r) => r.w * r.h;
|
|
case .none: 0;
|
|
}
|
|
}
|
|
```
|
|
|
|
Flag enums with power-of-2 values:
|
|
```sx
|
|
Perms :: enum flags { read; write; execute; }
|
|
rw := Perms.read | Perms.write;
|
|
```
|
|
|
|
Set a variant by construction (`s = .circle(2.0)`), which writes the tag and
|
|
payload together. Direct member assignment to a variant (`s.circle = 2.0`) is
|
|
rejected — it would set the payload but not the tag. Mutating a sub-field of the
|
|
active variant in place (`s.rect.w = 9.0`) is fine.
|
|
|
|
### Optionals
|
|
|
|
```sx
|
|
x: ?i32 = 42;
|
|
y: ?i32 = null;
|
|
|
|
val := x ?? 0; // null coalescing
|
|
forced := x!; // force unwrap (traps on null)
|
|
|
|
if v := x { // safe unwrap
|
|
print("{}\n", v);
|
|
}
|
|
|
|
// Optional chaining
|
|
node: ?Node = get_node();
|
|
name := node?.name ?? "unknown";
|
|
```
|
|
|
|
### Generics
|
|
|
|
```sx
|
|
max :: (a: $T, b: T) -> T {
|
|
if a > b then a else b;
|
|
}
|
|
|
|
List :: struct ($T: Type) {
|
|
items: [*]T;
|
|
len: i64;
|
|
|
|
append :: (self: *List(T), item: T) { ... }
|
|
}
|
|
```
|
|
|
|
Generic constraints via protocols:
|
|
```sx
|
|
are_equal :: ($T: Type/Eq, a: T, b: T) -> bool { a.eq(b); }
|
|
```
|
|
|
|
### Closures
|
|
|
|
```sx
|
|
make_adder :: (n: i64) -> Closure(i64) -> i64 {
|
|
closure((x: i64) -> i64 => x + n);
|
|
}
|
|
|
|
add5 := make_adder(5);
|
|
print("{}\n", add5(100)); // 105
|
|
```
|
|
|
|
Closures capture by value. Bare functions auto-promote to closures when needed.
|
|
|
|
### Protocols
|
|
|
|
```sx
|
|
Drawable :: protocol {
|
|
draw :: (self: *Self, x: i32, y: i32); // receiver is explicit + required
|
|
}
|
|
|
|
impl Drawable for Circle {
|
|
draw :: (self: *Circle, x: i32, y: i32) { ... }
|
|
}
|
|
|
|
shape : Drawable = xx my_circle; // type erasure via xx
|
|
shape.draw(10, 20); // dynamic dispatch
|
|
```
|
|
|
|
Every protocol method declares its receiver explicitly as the first parameter
|
|
(`self: *Self` or `self: Self`), matching the `impl` signature; the annotation is
|
|
erased before dispatch, so the call site is unchanged.
|
|
|
|
`#inline` protocols store function pointers directly (no vtable indirection):
|
|
```sx
|
|
Allocator :: protocol #inline {
|
|
alloc :: (self: *Self, size: i64) -> *void;
|
|
dealloc :: (self: *Self, ptr: *void);
|
|
}
|
|
```
|
|
|
|
### Pattern Matching
|
|
|
|
```sx
|
|
// On enums
|
|
if shape == {
|
|
case .circle: (r) => print("radius: {}\n", r);
|
|
case .rect: (r) => print("{}x{}\n", r.w, r.h);
|
|
case .none: print("nothing\n");
|
|
}
|
|
|
|
// On optionals
|
|
if opt == {
|
|
case .some: (val) => use(val);
|
|
case .none: fallback();
|
|
}
|
|
|
|
// On type categories (via Any)
|
|
if type_of(val) == {
|
|
case int: print("integer\n");
|
|
case string: print("string\n");
|
|
case struct: print("struct\n");
|
|
}
|
|
```
|
|
|
|
### Control Flow
|
|
|
|
```sx
|
|
// Chained comparisons
|
|
if 0 <= x <= 100 { ... }
|
|
|
|
// While
|
|
while i < 10 { i += 1; }
|
|
|
|
// For — collections, ranges, and parallel iteration
|
|
for items (val) { print("{}\n", val); }
|
|
for items, 0.. (val, idx) { print("[{}] = {}\n", idx, val); }
|
|
for 1..=5, 0.. (a, b) { print("{}:{}\n", a, b); } // a: 1..5, b follows
|
|
for items (val) => total += val; // arrow body
|
|
for 0<..<n (i) { } // bound markers: 1 .. n-1
|
|
for 0=..=n (i) { } // 0 .. n
|
|
sub := items[1..=3]; // slices take them too
|
|
|
|
// Defer
|
|
f := open("file.txt");
|
|
defer close(f);
|
|
|
|
// Multi-target assignment (atomic swap)
|
|
a, b = b, a;
|
|
```
|
|
|
|
### Pipe Operator
|
|
|
|
```sx
|
|
result := data |> parse() |> transform() |> serialize();
|
|
// equivalent to: serialize(transform(parse(data)))
|
|
```
|
|
|
|
### Compile-Time Execution
|
|
|
|
```sx
|
|
// Evaluate at compile time
|
|
FIBONACCI_10 :: #run fib(10);
|
|
|
|
// Generate code at compile time
|
|
#insert #run generate_lookup_table();
|
|
```
|
|
|
|
### C Interop
|
|
|
|
C linkage:
|
|
```sx
|
|
libc :: #library "c";
|
|
printf :: (fmt: [:0]u8, args: ..Any) -> i32 extern libc;
|
|
write_fd :: (fd: i32, buf: [*]u8, count: u64) -> i64 extern libc "write";
|
|
```
|
|
|
|
`extern` / `export` are the keyword surface for C linkage. `extern` imports a
|
|
symbol defined elsewhere; `export` is its dual — define a function in sx and
|
|
expose it under the C ABI so C can call back in. Both imply `abi(.c)` and take
|
|
the same optional `[LIB] ["csym"]` rename tail; they also apply to data globals and
|
|
to Obj-C / JNI runtime-class aggregates (postfix after the `#objc_class(…)` directive).
|
|
```sx
|
|
abs :: (x: i32) -> i32 extern; // import an external C symbol
|
|
sx_square :: (x: i32) -> i32 export { x * x } // define + expose to C
|
|
__stdinp : *void extern; // extern data global
|
|
NSObject :: #objc_class("NSObject") extern { alloc :: () -> *NSObject; } // reference a runtime class
|
|
```
|
|
|
|
Direct C header import:
|
|
```sx
|
|
#import c {
|
|
#include "vendors/mylib/api.h";
|
|
#source "vendors/mylib/impl.c";
|
|
};
|
|
```
|
|
|
|
### Inline Assembly
|
|
|
|
`asm` is an expression. The body is a brace block: a template string first, then
|
|
operands and an optional `clobbers(.…)` clause. Each operand is
|
|
`[name]? "constraint" <role>`, where the role is `-> Type` (a value output) or
|
|
`= expr` (an input). It compiles to an LLVM inline-asm call (AT&T syntax).
|
|
|
|
```sx
|
|
// one value output, two register-class inputs
|
|
add :: (a: i64, b: i64) -> i64 {
|
|
return asm { "add %[out], %[a], %[b]", [out] "=r" -> i64, [a] "r" = a, [b] "r" = b };
|
|
}
|
|
```
|
|
|
|
The `%[name]` in the template refers to an operand; `%%` is a literal `%`. An
|
|
operand pinned to a register (`"={rax}"`, `"{rdi}"`) is **auto-named after that
|
|
register**, so an explicit `[name]` is only needed for register-class (`=r`)
|
|
operands or to give a value a name distinct from its register. A label that just
|
|
echoes its register (`[rax] "={rax}"`) is rejected.
|
|
|
|
Outputs decide the result: **0** → `void` (and the asm must be `volatile`);
|
|
**1** → that type; **N** → a tuple, named by each operand's name.
|
|
|
|
```sx
|
|
// multiple value outputs → a destructurable tuple
|
|
split :: (x: u64) -> (lo: u64, hi: u64) {
|
|
return asm {
|
|
#string ASM
|
|
and %[l], %[x], #0xff
|
|
lsr %[h], %[x], #8
|
|
ASM,
|
|
[l] "=r" -> u64, [h] "=r" -> u64, [x] "r" = x,
|
|
};
|
|
}
|
|
lo, hi := split(0x1234); // (52, 18)
|
|
```
|
|
|
|
`asm volatile { … }` marks side effects (required when there are no outputs). A
|
|
multi-instruction template uses the `#string` heredoc (delivered verbatim — no
|
|
escape processing). `clobbers(.cc, .memory, .rax)` lists registers/flags the asm
|
|
trashes that aren't operands.
|
|
|
|
A top-level `asm { … }` block is **global assembly** — template only (no
|
|
operands or `volatile`), emitted as module-level asm. Symbols it defines are
|
|
reached with a lib-less `extern`:
|
|
|
|
```sx
|
|
asm {
|
|
#string ASM
|
|
.global _my_add
|
|
_my_add:
|
|
add x0, x0, x1
|
|
ret
|
|
ASM,
|
|
};
|
|
my_add :: (a: i64, b: i64) -> i64 extern;
|
|
```
|
|
|
|
Inline asm is target-specific and never runs at compile time. See
|
|
[docs/inline-assembly.md](docs/inline-assembly.md) for the full guide
|
|
(place outputs, global asm, the cookbook) and `examples/16xx-platform-asm-*`
|
|
for the runnable matrix.
|
|
|
|
### Modules
|
|
|
|
```sx
|
|
#import "modules/std.sx"; // flat import
|
|
math :: #import "modules/math"; // namespaced import (directory: all .sx files merged)
|
|
```
|
|
|
|
A path that matches both a file and a same-named sibling directory
|
|
(`modules/std.sx` next to `modules/std/`) is rejected as ambiguous — write the
|
|
`.sx` path to import the file.
|
|
|
|
When two flat-imported modules each define a function of the same name, every
|
|
module's own code binds its OWN author — a bare call resolves to the same-name
|
|
function in the caller's module (or in its single flat import that provides it).
|
|
A bare call to a name that two or more flat imports both provide is ambiguous and
|
|
is rejected; qualify it with a namespaced import (`m :: #import …; m.fn()`).
|
|
|
|
A **namespaced** import only binds its alias: reach the module's members as
|
|
`m.name`. Bare-name visibility joins over flat (`#import "…"`) imports, never over
|
|
a namespaced alias. That join is **non-transitive for every bare member kind —
|
|
functions, constants, AND types alike**: a flat import of a flat import is NOT
|
|
bare-visible (when `A` imports `B` and `B` imports `C`, `A` does not see `C`'s
|
|
top-level names — including its types — so qualify them, or `#import "C"` directly
|
|
if you reference them). This holds for a *parameterized* type head too: a generic
|
|
struct / parameterized protocol / type-returning function used as `Box(i64)` is
|
|
gated exactly like a bare leaf type — the constructor head must be reachable over
|
|
your own or a direct flat import, not two hops away. A bare reference to a
|
|
namespaced-only import's member — function, module constant, or **type** (leaf or
|
|
generic head) — is likewise not visible and is rejected (`type 'X' is not visible;
|
|
#import the module that declares it`); qualify it as `m.name`. The type gate holds
|
|
wherever a bare type name is named — a value/field annotation, a reflection /
|
|
type-arg slot (`size_of(T)`, `size_of(*T)`), a typed array-literal head (`T.[…]`),
|
|
a parameterized head (`Box(i64)`), or a type-as-value / type-match arm — not just
|
|
plain annotations. **Own-wins** holds at every one of those sites too, exactly like
|
|
a bare call: when the querying module declares its OWN same-name type, that bare
|
|
reference resolves to ITS author — never a same-name flat import. Ambiguity is
|
|
enforced at every one of those sites as well: a bare type (including a type-returning
|
|
function head) that two or more flat imports each declare — with no own author to
|
|
win — is **ambiguous and rejected** (`type 'X' is ambiguous: it is declared in
|
|
multiple flat-imported modules; qualify the reference or remove the duplicate
|
|
import`) — never a silent pick of one author. Qualifying the reference is a real
|
|
escape hatch for a **generic head** too: `ns.Box(args)` selects the template
|
|
AUTHORED by `ns`'s module, so two namespaces each declaring a same-name
|
|
`Box($T)` with different layouts stay distinct types (`a.Box(i64)` and
|
|
`b.Box(i64)` instantiate their own author's fields), never the global last-wins
|
|
template. (A library's own *internal* type references still resolve: a generic
|
|
struct / pack fn / protocol body is instantiated in the module that defines it, so
|
|
e.g. `List(T).append`'s `alloc: Allocator` is visible there regardless of the call
|
|
site.)
|
|
|
|
**Namespace aliases carry one level.** A namespaced import is an ordinary
|
|
declaration, and flat-importing the module that declares it makes the alias
|
|
usable in the importer — there is no `pub` keyword. The stdlib prelude uses
|
|
exactly this: std.sx is itself a pure re-export facade (every bare prelude
|
|
name is an alias into the `std/core.sx` / `std/fmt.sx` / `std/list.sx`
|
|
part-files), and `#import "modules/std.sx"` gives every bare prelude name
|
|
(`print`, `List`, `Context`, …) plus the carried namespaces — the
|
|
part-files (`core`, `fmt`, `list`) and std's tail
|
|
(`mem`, `fs`, `process`, `socket`, `json`, `cli`, `hash`, `xml`, `log`, `test`):
|
|
|
|
```sx
|
|
#import "modules/std.sx";
|
|
|
|
main :: () {
|
|
gpa := mem.GPA.init(); // mem :: #import — carried from std.sx
|
|
log.warn("count = {}", 3);
|
|
s := xml.escape("<a & b>");
|
|
}
|
|
```
|
|
|
|
Carried aliases follow declaration rules: an own declaration shadows a carried
|
|
alias, two flat imports carrying the same alias make its use ambiguous, and
|
|
carry does not chain through a second flat hop.
|
|
|
|
**Re-exporting through alias declarations.** Since visibility never chains,
|
|
a facade re-exports another module's members as its OWN declarations —
|
|
ordinary aliases, which its direct flat importers then see bare. This works
|
|
for functions of every kind (plain, generic, comptime-pack like `print`),
|
|
plain types, and generic struct heads alike (the generic alias binds the
|
|
same template, so instantiation and methods resolve through it), renamed
|
|
or same-name:
|
|
|
|
```sx
|
|
// facade.sx
|
|
r :: #import "rich.sx";
|
|
helper :: r.helper; // fn re-export
|
|
Thing :: r.Thing; // struct re-export
|
|
Box :: r.Box; // generic head re-export — same template
|
|
|
|
// consumer.sx
|
|
#import "facade.sx";
|
|
b := Box(i64).{ item = 3 }; // rich.sx's Box, via the facade
|
|
```
|
|
|
|
### Implicit Context
|
|
|
|
Every program gets an implicit `context` with a default allocator:
|
|
|
|
```sx
|
|
// No boilerplate needed — context is auto-initialized
|
|
main :: () {
|
|
list := List(i64).create(); // uses context.allocator
|
|
list.append(42);
|
|
}
|
|
|
|
// Override allocator for a scope
|
|
push Context.{ allocator = my_arena } {
|
|
do_work(); // all allocations use my_arena
|
|
}
|
|
```
|
|
|
|
## Quick Sort Example
|
|
|
|
```sx
|
|
#import "modules/std.sx";
|
|
|
|
quick_sort :: (items: []$T) {
|
|
partition :: (items: []T, lo: i64, hi: i64) -> i64 {
|
|
pivot := items[hi];
|
|
i := lo - 1;
|
|
j := lo;
|
|
while j < hi {
|
|
if items[j] < pivot {
|
|
i += 1;
|
|
items[i], items[j] = items[j], items[i];
|
|
}
|
|
j += 1;
|
|
}
|
|
i += 1;
|
|
items[i], items[hi] = items[hi], items[i];
|
|
i;
|
|
}
|
|
|
|
sort :: (items: []T, lo: i64, hi: i64) {
|
|
if lo < hi {
|
|
pi := partition(items, lo, hi);
|
|
sort(items, lo, pi - 1);
|
|
sort(items, pi + 1, hi);
|
|
}
|
|
}
|
|
|
|
sort(items, 0, items.len - 1);
|
|
}
|
|
|
|
main :: () {
|
|
arr : []i64 = .[333, 2, 3, 5, 2, 2, 3, 4, 5, 6, 6, 1];
|
|
quick_sort(arr);
|
|
print("{}\n", arr);
|
|
// [1, 2, 2, 2, 3, 3, 4, 5, 5, 6, 6, 333]
|
|
}
|
|
```
|
|
|
|
## Standard Library
|
|
|
|
The standard library (`modules/std.sx`) provides:
|
|
|
|
- **I/O**: `print(fmt, args...)`, `out(str)`
|
|
- **Collections**: `List($T)` (dynamic array)
|
|
- **Strings**: `concat`, `substr`, `int_to_string`, `uint_to_string`, `float_to_string`, `cstring`
|
|
- **Memory**: `Allocator` protocol, `GPA` (general purpose), `Arena` (bump allocator)
|
|
- **Math**: `sqrt`, `sin`, `cos`
|
|
- **Introspection**: `type_of`, `type_name`, `type_is_unsigned`, `type_eq`, `field_count`, `field_name`, `field_value`, `size_of`, `align_of`, `is_flags` — the type-only builtins (`size_of`, `align_of`, `field_count`, `type_name`, `type_eq`, `type_is_unsigned`, `is_flags`) require a type argument (a spelled type or a generic `T`); passing a value is a compile-time error. A runtime `Type` value (`type_of(x)`) is currently accepted by `type_name` and `type_is_unsigned` only — the other five are compile-time-only (runtime reflection is deferred)
|
|
|
|
### Atomics (`modules/std/atomic.sx`)
|
|
|
|
Opt-in import (not in the universal prelude). `Atomic($T)` is a transparent
|
|
wrapper over an integer/pointer-sized `T`; the memory `Ordering` is a **compile-time
|
|
value parameter** (`$o`) — LLVM atomic ordering is an instruction attribute, so it
|
|
must be known at compile time, and it's always explicit (Rust-style, no default):
|
|
|
|
```sx
|
|
#import "modules/std/atomic.sx";
|
|
|
|
counter : Atomic(i64) = .init(0);
|
|
counter.store(0, .relaxed);
|
|
n := counter.load(.acquire);
|
|
prev := counter.fetch_add(1, .seq_cst); // + fetch_sub/and/or/xor/min/max
|
|
old := counter.swap(42, .acq_rel);
|
|
|
|
// compare-exchange returns ?T — null = SUCCESS; a present value is the actual
|
|
// current value on failure (for a retry loop). `_weak` may fail spuriously.
|
|
got := counter.compare_exchange(old, 99, .acq_rel, .acquire);
|
|
if got == null { /* swapped */ } else { /* retry with got! */ }
|
|
|
|
fence(.seq_cst); // standalone memory fence (.relaxed is rejected)
|
|
```
|
|
|
|
`Ordering` = `relaxed`/`acquire`/`release`/`acq_rel`/`seq_cst`. Invalid combinations
|
|
are compile errors (a load can't be `.release`; a store can't be `.acquire`; CAS's
|
|
failure ordering can't be stronger than success; a fence can't be `.relaxed`).
|
|
RMW/CAS/swap are integer-only. The same operations run at compile time (`#run`)
|
|
under single-threaded semantics, matching the runtime result.
|
|
|
|
### Command-line interface (`modules/std/cli.sx`)
|
|
|
|
`std.cli` builds command-line front-ends over an explicit logical argv
|
|
(`[]string`): `os_args(buf)` reads the real process argv, and
|
|
`parse(args, commands, diag) -> !Parsed` does subcommand dispatch + `--flag`
|
|
parsing. On top of that it defines the small **exit-code / `--json` contract**
|
|
a CLI program (e.g. `dist`) relies on:
|
|
|
|
```sx
|
|
#import "modules/std/cli.sx";
|
|
|
|
p, e := parse(args, cmds, @diag); // (Parsed, !CliError)
|
|
if e == error.UnknownCommand {
|
|
log.err("unknown command '{}'", diag.token); // human text -> stderr
|
|
exit_usage(); // usage error -> exit 64
|
|
}
|
|
if p.json { /* emit ONLY machine output on stdout */ }
|
|
```
|
|
|
|
- **Named exit codes** — `EX_OK` (0), `EX_USAGE` (64, the sysexits.h
|
|
command-line-usage code), `EX_UNAVAILABLE` (70, unsupported platform).
|
|
- **Terminators** — `exit_ok()` / `exit_usage()` end the process with the
|
|
matching code; both route through the canonical `process.exit(code: u8)`.
|
|
- **`--json` mode** — the reserved global `--json` flag surfaces as
|
|
`parsed.json` (true iff `--json` is in the argv). Convention: in json mode
|
|
stdout carries only the machine result; human diagnostics go to stderr.
|
|
|
|
## Cross-Compilation
|
|
|
|
```sh
|
|
sx build app.sx --target linux # Linux x86_64 (glibc, dynamic)
|
|
sx build app.sx --target linux-musl # Linux x86_64 (musl, static)
|
|
sx build app.sx --target macos-arm # macOS ARM64
|
|
sx build app.sx --target windows # Windows x86_64 (MSVC)
|
|
sx build app.sx --target windows-gnu # Windows x86_64 (MinGW)
|
|
sx build app.sx --target wasm # WebAssembly
|
|
```
|
|
|
|
### Self-contained builds (bundled `zig`)
|
|
|
|
For macOS / Linux / Windows targets, sx can link with a bundled `zig` as
|
|
`zig cc` instead of the host's system linker — it supplies lld, the CRT, and
|
|
libc (musl/glibc/mingw), so no `cc`/SDK needs to be installed. The default
|
|
Linux output is statically-linked musl, which runs on any Linux.
|
|
|
|
```sh
|
|
sx build app.sx --target linux-musl --self-contained # static, portable ELF
|
|
sx build app.sx --self-contained # host target, hermetic link
|
|
SX_ZIG=/path/to/zig sx build app.sx --self-contained # pin a specific zig
|
|
sx build app.sx --no-self-contained # force the system toolchain
|
|
```
|
|
|
|
`--self-contained` uses a `zig` found via `$SX_ZIG`, a bundled copy next to the
|
|
`sx` binary, or `zig` on `PATH`. In a packaged distribution (with a bundled
|
|
`zig` alongside `sx`) the backend activates automatically; a `PATH`-only `zig`
|
|
is used only when `--self-contained` is passed, so native dev builds are never
|
|
silently rerouted. Set `SX_DEBUG_ZIG=1` to trace discovery.
|
|
|
|
## Acknowledgments
|
|
|
|
- [Jonathan Blow](https://en.wikipedia.org/wiki/Jonathan_Blow) for Jai, the language that inspired this one
|
|
- [Andrew Kelley](https://andrewkelley.me) for Zig, which made this compiler a joy to write
|
|
|
|
## License
|
|
|
|
MIT
|