Files
sx/readme.md
agra bb8f7dc5ec fix(stdlib/E4): route reflection/literal/value/match bare-type sites through the non-transitive gate
attempt-3 closed the leaf + parameterized-head leaks but several more
sites still resolved an UNQUALIFIED type name via the global
type_alias_map / findByName / type_bridge.resolveAstType without the
single-hop visibility gate, so a 2-flat-hop bare type leaked through:

  - resolveTypeArg (reflection / size_of / align_of / type_name / type_eq):
    identifier + type_expr leaves now gate via headTypeLeak; the wrapped /
    structural forms (*T, [N]T, []T, ?T, fn-ptr, tuple) route through the
    already-gated resolveTypeWithBindings so each inner leaf recurses the
    source-aware resolveNominalLeaf.
  - resolveTupleLiteralTypeArg: each element leaf is resolved through the
    source-aware resolver before the delegated build, so (COnly, s64) is
    gated.
  - resolveArrayLiteralType (T.[...] typed array/vector-literal head):
    identifier + type_expr leaves gate via headTypeLeak.
  - type-as-value lowerExpr identifier (x: Type = COnly, x == COnly).
  - type-category match arm (case COnly:).

Qualified ns.X / 1-hop / source-pinned library-internal references stay
exempt (the gate falls through for reachable / unauthored names, and
returns the existing "unresolved type" diagnostic for genuinely-undeclared
names). README notes the type gate holds wherever a bare type name is
named. New regressions 0765 (2-hop reject) / 0766 (1-hop pass).
2026-06-08 13:18:51 +03:00

535 lines
18 KiB
Markdown

# sx
An experimental systems programming language with Jai-inspired syntax, compile-time execution, generics, closures, protocols, and an LLVM backend.
> **Status**: Highly experimental. The language and compiler are under active development.
## At a Glance
```sx
#import "modules/std.sx";
Point :: struct {
x, y: s32;
magnitude :: (self: *Point) -> f32 { sqrt(self.x * self.x + self.y * self.y); }
}
main :: () {
p := Point.{ x = 3, y = 4 };
print("point: {}, magnitude: {}\n", p, p.magnitude());
}
```
**Key characteristics:**
- Jai-inspired declaration syntax: `name :: value` for constants, `name := value` for variables
- Compiles to native code via LLVM 19
- Compile-time execution with `#run`
- Generics via monomorphization
- First-class closures with value capture
- Protocol-based polymorphism (traits)
- Pattern matching on enums, optionals, and type categories
- C interop via `#foreign` and `#import c`
- Targets: macOS (ARM64, x86_64), Linux (x86_64, ARM64), Windows (x86_64), WebAssembly
## Building
Requires **Zig 0.16+** and **LLVM 19+**.
```sh
zig build
```
On macOS with Homebrew LLVM:
```sh
# default path: /opt/homebrew/opt/llvm@19
zig build
```
Custom LLVM path:
```sh
zig build -Dllvm-prefix=/path/to/llvm
```
## Usage
```sh
sx run file.sx # compile and run
sx build file.sx # compile to binary
sx build file.sx -o out # compile with output path
sx ir file.sx # emit LLVM IR
sx lsp # start language server
```
Options:
```
--target <triple> target platform (shortcuts: macos, linux, windows, wasm)
--opt <level> optimization: none, less, default, aggressive
--cpu <name> target CPU
-o <path> output path
```
## Language Overview
### Types
| Type | Description |
|------|-------------|
| `s8`..`s64`, `u8`..`u64` | Signed/unsigned integers (default: `s64`) |
| `f32`, `f64` | Floating point (default: `f32`) |
| `bool` | `true` / `false` |
| `string` | UTF-8 fat pointer `{ptr, len}` |
| `[N]T` | Fixed-size array |
| `[]T` | Slice (fat pointer) |
| `*T`, `[*]T` | Single / many pointer |
| `?T` | Optional |
| `struct`, `enum`, `union` | Composite types |
| `Closure(args) -> ret` | Closure type |
**Numeric limits.** A field-like access on a builtin integer type name folds to
a compile-time constant of that type: `s64.max``9223372036854775807`,
`u8.min``0`, `s3.max``3`. It works for every width `s1`..`s64` / `u1`..`u64`
plus `usize`/`isize`, and is usable anywhere a constant of that type is — including
array dimensions (`[u8.max]T` is a 255-element array). The float types `f32`/`f64`
expose `.min` / `.max` too (with `.min` = most-negative finite = `-max`, **not**
C's `DBL_MIN`) plus the float-only `.epsilon` (ULP of 1.0, not C#'s denormal
`Epsilon`), `.min_positive` (smallest normal = C `DBL_MIN`), `.true_min` (smallest
subnormal — beware flush-to-zero CPU modes), `.inf`, and `.nan`. A float-only
accessor on an integer (`s32.epsilon`), or any accessor on a non-numeric type, is
a clean compile error. The fold applies only to a bare type-name receiver: a raw
identifier that binds a value shadowing a type name (`` `f64 := … `` then
`` `f64.epsilon ``) reads the value's field, not the limit — for a local, global,
or module-constant binding alike. This stays an ordinary *runtime* field read
even when it flows into an integer binding or an array dimension, so it truncates
(its field value) / is a non-constant count — never the builtin limit. See
`specs.md` → Numeric Limits.
### Declarations
```sx
// Constants (compile-time when possible)
PI :: 3.14159;
MAX : s32 : 100;
// Variables (mutable)
x := 42; // inferred type
y : s32 = 0; // explicit type
z : s32 = ---; // uninitialized
```
A typed constant's initializer must be compatible with its annotation — an
integer fits any integer or float, a float a float type, a string `string`,
`null` a pointer/optional. The check is type-based, so it covers a literal and a
constant expression alike: both `N : string : 4` and `N : string : M + 2` are a
compile-time `type mismatch` error, not a silently-accepted constant. Mixed
int+float arithmetic promotes to the float in either operand order (`n + 0.5` and
`0.5 + n` are both `f64`), so `C : s64 : M + 0.5` is rejected regardless of order
while `F : f64 : M + 0.5` folds to `2.5`.
**Float → integer narrowing (unified rule).** A float flowing into an
integer-typed binding *without* a cast follows the same integral-fold rule an
array dimension uses: an **integral** compile-time float folds to its integer, a
**non-integral** one is a compile error. It holds whether the value is a literal
or *any* compile-time-constant float expression — including one that references a
float-typed const (`F : f64 : 2.5; y : s64 = F + 1.5` → `4`), a builtin float
numeric-limit accessor (`f64.max - f64.max` → `0`, while `f64.true_min + 0.5`
errors), a float `%` (`6.0 % 4.0` → `2`, while `5.5 % 2.0` = `1.5` errors), or a
float `/` (`6.0 / 2.0` → `3`, while `5.0 / 2.0` = `2.5` errors — a float `/` is
always float division, never integer truncation, even with integral operands):
the compile-time float evaluator recognises every leaf shape the integer one does, so
no constant float form escapes the rule at one site while folding at another — and
is uniform
across a typed local, a parameter default, a struct field default, a call
argument, a typed constant, **and an array dimension / count** — `y : s64 = 4.0`,
`K : s64 : 4.0`, `y : s64 = M + 2.0`, and `[F + 1.5]s64` (≡ `[4]s64`, whether
written directly, through a const, or via a type alias) all give `4`, while
`y : s64 = 1.5`, `N : s64 : 1.5`, `y : s64 = M + 0.5`, `y : s64 = F + 0.25`
(= `2.75`), and `[F + 0.25]s64` all error (one wording at the binding sites:
`cannot implicitly narrow non-integral float …`; a dimension instead reports
`array dimension must be an integer, but '…' is a non-integral float`, since the
cast escape does not apply in a count position). An explicit `xx` / `cast(s64)`
is the escape hatch and always truncates (`y : s64 = xx 1.5` → `1`,
`y : s64 = xx (M + 0.5)` → `2`); a genuine runtime float is likewise unaffected.
Builtin type names (`s2`, `u8`, `bool`, `string`, …) are reserved and a *bare*
spelling can't be used as an identifier at a **value-binding or declaration-name**
site — a value binding (`:=` / typed local / parameter), a `::` constant or
function declaration, an `impl` method definition, or a `::` type declaration
(`struct` / `enum` / `union` / alias / `protocol` / …) — each is an error
(`s2 :: 5` and `s2 :: (n) { … }` are rejected just like `s2 := 5`). **Member-name
positions are exempt**: a struct *field*, a union *tag*, and a protocol
*method-signature* may be a bare reserved spelling (`struct { s2: s64 }`,
`union { u8: … }`, `protocol { s2 :: () -> s64 }`) — they are reached via `obj.name`,
so they never mis-lower. The bare exemption covers only the identifier-classified
reserved names (`s1`..`s64`, `u1`..`u64`, `bool`, `string`, `void`, `usize`,
`isize`, `Any`); `f32` and `f64` are lexer keywords, so even in a member slot they
need the backtick (`` struct { `f32: s64 } ``). A leading backtick escapes one into
a **raw identifier**:
`` `name `` is the literal identifier `name` (the backtick drops out of the text),
usable in **every** position — value, declaration, and type, and optional in the
exempt member positions. It is the only way handwritten sx can spell a reserved
name in a binding or declaration site.
```sx
`s2 := 2.5; // identifier "s2", distinct from the s2 type
print("{}\n", `s2); // 2.5 (or bare `s2` in value position)
`s2 :: struct { x: s64; } // declare a type named with a reserved spelling
v : `s2 = ---; // and reference it as a type — resolves to the struct
x : s2 = 3; // bare `s2` in type position is still the int type
```
It works in every identifier position — local, global, parameter, struct field,
union tag, function name, type/alias/import name, a top-level or struct-body
constant, and the control-flow / capture / binding forms (destructure, `if`/`while`
binding, `for` capture, match capture, `catch`/`onfail` tag) — and a reserved-spelled
function is bare-callable (`s2(10)`). A backtick name used as a type resolves to a
`` `name ``-declared type — including a parameterized template (`` `s2(s64) ``) and
under pointer/optional wrappers — else a normal `unknown type` error.
Foreign declarations from `#import c { … }` are exempt automatically: C names that
collide with reserved type names (e.g. `s1`, `s2`) import unedited, and a foreign
reserved-name function is bare-callable by its C name.
### Structs
```sx
Vec3 :: struct {
x, y, z: f32;
length :: (self: *Vec3) -> f32 {
sqrt(self.x * self.x + self.y * self.y + self.z * self.z);
}
}
v := Vec3.{ x = 1, y = 2, z = 3 };
v2 := Vec3.{ 1, 2, 3 }; // positional
print("{}\n", v.length());
```
Structs support field defaults, `#using` for composition, and methods defined in the body.
### Enums (Tagged Unions)
```sx
Shape :: enum {
circle: f32;
rect: struct { w, h: f32; };
none;
}
area :: (s: Shape) -> f32 {
if s == {
case .circle: (r) => 3.14159 * r * r;
case .rect: (r) => r.w * r.h;
case .none: 0;
}
}
```
Flag enums with power-of-2 values:
```sx
Perms :: enum flags { read; write; execute; }
rw := Perms.read | Perms.write;
```
### Optionals
```sx
x: ?s32 = 42;
y: ?s32 = null;
val := x ?? 0; // null coalescing
forced := x!; // force unwrap (traps on null)
if v := x { // safe unwrap
print("{}\n", v);
}
// Optional chaining
node: ?Node = get_node();
name := node?.name ?? "unknown";
```
### Generics
```sx
max :: (a: $T, b: T) -> T {
if a > b then a else b;
}
List :: struct ($T: Type) {
items: [*]T;
len: s64;
append :: (self: *List(T), item: T) { ... }
}
```
Generic constraints via protocols:
```sx
are_equal :: ($T: Type/Eq, a: T, b: T) -> bool { a.eq(b); }
```
### Closures
```sx
make_adder :: (n: s64) -> Closure(s64) -> s64 {
closure((x: s64) -> s64 => x + n);
}
add5 := make_adder(5);
print("{}\n", add5(100)); // 105
```
Closures capture by value. Bare functions auto-promote to closures when needed.
### Protocols
```sx
Drawable :: protocol {
draw :: (x: s32, y: s32);
}
impl Drawable for Circle {
draw :: (self: *Circle, x: s32, y: s32) { ... }
}
shape : Drawable = xx my_circle; // type erasure via xx
shape.draw(10, 20); // dynamic dispatch
```
`#inline` protocols store function pointers directly (no vtable indirection):
```sx
Allocator :: protocol #inline {
alloc :: (size: s64) -> *void;
dealloc :: (ptr: *void);
}
```
### Pattern Matching
```sx
// On enums
if shape == {
case .circle: (r) => print("radius: {}\n", r);
case .rect: (r) => print("{}x{}\n", r.w, r.h);
case .none: print("nothing\n");
}
// On optionals
if opt == {
case .some: (val) => use(val);
case .none: fallback();
}
// On type categories (via Any)
if type_of(val) == {
case int: print("integer\n");
case string: print("string\n");
case struct: print("struct\n");
}
```
### Control Flow
```sx
// Chained comparisons
if 0 <= x <= 100 { ... }
// While
while i < 10 { i += 1; }
// For (arrays and slices)
for items: (val) { print("{}\n", val); }
for items: (val, idx) { print("[{}] = {}\n", idx, val); }
// Defer
f := open("file.txt");
defer close(f);
// Multi-target assignment (atomic swap)
a, b = b, a;
```
### Pipe Operator
```sx
result := data |> parse() |> transform() |> serialize();
// equivalent to: serialize(transform(parse(data)))
```
### Compile-Time Execution
```sx
// Evaluate at compile time
FIBONACCI_10 :: #run fib(10);
// Generate code at compile time
#insert #run generate_lookup_table();
```
### C Interop
Foreign functions:
```sx
libc :: #library "c";
printf :: (fmt: [:0]u8, args: ..Any) -> s32 #foreign libc;
write_fd :: (fd: s32, buf: [*]u8, count: u64) -> s64 #foreign libc "write";
```
Direct C header import:
```sx
#import c {
#include "vendors/mylib/api.h";
#source "vendors/mylib/impl.c";
};
```
### Modules
```sx
#import "modules/std.sx"; // flat import
math :: #import "modules/math.sx"; // namespaced import
```
When two flat-imported modules each define a function of the same name, every
module's own code binds its OWN author — a bare call resolves to the same-name
function in the caller's module (or in its single flat import that provides it).
A bare call to a name that two or more flat imports both provide is ambiguous and
is rejected; qualify it with a namespaced import (`m :: #import …; m.fn()`).
A **namespaced** import only binds its alias: reach the module's members as
`m.name`. Bare-name visibility joins over flat (`#import "…"`) imports, never over
a namespaced alias. That join is **non-transitive for every bare member kind —
functions, constants, AND types alike**: a flat import of a flat import is NOT
bare-visible (when `A` imports `B` and `B` imports `C`, `A` does not see `C`'s
top-level names — including its types — so qualify them, or `#import "C"` directly
if you reference them). This holds for a *parameterized* type head too: a generic
struct / parameterized protocol / type-returning function used as `Box(s64)` is
gated exactly like a bare leaf type — the constructor head must be reachable over
your own or a direct flat import, not two hops away. A bare reference to a
namespaced-only import's member — function, module constant, or **type** (leaf or
generic head) — is likewise not visible and is rejected (`type 'X' is not visible;
#import the module that declares it`); qualify it as `m.name`. The type gate holds
wherever a bare type name is named — a value/field annotation, a reflection /
type-arg slot (`size_of(T)`, `size_of(*T)`), a typed array-literal head (`T.[…]`),
or a type-as-value / type-match arm — not just plain annotations. (A library's own *internal* type references still resolve: a generic
struct / pack fn / protocol body is instantiated in the module that defines it, so
e.g. `List(T).append`'s `alloc: Allocator` is visible there regardless of the call
site.)
### Implicit Context
Every program gets an implicit `context` with a default allocator:
```sx
// No boilerplate needed — context is auto-initialized
main :: () {
list := List(s64).create(); // uses context.allocator
list.append(42);
}
// Override allocator for a scope
push Context.{ allocator = my_arena } {
do_work(); // all allocations use my_arena
}
```
## Quick Sort Example
```sx
#import "modules/std.sx";
quick_sort :: (items: []$T) {
partition :: (items: []T, lo: s64, hi: s64) -> s64 {
pivot := items[hi];
i := lo - 1;
j := lo;
while j < hi {
if items[j] < pivot {
i += 1;
items[i], items[j] = items[j], items[i];
}
j += 1;
}
i += 1;
items[i], items[hi] = items[hi], items[i];
i;
}
sort :: (items: []T, lo: s64, hi: s64) {
if lo < hi {
pi := partition(items, lo, hi);
sort(items, lo, pi - 1);
sort(items, pi + 1, hi);
}
}
sort(items, 0, items.len - 1);
}
main :: () {
arr : []s64 = .[333, 2, 3, 5, 2, 2, 3, 4, 5, 6, 6, 1];
quick_sort(arr);
print("{}\n", arr);
// [1, 2, 2, 2, 3, 3, 4, 5, 5, 6, 6, 333]
}
```
## Standard Library
The standard library (`modules/std.sx`) provides:
- **I/O**: `print(fmt, args...)`, `out(str)`
- **Collections**: `List($T)` (dynamic array)
- **Strings**: `concat`, `substr`, `int_to_string`, `uint_to_string`, `float_to_string`, `cstring`
- **Memory**: `Allocator` protocol, `GPA` (general purpose), `Arena` (bump allocator)
- **Math**: `sqrt`, `sin`, `cos`
- **Introspection**: `type_of`, `type_name`, `type_is_unsigned`, `type_eq`, `field_count`, `field_name`, `field_value`, `size_of`, `align_of`, `is_flags` — the type-only builtins (`size_of`, `align_of`, `field_count`, `type_name`, `type_eq`, `type_is_unsigned`, `is_flags`) require a type argument (a spelled type or a generic `T`); passing a value is a compile-time error. A runtime `Type` value (`type_of(x)`) is currently accepted by `type_name` and `type_is_unsigned` only — the other five are compile-time-only (runtime reflection is deferred)
### Command-line interface (`modules/std/cli.sx`)
`std.cli` builds command-line front-ends over an explicit logical argv
(`[]string`): `os_args(buf)` reads the real process argv, and
`parse(args, commands, diag) -> !Parsed` does subcommand dispatch + `--flag`
parsing. On top of that it defines the small **exit-code / `--json` contract**
a CLI program (e.g. `dist`) relies on:
```sx
#import "modules/std/cli.sx";
p, e := parse(args, cmds, @diag); // (Parsed, !CliError)
if e == error.UnknownCommand {
log.err("unknown command '{}'", diag.token); // human text -> stderr
exit_usage(); // usage error -> exit 64
}
if p.json { /* emit ONLY machine output on stdout */ }
```
- **Named exit codes** — `EX_OK` (0), `EX_USAGE` (64, the sysexits.h
command-line-usage code), `EX_UNAVAILABLE` (70, unsupported platform).
- **Terminators** — `exit_ok()` / `exit_usage()` end the process with the
matching code; both route through the canonical `process.exit(code: u8)`.
- **`--json` mode** — the reserved global `--json` flag surfaces as
`parsed.json` (true iff `--json` is in the argv). Convention: in json mode
stdout carries only the machine result; human diagnostics go to stderr.
## Cross-Compilation
```sh
sx build app.sx --target linux # Linux x86_64
sx build app.sx --target macos-arm # macOS ARM64
sx build app.sx --target windows # Windows x86_64
sx build app.sx --target wasm # WebAssembly
```
## Acknowledgments
- [Jonathan Blow](https://en.wikipedia.org/wiki/Jonathan_Blow) for Jai, the language that inspired this one
- [Andrew Kelley](https://andrewkelley.me) for Zig, which made this compiler a joy to write
## License
MIT