1048 lines
32 KiB
Markdown
1048 lines
32 KiB
Markdown
# sx language specification
|
|
|
|
## 1. Lexical Structure
|
|
|
|
### Comments
|
|
Line comments start with `//` and extend to end of line.
|
|
```sx
|
|
// this is a comment
|
|
```
|
|
|
|
### Identifiers
|
|
- Lowercase or mixed-case for variables, functions: `x`, `compute`, `main`
|
|
- UPPER_SNAKE_CASE for constants: `SOME_INT`, `SOME_STR`
|
|
- PascalCase for types: `Foo`
|
|
|
|
### Literals
|
|
|
|
| Kind | Examples | Type |
|
|
|-----------|---------------------|---------|
|
|
| Integer | `0`, `42`, `0xFF`, `0b1010` | `s64` |
|
|
| Float | `0.3`, `0.9` | `f32` |
|
|
| String | `"Hello"`, `"z: {z}"` | `string` |
|
|
| Multi-line String | `` `line1\nline2` `` | `string` |
|
|
| Heredoc String | `#string END`...`END` | `string` |
|
|
| Boolean | `true`, `false` | `bool` |
|
|
| Enum | `.variant1` | inferred from context |
|
|
| Undefined | `---` | context-dependent |
|
|
|
|
**Multi-line strings** use backtick delimiters (`` ` ``). They may span multiple lines and support the same escape sequences as regular strings (`\n`, `\t`, `\r`, `\\`, `\"`, `` \` ``, `\0`). Content between backticks is taken verbatim (no indentation stripping).
|
|
```sx
|
|
shader_src := `#version 330 core
|
|
void main() {
|
|
gl_Position = vec4(0.0);
|
|
}
|
|
`;
|
|
```
|
|
|
|
**Heredoc strings** use `#string DELIMITER` syntax (inspired by Jai). Content is completely raw — no escape processing. The delimiter is any identifier. Content starts after the newline following the delimiter and ends when the delimiter appears at column 0 of a line.
|
|
```sx
|
|
vert_src := #string GLSL
|
|
#version 330 core
|
|
void main() {
|
|
gl_Position = vec4(aPos, 1.0);
|
|
}
|
|
GLSL;
|
|
```
|
|
|
|
### Keywords
|
|
`if`, `else`, `then`, `while`, `break`, `continue`, `true`, `false`, `enum`, `struct`, `union`, `case`, `return`, `defer`, `xx`, `and`, `or`
|
|
|
|
> Note: `enum` is used for both payload-less and payload-bearing sum types (tagged unions). `union` is reserved for C-style untagged unions (memory overlays).
|
|
|
|
### Operators
|
|
|
|
| Operator | Meaning |
|
|
|----------|------------------|
|
|
| `+` | addition |
|
|
| `-` | subtraction / negation |
|
|
| `*` | multiplication |
|
|
| `/` | division |
|
|
| `==` | equality |
|
|
| `!=` | inequality |
|
|
| `<` | less than |
|
|
| `>` | greater than |
|
|
| `<=` | less or equal |
|
|
| `>=` | greater or equal |
|
|
| `&` | bitwise AND |
|
|
| `\|` | bitwise OR |
|
|
| `and` | logical AND (short-circuit) |
|
|
| `or` | logical OR (short-circuit) |
|
|
| `+=` | add-assign |
|
|
| `-=` | sub-assign |
|
|
| `*=` | mul-assign |
|
|
| `/=` | div-assign |
|
|
|
|
### Delimiters and Punctuation
|
|
|
|
| Token | Meaning |
|
|
|--------|--------------------------------------|
|
|
| `::` | constant binding / definition |
|
|
| `:=` | variable binding (mutable, inferred) |
|
|
| `:` | type annotation |
|
|
| `=` | assignment (in typed var decl) |
|
|
| `;` | statement terminator |
|
|
| `,` | separator |
|
|
| `.` | field access / enum literal prefix |
|
|
| `->` | return type annotation |
|
|
| `=>` | lambda arrow |
|
|
| `$` | generic type parameter introduction |
|
|
| `---` | undefined value |
|
|
| `()` | grouping / params |
|
|
| `{}` | blocks / bodies |
|
|
|
|
---
|
|
|
|
## 2. Type System
|
|
|
|
### Primitive Types
|
|
- `s1`..`s64` — signed integers (1 to 64 bits). `s64` is the default for integer literals.
|
|
- `u1`..`u64` — unsigned integers (1 to 64 bits).
|
|
- `f32` — 32-bit floating point
|
|
- `f64` — 64-bit floating point
|
|
- `bool` — boolean (`true` / `false`)
|
|
- `string` — string of characters
|
|
- `Any` — type-erased value, represented as `{ i64, i64 }` (type tag + payload). Used for variadic arguments and runtime type dispatch.
|
|
- `Type` — compile-time type value. At runtime, represented as an `i64` type tag (same tag space as `Any`).
|
|
|
|
### Enum Types
|
|
User-defined sum types with named variants. Variants may optionally carry typed data (tagged unions). Internally, payload-less enums are represented as `i64` (variant index). Enums with payloads are represented as `{ i64, [max_payload_size x i8] }` (tag + data).
|
|
|
|
#### Declaration
|
|
```sx
|
|
// Payload-less enum
|
|
Color :: enum {
|
|
red;
|
|
green;
|
|
blue;
|
|
}
|
|
|
|
// Enum with payloads (tagged union)
|
|
Shape :: enum {
|
|
circle: f32; // typed variant
|
|
rect: s32; // typed variant
|
|
none; // void variant
|
|
}
|
|
```
|
|
Variants are referenced with dot-prefix syntax: `.variant1`
|
|
|
|
#### Construction
|
|
```sx
|
|
c := Color.red; // payload-less
|
|
s :Shape = .circle(3.14); // inferred from context
|
|
s = .none; // void variant
|
|
s = Shape.rect(42); // explicit prefix
|
|
```
|
|
|
|
#### Payload Access
|
|
```sx
|
|
r := s.circle; // load payload as f32 (undefined behavior if wrong variant active)
|
|
```
|
|
|
|
#### Pattern Matching
|
|
```sx
|
|
if s == {
|
|
case .circle: print("circle\n");
|
|
case .rect: print("rect\n");
|
|
case .none: print("none\n");
|
|
}
|
|
```
|
|
|
|
#### Enum Interpolation
|
|
Payload-less enums print as `.variant`. Enums with payloads print as `.variant(value)` or `<TypeName tag=N>`:
|
|
```sx
|
|
print("{}", s); // .circle(3.140000)
|
|
```
|
|
|
|
### Union Types (Untagged)
|
|
C-style untagged unions for zero-cost memory overlays (type punning). All fields share the same memory — no tag, no runtime overhead. The LLVM representation is `[max_field_size x i8]`.
|
|
|
|
#### Declaration
|
|
```sx
|
|
Overlay :: union {
|
|
f: f32;
|
|
i: s32;
|
|
}
|
|
```
|
|
All fields must have types (unlike enums, which may have void variants).
|
|
|
|
#### Anonymous Struct Fields (Member Promotion)
|
|
Anonymous `struct` fields inside a union have their members promoted to the union namespace:
|
|
```sx
|
|
Vec2 :: union {
|
|
data: [2]f32;
|
|
struct { x, y: f32; };
|
|
}
|
|
```
|
|
Access promoted members directly: `v.x`, `v.y` — these are zero-cost GEPs into the same underlying memory as `v.data[0]`, `v.data[1]`.
|
|
|
|
#### Initialization
|
|
Unions must be initialized with `---` (undefined) and then assigned per-field:
|
|
```sx
|
|
o :Overlay = ---;
|
|
o.f = 3.14;
|
|
print("{}\n", o.i); // reinterpret bits as s32
|
|
```
|
|
|
|
#### Restrictions
|
|
- Pattern matching (`if x == { case ... }`) is not supported on unions.
|
|
- Unions cannot be printed directly via `print("{}", union_val)` — access individual fields instead.
|
|
|
|
### Struct Types
|
|
User-defined product types with named fields.
|
|
```sx
|
|
Vec4 :: struct {
|
|
x, y, z, w: f32;
|
|
}
|
|
```
|
|
Fields are declared as `name1, name2: type;` (comma-separated names sharing a type, semicolon-terminated).
|
|
|
|
#### Field Defaults
|
|
Fields may have default values. Fields without an explicit default have a zero-value default. `---` marks a field as explicitly undefined.
|
|
```sx
|
|
Foo :: struct {
|
|
a : u2; // default is 0
|
|
b : u8 = 42; // default is 42
|
|
c : u8 = ---; // default is undefined
|
|
}
|
|
```
|
|
|
|
#### Struct Literals
|
|
```sx
|
|
// Positional (with type annotation — type inferred from annotation)
|
|
v1 : Vec4 = .{ 1, 2, 3, 0 };
|
|
|
|
// Positional (with type prefix)
|
|
v2 := Vec4.{ 4, 1, 1, 3 };
|
|
|
|
// Named fields (any order)
|
|
v3 := Vec4.{ w=0, x=2, y=3, z=4 };
|
|
|
|
// Mixed named + shorthand (bare identifier = field name matches variable name)
|
|
z := 5.0;
|
|
w := 6.0;
|
|
v4 := Vec4.{ y=3, x=9, w, z };
|
|
```
|
|
|
|
#### Field Access and Assignment
|
|
```sx
|
|
v1.x // read field x of struct v1
|
|
v1.x = 3.0; // assign to field x of struct v1
|
|
```
|
|
|
|
#### Struct Interpolation
|
|
Struct values in string interpolation print as `TypeName{field:value, ...}`:
|
|
```sx
|
|
print("{}", v1); // Vec4{x:1.0, y:2.0, z:3.0, w:0.0}
|
|
```
|
|
|
|
### Array Types
|
|
Fixed-size arrays with element type and length.
|
|
```sx
|
|
buffer : [5]f32 = .[0, 2, 3.5, 4, 0];
|
|
val := buffer[2]; // 3.5
|
|
buffer.len // 5 (compile-time constant, s64)
|
|
```
|
|
|
|
Arrays can also be constructed programmatically with the `Array` builtin:
|
|
```sx
|
|
MyArr :: Array(5, s32); // equivalent to [5]s32
|
|
```
|
|
|
|
### Slice Types
|
|
A slice `[]T` is a fat pointer `{ptr, i64}` referencing a contiguous sequence of `T` elements. Same runtime layout as `string`.
|
|
```sx
|
|
// Arrays implicitly coerce to slices at call sites
|
|
arr : [5]s32 = .[3, 1, 4, 1, 5];
|
|
sortSlice(arr); // [5]s32 → []s32 coercion
|
|
|
|
// Slice operations
|
|
items[i] // read element at index
|
|
items[i] = val; // write element at index
|
|
items.len // length (s64)
|
|
items.ptr // raw pointer
|
|
```
|
|
|
|
Slices support generic type parameters: `[]$T` introduces type parameter `T` inferred from the element type of the argument (array or slice).
|
|
|
|
### Subslicing
|
|
Arrays, slices, and strings support subslice syntax to create zero-copy views:
|
|
```sx
|
|
arr : [5]s32 = .[3, 1, 4, 1, 5];
|
|
sub := arr[1..4]; // []s32 → [1, 4, 1]
|
|
head := arr[..3]; // []s32 → [3, 1, 4]
|
|
tail := arr[2..]; // []s32 → [4, 1, 5]
|
|
|
|
msg := "hello world";
|
|
word := msg[6..11]; // string → "world"
|
|
```
|
|
- `expr[start..end]` — elements from `start` (inclusive) to `end` (exclusive)
|
|
- `expr[start..]` — elements from `start` to end
|
|
- `expr[..end]` — elements from beginning to `end`
|
|
- Result type: `[]T` for arrays/slices, `string` for strings
|
|
- No memory allocation — the result points into the original backing storage
|
|
|
|
### Pointer Types
|
|
|
|
| Syntax | Meaning | `.len` | `[i]` |
|
|
|--------|---------|--------|-------|
|
|
| `*T` | pointer to one T | no | no |
|
|
| `[*]T` | many-pointer (buffer) | no | yes |
|
|
| `*[N]T` | pointer to array of N T | yes | yes |
|
|
| `*[]T` | pointer to slice | yes | yes |
|
|
|
|
**Address-of**: `&x` returns a pointer to the variable.
|
|
```sx
|
|
v := Vec2.{ 1.0, 2.0 };
|
|
ptr := &v; // *Vec2
|
|
```
|
|
|
|
**Dereference**: `p.*` loads the value through the pointer.
|
|
```sx
|
|
copy := ptr.*; // Vec2
|
|
```
|
|
|
|
**Auto-deref**: `p.field` is sugar for `p.*.field`.
|
|
```sx
|
|
set_x :: (p: *Vec2, val: f32) {
|
|
p.x = val; // auto-deref: p.*.x = val
|
|
}
|
|
set_x(&v, 99.0);
|
|
```
|
|
|
|
**Null**: All pointer types are nullable. `null` is the null pointer literal.
|
|
```sx
|
|
np : *Vec2 = null;
|
|
```
|
|
|
|
**Many-pointer**: `[*]T` supports indexing for buffers of unknown size.
|
|
```sx
|
|
arr : [5]s32 = .[10, 20, 30, 40, 50];
|
|
mp : [*]s32 = &arr[0]; // *s32 → [*]s32 implicit
|
|
val := mp[2]; // 30
|
|
```
|
|
|
|
**Implicit conversions**:
|
|
- `*T` → `[*]T` (pointer to element → many-pointer)
|
|
- `*[N]T` → `[*]T` (pointer to array → many-pointer)
|
|
- `[N]T` → `[*]T` at call sites (array decays to many-pointer)
|
|
- `[]T` → `[*]T` (slice decays to many-pointer, extracts `.ptr`)
|
|
- `T` → `*T` at call sites (implicit address-of)
|
|
- `null` (`*void`) → any `*T`
|
|
|
|
**Fat pointer layout**: `[:0]u8`, `string`, and `[]T` are `{ptr, i64}` structs. The raw pointer is always the first field at offset 0. This means `*[:0]u8` works as C's `char**` — a C function dereferences through the outer pointer and reads the raw `char*` from offset 0.
|
|
|
|
### C Interop Type Mapping
|
|
|
|
| C type | sx type | Notes |
|
|
|--------|---------|-------|
|
|
| `const char*` (input) | `[:0]u8` | compiler extracts `.ptr` at call site |
|
|
| `char*` (output buffer) | `[*]u8` | raw buffer, no length |
|
|
| `const char**` | `*[:0]u8` | address of `[:0]u8` — `.ptr` at offset 0 |
|
|
| `int*` (single out) | `*s32` | |
|
|
| `unsigned*` (single out) | `*u32` | |
|
|
| `float*` (buffer) | `[*]f32` | |
|
|
| `void*` (generic) | `*void` | only for truly opaque/generic data |
|
|
|
|
### Vector Types (SIMD)
|
|
LLVM SIMD vectors, parameterized by length and element type.
|
|
```sx
|
|
v := vec3(1, 3, 2); // Vector(3, f32)
|
|
```
|
|
|
|
**Arithmetic**: Element-wise `+`, `-`, `*`, `/` on vectors of same dimensions.
|
|
```sx
|
|
add := v1 + v2; // element-wise addition
|
|
```
|
|
|
|
**Scalar broadcast**: Scalar operands are broadcast to match the vector.
|
|
```sx
|
|
scaled := v * 2.0; // [2.0, 6.0, 4.0]
|
|
```
|
|
|
|
**Negation**: Unary `-` negates each element.
|
|
```sx
|
|
neg := -v; // [-1.0, -3.0, -2.0]
|
|
```
|
|
|
|
**Element access**: `.x`, `.y`, `.z`, `.w` (aliases `.r`, `.g`, `.b`, `.a`) extract single components.
|
|
```sx
|
|
v.x // first element
|
|
v.z // third element
|
|
```
|
|
|
|
**Index access**: `v[i]` extracts by index.
|
|
```sx
|
|
v[0] // first element
|
|
```
|
|
|
|
**Built-in `sqrt`**: Calls LLVM `llvm.sqrt.f32`/`.f64` intrinsic.
|
|
```sx
|
|
s := sqrt(9.0); // 3.0
|
|
```
|
|
|
|
### Function Types
|
|
Expressed as `(param_types) -> return_type`.
|
|
A function with no return type annotation returns void.
|
|
```sx
|
|
// type is (s32) -> s32
|
|
compute :: (x: s32) -> s32 { x * x; }
|
|
|
|
// type is () -> void
|
|
main :: () { }
|
|
```
|
|
|
|
### Type Aliases
|
|
A name bound to an existing type.
|
|
```sx
|
|
SOME_TYPE :: f64;
|
|
```
|
|
|
|
### Generic Functions (Monomorphization)
|
|
Functions can be parameterized over types using `$T` syntax. The `$` prefix introduces a type parameter; subsequent uses of the name reference it.
|
|
```sx
|
|
sum :: (a: $T, b: T) -> T {
|
|
return a + b;
|
|
}
|
|
```
|
|
- `$T` in a parameter type **introduces** type parameter `T`
|
|
- Bare `T` (without `$`) **references** the introduced type parameter
|
|
- At call sites, type arguments are **inferred** from actual argument types:
|
|
```sx
|
|
sum(40, 2) // T = s32
|
|
sum(1.5, 2.5) // T = f32
|
|
```
|
|
- Each unique set of concrete types produces a **separate specialized function** (monomorphization)
|
|
- Multiple type parameters are supported: `(a: $T, b: $U) -> T`
|
|
|
|
### Variadic Functions
|
|
Functions can accept a variable number of arguments using `..Type` syntax:
|
|
```sx
|
|
print :: (fmt: string, args: ..Any) { ... }
|
|
```
|
|
- `..Any` means zero or more arguments, each boxed into `Any` (type tag + payload)
|
|
- The variadic parameter must be the last parameter
|
|
- At call sites, variadic arguments are automatically boxed: `print("x={}, y={}\n", x, y)`
|
|
- Inside the function body, `args` is accessed as a slice-like sequence
|
|
|
|
### Type Inference
|
|
- `::` bindings infer type from the right-hand side
|
|
- `:=` bindings infer type from the right-hand side
|
|
- Explicit annotation overrides inference: `NAME : f64 : 0.9;`
|
|
- Integer literals default to `s64`
|
|
- Float literals default to `f32`
|
|
- Enum literals (`.variant`) infer their enum type from context (expected type)
|
|
|
|
### Type Conversions
|
|
|
|
**Implicit (widening)** — allowed without annotation:
|
|
- Integer to wider integer of same signedness (`u8` → `u16`, `s8` → `s32`)
|
|
- Unsigned to strictly wider signed (`u8` → `s16`)
|
|
- Any integer to any float (`u8` → `f32`, `s32` → `f64`)
|
|
- Float to wider float (`f32` → `f64`)
|
|
- Integer and float literals can convert to any numeric type implicitly
|
|
|
|
**Explicit (narrowing)** — requires `xx` prefix:
|
|
- Integer to narrower integer (`s32` → `u8`)
|
|
- Signed to unsigned (`s32` → `u32`)
|
|
- Float to narrower float (`f64` → `f32`)
|
|
- Float to any integer (`f64` → `u16`)
|
|
- Unsigned to signed of same or narrower width (`u8` → `s8`)
|
|
|
|
The `xx` prefix operator marks an expression for auto-conversion to the expected type from context (assignment, declaration, argument, return):
|
|
```sx
|
|
large: f64 = 5999.5;
|
|
x : u16 = xx large; // f64 → u16
|
|
d : u8 = #run xx resolve(5); // s32 → u8 at compile time
|
|
```
|
|
|
|
Using `xx` outside a typed context (where the target type is known) is a compile error.
|
|
|
|
---
|
|
|
|
## 3. Declarations
|
|
|
|
### Constant Binding (immutable)
|
|
|
|
```sx
|
|
// inferred type
|
|
NAME :: value;
|
|
|
|
// explicit type
|
|
NAME : type : value;
|
|
```
|
|
|
|
The `::` operator creates an immutable binding. The value is evaluated at compile time when possible.
|
|
|
|
Examples:
|
|
```sx
|
|
SOME_INT :: 0; // s32
|
|
SOME_STR :: "Hello"; // string
|
|
SOME_FLOAT :: 0.3; // f32
|
|
SOME_DOUBLE : f64 : 0.9; // f64 (explicit)
|
|
SOME_FUNC :: () => 42; // () -> s32
|
|
SOME_TYPE :: f64; // type alias
|
|
```
|
|
|
|
### Variable Binding (mutable)
|
|
|
|
```sx
|
|
// inferred type
|
|
name := value;
|
|
|
|
// explicit type
|
|
name : type = value;
|
|
|
|
// default-initialized (type required)
|
|
name : type;
|
|
|
|
// undefined (type required)
|
|
name : type = ---;
|
|
```
|
|
|
|
The `:=` operator creates a mutable binding. The type is inferred unless explicitly annotated.
|
|
|
|
`name : type;` initializes using the type's defaults: zero for primitives, per-field defaults for structs (see Field Defaults).
|
|
|
|
`name : type = ---;` leaves the value undefined (uninitialized memory). Reading before writing is undefined behavior.
|
|
|
|
Examples:
|
|
```sx
|
|
x := 42; // s32, mutable
|
|
x := if true then 1 else 2;
|
|
z : Foo = .variant2; // Foo, mutable, explicit type
|
|
a : Foo; // Foo, default-initialized (a=0, b=42, c=undef)
|
|
b : Foo = ---; // Foo, entirely undefined
|
|
```
|
|
|
|
### Function Definition
|
|
|
|
```sx
|
|
name :: (params) -> return_type {
|
|
body
|
|
}
|
|
```
|
|
|
|
- Parameters: `name: type` separated by commas
|
|
- Return type: `-> type` (omit for void)
|
|
- Body: block of statements; last expression is the implicit return value
|
|
- No `return` keyword needed (last expression = return value)
|
|
|
|
Examples:
|
|
```sx
|
|
compute :: (x: s32) -> s32 {
|
|
x * x;
|
|
}
|
|
|
|
main :: () {
|
|
// void return, no -> annotation
|
|
}
|
|
|
|
// Bare-block shorthand (equivalent to no-arg void function):
|
|
main :: {
|
|
// same as main :: () { ... }
|
|
}
|
|
```
|
|
|
|
### Enum Definition
|
|
|
|
```sx
|
|
Name :: enum {
|
|
variant1;
|
|
variant2;
|
|
}
|
|
```
|
|
|
|
Defines a new enum type with the given variants. Trailing comma is allowed.
|
|
|
|
### Enum Flags
|
|
|
|
```sx
|
|
Perms :: enum flags {
|
|
read; // 1
|
|
write; // 2
|
|
execute; // 4
|
|
}
|
|
```
|
|
|
|
The `flags` modifier assigns auto power-of-2 values (1, 2, 4, 8, ...) instead of sequential indices (0, 1, 2, ...). Flags can be combined with `|` and tested with `&`:
|
|
|
|
```sx
|
|
p :Perms = .read | .write;
|
|
if p & .execute { ... }
|
|
print("{}\n", p); // .read | .write
|
|
```
|
|
|
|
Explicit values use `::` syntax (Jai-style):
|
|
|
|
```sx
|
|
WindowFlags :: enum flags {
|
|
vsync :: 64;
|
|
resizable :: 4;
|
|
hidden :: 128;
|
|
}
|
|
```
|
|
|
|
Restrictions:
|
|
- Flags enum variants cannot have payloads
|
|
- `flags` is a contextual identifier, not a keyword
|
|
|
|
### Bitwise Operators
|
|
|
|
`&` (bitwise AND) and `|` (bitwise OR) work on all integer types, not just flags. They sit at precedence level 3, between comparisons and logical operators.
|
|
|
|
```sx
|
|
x := 0xFF & 0x0F; // 15
|
|
y := 1 | 2 | 4; // 7
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Expressions
|
|
|
|
Everything in `sx` is expression-oriented where possible.
|
|
|
|
### Operator Precedence
|
|
|
|
| Prec | Operators | Notes |
|
|
|------|-----------|-------|
|
|
| 6 (highest) | `*`, `/`, `%` | multiplication, division, modulo |
|
|
| 5 | `+`, `-` | addition, subtraction |
|
|
| 4 | `<`, `<=`, `>`, `>=`, `==`, `!=` | comparisons (chainable) |
|
|
| 3 | `&`, `\|` | bitwise AND, bitwise OR |
|
|
| 2 | `and` | logical AND (short-circuit) |
|
|
| 1 (lowest) | `or` | logical OR (short-circuit) |
|
|
|
|
### Arithmetic
|
|
Standard infix: `+`, `-`, `*`, `/` with usual precedence (`*`/`/` before `+`/`-`).
|
|
```sx
|
|
x * x
|
|
x + 2
|
|
```
|
|
|
|
### Chained Comparisons
|
|
Comparison operators can be chained. Each operand is evaluated exactly once.
|
|
```sx
|
|
0 <= x <= 100 // equivalent to: 0 <= x and x <= 100
|
|
1000 > x >= -100 // equivalent to: 1000 > x and x >= -100
|
|
a == b == c // equivalent to: a == b and b == c
|
|
```
|
|
Mixed operators are allowed: `a < b <= c > d` means `a < b and b <= c and c > d`.
|
|
|
|
### Logical Operators
|
|
`and` and `or` are short-circuit boolean operators. The right operand is not evaluated if the left operand determines the result.
|
|
```sx
|
|
if 0 <= x <= 100 and 0 <= y <= 100 {
|
|
print("contained");
|
|
}
|
|
```
|
|
|
|
### If Expression (inline form)
|
|
```sx
|
|
if condition then consequent else alternate
|
|
```
|
|
Both branches are single expressions. The whole form produces a value.
|
|
```sx
|
|
x := if true then 1 else 2;
|
|
```
|
|
The `else` branch is optional. Without it, the form is a statement (no value):
|
|
```sx
|
|
if i == 2 then continue;
|
|
if done then break;
|
|
if err then return;
|
|
```
|
|
|
|
### If Expression (block form)
|
|
```sx
|
|
if condition {
|
|
stmts
|
|
} else {
|
|
stmts
|
|
}
|
|
```
|
|
Each branch is a block. The last expression in each block is the branch's value. Can be used inline within other expressions:
|
|
```sx
|
|
y := x + if false {
|
|
7;
|
|
} else {
|
|
12;
|
|
};
|
|
```
|
|
|
|
### Pattern Matching
|
|
```sx
|
|
if subject == {
|
|
case pattern: body
|
|
case pattern: body
|
|
else: body // optional default arm
|
|
}
|
|
```
|
|
Matches `subject` against each `case`. Patterns can be:
|
|
- **Enum literals**: `.variant` — matches a specific enum variant.
|
|
- **Integer/bool literals**: `42`, `true` — matches a specific value.
|
|
- **Type categories**: `struct`, `enum`, `union` — matches all types in that category (used with `type_of` values).
|
|
|
|
`break` exits a case arm without producing a value. The optional `else:` arm matches when no `case` pattern matches.
|
|
```sx
|
|
if z == {
|
|
case .variant1: break;
|
|
case .variant2:
|
|
print("z: {z}");
|
|
else:
|
|
print("unknown");
|
|
}
|
|
```
|
|
|
|
#### Type Category Matching
|
|
When switching on a `Type` value (from `type_of`), category keywords match all registered types of that category:
|
|
```sx
|
|
type := type_of(val);
|
|
if type == {
|
|
case int: result = int_to_string(xx val);
|
|
case struct: result = struct_to_string(cast(type) val);
|
|
case enum: result = enum_to_string(cast(type) val);
|
|
}
|
|
```
|
|
Available categories: `int`, `float`, `bool`, `string`, `struct`, `enum`, `vector`, `array`, `slice`, `pointer`, `type`.
|
|
|
|
> Note: `case enum:` matches both payload-less enums and tagged enums (enums with payloads). C-style untagged unions are not registered with the Any type system and cannot be matched by category.
|
|
|
|
Inside a category arm, `cast(type) val` performs **runtime generic dispatch**: the compiler generates a switch over all types in the category, monomorphizing the callee for each concrete type.
|
|
|
|
### While Loop
|
|
```sx
|
|
while condition {
|
|
body
|
|
}
|
|
```
|
|
Repeats `body` as long as `condition` is true. `break;` exits the loop. `continue;` skips to the next iteration.
|
|
```sx
|
|
i := 0;
|
|
while i < 10 {
|
|
i += 1;
|
|
if i == 5 { continue; }
|
|
if i == 8 { break; }
|
|
print("{i}\n");
|
|
}
|
|
```
|
|
|
|
### For Loop
|
|
```sx
|
|
for iterable {
|
|
// `it` is the current element
|
|
// `it_index` is the current index (s64)
|
|
print("{it}\n");
|
|
}
|
|
```
|
|
Iterates over arrays and slices. The loop body has two implicit variables:
|
|
- `it` — the current element value
|
|
- `it_index` — the current index (s64, starting at 0)
|
|
|
|
`break;` exits the loop. `continue;` skips to the next iteration.
|
|
```sx
|
|
arr : [5]s32 = .[1, 2, 3, 4, 5];
|
|
for arr {
|
|
if it_index == 2 { continue; }
|
|
print("{it}\n");
|
|
}
|
|
```
|
|
|
|
### Lambda
|
|
```sx
|
|
(params) => expr
|
|
(params) -> return_type => expr
|
|
```
|
|
Anonymous function. Produces a function value. Supports the same parameter features as named functions: `$` generic type params, `..` variadic params, and optional return type annotation.
|
|
```sx
|
|
SOME_FUNC :: () => 42; // () -> s32
|
|
double :: (x: $T) -> T => x + x; // generic lambda with return type
|
|
```
|
|
|
|
### Function Call
|
|
```sx
|
|
callee(args)
|
|
```
|
|
```sx
|
|
compute(6)
|
|
print("hello")
|
|
```
|
|
|
|
### Field Access
|
|
```sx
|
|
object.field
|
|
```
|
|
Used for module access (`std.print`) and struct member access.
|
|
|
|
### Enum Literal
|
|
```sx
|
|
.variant_name
|
|
```
|
|
The enum type is inferred from context (expected type from declaration or parameter).
|
|
|
|
---
|
|
|
|
## 5. Statements
|
|
|
|
Statements are terminated by `;`.
|
|
|
|
- **Declaration**: `name :: value;` / `name := value;`
|
|
- **Assignment**: `name = value;` / `name += value;` (and other compound assignments). Also supports field targets: `obj.field = value;`
|
|
- **Expression statement**: `expr;` — evaluates the expression (last in a block = return value)
|
|
- **Return**: `return expr;` — returns from the enclosing function with the given value. `return;` returns void.
|
|
- **Break**: `break;` — exits a match arm or while loop
|
|
- **Continue**: `continue;` — skips to the next iteration of a while loop
|
|
- **Defer**: `defer expr;` — defers execution of `expr` until the enclosing block exits (LIFO order)
|
|
|
|
---
|
|
|
|
## 6. Blocks, Scoping, and Implicit Returns
|
|
|
|
A block `{ ... }` contains zero or more statements. The last expression in a block is its value (implicit return).
|
|
|
|
In function bodies, the last expression becomes the return value:
|
|
```sx
|
|
compute :: (x: s32) -> s32 {
|
|
x * x; // this is returned
|
|
}
|
|
```
|
|
|
|
### Scope Blocks
|
|
|
|
Bare blocks can be used as statements to introduce a new lexical scope. Variables declared inside a scope block are local to that block. No trailing `;` is required.
|
|
|
|
```sx
|
|
main :: {
|
|
x := 42;
|
|
{
|
|
x := 6; // shadows outer x
|
|
print("inner: {x}"); // prints 6
|
|
}
|
|
print("outer: {x}"); // prints 42
|
|
}
|
|
```
|
|
|
|
### Variable Shadowing
|
|
|
|
A variable declaration (`name :=`) inside an inner scope shadows any variable with the same name from outer scopes. The outer variable is restored when the inner scope exits.
|
|
|
|
### Defer
|
|
|
|
`defer expr;` schedules `expr` to execute when the enclosing scope block exits. Multiple defers in the same scope execute in reverse order (LIFO).
|
|
|
|
```sx
|
|
{
|
|
defer print("second");
|
|
defer print("first");
|
|
}
|
|
// prints: first, then second
|
|
```
|
|
|
|
---
|
|
|
|
## 7. Built-in Functions
|
|
|
|
Built-in functions are declared in `std.sx` with the `#builtin` suffix, which tells the compiler to generate the implementation internally rather than looking for a function body.
|
|
|
|
### I/O
|
|
- `write(str: string) -> void` — write a string to standard output
|
|
- `print(fmt: string, args: ..Any)` — formatted print. Parses `{}` placeholders in the format string and substitutes arguments. When all argument types are statically known, the compiler specializes the call at compile time (no `Any` boxing).
|
|
|
|
### Math
|
|
- `sqrt(x: $T) -> T` — square root (maps to LLVM intrinsic)
|
|
|
|
### Memory
|
|
- `alloc(size: s64) -> string` — allocate `size` bytes of memory, returned as a string slice
|
|
- `size_of($T: Type) -> s64` — size of type `T` in bytes
|
|
|
|
### Type Introspection
|
|
- `type_of(val: $T) -> Type` — returns the runtime type tag of a value
|
|
- `type_name($T: Type) -> string` — returns the name of type `T` as a string (e.g., `"Point"`)
|
|
- `field_count($T: Type) -> s64` — returns the number of fields (struct), variants (enum), or elements (vector) in type `T`
|
|
- `field_name($T: Type, idx: s64) -> string` — returns the name of the `idx`-th field (struct) or variant (enum) of type `T`
|
|
- `field_value(s: $T, idx: s64) -> Any` — returns the `idx`-th field (struct) or element (vector) of `s`, boxed as `Any`
|
|
|
|
### Type Conversion
|
|
- `cast(Type) expr` — prefix operator that converts `expr` to `Type`. Examples: `cast(s32) 3.14`, `cast(f64) n`. When `Type` is a runtime `Type` value inside a type-category match arm, the compiler generates a dispatch switch over all types in the category, monomorphizing the callee for each concrete type.
|
|
|
|
### Vectors
|
|
- `Vector($N: int, $T: Type) -> Type` — returns an LLVM vector type of `N` elements of type `T`
|
|
|
|
---
|
|
|
|
## 8. Compile-time Evaluation
|
|
|
|
### `#run` Directive
|
|
|
|
`#run expr` evaluates `expr` at compile time using lazy JIT execution. It can appear in two contexts:
|
|
|
|
**Compile-time constants** — bind a compile-time value to a name:
|
|
```sx
|
|
compute :: (x: s32) -> s32 { x * x; }
|
|
x :: #run compute(5); // x = 25, evaluated at compile time
|
|
```
|
|
|
|
Comptime globals are resolved lazily: the JIT executes only when the value is first referenced during code generation. Chained dependencies are resolved automatically.
|
|
|
|
**Side effects** — execute code at compile time for its side effects:
|
|
```sx
|
|
#run print("compiling...");
|
|
```
|
|
|
|
### `#insert` Directive
|
|
|
|
`#insert expr;` evaluates `expr` at compile time to obtain a string, then parses and compiles that string as inline code at the insertion point.
|
|
|
|
```sx
|
|
generate :: () -> string {
|
|
return "print(\"hello from the other side\");";
|
|
}
|
|
|
|
main :: () {
|
|
#insert #run generate();
|
|
// equivalent to: print("hello from the other side");
|
|
}
|
|
```
|
|
|
|
The inserted string must contain valid `sx` statements (including semicolons). The statements are parsed and compiled in the same scope as the `#insert` site.
|
|
|
|
---
|
|
|
|
## 9. Modules / Imports
|
|
|
|
### `#import` Directive
|
|
|
|
The `#import` directive brings declarations from another `.sx` file into the current file. Paths are resolved relative to the importing file's directory.
|
|
|
|
**Flat import** — splices all declarations from the imported file into the current scope:
|
|
```sx
|
|
#import "modules/std/math.sx";
|
|
```
|
|
|
|
**Namespaced import** — wraps all declarations under a namespace name:
|
|
```sx
|
|
std :: #import "modules/std.sx";
|
|
```
|
|
|
|
Namespaced declarations are accessed with dot notation:
|
|
```sx
|
|
std.print("hello");
|
|
```
|
|
|
|
### Import Resolution
|
|
|
|
- Imports are resolved after parsing and before code generation.
|
|
- Paths are relative to the directory of the file containing the `#import`.
|
|
- Nested imports are supported (imported files may themselves contain `#import`).
|
|
- Circular imports are detected and silently skipped (each file is imported at most once).
|
|
- Generic functions in namespaced imports are supported (e.g., `std.mul(5, 2)` where `mul` is generic).
|
|
|
|
### Intra-module References
|
|
|
|
Functions within a namespaced import can call each other without the namespace prefix. When generating code for a namespaced module, unresolved function names are automatically tried with the namespace prefix.
|
|
|
|
### Example
|
|
|
|
```sx
|
|
// modules/std/math.sx
|
|
mul :: (base: $T, exp: T) -> T { base * exp; }
|
|
|
|
// modules/std/std.sx
|
|
print :: (str: string) -> void #builtin;
|
|
|
|
// main.sx
|
|
std :: #import "modules/std.sx";
|
|
#import "modules/std/math.sx";
|
|
|
|
main :: () -> s32 {
|
|
std.print("hello there");
|
|
mul(5, 2);
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 10. Program Structure
|
|
|
|
A program is a sequence of top-level declarations and `#import` directives. Execution begins at `main`.
|
|
|
|
```sx
|
|
main :: () {
|
|
// entry point
|
|
}
|
|
```
|
|
|
|
`main` takes no arguments and returns void. The process exit code is 0 unless otherwise specified.
|
|
|
|
---
|
|
|
|
## 11. Grammar (informal)
|
|
|
|
```
|
|
program = top_level*
|
|
top_level = decl | import_decl
|
|
import_decl = '#import' STRING ';'
|
|
| IDENT '::' '#import' STRING ';'
|
|
decl = const_decl | var_decl | fn_decl | enum_decl | struct_decl
|
|
const_decl = IDENT '::' expr ';'
|
|
| IDENT ':' type ':' expr ';'
|
|
var_decl = IDENT ':=' expr ';'
|
|
| IDENT ':' type '=' expr ';'
|
|
| IDENT ':' type ';'
|
|
fn_decl = IDENT '::' '(' params? ')' ('->' type)? block
|
|
| IDENT '::' block
|
|
enum_decl = IDENT '::' 'enum' '{' (IDENT ';')* '}'
|
|
struct_decl = IDENT '::' 'struct' '{' field_group* '}'
|
|
field_group = IDENT (',' IDENT)* ':' type ('=' expr)? ';'
|
|
params = param (',' param)*
|
|
param = IDENT ':' type
|
|
block = '{' stmt* '}'
|
|
stmt = decl | assignment ';' | return_stmt | defer_stmt | insert_stmt
|
|
| break_stmt | continue_stmt | expr ';'
|
|
return_stmt = 'return' expr? ';'
|
|
break_stmt = 'break' ';'
|
|
continue_stmt = 'continue' ';'
|
|
defer_stmt = 'defer' expr ';'
|
|
insert_stmt = '#insert' expr ';'
|
|
assignment = lvalue ('=' | '+=' | '-=' | '*=' | '/=') expr
|
|
lvalue = IDENT | postfix '.' IDENT
|
|
expr = if_expr | match_expr | while_expr | for_expr | lambda | binary
|
|
while_expr = 'while' expr block
|
|
for_expr = 'for' expr block
|
|
binary = unary (binop unary)*
|
|
unary = ('-' | '!' | 'xx' | 'cast' '(' type ')') postfix
|
|
| postfix
|
|
postfix = primary ('(' args? ')' | '.' IDENT | '.{' field_init_list '}')*
|
|
primary = INT | HEX_INT | BIN_INT | FLOAT | STRING | BOOL | IDENT | '---'
|
|
| '.' IDENT | '.' '{' field_init_list '}'
|
|
| '(' expr ')' | block | '#run' expr
|
|
field_init_list = field_init (',' field_init)*
|
|
field_init = IDENT '=' expr | IDENT | expr
|
|
if_expr = 'if' expr 'then' expr ('else' expr)?
|
|
| 'if' expr block ('else' block)?
|
|
match_expr = 'if' expr '==' '{' case_arm* else_arm? '}'
|
|
case_arm = 'case' pattern ':' (stmt* | 'break' ';')
|
|
else_arm = 'else' ':' stmt*
|
|
pattern = '.' IDENT | INT | BOOL | IDENT
|
|
lambda = '(' params? ')' ('->' type)? '=>' expr
|
|
args = expr (',' expr)*
|
|
type = '$' IDENT | 's32' | 'f32' | 'f64' | 'bool' | 'string'
|
|
| 'Any' | 'Type' | '..' type | '[' expr ']' type | IDENT
|
|
```
|
|
|
|
---
|
|
|
|
## 12. Open Questions
|
|
|
|
These are inferred gaps — things not shown in the readme that need decisions:
|
|
|
|
- **`return`**: Both `return expr;` and implicit return (last expression) are supported.
|
|
- **Else in match**: Is there a default/else arm in pattern matching?
|
|
- **Nested functions**: Can functions be defined inside other functions?
|
|
- **Mutability of params**: Are function parameters immutable by default?
|
|
- **Array/list types**: Not shown — deferred.
|
|
- **Struct types**: Implemented — named struct types with positional/named/shorthand literals.
|
|
- **Imports/modules**: `#import` directive supports flat and namespaced imports (see Section 8).
|
|
- **Operator overloading**: Not shown — presumably no.
|
|
- **Semicolons**: Required on all statements? What about the last expression in a block?
|
|
- **Top-level expressions**: Are bare expressions allowed at the top level or only declarations?
|