The stateless alias-registration array-dim path collapsed foldDimU32's
distinct .too_large / .below_min outcomes into null, so an oversized type
alias (Big :: [5000000000]s64) emitted the FALSE 'an array dimension is not
a compile-time integer constant' message while the direct form correctly
reported 'array dimension 5000000000 does not fit in u32'.
Add program_index.reportDimError as the single source of dim-error wording
(the stateful path now emits through it too) and type_bridge.foldArrayDim to
surface the DimU32 reason at the alias-registration site. An oversized/negative
alias dim now routes to reportDimError for the same precise message as the
direct form; a genuinely non-const alias dim keeps the alias-specific message.
Regression: examples/1131-diagnostics-array-dim-oversized-u32-alias.sx
Two remaining siblings in F0.4's comptime-int path.
1. Type-returning function with a value param used as a TYPE annotation
(`b : Make(N, s64)` where `Make :: ($K: u32, $T: Type) -> Type`):
- `isValueParamPosition` (semantic_diagnostics) now also skips a value
param of a `fn_ast_map` type-returning function, so `N` is not walked
as the type name "N" ("unknown type 'N'").
- `resolveParameterizedWithBindings` routes a type-returning-function
name to `instantiateTypeFunction` (the `.call` path already did).
- `instantiateTypeFunction` resolves a general return-type expression
(`return [K]T`) with bindings active — not just struct/union returns.
`Make(N, s64)`, `Make(M + 1, s64)`, `Make(3, s64)` all resolve to one
`[3]s64`.
2. Oversized dim/lane fold panicked the compiler (0087): an array dim /
Vector lane folded to a valid i64 (5e9) then narrowed to u32 with an
unchecked `@intCast`. New single gate `program_index.foldDimU32` folds
via `evalConstIntExpr` then range-checks `[min, maxInt(u32)]`; the three
narrowing sites (resolveArrayLen stateful + stateless, resolveVectorLane)
all route through it and emit a clean diagnostic + halt instead of
panicking. Value-param args stay i64 until used as a dim/lane, where the
same gate checks them.
Regressions: examples/0208 (value-param type function), examples/1130
(oversized array dim clean halt), examples/1503 (oversized Vector lane
clean halt). Marks issue 0087 RESOLVED.
Gate: zig build, zig build test, bash tests/run_examples.sh — 398 passed,
0 failed, 0 timed out.
While fixing 0083 (attempt 5) noticed a distinct, pre-existing bug:
writing to a Vector component (`v.x = 1.0`) aborts with "unresolved type
reached LLVM emission" in emitStore. Reading a lane works; a literal lane
count triggers it, so it is NOT the lane-count class. Confirmed
reproducible on the pristine pre-attempt-5 compiler (not introduced by
the lane-count fix). The standard vector idiom (`.[…]` construction +
component reads / arithmetic, examples/1500) is unaffected. Filed for a
separate session; not worked around here.
Attempts 1–4 fixed the array-dimension paths but the same length-0
fabrication class survived on every other site that resolves a
compile-time integer. Unify them all on the single shared
`program_index.evalConstIntExpr` so they cannot diverge:
- All three Vector lane resolvers (resolveTypeCallWithBindings,
resolveParameterizedWithBindings, resolveArrayLiteralType) and both
generic value-param binders (instantiateGenericStruct,
instantiateTypeFunction) hand-rolled an `else => 0` switch. A
module-const lane `Vector(N, f32)` fabricated a 0-lane `<0 x float>`
(LLVM "huge alignment" abort); a value-param `Vec(N, f32)` fabricated
a 0 binding / wrong mangled name. They now fold through the shared
evaluator and emit a clean diagnostic + `.unresolved` on a non-const
operand (resolveVectorLane / resolveValueParamArg) — never 0.
- evalComptimeInt (inline-for bounds) delegated to the shared evaluator,
so `inline for 0..M` / `0..(M+1)` fold like array dims. The `<pack>.len`
leaf moved into the shared folder via a new `ctx.lookupPackLen`.
- The unknown-type semantic checker no longer walks a value-param
position (`Vector(N, …)` / `Vec(N, …)`) as a type name (was reporting
"unknown type 'N'").
- The parameterized-type-arg parser and the function-body lookahead
(hasFnBodyAfterArrow) accept a const-EXPRESSION in a value position, so
`Vector(M + 1, f32)` and `[M + 1]T` parse as a return type too (the
latter a pre-existing array-dim sibling that the same heuristic broke).
Regressions: examples/1501 (named-const + const-expr lane, direct +
alias, 3/4-lane reads), 1502 (runtime lane clean-halts, exit 1, no LLVM
crash), 0207 (Vec(N)/Vec(M+1) == Vec(3) instantiation), 0610 (inline-for
const bounds). Shared-evaluator unit test extended with the pack-len arm.
zig build && zig build test && bash tests/run_examples.sh: 395 passed,
0 failed.
A constant-FOLDABLE expression array dimension (`[M + 1]`, `[M * N]`,
`[N - M]`, nested `[M + N - 1]`, parenthesised `[(M + 1) * 2]`, mixing
untyped and typed module consts) was wrongly rejected as "not a
compile-time integer constant" even though every operand is
compile-time-known. Attempts 1-3 resolved only a bare named-const dim or
a literal; an expression dim must be EVALUATED, not rejected.
Fix: the shared dim resolver now routes the dimension through a single
constant integer-expression evaluator (`program_index.evalConstIntExpr`)
that folds integer `+ - * / %` and unary negate over literals and
named/typed module consts, recursively (parentheses carry no AST node).
The leaf-name lookup is delegated via `ctx.lookupDimName`, so the
stateful body-lowering path (`Lowering`, which also sees comptime
constants and generic `$N` values) and the stateless registration path
(`type_bridge.StatelessInner`, module consts only) share the EXACT SAME
folding logic and cannot diverge — an expression dim via a type alias
resolves identically to the direct form.
No-fabrication discipline unchanged: a genuinely non-comptime dimension
(runtime local, non-comptime call, unbound name) or arithmetic that
overflows / divides by zero still yields null -> `.unresolved` -> the
same clean compile-halting diagnostic, never a fabricated length.
- examples/0144-types-const-expr-array-dim.sx: every expression form,
direct vs alias, scalar / string / struct element types (fails on the
pre-fix compiler, passes after).
- examples/1129 re-pointed at a genuinely non-const dimension
(`[get()]s64`, a runtime call) so it still proves the stateless
clean-halt (a foldable expression is no longer an error).
- program_index.test.zig: unit test for evalConstIntExpr folding and
clean-halt-on-non-const.
A type alias whose dimension is a named const (`Arr :: [N]T`) resolves its
dimension eagerly during scanDecls pass 1, on the stateless registration path,
which can only read `module_const_map`. Typed consts (`N : s64 : 16`) register
only in pass 2 and a forward-declared untyped const had not registered yet, so
the stateless resolver saw an empty table, printed a non-fatal warning,
fabricated length 0, and continued — yielding a 0-byte alloca, garbage reads,
and a segfault for slice/struct elements.
- scanDecls pass 0 pre-registers every integer-valued module const before any
type alias resolves, so typed, untyped, and forward-referenced consts all
resolve identically.
- Both dim resolvers now share `program_index.moduleConstInt`, so the stateful
body-lowering path and the stateless registration path cannot diverge.
- `resolveArrayLen` returns `?u32`; `resolveCompound` yields `.unresolved` on
null instead of a 0-length array. The stateful path emits a diagnostic; the
alias-registration path surfaces an unresolved alias as a clean compile error
that aborts the build. The Vector lane-count `else => 0` is fixed the same way.
Regressions: examples/0143 (typed-const dim direct + via alias for s64/string/
struct, forward-ref alias, nested) and examples/1129 (an unresolvable computed
dim halts with a clean diagnostic + non-zero exit). Both fail on the pre-fix
compiler (garbage/segfault; warning+exit0) and pass after.
Makes the F0.4 fixes exhaustive across every resolution / nesting path.
0083 — named-const array dimension, stateless paths. Attempt 1 fixed the
stateful resolver (direct local decls, struct fields, params, returns) but the
binding-free registration-time resolver (`type_bridge`, used for type aliases
`Arr :: [N]T` and inline union/enum field types) still resolved a named dim with
a silent `else 0`, so `Arr :: [N]s64; a : Arr` and `union { a: [N]s64 }` were
still miscompiled (garbage / bus error). Thread the module-global const table
(`ProgramIndex.module_const_map`) into `type_bridge` alongside the alias map, so
`StatelessInner.resolveArrayLen` resolves a named module-const dim to the same
length everywhere. The remaining unresolvable case (a computed/comptime dim on
the binding-free path, which the stateful path hard-errors) now bails LOUDLY
instead of fabricating a 0 length.
0085 — nested slice-literal elements. `lowerArrayLiteral` lowered each element
with the element type as target but appended the raw value. A nested `.[...]`
element at a slice element type (`[][]s64`) still lowers to an aggregate array
`[N]T`, so the outer aggregate held raw arrays where slice {ptr,len} headers
were expected — indexing the inner slice read a garbage pointer and segfaulted.
After lowering each element, coerce a same-element array to the slice element
type via the existing `array_to_slice` op. The coercion recurses with the
nesting, so `[][]T` and deeper materialize at every level — local-bound AND
direct-call-argument forms.
Regressions (fail-before/pass-after demonstrated on the pre-fix compiler):
examples/0140-types-named-const-array-dim.sx — extended with type-alias,
nested [N][M]T, and union-field named dims (s64 / string / struct elems)
examples/0142-types-nested-slice-literal-elements.sx — [][]s64 + [][]string,
local-bound vs direct-arg
src/ir/type_bridge.test.zig — named-const dim resolves to literal length
Gate: zig build, zig build test, bash tests/run_examples.sh (388 passed).
Issues 0083 and 0085 marked RESOLVED.
Two silent-miscompile codegen fixes:
0083 — named-const array dimension. `TypeResolver.resolveCompound`'s array
arm resolved the dimension with `if int_literal ... else 0`, so a named const
(`N :: 16; [N]T`) hit the silent `else 0`: the array became 0-length / 0-byte
and element access ran out of bounds (garbage for scalars, bus error for
slice/pointer/struct elements). The arm now delegates the dimension to
`inner.resolveArrayLen` (symmetric with `inner.resolveInner` for the element).
The stateful `Lowering.resolveArrayLen` evaluates it as a compile-time integer
across the comptime-constant / generic-value / module-global const tables and
emits a diagnostic — no fabricated length — when it isn't one.
0084 — `.[...]` literal passed directly as a call arg. `lowerArrayLiteral`
always yields an aggregate array value; the array→slice conversion is the
caller's job. The local-bound var-decl path did it, but the call-arg coercion
path had no array→slice arm, so `classify([N]T, []T)` returned `.none` and the
raw array was passed where a slice was expected (callee read its {ptr,len}
header off the wrong bytes → 0 / garbage / segfault). `classify` now returns a
new `.array_to_slice` plan for same-element `[N]T → []T`, and `coerceToType`
emits the existing `array_to_slice` op — identical to the local-bound path.
Regressions (fail-before/pass-after demonstrated on the pre-fix compiler):
examples/0140-types-named-const-array-dim.sx (s64 + string + struct elems)
examples/0141-types-slice-literal-direct-call-arg.sx (string + []s64)
Gate: zig build, zig build test, bash tests/run_examples.sh (387 passed).
Issues 0083 and 0084 marked RESOLVED.
Example 0717 now asserts the (token, index) Diag for ALL SIX raise sites
in cli.sx, closing the two the reviewer found still unasserted:
- zero-arg UnknownCommand: parse([], ...) -> index -1, token ""
(the args.len == 0 sub-branch of cli.sx:237, distinct from the
one-arg too-few form already covered at index 0 / token args[0]).
- TooManyFlags (cli.sx:256): a command declaring 17 flag specs (> the
inline 16 cap) is rejected, not truncated -> index -1, token command.
The three index==-1 cases (zero-arg, too-many, missing-req) seed their
Diag with a sentinel before parse, so each assertion proves parse WROTE
the -1/"" rather than merely matching the `.{}` default. Verified
non-vacuous: flipping any expected value makes that line FAIL.
Test-only: cli.sx logic and src/ are untouched.
Extend example 0717 to pin the offending token VIEW and its args index
for every failure the parser's Diag populates: unknown-command,
unknown-group, too-few-args, missing-value, value-eats-flag, and the
missing-required index. Closes the test-coverage gap flagged in review;
cli.sx parser logic unchanged.
Extend std/cli.sx with a zero-heap argument parser that the caller drives
over a logical argv ([]string), separate from the F3.1 os_args accessor.
Grammar: <group> <command> [--flag VALUE | --bool]... [--json] [-- rest...]
- (group, command) dispatched against a caller-provided Command table;
no match -> error.UnknownCommand.
- value-taking vs boolean flags fixed by each command's FlagSpec list;
--json is a reserved global boolean surfaced as parsed.json.
- `--` or the first bare operand ends flag parsing; the remainder is
parsed.rest (operand views).
Heap discipline (heap-discipline.md): zero heap, zero copy. group/command/
flag values/rest are all VIEWS into args. Parsed is a by-value stack struct;
flag presence/values live in a fixed [16]FlagValue inline array indexed by
spec position (no per-flag allocation, no context.allocator). The flag-spec
list and command table are caller storage passed as views.
Failure surfacing (no silent skip): unknown command, unknown flag, a
value-flag missing its value, and an absent required flag each raise a
specific CliError variant; a caller-owned Diag records the offending token
(index + view) before each raise, since error tags carry no data.
examples/0717 drives the parser over explicit []string vectors: a valid
group/command/--flag/--bool/--json case (asserting parsed values + that
values are views into argv), subcommand dispatch, `--`/bare-operand
separators, and the five failure variants each asserted via destructure +
Diag. zig build && zig build test && run_examples.sh green (385 passed).
The global-init constant serializers in emit_llvm.zig printed a diagnostic
on an unserializable value and then RETURNED an undef/null placeholder and
CONTINUED emitting. For a comptime `#run` global that yields a function
reference (`fp :: #run pick();` where pick returns a function), the build
fell through to the JIT and segfaulted calling through the undef pointer
(exit 134) — a silent miscompile dressed up as a printed error.
Route every genuine bail in the serialization family through a new
`failGlobalInit` helper: it sets `comptime_failed` (so core.generateCode
aborts with a non-zero exit after emit()) and returns an undef placeholder
that never ships, because the halt fires before object emission / JIT. This
covers the comptime func_ref leaf, the require_resolved aggregate func_ref
leaf, the top-level + vtable func_ref globals, the comptime-init catch, and
the remaining heap-walk / aggregate-shape bails. Unresolved-function
diagnostics now name the function instead of its (stdlib-unstable) IR index.
The require_resolved=false Pass-0 placeholder is unchanged (func_map is
empty until Pass 1; the aggregate is re-emitted with require_resolved=true).
Regression: examples/1128-diagnostics-comptime-global-funcref-rejected.sx —
a `#run` global returning a function ref now exits 1 with the diagnostic
(was: exit 134 segfault). Fail-before/pass-after verified.
A module-global initialized with an enum literal silently zero-initialized
to the first tag (`chosen : Color = .green` read back as `.red`), and an
enum tag inside a global array/struct was rejected as non-constant. The
constant serializer had no enum-literal arm.
Add `Lowering.constEnumLiteral`: serialize an enum literal to a
`ConstantValue.int` holding the variant's tag value, resolved against the
destination enum type and respecting explicit variant values; the global's
type drives the backing width at emit time. Wired into `globalInitValue`
(scalar global) and `constExprValue` (array element / struct field / nested
aggregate). A non-enum destination or unknown variant is diagnosed loudly,
never silently zero-initialized. The compiler-injected OS/ARCH globals now
serialize to their real `.unknown` tag (6 / 4); runtime reads are unchanged
(they resolve through comptime_constants), so only the static initializer in
the pinned .ir snapshots changes.
Remove the silent `func_ref => orelse LLVMConstNull` fallbacks in the LLVM
constant emitters: aggregate func_ref leaves carry a `require_resolved` flag
(transient null in Pass 0, loud diagnostic if still unresolved in the
Pass-1.5 re-emit), a top-level func_ref global is resolved in
initVtableGlobals, and the comptime (#run) path bails loudly instead of
emitting a null function pointer.
Regression: examples/0139-types-global-enum-literal-init.sx (scalar, array,
struct field, explicit-value enum u16 stride, struct-array with enum field);
negative: examples/1127-diagnostics-global-enum-literal-bad-variant.sx.
Mark issue 0082 RESOLVED.
A module-global aggregate initializer rejected a `null` literal in a
pointer (or optional-pointer) field as "must be initialized by a
compile-time constant". `Lowering.constExprValue` had no `.null_literal`
arm, so the null leaf returned no constant and the whole aggregate looked
non-constant — even though `null` is the compile-time zero pointer (a
top-level scalar `p : *s64 = null;` already serialized fine).
Add `.null_literal => .null_val` to constExprValue. While here, make the
two LLVM constant emitters exhaustive: emitConstAggregate and the
top-level init_val switch in emit_llvm.zig previously ended in a silent
`else => LLVMConstNull(...)` catch-all (the silent-arm class CLAUDE.md
mandates rooting out). They now handle every ConstantValue tag explicitly
(.null_val/.zeroinit -> all-zero constant, .undef -> LLVMGetUndef,
.func_ref resolved, nested .vtable is a hard @panic tripwire). The
reject-loud path for genuinely non-constant fields is preserved.
Regression: examples/0138 (array-of-struct null ptr fields, array of
all-null pointers, nested struct-in-struct null ptr) and the negative
examples/1126 (null ptr field beside a non-const field still errors).
Fail-before/pass-after verified.
A module-global array of struct literals (`pairs : [2]Pair = .[ .{...}, .{...} ]`)
was emitted as `zeroinitializer`, silently dropping every declared field — reads
returned 0 with no diagnostic. Global struct literals and struct-with-array
already worked; the gap was struct literals used as ARRAY elements.
Root cause: `Lowering.constExprValue` (the const-aggregate serializer for global
initializers) had no `.struct_literal` arm. `constArrayLiteral` serialized each
element through `constExprValue`, so a struct-literal element returned null,
collapsing the whole array initializer to null; `globalInitValue` then emitted no
payload and the LLVM backend zero-initialized the global — the same silent-zero
class as 0071/0072, one level inside an array literal.
Fix: make `constExprValue` type-aware — thread the destination element/field
TypeId so a struct-literal leaf routes through `constStructLiteral` and a nested
array-literal through `constArrayLiteral` with the correct element type.
`constArrayLiteral` derives its element type from the array TypeId;
`constStructLiteral` passes each field's type. A global aggregate initializer that
still does not fully reduce to a compile-time constant is now rejected loudly
(`diagnoseNonConstGlobal`) instead of silently zeroing. `emitConstAggregate`
already recurses over nested aggregates, so `sx run` (JIT) and `sx build` (AOT)
both materialize the declared values.
Regression: examples/0137-types-global-aggregate-literal-init.sx (global
[N]Struct literal, global struct literal, struct-with-array, nested
array-of-struct-with-array; values read back with no prior store, plus a store on
top). Fails on the pre-fix compiler (array-of-struct fields read 0), passes after.
Marks issues 0079 (already resolved) and 0080 RESOLVED.
A store to a module-global array element (`g[i] = v`) was silently dropped:
a subsequent `g[i]` read the array's initializer, not `v`. Constant index,
variable index, and cross-function stores were all affected, in both `sx run`
and `sx build`. Global scalars and local arrays were fine.
Root cause: `Lowering.lowerExprAsPtr` (the lvalue/address path) handled only
local identifiers. A module-global identifier fell through to the value
fallback `lowerExpr`, which emits `global_get` — loading the whole array by
value. The LLVM backend's `emitIndexGep` then allocas a throwaway temp, copies
the value in, and GEPs into the temp, so the store wrote a discarded copy.
Fix: teach `lowerExprAsPtr`'s identifier arm about globals — emit `global_addr`
(a pointer into the global's live storage), or `global_get` for a pointer-typed
global (mirroring the local pointer case). Route the `address_of(index_expr)`
array base through `lowerExprAsPtr` too so `&g[i]` is likewise an lvalue into
the global. `index_gep` now GEPs directly into the global for const and variable
index, across functions. This also fixes global struct field stores, which
shared the same root cause.
Regression: examples/0136-types-global-array-element-store.sx (const-index,
var-index, cross-function store on a scalar global array; struct-element array
for stride; nested-array global for the recursive lvalue). Fails on the pre-fix
compiler, passes after.
Add library/modules/std/cli.sx: a pure-sx command-line argument accessor
backed by the macOS C runtime (_NSGetArgv/_NSGetArgc), no compiler change.
os_argc() -> s64
os_args(buf: []string) -> []string
Zero heap, zero per-arg allocation: os_args fills a caller-provided buffer
(stack array) with string VIEWS over the process's own argv block, which
lives for the whole process. The returned slice header is a by-value stack
return; nothing touches context.allocator.
Documents the `sx run` reality: under `sx run <prog.sx> ...` the process
argv is the interpreter's argv (sx, run, prog.sx, ...), not a program's
logical args. This accessor reports the real process argv truthfully;
mapping to logical args is a later consumer concern (distribution P3.1).
Non-macOS platforms bail loudly (message + _exit) rather than returning a
silent empty.
examples/0716-modules-cli-argv.sx asserts only deterministic structural
invariants (argc >= 1, argv[0] non-empty, os_argc() == filled length).
Add 0715-modules-json-suite as the single comprehensive pinned suite for
std.json (mirrors 0711 for std.hash), alongside the focused 0713/0714 demos:
- ROUND-TRIP build->write->parse->write over a document covering EVERY value
kind (a string with every escape form \" \\ \b \f \n \r \t plus a \u00XX
control, integers 0 / negative / s64 MIN / s64 MAX, bool, null, array,
nested object) with insertion-order assertions, exact writer bytes, and
parse-then-rewrite idempotence.
- DECODE positives: \/, the full named-escape set, \uXXXX (BMP 1- and 2-byte)
plus a surrogate pair, the escaped control forms, and raw multi-byte UTF-8
round-tripping through writer + reader.
- MALFORMED matrix: one assertion per JsonParseError variant and its key
edges (UnexpectedToken, UnexpectedEnd, BadEscape, BadNumber incl. leading
zero / lone '-' / fraction / exponent / overflow, TrailingGarbage,
BadControlChar), each asserted to raise.
Pure test work: src/ and library/ untouched, no json.sx change needed. Every
model is built through an explicit Arena allocator (heap discipline).
parse_string scanned for `"` and `\` but accepted every other byte,
including raw control characters. RFC 8259 §7 requires those bytes to be
escaped inside a string; an unescaped one is invalid JSON and must surface
a parse error, not be silently accepted.
Add `BadControlChar` to JsonParseError and reject any unescaped byte < 0x20
in the string body scan (which gates the decode path too, so escaped forms
like \t/\n/ still decode correctly; 0x20 and 0x7F are not over-rejected).
Regression test in examples/0714: raw 0x09/0x0A/0x00 each raise
BadControlChar via `?`/`!`; a positive case proves the escaped forms still
decode to the right bytes. All prior assertions kept.
Issue 0078 (string == as an and/or operand emitting an invalid PHI) is
resolved on this branch, so the example no longer needs the split that
worked around it. Restore the natural combined assertion
sub.items[0].key == "k" and sub.items[0].val.str == "v"
(one nested-pair report), and the in_range containment helper to
return x >= lo and x < hi;
Drop the now-stale issues/0078 references. Re-captured expected stdout
(nested-key/nested-val -> nested-pair). json.sx and src/ untouched.
A string `==`/`!=` used as an operand of a short-circuit `and`/`or` emitted
invalid LLVM (`PHI node entries do not match predecessors!`). String compares
expand into their own memcmp sub-CFG during LLVM emission, so the operand
finishes in a later basic block (`str.merge`) than the one the IR block
started in. `fixupPhiNodes` wired the short-circuit merge PHI's incoming edge
to `block_map[ir_block]` (the block the IR block started as), recording a
stale predecessor (`%entry`/`%and.rhs.0`).
Fix: record the builder's actual insertion block after emitting each IR
block's instructions (`term_block_map`, via `LLVMGetInsertBlock`) and use it
as the PHI predecessor. General — corrects the incoming block for any operand
that emitted intermediate basic blocks (string `==`, value `match`, …), not
just string `==`.
Regression: examples/0045-basic-string-eq-short-circuit.sx (string `==` on
both sides of `and` and of `or`, plus a match-value + enum-payload `==` shape).
Fails (LLVM abort) pre-fix, passes after.
Add the JSON reader (parser) to library/modules/std/json.sx, the inverse
of the F2.1 writer over the same value model: insertion-ordered objects,
arrays, strings (full unescaping incl. \uXXXX + surrogate pairs), s64
integers, bool, null.
Heap discipline (binding): exactly two allocation kinds, both through the
EXPLICIT `alloc` parameter, never the implicit context allocator —
composite backing stores (Array/Object.items via add/put) and decoded
escaped-string buffers (bounded by the raw span). Un-escaped string
values are zero-copy VIEWS into the input buffer (valid only while it
lives); scalars carry no heap.
Failure surfacing (hard contract): malformed input raises a meaningful
JsonParseError variant (UnexpectedToken / UnexpectedEnd / BadEscape /
BadNumber / TrailingGarbage) on the error channel, never a bogus value.
Trailing non-whitespace is TrailingGarbage; fractions/exponents,
out-of-s64 magnitudes, and leading zeros are BadNumber. Number
accumulation runs in negative space so s64 MIN parses exactly.
examples/0714-modules-json-reader.sx asserts the parsed structure
(insertion order, every kind), proves the view-vs-decoded heap split by
pointer containment, round-trips back through the writer byte-for-byte,
decodes a surrogate-pair into 4 UTF-8 bytes, and checks every malformed
variant.
Filed issues/0078: a string `==` (or any sub-CFG operand) used in a
short-circuit `and`/`or` emits invalid LLVM IR (stale PHI predecessor),
hit while writing the example's assertions and worked around there by not
combining comparisons with `and`/`or`. src/ untouched.
Close the coverage gap from attempt 1: example 0713 now builds integer
fields holding s64 MIN (-9223372036854775808) and s64 MAX
(9223372036854775807) — plus zero, a small negative, and a small positive —
and asserts the EXACT emitted bytes. This permanently pins the edge that
write_int is specifically engineered for (folding positives into negative
space so MIN's non-representable-positive magnitude serializes correctly).
s64 MIN is expressed as (0 - 9223372036854775807 - 1) because its magnitude
is not a representable positive s64 literal.
Test hygiene: stream to a repo-local, gitignored .sx-tmp/ path (created if
missing) instead of a fixed /tmp name, and unlink it right after read-back
so nothing leaks. Writer/model logic and src/ are untouched.
Add library/modules/std/json.sx — the JSON value model and writer
(reader lands in a later step).
Value model: a tagged union over null/bool/integer(s64)/string/array/
object. Objects are an ORDERED list of (key,value) pairs preserving
INSERTION ORDER (no hash map, never sorted/deduped). Integers only — no
fraction/exponent this milestone.
Heap discipline:
- Scalars carry no heap; string values are VIEWS into caller memory
(never copied into the node).
- Composite nodes (Array/Object) own growable child storage, allocated
through an EXPLICIT allocator parameter on the builder methods
(arr.add(v, alloc) / obj.put(key, val, alloc), mirroring List.append)
— never the implicit context allocator.
- The writer adds ZERO output allocations: it emits into a caller-
provided Sink, either a fixed []u8 buffer (overflow raises, never
truncates) or streaming straight to an fs.File through a small caller
staging buffer (no whole-document string; peak memory O(staging)).
Integer digits format in a stack [20]u8; s64 MIN is handled by
formatting in negative space. Sink/IO/overflow surface on the !
error channel.
examples/0713-modules-json-writer.sx builds a nested object + array +
string with every escape kind + negative int + bool + null, then asserts
the EXACT bytes (insertion order, escaping) from both the buffer sink and
the file-streaming sink, plus the overflow-raises path.
Make the SHA-256 digest path allocation-free (foundation heap-discipline):
- final() and sha256_hex() now return the 64-char lowercase hex digest as
a [64]u8 by value on the stack; the cstring(64) heap allocation is gone.
- sha256_file() streams the file in fixed 64KB stack chunks via open_file/
File.read/File.close (defer-closed on every path) instead of slurping it
with read_file; peak memory is O(chunk), not O(filesize).
Tests (compare via a zero-copy string view over the [64]u8):
- 0710 updated to the by-value API (output unchanged).
- 0711 known-answer vectors: "", "abc", NIST-56/112, padding boundaries
{0,55,56,57,63,64,65,119,120}, and 1000 / 1,000,000 'a' repeats, each
pinned to its published digest (cross-checked with shasum -a 256).
- 0712 streaming equivalence (one-shot == byte-at-a-time == split-mid-block
== split-on-boundary) plus sha256_file(temp) == in-memory digest.
src/ untouched. zig build && zig build test && tests/run_examples.sh green.
Add a pure-sx streaming SHA-256 (FIPS 180-4) stdlib module, importable
as `#import "modules/std/hash.sx";`. All 32-bit word arithmetic is done
in s64 and masked back with `& MASK32`, so digests are deterministic and
platform-independent — no shelling out, no native crypto.
API:
- init() -> Sha256 (by-value *self pattern)
- update(*Sha256, string) (multi-block + partial-block buffering)
- final(*Sha256) -> string (32-byte digest as lowercase hex)
- sha256_hex(string) -> string (one-shot)
- sha256_file([:0]u8) -> ?string (digest of a file via fs.read_file)
Verified against FIPS/NIST known-answer vectors and `shasum -a 256`:
"" , "abc", the 56- and 112-byte multi-block vectors, 1000×'a', and the
64/65-byte block boundaries; chunked update() matches the one-shot call.
examples/0710-modules-sha256.sx pins the KAT vectors + the streaming
invariant; gate green (zig build, zig build test, run_examples 370/0/0/0).
The reserved-type-name binding diagnostic fired correctly but underlined the
enclosing statement / if / while / for / match / protocol / #objc_class block
because every binding-name check reused the parent `node.span`.
Thread each binding name's own span through the AST and parser, and pass it to
`checkBindingNames`:
- ast: add name spans to VarDecl, DestructureDecl, If/WhileExpr, ForExpr
(capture + index), MatchArm, Catch/OnFailStmt, Protocol/ForeignMethodDecl.
- parser: populate each span at the binding site from the name token's loc;
destructure reuses each target identifier's own span.
- semantic_diagnostics: every checkBindingName call now passes the binding's
own span — no site falls back to node.span. fn/lambda params already used
Param.name_span.
Carets now land on the offending identifier itself. New regression
examples/1125 asserts the protocol default-body and sx-defined #objc_class
method param spans; 0125/1119-1124 expected updated to the precise carets.
The reserved/builtin-type-name binding diagnostic was a hand-walked subset
of binding-bearing AST nodes with a silent `else => {}`, so each review
found another syntactic binding form that bypassed it and hit the original
LLVM verifier abort: destructure names (`s2, x := …`), `impl` method
params/locals, and `if` / `while` / `for` / match-arm / `catch` / `onfail`
captures.
Rewrite `checkBindingNames` (src/ir/semantic_diagnostics.zig) as an
EXHAUSTIVE `switch` over every `Node.Data` tag with NO `else` arm — a future
binding-bearing node type now fails to compile until it is handled here, so
coverage is enforced by the compiler instead of a hand-maintained list. The
check stays in the pre-lowering semantic pass rather than moving to the
`Scope.put` scope-registration choke point: lowering is lazy, so an
uncalled function's bindings never reach `Scope.put`, yet they must still be
rejected at their declaration (e.g. the never-called `takes_u8` in 1119).
No lowering special-case; `lower.zig` unchanged.
Regression tests (fail-before: LLVM abort or silent accept → pass-after:
clean diagnostic, exit 1):
- 1121 control-flow: destructure, if/while bindings, for capture+index,
match-arm capture
- 1122 impl-block method: reserved param AND reserved local
- 1123 catch + onfail tag bindings
- 1124 destructure name reserved in an imported module
Existing 0125 / 1119 / 0135 / 1120 tests kept; full suite 368 passed.
The issue-0076 reserved-type-name binding diagnostic only ran over main-file
decls, so an imported module (or the stdlib) could still declare `s2 := ...`
and reach lowering, where the address-of family loads the whole aggregate and
passes it by value to a `ptr` param — LLVM verifier abort.
Extend coverage to every compiled module: a dedicated `checkBindingNames` walk
(in semantic_diagnostics.zig) visits every var/`:=`/typed-local binding name and
function/lambda/struct-method parameter at any depth, with NO main-file filter,
descending the `namespace_decl` that a `mod :: #import` wraps so imported-module
decls are reached. It tracks each module's source_file (save/restore per node)
so the diagnostic renders against the imported module's text. Rejection still
defers to the parser's `Type.fromName` classifier; the unknown-type check (0064)
stays main-file-only. No lowering special-case; `.identifier`-only address-of
paths are unchanged.
Stdlib audit: the only reserved-name bindings under library/ were two `u1`
locals in ui/renderer.sx (UV coords) — renamed to u_min/u_max/v_min/v_max.
Regression test: examples/1120-diagnostics-imported-reserved-type-name.sx (+
companion mod.sx) — an imported `s2 := ...` now emits the clean diagnostic at
the import's declaration site (exit 1), not an LLVM abort.
Resolves issues 0076 (coverage extension) and 0077.
A value binding (local/global `var` or a parameter) spelled as a
reserved/builtin type name parses as a `.type_expr` rather than an
`.identifier` (parser.zig, via `Type.fromName`), so the address-of
family in lower.zig never saw a scoped local and mis-lowered it —
loading the aggregate and passing it by value to a `ptr` parameter
(LLVM verifier abort, or a silent `*self`-mutation-losing copy).
Add a declaration-site diagnostic in semantic_diagnostics.zig
(`UnknownTypeChecker.checkBindingName`): reject any parameter name or
`var` binding name (`:=` / typed-local / global forms) whose spelling
collides with a reserved type name. `isReservedTypeName` defers to the
parser's own classifier (`types.Type.fromName`) so the rejected set
never drifts from the set that would parse as a type — the named
builtins (bool/string/void/f32/f64/usize/isize/Any) and `[su]N` over
sx's 1-64 range. Bare value names (`s`, `self`, `index`) are untouched.
No lowering special-case; the `.identifier`-only address-of paths are
correct once type-shaped names can never be bound. The rejected
attempt-1 `bareVarName` approach was never landed.
Tests:
- 0125-types-type-named-var-rejected: `:=` form (s2) rejected
(repurposed from the old test that asserted the now-illegal behavior).
- 1119-diagnostics-reserved-type-name-as-identifier: parameter (u8),
typed-local (s64, bool), `:=` (string) forms rejected.
- 0135-types-self-streaming-nonreserved: positive — `*self` streaming
with non-reserved names accumulates correctly via both call styles.
- 0904-optionals: renamed incidental locals s1/s2 -> filled/empty.
The trace docs predated the current formatter. Corrected against the real
output (library/modules/trace.sx to_string + examples/expected/1025-errors-
trace-format.stderr):
- error-handling.md: replace the obsolete trace example ("error trace:" /
"raised error.X" / "at func (file:line)") with the real format —
"error return trace (most recent call last):" + per-frame "func at
file:line:col" + source line + caret.
- debugger.md: drop the stale "(planned)" marker on the trace formatter
(it is implemented); the tag-name table note now cites the failable-main
reporter's "unhandled error reached main: error.X" line, not a
nonexistent "raised error.X" trace line.
The `type_name` / `type_eq` reflection builtins resolved their Type arg's IR
type via `getRefIRType(...) orelse TypeId.s64`, then gated `== .any`. A failed
must-succeed lookup silently became `.s64` (`!= .any`), classifying a boxed
`Any` arg as bare i64 and reading the wrong value with no diagnostic.
Add the sibling classifier `LLVMEmitter.reflectArgRepr`, which routes the
lookup through `argIRTypeOrFail` (the issue-0074 `.unresolved` resolver) and
returns `{ boxed, bare, unresolved }`. The three emit sites in ops.zig
(`type_name` + `type_eq` x2) now switch on it: `.boxed` extracts the Any value
field, `.bare` uses the value directly, `.unresolved` hits a hard `@panic`
tripwire — never silently treated as bare. Real args always resolve, so the
happy path is byte-identical (suite stays 361/0, zero snapshot churn).
Secondary `lower.zig` `null_literal`/`undef_literal => target_type orelse .void`
confirmed intentional (typeless-literal default deliberately handled by
emitConstNull/emitConstUndef as null-ptr / undef-i64) — left with an invariant
comment, not the `.unresolved` tripwire.
Regression test in emit_llvm.test.zig asserts the loud path: fail-before with
`orelse .s64` yields `.bare`; pass-after yields `.unresolved`.
Discovered during the 0074 fix + a codebase-wide silent-type-fallback sweep.
getRefIRType(...) orelse TypeId.s64 at ops.zig:1023/1049/1055 (type_name/type_eq).
Blocker; to be resolved before the arch-refactor stream closes.
Four FFI call-arg lowering sites resolved an argument's IR type via
`getRefIRType(arg_ref) orelse .void` — a silent fallback to the load-bearing
real type `.void`. A failed lookup there is a codegen invariant violation, but
`.void` is treated by downstream `toLLVMType` → `abiCoerceParamType` →
`coerceArg` as a legitimate void-typed foreign argument, corrupting the call
ABI with no diagnostic.
Add one shared resolver `LLVMEmitter.argIRTypeOrFail` that returns the
dedicated `.unresolved` sentinel on a failed lookup — never `.void`/`.s64` — so
the failure cannot masquerade as a real type and trips `toLLVMType`'s existing
hard `@panic` tripwire at the call site. Route all four sites through it:
- src/ir/emit_llvm.zig JNI constructor (NewObject) arg loop
- src/backend/llvm/ops.zig objc_msgSend arg loop
- src/backend/llvm/ops.zig JNI non-virtual call arg loop
- src/backend/llvm/ops.zig JNI Call<Type>Method arg loop
Happy path is byte-identical (every real arg already has a resolved type); FFI
examples stay green with zero snapshot churn.
Regression test (fail-before/pass-after) in src/ir/emit_llvm.test.zig asserts an
unresolvable FFI arg ref now yields `.unresolved`, not the old silent `.void`.