Commit Graph

694 Commits

Author SHA1 Message Date
agra
e8cc9d03de fix(ir): precise oversized-dim diagnostic on the alias path (0083)
The stateless alias-registration array-dim path collapsed foldDimU32's
distinct .too_large / .below_min outcomes into null, so an oversized type
alias (Big :: [5000000000]s64) emitted the FALSE 'an array dimension is not
a compile-time integer constant' message while the direct form correctly
reported 'array dimension 5000000000 does not fit in u32'.

Add program_index.reportDimError as the single source of dim-error wording
(the stateful path now emits through it too) and type_bridge.foldArrayDim to
surface the DimU32 reason at the alias-registration site. An oversized/negative
alias dim now routes to reportDimError for the same precise message as the
direct form; a genuinely non-const alias dim keeps the alias-specific message.

Regression: examples/1131-diagnostics-array-dim-oversized-u32-alias.sx
2026-06-04 12:31:24 +03:00
agra
efc09699e8 fix(ir): value-param type functions + range-checked dim/lane fold (0083, 0087)
Two remaining siblings in F0.4's comptime-int path.

1. Type-returning function with a value param used as a TYPE annotation
   (`b : Make(N, s64)` where `Make :: ($K: u32, $T: Type) -> Type`):
   - `isValueParamPosition` (semantic_diagnostics) now also skips a value
     param of a `fn_ast_map` type-returning function, so `N` is not walked
     as the type name "N" ("unknown type 'N'").
   - `resolveParameterizedWithBindings` routes a type-returning-function
     name to `instantiateTypeFunction` (the `.call` path already did).
   - `instantiateTypeFunction` resolves a general return-type expression
     (`return [K]T`) with bindings active — not just struct/union returns.
   `Make(N, s64)`, `Make(M + 1, s64)`, `Make(3, s64)` all resolve to one
   `[3]s64`.

2. Oversized dim/lane fold panicked the compiler (0087): an array dim /
   Vector lane folded to a valid i64 (5e9) then narrowed to u32 with an
   unchecked `@intCast`. New single gate `program_index.foldDimU32` folds
   via `evalConstIntExpr` then range-checks `[min, maxInt(u32)]`; the three
   narrowing sites (resolveArrayLen stateful + stateless, resolveVectorLane)
   all route through it and emit a clean diagnostic + halt instead of
   panicking. Value-param args stay i64 until used as a dim/lane, where the
   same gate checks them.

Regressions: examples/0208 (value-param type function), examples/1130
(oversized array dim clean halt), examples/1503 (oversized Vector lane
clean halt). Marks issue 0087 RESOLVED.

Gate: zig build, zig build test, bash tests/run_examples.sh — 398 passed,
0 failed, 0 timed out.
2026-06-04 12:13:45 +03:00
agra
7238eea084 docs(issues): file 0086 — Vector lane store panics (discovered, pre-existing)
While fixing 0083 (attempt 5) noticed a distinct, pre-existing bug:
writing to a Vector component (`v.x = 1.0`) aborts with "unresolved type
reached LLVM emission" in emitStore. Reading a lane works; a literal lane
count triggers it, so it is NOT the lane-count class. Confirmed
reproducible on the pristine pre-attempt-5 compiler (not introduced by
the lane-count fix). The standard vector idiom (`.[…]` construction +
component reads / arithmetic, examples/1500) is unaffected. Filed for a
separate session; not worked around here.
2026-06-04 11:32:31 +03:00
agra
a491a1bf73 fix(ir): route every comptime-int through the shared evaluator (0083)
Attempts 1–4 fixed the array-dimension paths but the same length-0
fabrication class survived on every other site that resolves a
compile-time integer. Unify them all on the single shared
`program_index.evalConstIntExpr` so they cannot diverge:

- All three Vector lane resolvers (resolveTypeCallWithBindings,
  resolveParameterizedWithBindings, resolveArrayLiteralType) and both
  generic value-param binders (instantiateGenericStruct,
  instantiateTypeFunction) hand-rolled an `else => 0` switch. A
  module-const lane `Vector(N, f32)` fabricated a 0-lane `<0 x float>`
  (LLVM "huge alignment" abort); a value-param `Vec(N, f32)` fabricated
  a 0 binding / wrong mangled name. They now fold through the shared
  evaluator and emit a clean diagnostic + `.unresolved` on a non-const
  operand (resolveVectorLane / resolveValueParamArg) — never 0.
- evalComptimeInt (inline-for bounds) delegated to the shared evaluator,
  so `inline for 0..M` / `0..(M+1)` fold like array dims. The `<pack>.len`
  leaf moved into the shared folder via a new `ctx.lookupPackLen`.
- The unknown-type semantic checker no longer walks a value-param
  position (`Vector(N, …)` / `Vec(N, …)`) as a type name (was reporting
  "unknown type 'N'").
- The parameterized-type-arg parser and the function-body lookahead
  (hasFnBodyAfterArrow) accept a const-EXPRESSION in a value position, so
  `Vector(M + 1, f32)` and `[M + 1]T` parse as a return type too (the
  latter a pre-existing array-dim sibling that the same heuristic broke).

Regressions: examples/1501 (named-const + const-expr lane, direct +
alias, 3/4-lane reads), 1502 (runtime lane clean-halts, exit 1, no LLVM
crash), 0207 (Vec(N)/Vec(M+1) == Vec(3) instantiation), 0610 (inline-for
const bounds). Shared-evaluator unit test extended with the pack-len arm.

zig build && zig build test && bash tests/run_examples.sh: 395 passed,
0 failed.
2026-06-04 11:32:25 +03:00
agra
cd39316f5e fix(ir): evaluate constant-expression array dimensions (0083)
A constant-FOLDABLE expression array dimension (`[M + 1]`, `[M * N]`,
`[N - M]`, nested `[M + N - 1]`, parenthesised `[(M + 1) * 2]`, mixing
untyped and typed module consts) was wrongly rejected as "not a
compile-time integer constant" even though every operand is
compile-time-known. Attempts 1-3 resolved only a bare named-const dim or
a literal; an expression dim must be EVALUATED, not rejected.

Fix: the shared dim resolver now routes the dimension through a single
constant integer-expression evaluator (`program_index.evalConstIntExpr`)
that folds integer `+ - * / %` and unary negate over literals and
named/typed module consts, recursively (parentheses carry no AST node).
The leaf-name lookup is delegated via `ctx.lookupDimName`, so the
stateful body-lowering path (`Lowering`, which also sees comptime
constants and generic `$N` values) and the stateless registration path
(`type_bridge.StatelessInner`, module consts only) share the EXACT SAME
folding logic and cannot diverge — an expression dim via a type alias
resolves identically to the direct form.

No-fabrication discipline unchanged: a genuinely non-comptime dimension
(runtime local, non-comptime call, unbound name) or arithmetic that
overflows / divides by zero still yields null -> `.unresolved` -> the
same clean compile-halting diagnostic, never a fabricated length.

- examples/0144-types-const-expr-array-dim.sx: every expression form,
  direct vs alias, scalar / string / struct element types (fails on the
  pre-fix compiler, passes after).
- examples/1129 re-pointed at a genuinely non-const dimension
  (`[get()]s64`, a runtime call) so it still proves the stateless
  clean-halt (a foldable expression is no longer an error).
- program_index.test.zig: unit test for evalConstIntExpr folding and
  clean-halt-on-non-const.
2026-06-04 10:38:21 +03:00
agra
d2bf8f3f2d fix(ir): unify named-const array-dim resolution + kill length-0 fabrication (0083)
A type alias whose dimension is a named const (`Arr :: [N]T`) resolves its
dimension eagerly during scanDecls pass 1, on the stateless registration path,
which can only read `module_const_map`. Typed consts (`N : s64 : 16`) register
only in pass 2 and a forward-declared untyped const had not registered yet, so
the stateless resolver saw an empty table, printed a non-fatal warning,
fabricated length 0, and continued — yielding a 0-byte alloca, garbage reads,
and a segfault for slice/struct elements.

- scanDecls pass 0 pre-registers every integer-valued module const before any
  type alias resolves, so typed, untyped, and forward-referenced consts all
  resolve identically.
- Both dim resolvers now share `program_index.moduleConstInt`, so the stateful
  body-lowering path and the stateless registration path cannot diverge.
- `resolveArrayLen` returns `?u32`; `resolveCompound` yields `.unresolved` on
  null instead of a 0-length array. The stateful path emits a diagnostic; the
  alias-registration path surfaces an unresolved alias as a clean compile error
  that aborts the build. The Vector lane-count `else => 0` is fixed the same way.

Regressions: examples/0143 (typed-const dim direct + via alias for s64/string/
struct, forward-ref alias, nested) and examples/1129 (an unresolvable computed
dim halts with a clean diagnostic + non-zero exit). Both fail on the pre-fix
compiler (garbage/segfault; warning+exit0) and pass after.
2026-06-04 09:39:18 +03:00
agra
1f9f944ca1 fix(ir): exhaustive named-const array dims (0083) + nested slice-literal coercion (0085)
Makes the F0.4 fixes exhaustive across every resolution / nesting path.

0083 — named-const array dimension, stateless paths. Attempt 1 fixed the
stateful resolver (direct local decls, struct fields, params, returns) but the
binding-free registration-time resolver (`type_bridge`, used for type aliases
`Arr :: [N]T` and inline union/enum field types) still resolved a named dim with
a silent `else 0`, so `Arr :: [N]s64; a : Arr` and `union { a: [N]s64 }` were
still miscompiled (garbage / bus error). Thread the module-global const table
(`ProgramIndex.module_const_map`) into `type_bridge` alongside the alias map, so
`StatelessInner.resolveArrayLen` resolves a named module-const dim to the same
length everywhere. The remaining unresolvable case (a computed/comptime dim on
the binding-free path, which the stateful path hard-errors) now bails LOUDLY
instead of fabricating a 0 length.

0085 — nested slice-literal elements. `lowerArrayLiteral` lowered each element
with the element type as target but appended the raw value. A nested `.[...]`
element at a slice element type (`[][]s64`) still lowers to an aggregate array
`[N]T`, so the outer aggregate held raw arrays where slice {ptr,len} headers
were expected — indexing the inner slice read a garbage pointer and segfaulted.
After lowering each element, coerce a same-element array to the slice element
type via the existing `array_to_slice` op. The coercion recurses with the
nesting, so `[][]T` and deeper materialize at every level — local-bound AND
direct-call-argument forms.

Regressions (fail-before/pass-after demonstrated on the pre-fix compiler):
  examples/0140-types-named-const-array-dim.sx — extended with type-alias,
    nested [N][M]T, and union-field named dims (s64 / string / struct elems)
  examples/0142-types-nested-slice-literal-elements.sx — [][]s64 + [][]string,
    local-bound vs direct-arg
  src/ir/type_bridge.test.zig — named-const dim resolves to literal length

Gate: zig build, zig build test, bash tests/run_examples.sh (388 passed).
Issues 0083 and 0085 marked RESOLVED.
2026-06-04 09:06:08 +03:00
agra
12552e125d fix(ir): resolve named-const array dims (0083) + materialize literal slice args (0084)
Two silent-miscompile codegen fixes:

0083 — named-const array dimension. `TypeResolver.resolveCompound`'s array
arm resolved the dimension with `if int_literal ... else 0`, so a named const
(`N :: 16; [N]T`) hit the silent `else 0`: the array became 0-length / 0-byte
and element access ran out of bounds (garbage for scalars, bus error for
slice/pointer/struct elements). The arm now delegates the dimension to
`inner.resolveArrayLen` (symmetric with `inner.resolveInner` for the element).
The stateful `Lowering.resolveArrayLen` evaluates it as a compile-time integer
across the comptime-constant / generic-value / module-global const tables and
emits a diagnostic — no fabricated length — when it isn't one.

0084 — `.[...]` literal passed directly as a call arg. `lowerArrayLiteral`
always yields an aggregate array value; the array→slice conversion is the
caller's job. The local-bound var-decl path did it, but the call-arg coercion
path had no array→slice arm, so `classify([N]T, []T)` returned `.none` and the
raw array was passed where a slice was expected (callee read its {ptr,len}
header off the wrong bytes → 0 / garbage / segfault). `classify` now returns a
new `.array_to_slice` plan for same-element `[N]T → []T`, and `coerceToType`
emits the existing `array_to_slice` op — identical to the local-bound path.

Regressions (fail-before/pass-after demonstrated on the pre-fix compiler):
  examples/0140-types-named-const-array-dim.sx (s64 + string + struct elems)
  examples/0141-types-slice-literal-direct-call-arg.sx (string + []s64)

Gate: zig build, zig build test, bash tests/run_examples.sh (387 passed).
Issues 0083 and 0084 marked RESOLVED.
2026-06-04 08:22:45 +03:00
agra
3b36264e65 Merge branch 'flow/sx-foundation/F3.2' into dist-foundation 2026-06-04 08:03:43 +03:00
agra
9784ff8705 F3.2: assert Diag for the zero-arg and too-many-flags raise sites
Example 0717 now asserts the (token, index) Diag for ALL SIX raise sites
in cli.sx, closing the two the reviewer found still unasserted:

  - zero-arg UnknownCommand: parse([], ...) -> index -1, token ""
    (the args.len == 0 sub-branch of cli.sx:237, distinct from the
    one-arg too-few form already covered at index 0 / token args[0]).
  - TooManyFlags (cli.sx:256): a command declaring 17 flag specs (> the
    inline 16 cap) is rejected, not truncated -> index -1, token command.

The three index==-1 cases (zero-arg, too-many, missing-req) seed their
Diag with a sentinel before parse, so each assertion proves parse WROTE
the -1/"" rather than merely matching the `.{}` default. Verified
non-vacuous: flipping any expected value makes that line FAIL.

Test-only: cli.sx logic and src/ are untouched.
2026-06-04 07:54:20 +03:00
agra
d1e5f10039 F3.2: assert Diag (token,index) for all cli.parse error cases
Extend example 0717 to pin the offending token VIEW and its args index
for every failure the parser's Diag populates: unknown-command,
unknown-group, too-few-args, missing-value, value-eats-flag, and the
missing-required index. Closes the test-coverage gap flagged in review;
cli.sx parser logic unchanged.
2026-06-04 07:38:57 +03:00
agra
17b437ecfb F3.2: std.cli minimal subcommand + flag parser over explicit []string
Extend std/cli.sx with a zero-heap argument parser that the caller drives
over a logical argv ([]string), separate from the F3.1 os_args accessor.

Grammar: <group> <command> [--flag VALUE | --bool]... [--json] [-- rest...]
  - (group, command) dispatched against a caller-provided Command table;
    no match -> error.UnknownCommand.
  - value-taking vs boolean flags fixed by each command's FlagSpec list;
    --json is a reserved global boolean surfaced as parsed.json.
  - `--` or the first bare operand ends flag parsing; the remainder is
    parsed.rest (operand views).

Heap discipline (heap-discipline.md): zero heap, zero copy. group/command/
flag values/rest are all VIEWS into args. Parsed is a by-value stack struct;
flag presence/values live in a fixed [16]FlagValue inline array indexed by
spec position (no per-flag allocation, no context.allocator). The flag-spec
list and command table are caller storage passed as views.

Failure surfacing (no silent skip): unknown command, unknown flag, a
value-flag missing its value, and an absent required flag each raise a
specific CliError variant; a caller-owned Diag records the offending token
(index + view) before each raise, since error tags carry no data.

examples/0717 drives the parser over explicit []string vectors: a valid
group/command/--flag/--bool/--json case (asserting parsed values + that
values are views into argv), subcommand dispatch, `--`/bare-operand
separators, and the five failure variants each asserted via destructure +
Diag. zig build && zig build test && run_examples.sh green (385 passed).
2026-06-04 06:13:09 +03:00
agra
8c96290801 Merge branch 'flow/sx-foundation/F0.3' into dist-foundation 2026-06-04 05:38:18 +03:00
agra
d87bad2ec4 fix(ir): halt cleanly when a global initializer can't be serialized
The global-init constant serializers in emit_llvm.zig printed a diagnostic
on an unserializable value and then RETURNED an undef/null placeholder and
CONTINUED emitting. For a comptime `#run` global that yields a function
reference (`fp :: #run pick();` where pick returns a function), the build
fell through to the JIT and segfaulted calling through the undef pointer
(exit 134) — a silent miscompile dressed up as a printed error.

Route every genuine bail in the serialization family through a new
`failGlobalInit` helper: it sets `comptime_failed` (so core.generateCode
aborts with a non-zero exit after emit()) and returns an undef placeholder
that never ships, because the halt fires before object emission / JIT. This
covers the comptime func_ref leaf, the require_resolved aggregate func_ref
leaf, the top-level + vtable func_ref globals, the comptime-init catch, and
the remaining heap-walk / aggregate-shape bails. Unresolved-function
diagnostics now name the function instead of its (stdlib-unstable) IR index.

The require_resolved=false Pass-0 placeholder is unchanged (func_map is
empty until Pass 1; the aggregate is re-emitted with require_resolved=true).

Regression: examples/1128-diagnostics-comptime-global-funcref-rejected.sx —
a `#run` global returning a function ref now exits 1 with the diagnostic
(was: exit 134 segfault). Fail-before/pass-after verified.
2026-06-04 05:25:19 +03:00
agra
263333bd26 fix(ir): serialize enum-literal global initializers (issue 0082)
A module-global initialized with an enum literal silently zero-initialized
to the first tag (`chosen : Color = .green` read back as `.red`), and an
enum tag inside a global array/struct was rejected as non-constant. The
constant serializer had no enum-literal arm.

Add `Lowering.constEnumLiteral`: serialize an enum literal to a
`ConstantValue.int` holding the variant's tag value, resolved against the
destination enum type and respecting explicit variant values; the global's
type drives the backing width at emit time. Wired into `globalInitValue`
(scalar global) and `constExprValue` (array element / struct field / nested
aggregate). A non-enum destination or unknown variant is diagnosed loudly,
never silently zero-initialized. The compiler-injected OS/ARCH globals now
serialize to their real `.unknown` tag (6 / 4); runtime reads are unchanged
(they resolve through comptime_constants), so only the static initializer in
the pinned .ir snapshots changes.

Remove the silent `func_ref => orelse LLVMConstNull` fallbacks in the LLVM
constant emitters: aggregate func_ref leaves carry a `require_resolved` flag
(transient null in Pass 0, loud diagnostic if still unresolved in the
Pass-1.5 re-emit), a top-level func_ref global is resolved in
initVtableGlobals, and the comptime (#run) path bails loudly instead of
emitting a null function pointer.

Regression: examples/0139-types-global-enum-literal-init.sx (scalar, array,
struct field, explicit-value enum u16 stride, struct-array with enum field);
negative: examples/1127-diagnostics-global-enum-literal-bad-variant.sx.
Mark issue 0082 RESOLVED.
2026-06-04 04:52:42 +03:00
agra
d680b320f4 fix(ir): serialize null pointer fields in global aggregates (issue 0081)
A module-global aggregate initializer rejected a `null` literal in a
pointer (or optional-pointer) field as "must be initialized by a
compile-time constant". `Lowering.constExprValue` had no `.null_literal`
arm, so the null leaf returned no constant and the whole aggregate looked
non-constant — even though `null` is the compile-time zero pointer (a
top-level scalar `p : *s64 = null;` already serialized fine).

Add `.null_literal => .null_val` to constExprValue. While here, make the
two LLVM constant emitters exhaustive: emitConstAggregate and the
top-level init_val switch in emit_llvm.zig previously ended in a silent
`else => LLVMConstNull(...)` catch-all (the silent-arm class CLAUDE.md
mandates rooting out). They now handle every ConstantValue tag explicitly
(.null_val/.zeroinit -> all-zero constant, .undef -> LLVMGetUndef,
.func_ref resolved, nested .vtable is a hard @panic tripwire). The
reject-loud path for genuinely non-constant fields is preserved.

Regression: examples/0138 (array-of-struct null ptr fields, array of
all-null pointers, nested struct-in-struct null ptr) and the negative
examples/1126 (null ptr field beside a non-const field still errors).
Fail-before/pass-after verified.
2026-06-04 04:22:43 +03:00
agra
e93879816d fix(ir): materialize global aggregate struct-literal initializers (issue 0080)
A module-global array of struct literals (`pairs : [2]Pair = .[ .{...}, .{...} ]`)
was emitted as `zeroinitializer`, silently dropping every declared field — reads
returned 0 with no diagnostic. Global struct literals and struct-with-array
already worked; the gap was struct literals used as ARRAY elements.

Root cause: `Lowering.constExprValue` (the const-aggregate serializer for global
initializers) had no `.struct_literal` arm. `constArrayLiteral` serialized each
element through `constExprValue`, so a struct-literal element returned null,
collapsing the whole array initializer to null; `globalInitValue` then emitted no
payload and the LLVM backend zero-initialized the global — the same silent-zero
class as 0071/0072, one level inside an array literal.

Fix: make `constExprValue` type-aware — thread the destination element/field
TypeId so a struct-literal leaf routes through `constStructLiteral` and a nested
array-literal through `constArrayLiteral` with the correct element type.
`constArrayLiteral` derives its element type from the array TypeId;
`constStructLiteral` passes each field's type. A global aggregate initializer that
still does not fully reduce to a compile-time constant is now rejected loudly
(`diagnoseNonConstGlobal`) instead of silently zeroing. `emitConstAggregate`
already recurses over nested aggregates, so `sx run` (JIT) and `sx build` (AOT)
both materialize the declared values.

Regression: examples/0137-types-global-aggregate-literal-init.sx (global
[N]Struct literal, global struct literal, struct-with-array, nested
array-of-struct-with-array; values read back with no prior store, plus a store on
top). Fails on the pre-fix compiler (array-of-struct fields read 0), passes after.

Marks issues 0079 (already resolved) and 0080 RESOLVED.
2026-06-04 04:04:40 +03:00
agra
7306d37748 fix(ir): store to module-global array element targets live storage (issue 0079)
A store to a module-global array element (`g[i] = v`) was silently dropped:
a subsequent `g[i]` read the array's initializer, not `v`. Constant index,
variable index, and cross-function stores were all affected, in both `sx run`
and `sx build`. Global scalars and local arrays were fine.

Root cause: `Lowering.lowerExprAsPtr` (the lvalue/address path) handled only
local identifiers. A module-global identifier fell through to the value
fallback `lowerExpr`, which emits `global_get` — loading the whole array by
value. The LLVM backend's `emitIndexGep` then allocas a throwaway temp, copies
the value in, and GEPs into the temp, so the store wrote a discarded copy.

Fix: teach `lowerExprAsPtr`'s identifier arm about globals — emit `global_addr`
(a pointer into the global's live storage), or `global_get` for a pointer-typed
global (mirroring the local pointer case). Route the `address_of(index_expr)`
array base through `lowerExprAsPtr` too so `&g[i]` is likewise an lvalue into
the global. `index_gep` now GEPs directly into the global for const and variable
index, across functions. This also fixes global struct field stores, which
shared the same root cause.

Regression: examples/0136-types-global-array-element-store.sx (const-index,
var-index, cross-function store on a scalar global array; struct-element array
for stride; nested-array global for the recursive lvalue). Fails on the pre-fix
compiler, passes after.
2026-06-04 03:44:19 +03:00
agra
483b14015f Merge branch 'flow/sx-foundation/F3.1' into dist-foundation 2026-06-04 03:32:37 +03:00
agra
e7f5bd7aaa F3.1: std.cli os_args — real OS argv accessor via #foreign _NSGetArgv (examples/0716)
Add library/modules/std/cli.sx: a pure-sx command-line argument accessor
backed by the macOS C runtime (_NSGetArgv/_NSGetArgc), no compiler change.

  os_argc() -> s64
  os_args(buf: []string) -> []string

Zero heap, zero per-arg allocation: os_args fills a caller-provided buffer
(stack array) with string VIEWS over the process's own argv block, which
lives for the whole process. The returned slice header is a by-value stack
return; nothing touches context.allocator.

Documents the `sx run` reality: under `sx run <prog.sx> ...` the process
argv is the interpreter's argv (sx, run, prog.sx, ...), not a program's
logical args. This accessor reports the real process argv truthfully;
mapping to logical args is a later consumer concern (distribution P3.1).

Non-macOS platforms bail loudly (message + _exit) rather than returning a
silent empty.

examples/0716-modules-cli-argv.sx asserts only deterministic structural
invariants (argc >= 1, argv[0] non-empty, os_argc() == filled length).
2026-06-04 03:21:41 +03:00
agra
090bdd7cfa Merge branch 'flow/sx-foundation/F2.3' into dist-foundation 2026-06-04 03:04:16 +03:00
agra
1905d35507 F2.3: pin std.json round-trip + malformed-input suite (examples/0715)
Add 0715-modules-json-suite as the single comprehensive pinned suite for
std.json (mirrors 0711 for std.hash), alongside the focused 0713/0714 demos:

- ROUND-TRIP build->write->parse->write over a document covering EVERY value
  kind (a string with every escape form \" \\ \b \f \n \r \t plus a \u00XX
  control, integers 0 / negative / s64 MIN / s64 MAX, bool, null, array,
  nested object) with insertion-order assertions, exact writer bytes, and
  parse-then-rewrite idempotence.
- DECODE positives: \/, the full named-escape set, \uXXXX (BMP 1- and 2-byte)
  plus a surrogate pair, the escaped control forms, and raw multi-byte UTF-8
  round-tripping through writer + reader.
- MALFORMED matrix: one assertion per JsonParseError variant and its key
  edges (UnexpectedToken, UnexpectedEnd, BadEscape, BadNumber incl. leading
  zero / lone '-' / fraction / exponent / overflow, TrailingGarbage,
  BadControlChar), each asserted to raise.

Pure test work: src/ and library/ untouched, no json.sx change needed. Every
model is built through an explicit Arena allocator (heap discipline).
2026-06-04 02:57:32 +03:00
agra
dc2a6a0a87 Merge branch 'flow/sx-foundation/F2.2' into dist-foundation 2026-06-04 02:42:10 +03:00
agra
2871342c0a F2.2: reject raw control bytes (U+0000..U+001F) in JSON strings
parse_string scanned for `"` and `\` but accepted every other byte,
including raw control characters. RFC 8259 §7 requires those bytes to be
escaped inside a string; an unescaped one is invalid JSON and must surface
a parse error, not be silently accepted.

Add `BadControlChar` to JsonParseError and reject any unescaped byte < 0x20
in the string body scan (which gates the decode path too, so escaped forms
like \t/\n/	 still decode correctly; 0x20 and 0x7F are not over-rejected).

Regression test in examples/0714: raw 0x09/0x0A/0x00 each raise
BadControlChar via `?`/`!`; a positive case proves the escaped forms still
decode to the right bytes. All prior assertions kept.
2026-06-04 02:32:32 +03:00
agra
301e966bcf F2.2: un-workaround 0714 — combine string == under and/or (0078 fixed)
Issue 0078 (string == as an and/or operand emitting an invalid PHI) is
resolved on this branch, so the example no longer needs the split that
worked around it. Restore the natural combined assertion
  sub.items[0].key == "k" and sub.items[0].val.str == "v"
(one nested-pair report), and the in_range containment helper to
  return x >= lo and x < hi;
Drop the now-stale issues/0078 references. Re-captured expected stdout
(nested-key/nested-val -> nested-pair). json.sx and src/ untouched.
2026-06-04 02:17:22 +03:00
agra
0e7bdc7c11 Merge branch 'dist-foundation' into flow/sx-foundation/F2.2
# Conflicts:
#	issues/0078-string-eq-operand-of-short-circuit-and-invalid-phi.md
2026-06-04 02:10:42 +03:00
agra
1d92046b7c Merge branch 'flow/sx-foundation/F0.2' into dist-foundation 2026-06-04 02:09:45 +03:00
agra
7c1b90519f fix(emit): PHI predecessor for and/or operand that emits sub-CFG (issue 0078)
A string `==`/`!=` used as an operand of a short-circuit `and`/`or` emitted
invalid LLVM (`PHI node entries do not match predecessors!`). String compares
expand into their own memcmp sub-CFG during LLVM emission, so the operand
finishes in a later basic block (`str.merge`) than the one the IR block
started in. `fixupPhiNodes` wired the short-circuit merge PHI's incoming edge
to `block_map[ir_block]` (the block the IR block started as), recording a
stale predecessor (`%entry`/`%and.rhs.0`).

Fix: record the builder's actual insertion block after emitting each IR
block's instructions (`term_block_map`, via `LLVMGetInsertBlock`) and use it
as the PHI predecessor. General — corrects the incoming block for any operand
that emitted intermediate basic blocks (string `==`, value `match`, …), not
just string `==`.

Regression: examples/0045-basic-string-eq-short-circuit.sx (string `==` on
both sides of `and` and of `or`, plus a match-value + enum-payload `==` shape).
Fails (LLVM abort) pre-fix, passes after.
2026-06-04 02:00:13 +03:00
agra
88be541778 F2.2: std/json reader — explicit-alloc parse with error surfacing
Add the JSON reader (parser) to library/modules/std/json.sx, the inverse
of the F2.1 writer over the same value model: insertion-ordered objects,
arrays, strings (full unescaping incl. \uXXXX + surrogate pairs), s64
integers, bool, null.

Heap discipline (binding): exactly two allocation kinds, both through the
EXPLICIT `alloc` parameter, never the implicit context allocator —
composite backing stores (Array/Object.items via add/put) and decoded
escaped-string buffers (bounded by the raw span). Un-escaped string
values are zero-copy VIEWS into the input buffer (valid only while it
lives); scalars carry no heap.

Failure surfacing (hard contract): malformed input raises a meaningful
JsonParseError variant (UnexpectedToken / UnexpectedEnd / BadEscape /
BadNumber / TrailingGarbage) on the error channel, never a bogus value.
Trailing non-whitespace is TrailingGarbage; fractions/exponents,
out-of-s64 magnitudes, and leading zeros are BadNumber. Number
accumulation runs in negative space so s64 MIN parses exactly.

examples/0714-modules-json-reader.sx asserts the parsed structure
(insertion order, every kind), proves the view-vs-decoded heap split by
pointer containment, round-trips back through the writer byte-for-byte,
decodes a surrogate-pair into 4 UTF-8 bytes, and checks every malformed
variant.

Filed issues/0078: a string `==` (or any sub-CFG operand) used in a
short-circuit `and`/`or` emits invalid LLVM IR (stale PHI predecessor),
hit while writing the example's assertions and worked around there by not
combining comparisons with `and`/`or`. src/ untouched.
2026-06-04 01:41:33 +03:00
agra
295d95d51a Merge branch 'flow/sx-foundation/F2.1' into dist-foundation 2026-06-04 01:15:26 +03:00
agra
1d311b871e test(json): pin s64 MIN/MAX writer bytes; move scratch to .sx-tmp
Close the coverage gap from attempt 1: example 0713 now builds integer
fields holding s64 MIN (-9223372036854775808) and s64 MAX
(9223372036854775807) — plus zero, a small negative, and a small positive —
and asserts the EXACT emitted bytes. This permanently pins the edge that
write_int is specifically engineered for (folding positives into negative
space so MIN's non-representable-positive magnitude serializes correctly).

s64 MIN is expressed as (0 - 9223372036854775807 - 1) because its magnitude
is not a representable positive s64 literal.

Test hygiene: stream to a repo-local, gitignored .sx-tmp/ path (created if
missing) instead of a fixed /tmp name, and unlink it right after read-back
so nothing leaks. Writer/model logic and src/ are untouched.
2026-06-04 01:08:14 +03:00
agra
4552ed61f6 std/json: value model + zero-alloc writer with stable key order
Add library/modules/std/json.sx — the JSON value model and writer
(reader lands in a later step).

Value model: a tagged union over null/bool/integer(s64)/string/array/
object. Objects are an ORDERED list of (key,value) pairs preserving
INSERTION ORDER (no hash map, never sorted/deduped). Integers only — no
fraction/exponent this milestone.

Heap discipline:
  - Scalars carry no heap; string values are VIEWS into caller memory
    (never copied into the node).
  - Composite nodes (Array/Object) own growable child storage, allocated
    through an EXPLICIT allocator parameter on the builder methods
    (arr.add(v, alloc) / obj.put(key, val, alloc), mirroring List.append)
    — never the implicit context allocator.
  - The writer adds ZERO output allocations: it emits into a caller-
    provided Sink, either a fixed []u8 buffer (overflow raises, never
    truncates) or streaming straight to an fs.File through a small caller
    staging buffer (no whole-document string; peak memory O(staging)).
    Integer digits format in a stack [20]u8; s64 MIN is handled by
    formatting in negative space. Sink/IO/overflow surface on the !
    error channel.

examples/0713-modules-json-writer.sx builds a nested object + array +
string with every escape kind + negative int + bool + null, then asserts
the EXACT bytes (insertion order, escaping) from both the buffer sink and
the file-streaming sink, plus the overflow-raises path.
2026-06-04 00:47:30 +03:00
agra
9bf07e0c5f Merge branch 'flow/sx-foundation/F1.2' into dist-foundation 2026-06-04 00:20:31 +03:00
agra
f9bc593bb8 F1.2: std.hash zero-heap [64]u8 hex API + chunked file + pinned vectors
Make the SHA-256 digest path allocation-free (foundation heap-discipline):

- final() and sha256_hex() now return the 64-char lowercase hex digest as
  a [64]u8 by value on the stack; the cstring(64) heap allocation is gone.
- sha256_file() streams the file in fixed 64KB stack chunks via open_file/
  File.read/File.close (defer-closed on every path) instead of slurping it
  with read_file; peak memory is O(chunk), not O(filesize).

Tests (compare via a zero-copy string view over the [64]u8):
- 0710 updated to the by-value API (output unchanged).
- 0711 known-answer vectors: "", "abc", NIST-56/112, padding boundaries
  {0,55,56,57,63,64,65,119,120}, and 1000 / 1,000,000 'a' repeats, each
  pinned to its published digest (cross-checked with shasum -a 256).
- 0712 streaming equivalence (one-shot == byte-at-a-time == split-mid-block
  == split-on-boundary) plus sha256_file(temp) == in-memory digest.

src/ untouched. zig build && zig build test && tests/run_examples.sh green.
2026-06-04 00:08:46 +03:00
agra
ee1e097335 Merge branch 'flow/sx-foundation/F1.1' into dist-foundation 2026-06-03 22:47:40 +03:00
agra
8f9691c206 F1.1: std.hash — streaming SHA-256 in library/modules/std/hash.sx
Add a pure-sx streaming SHA-256 (FIPS 180-4) stdlib module, importable
as `#import "modules/std/hash.sx";`. All 32-bit word arithmetic is done
in s64 and masked back with `& MASK32`, so digests are deterministic and
platform-independent — no shelling out, no native crypto.

API:
- init() -> Sha256          (by-value *self pattern)
- update(*Sha256, string)   (multi-block + partial-block buffering)
- final(*Sha256) -> string  (32-byte digest as lowercase hex)
- sha256_hex(string) -> string             (one-shot)
- sha256_file([:0]u8) -> ?string           (digest of a file via fs.read_file)

Verified against FIPS/NIST known-answer vectors and `shasum -a 256`:
"" , "abc", the 56- and 112-byte multi-block vectors, 1000×'a', and the
64/65-byte block boundaries; chunked update() matches the one-shot call.

examples/0710-modules-sha256.sx pins the KAT vectors + the streaming
invariant; gate green (zig build, zig build test, run_examples 370/0/0/0).
2026-06-03 22:38:58 +03:00
agra
a89a5f8d18 Merge branch 'flow/sx-foundation/F0.1' into dist-foundation 2026-06-03 22:18:43 +03:00
agra
6433eb6155 fix(diagnostics): point reserved-type-name binding errors at the binding (issue 0076)
The reserved-type-name binding diagnostic fired correctly but underlined the
enclosing statement / if / while / for / match / protocol / #objc_class block
because every binding-name check reused the parent `node.span`.

Thread each binding name's own span through the AST and parser, and pass it to
`checkBindingNames`:

- ast: add name spans to VarDecl, DestructureDecl, If/WhileExpr, ForExpr
  (capture + index), MatchArm, Catch/OnFailStmt, Protocol/ForeignMethodDecl.
- parser: populate each span at the binding site from the name token's loc;
  destructure reuses each target identifier's own span.
- semantic_diagnostics: every checkBindingName call now passes the binding's
  own span — no site falls back to node.span. fn/lambda params already used
  Param.name_span.

Carets now land on the offending identifier itself. New regression
examples/1125 asserts the protocol default-body and sx-defined #objc_class
method param spans; 0125/1119-1124 expected updated to the precise carets.
2026-06-03 22:06:56 +03:00
agra
fcc76b9391 fix(diagnostics): make reserved-type-name binding check exhaustive (issue 0076)
The reserved/builtin-type-name binding diagnostic was a hand-walked subset
of binding-bearing AST nodes with a silent `else => {}`, so each review
found another syntactic binding form that bypassed it and hit the original
LLVM verifier abort: destructure names (`s2, x := …`), `impl` method
params/locals, and `if` / `while` / `for` / match-arm / `catch` / `onfail`
captures.

Rewrite `checkBindingNames` (src/ir/semantic_diagnostics.zig) as an
EXHAUSTIVE `switch` over every `Node.Data` tag with NO `else` arm — a future
binding-bearing node type now fails to compile until it is handled here, so
coverage is enforced by the compiler instead of a hand-maintained list. The
check stays in the pre-lowering semantic pass rather than moving to the
`Scope.put` scope-registration choke point: lowering is lazy, so an
uncalled function's bindings never reach `Scope.put`, yet they must still be
rejected at their declaration (e.g. the never-called `takes_u8` in 1119).
No lowering special-case; `lower.zig` unchanged.

Regression tests (fail-before: LLVM abort or silent accept → pass-after:
clean diagnostic, exit 1):
- 1121 control-flow: destructure, if/while bindings, for capture+index,
  match-arm capture
- 1122 impl-block method: reserved param AND reserved local
- 1123 catch + onfail tag bindings
- 1124 destructure name reserved in an imported module
Existing 0125 / 1119 / 0135 / 1120 tests kept; full suite 368 passed.
2026-06-03 20:09:46 +03:00
agra
df6e830bec fix(diagnostics): reject reserved type-name bindings in every module (issue 0077)
The issue-0076 reserved-type-name binding diagnostic only ran over main-file
decls, so an imported module (or the stdlib) could still declare `s2 := ...`
and reach lowering, where the address-of family loads the whole aggregate and
passes it by value to a `ptr` param — LLVM verifier abort.

Extend coverage to every compiled module: a dedicated `checkBindingNames` walk
(in semantic_diagnostics.zig) visits every var/`:=`/typed-local binding name and
function/lambda/struct-method parameter at any depth, with NO main-file filter,
descending the `namespace_decl` that a `mod :: #import` wraps so imported-module
decls are reached. It tracks each module's source_file (save/restore per node)
so the diagnostic renders against the imported module's text. Rejection still
defers to the parser's `Type.fromName` classifier; the unknown-type check (0064)
stays main-file-only. No lowering special-case; `.identifier`-only address-of
paths are unchanged.

Stdlib audit: the only reserved-name bindings under library/ were two `u1`
locals in ui/renderer.sx (UV coords) — renamed to u_min/u_max/v_min/v_max.

Regression test: examples/1120-diagnostics-imported-reserved-type-name.sx (+
companion mod.sx) — an imported `s2 := ...` now emits the clean diagnostic at
the import's declaration site (exit 1), not an LLVM abort.

Resolves issues 0076 (coverage extension) and 0077.
2026-06-03 19:32:49 +03:00
agra
f49a49cd07 fix(diagnostics): reject reserved/builtin type names used as identifiers (issue 0076)
A value binding (local/global `var` or a parameter) spelled as a
reserved/builtin type name parses as a `.type_expr` rather than an
`.identifier` (parser.zig, via `Type.fromName`), so the address-of
family in lower.zig never saw a scoped local and mis-lowered it —
loading the aggregate and passing it by value to a `ptr` parameter
(LLVM verifier abort, or a silent `*self`-mutation-losing copy).

Add a declaration-site diagnostic in semantic_diagnostics.zig
(`UnknownTypeChecker.checkBindingName`): reject any parameter name or
`var` binding name (`:=` / typed-local / global forms) whose spelling
collides with a reserved type name. `isReservedTypeName` defers to the
parser's own classifier (`types.Type.fromName`) so the rejected set
never drifts from the set that would parse as a type — the named
builtins (bool/string/void/f32/f64/usize/isize/Any) and `[su]N` over
sx's 1-64 range. Bare value names (`s`, `self`, `index`) are untouched.
No lowering special-case; the `.identifier`-only address-of paths are
correct once type-shaped names can never be bound. The rejected
attempt-1 `bareVarName` approach was never landed.

Tests:
- 0125-types-type-named-var-rejected: `:=` form (s2) rejected
  (repurposed from the old test that asserted the now-illegal behavior).
- 1119-diagnostics-reserved-type-name-as-identifier: parameter (u8),
  typed-local (s64, bool), `:=` (string) forms rejected.
- 0135-types-self-streaming-nonreserved: positive — `*self` streaming
  with non-reserved names accumulates correctly via both call styles.
- 0904-optionals: renamed incidental locals s1/s2 -> filled/empty.
2026-06-03 19:00:39 +03:00
agra
4ab3608f77 Merge branch 'docs/trace-output-repair' 2026-06-03 16:55:00 +03:00
agra
99a5c781a0 docs: fix stale error-trace output format + markers
The trace docs predated the current formatter. Corrected against the real
output (library/modules/trace.sx to_string + examples/expected/1025-errors-
trace-format.stderr):
- error-handling.md: replace the obsolete trace example ("error trace:" /
  "raised error.X" / "at func (file:line)") with the real format —
  "error return trace (most recent call last):" + per-frame "func at
  file:line:col" + source line + caret.
- debugger.md: drop the stale "(planned)" marker on the trace formatter
  (it is implemented); the tag-name table note now cites the failable-main
  reporter's "unhandled error reached main: error.X" line, not a
  nonexistent "raised error.X" trace line.
2026-06-03 16:54:36 +03:00
agra
973543ddf8 Merge branch 'arch-refactor' 2026-06-03 16:34:16 +03:00
agra
1148362353 Merge branch 'flow/sx-plan-arch/fix-0075' into arch-refactor 2026-06-03 16:12:39 +03:00
agra
aca077d720 fix(reflection): replace silent .s64 arg-type fallback with loud .unresolved (issue 0075)
The `type_name` / `type_eq` reflection builtins resolved their Type arg's IR
type via `getRefIRType(...) orelse TypeId.s64`, then gated `== .any`. A failed
must-succeed lookup silently became `.s64` (`!= .any`), classifying a boxed
`Any` arg as bare i64 and reading the wrong value with no diagnostic.

Add the sibling classifier `LLVMEmitter.reflectArgRepr`, which routes the
lookup through `argIRTypeOrFail` (the issue-0074 `.unresolved` resolver) and
returns `{ boxed, bare, unresolved }`. The three emit sites in ops.zig
(`type_name` + `type_eq` x2) now switch on it: `.boxed` extracts the Any value
field, `.bare` uses the value directly, `.unresolved` hits a hard `@panic`
tripwire — never silently treated as bare. Real args always resolve, so the
happy path is byte-identical (suite stays 361/0, zero snapshot churn).

Secondary `lower.zig` `null_literal`/`undef_literal => target_type orelse .void`
confirmed intentional (typeless-literal default deliberately handled by
emitConstNull/emitConstUndef as null-ptr / undef-i64) — left with an invariant
comment, not the `.unresolved` tripwire.

Regression test in emit_llvm.test.zig asserts the loud path: fail-before with
`orelse .s64` yields `.bare`; pass-after yields `.unresolved`.
2026-06-03 16:05:31 +03:00
agra
759e3caa5e Merge branch 'flow/sx-plan-arch/fix-0074' into arch-refactor 2026-06-03 15:55:39 +03:00
agra
633c0a2540 docs(issues): file 0075 — silent .s64 type fallback in reflection builtins
Discovered during the 0074 fix + a codebase-wide silent-type-fallback sweep.
getRefIRType(...) orelse TypeId.s64 at ops.zig:1023/1049/1055 (type_name/type_eq).
Blocker; to be resolved before the arch-refactor stream closes.
2026-06-03 15:55:32 +03:00
agra
4537538bb2 fix(ffi): replace silent .void arg-type fallback with loud .unresolved (issue 0074)
Four FFI call-arg lowering sites resolved an argument's IR type via
`getRefIRType(arg_ref) orelse .void` — a silent fallback to the load-bearing
real type `.void`. A failed lookup there is a codegen invariant violation, but
`.void` is treated by downstream `toLLVMType` → `abiCoerceParamType` →
`coerceArg` as a legitimate void-typed foreign argument, corrupting the call
ABI with no diagnostic.

Add one shared resolver `LLVMEmitter.argIRTypeOrFail` that returns the
dedicated `.unresolved` sentinel on a failed lookup — never `.void`/`.s64` — so
the failure cannot masquerade as a real type and trips `toLLVMType`'s existing
hard `@panic` tripwire at the call site. Route all four sites through it:
  - src/ir/emit_llvm.zig          JNI constructor (NewObject) arg loop
  - src/backend/llvm/ops.zig      objc_msgSend arg loop
  - src/backend/llvm/ops.zig      JNI non-virtual call arg loop
  - src/backend/llvm/ops.zig      JNI Call<Type>Method arg loop

Happy path is byte-identical (every real arg already has a resolved type); FFI
examples stay green with zero snapshot churn.

Regression test (fail-before/pass-after) in src/ir/emit_llvm.test.zig asserts an
unresolvable FFI arg ref now yields `.unresolved`, not the old silent `.void`.
2026-06-03 15:43:27 +03:00
agra
6f4b872254 Merge branch 'flow/sx-plan-arch/A9.2' into arch-refactor 2026-06-03 15:19:02 +03:00