Files
sx/issues/0083-named-const-array-dimension-miscompiled.md
agra a7dcb23b70 fix(ir): poison type-fn binder on failed value-param bind (0083)
A failed value-param bind on a type-returning function (e.g.
`MakeC :: ($K: Count, $T: Type) -> Type { return [K]T; }` with
`a : MakeC(5_000_000_000, s64)`) emitted its correct range diagnostic
but then `instantiateTypeFunction` returned `null`, so
`resolveParameterizedWithBindings` fell through to an empty-struct
placeholder named after the function. The binding `a` got that
placeholder type, so a later `a.len` cascaded a bogus second error
`field 'len' not found on type 'MakeC'`.

The struct binder (`instantiateGenericStruct`) already returns
`.unresolved` here; the type-fn binder now matches it — a failed
value-param bind poisons to `.unresolved` instead of `null`, so the
caller propagates the diagnosed poison and the existing
`emitFieldError` suppression yields one clean diagnostic. Covers
every type-fn value-param failure mode: overflow via an aliased
constraint, a non-const arg, and an unknown type arg.

Regression: examples/1137-diagnostics-value-param-type-fn-no-cascade.sx
2026-06-04 14:38:18 +03:00

243 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 0083 — fixed array with a named-constant dimension is miscompiled
> **RESOLVED.** Root cause: `TypeResolver.resolveCompound`'s array arm resolved
> the dimension with `if (length.data == .int_literal) ... else 0` — a named
> const (`N :: 16`) hit the silent `else 0`, so `[N]T` became a 0-length / 0-byte
> array and element access ran out of bounds (garbage for scalars, bus error for
> slice/pointer/struct elements). Fix: the array arm now delegates the dimension
> to `inner.resolveArrayLen` (symmetric with `inner.resolveInner` for the element
> type). The stateful `Lowering.resolveArrayLen` evaluates the dimension as a
> compile-time integer across the comptime-constant, generic-value, and
> module-global const tables, and emits a diagnostic (no fabricated length) when
> it isn't one.
>
> **Exhaustive follow-up (attempt 2).** The first fix covered every *stateful*
> resolution path (direct local decls, struct fields, function params/returns),
> but the *stateless* registration-time resolver (`type_bridge`, used for type
> aliases `Arr :: [N]T` and inline union/enum field types) still resolved the
> named dim with a silent `else 0` — so `Arr :: [N]s64; a : Arr` and
> `union { a: [N]s64 }` were still miscompiled. Fix: the module-global const
> table (`ProgramIndex.module_const_map`) is now threaded into `type_bridge`
> alongside the alias map, so `StatelessInner.resolveArrayLen` resolves a named
> module-const dim to the same length everywhere. The remaining unresolvable case
> (a computed/comptime dimension on the binding-free path) bails LOUDLY instead of
> fabricating a 0 length. Files: `src/ir/type_resolver.zig`, `src/ir/lower.zig`,
> `src/ir/type_bridge.zig`. Regression: `examples/0140-types-named-const-array-dim.sx`
> (direct + type-alias + nested `[N][M]T` + union-field dims, s64 / string /
> struct element types).
>
> **Root-cause close-out (attempt 3).** Attempt 2 threaded the const map into
> `type_bridge` but the map wasn't fully populated when an alias resolved its
> dimension: type aliases (`Arr :: [N]T`) resolve EAGERLY in scanDecls pass 1,
> while TYPED consts (`N : s64 : 16`) register only in pass 2 and a
> forward-declared untyped const (`Arr :: [N]T; N :: 16`) hadn't registered yet
> either — so the stateless resolver saw an empty table, printed a non-fatal
> warning, fabricated length 0, and CONTINUED to garbage / a segfault. Three
> coordinated fixes: (1) a scanDecls **pass 0** pre-registers every integer-valued
> module const into `module_const_map` BEFORE any alias resolves, so typed,
> untyped, and forward-referenced consts all resolve identically; (2) both the
> stateful and stateless dim resolvers now share one routine
> (`program_index.moduleConstInt`) so they cannot disagree again; (3) the length-0
> fabrications are GONE — `resolveArrayLen` returns `?u32`, `resolveCompound`
> yields the `.unresolved` sentinel on null (never a 0-byte array), the stateful
> path emits a diagnostic, and the registration path surfaces an unresolved alias
> as a clean compile error that aborts the build (the `type_bridge.zig:270`
> Vector-lane `else => 0` is fixed the same way). Files:
> `src/ir/program_index.zig`, `src/ir/lower.zig`, `src/ir/type_bridge.zig`,
> `src/ir/type_resolver.zig`. Regressions:
> `examples/0143-types-typed-const-array-dim.sx` (typed-const dim direct + via
> alias for s64/string/struct, forward-ref alias, nested) and
> `examples/1129-diagnostics-array-dim-not-const.sx` (an unresolvable computed dim
> halts with a clean diagnostic + non-zero exit, not a fabricated 0-length array).
>
> **Const-expression dimensions (attempt 4).** Attempts 13 resolved only a BARE
> named-const dim (`[M]`) or a literal (`[5]`); any constant-FOLDABLE *expression*
> dimension (`[M + 1]`, `[M * N]`, `[N - M]`, nested `[M + N - 1]`, parenthesised
> `[(M + 1) * 2]`) was wrongly rejected as "not a compile-time integer constant"
> even though every operand is compile-time-known. Such a dimension MUST be
> evaluated, not rejected. Fix: the shared dim resolver now routes the dimension
> through a single constant integer-expression evaluator
> (`program_index.evalConstIntExpr`) that folds integer `+ - * / %` and unary
> negate (parentheses carry no AST node) over literals and named/typed module
> consts, recursively. The leaf-name lookup is delegated (`ctx.lookupDimName`) so
> the stateful body-lowering path and the stateless registration path share the
> EXACT SAME folding logic and cannot diverge — an expression dim via a type alias
> resolves identically to the direct form. The no-fabrication discipline is
> unchanged: a genuinely non-comptime dimension (a runtime local, a non-comptime
> call, an unbound name) — or arithmetic that overflows / divides by zero — still
> yields null → `.unresolved` → the same clean compile-halting diagnostic, never a
> fabricated length. Files: `src/ir/program_index.zig` (+`.test.zig`),
> `src/ir/lower.zig`, `src/ir/type_bridge.zig`. Regression:
> `examples/0144-types-const-expr-array-dim.sx` (every expression form, direct vs
> alias, scalar / string / struct element types); `1129` re-pointed at a genuinely
> non-const dimension (`[get()]s64`, a runtime call) so it still proves the
> stateless clean-halt.
>
> **Unified comptime-int evaluator (attempt 5).** Attempts 14 fixed the array
> *dimension* paths but the SAME length-0 fabrication class survived on the
> siblings that resolve a comptime integer elsewhere: the three Vector lane
> resolvers (`resolveTypeCallWithBindings`, `resolveParameterizedWithBindings`,
> `resolveArrayLiteralType`) and the two generic value-param binders
> (`instantiateGenericStruct`, `instantiateTypeFunction`) each hand-rolled an
> `else => 0` switch, so `Vector(N, f32)` / `Vec(N, f32)` (N a module const)
> fabricated a 0-lane `<0 x float>` (LLVM "huge alignment" abort) or a 0 binding
> under a wrong mangled name; and the `inline for` bound folder (`evalComptimeInt`)
> only knew literals / comptime cursors / `<pack>.len`, so `inline for 0..M` failed
> outright. Fix: every one of those sites now routes through the single shared
> `program_index.evalConstIntExpr` — `evalComptimeInt` delegates to it (the pack
> `.len` leaf moved into the shared folder via a new `ctx.lookupPackLen`); the
> Vector lane and value-param resolvers fold through it and emit a clean diagnostic
> + `.unresolved` (never `else => 0`) on a non-const operand. Two enabling fixes
> upstream of resolution: the unknown-type semantic checker no longer walks a
> value-param position (`Vector(N, …)` / `Vec(N, …)`) as a type name (it was
> reporting "unknown type 'N'"); and both the parameterized-type-arg parser and
> the function-body-detection lookahead (`hasFnBodyAfterArrow`) accept a
> const-EXPRESSION in a value position, so `Vector(M + 1, f32)` and `[M + 1]T`
> parse as a return type too (the latter a pre-existing attempt-4 sibling miss).
> Files: `src/ir/program_index.zig` (+`.test.zig`), `src/ir/lower.zig`,
> `src/ir/type_bridge.zig`, `src/ir/semantic_diagnostics.zig`, `src/parser.zig`.
> Regressions: `examples/1501-vectors-const-lane.sx` (named-const + const-expr
> lane, direct + alias, 3- and 4-lane reads), `examples/1502-vectors-runtime-lane-
> not-const.sx` (a runtime lane clean-halts, exit 1, no LLVM crash),
> `examples/0207-generics-value-param-const.sx` (`Vec(N,f32)` / `Vec(M+1,f32)`
> resolve to the same instantiation as `Vec(3,f32)`),
> `examples/0610-comptime-inline-for-const-bound.sx` (`inline for 0..M` and
> `0..(M+1)` unroll).
>
> **Value-param type functions + oversized guard (attempt 6).** Two remaining
> siblings in the comptime-int path. (1) A type-RETURNING function with a value
> param used as a TYPE annotation (`b : Make(N, s64)` where `Make :: ($K: u32,
> $T: Type) -> Type { return [K]T; }`) was rejected "unknown type 'N'" because
> the unknown-type checker walked the value-param position as a type name, AND the
> parameterized-type-annotation path never routed to `instantiateTypeFunction`
> (only the `.call` path did), nor did that binder resolve a non-struct/union
> return shape. Fix: `isValueParamPosition` (semantic_diagnostics.zig) now also
> skips a value param of a `fn_ast_map` type-returning function (mirroring the
> binder's value/type classification); `resolveParameterizedWithBindings` routes
> a type-returning-function name to `instantiateTypeFunction`; and that binder
> resolves a general return-type expression (`return [K]T`) with bindings active.
> `Make(N, s64)`, `Make(M + 1, s64)`, and `Make(3, s64)` now resolve to one
> `[3]s64`. (2) Oversized dim/lane folds (`[5_000_000_000]`) panicked the
> compiler — fixed under issue 0087 via the shared range-checked
> `program_index.foldDimU32` gate. Files: `src/ir/semantic_diagnostics.zig`,
> `src/ir/lower.zig`, `src/ir/program_index.zig`, `src/ir/type_bridge.zig`.
> Regression: `examples/0208-generics-value-param-type-function.sx`.
>
> **Diagnostic-accuracy parity (attempt 7).** The fold + layout were correct, but
> the two paths still DIVERGED on the error MESSAGE for an oversized dim. The
> direct form (`a : [5_000_000_000]s64`) reported the accurate "array dimension
> 5000000000 does not fit in u32" (from the stateful `resolveArrayLen`, which
> branches on `foldDimU32`'s `.too_large` / `.below_min` / `.not_const` variants),
> but the type-ALIAS form (`Big :: [5_000_000_000]s64`) reported a FALSE "an array
> dimension is not a compile-time integer constant" — because the stateless
> `resolveArrayLen` collapsed every non-`.ok` `DimU32` to `null`, so the
> alias-registration site had only one generic message to emit. Fix: a single
> wording source `program_index.reportDimError(diag, span, DimU32)` now owns the
> dim-error text; the stateful path emits through it, and the alias-registration
> site re-folds a top-level array dim via the new `type_bridge.foldArrayDim`
> (same shared `foldDimU32`) and routes a `.too_large` / `.below_min` result to
> `reportDimError` — so an oversized alias dim now reports the SAME precise
> message as the direct form. A genuinely non-const alias dim (`[get()]`) still
> gets the alias-specific "not a compile-time integer constant" message (1129).
> Files: `src/ir/program_index.zig`, `src/ir/type_bridge.zig`, `src/ir/lower.zig`.
> Regression: `examples/1131-diagnostics-array-dim-oversized-u32-alias.sx`
> (oversized dim via alias → "does not fit in u32", matching direct example 1130;
> 1129 still proves the non-const path keeps the generic message).
>
> **Integral-float counts + value-param range gate (attempt 8, Agra ruling).**
> Two finishing items on the shared count path. (1) An *integral* compile-time
> FLOAT used as a count (array dim, Vector lane, value-param, `inline for` bound)
> was wrongly rejected — `N : f64 : 4.0`, `N :: 4.0`, and `[4.0]s64` all said
> "must be a compile-time integer constant". The shared evaluator now folds an
> integral float to its integer at the single leaf
> (`program_index.floatToIntExact`, used by both the `.float_literal` arm of
> `evalConstIntExpr` and `moduleConstInt`), so every consumer accepts `4.0` ≡ `4`
> while a non-integral (`4.5`) or negative value is still rejected by the
> downstream `foldDimU32` gate. (2) A generic value-param bind (`Box($K: u32)`)
> never range-checked the folded arg against its declared type, so
> `Box(5_000_000_000)` compiled and ran; the bind now routes a `u32` count
> through the same `foldDimU32` gate (and any other declared integer type through
> `program_index.intTypeRange`), so an out-of-range arg is a clean compile error
> ("value 5000000000 does not fit in u32 parameter K"). Files:
> `src/ir/program_index.zig` (+`.test.zig`), `src/ir/lower.zig`, `specs.md`.
> Regressions: `examples/0145-types-integral-float-array-dim.sx`,
> `examples/1504-vectors-integral-float-lane.sx`,
> `examples/0611-comptime-integral-float-inline-for.sx`,
> `examples/0209-generics-value-param-integral-float.sx`,
> `examples/1132-diagnostics-array-dim-non-integral-float.sx`,
> `examples/1133-diagnostics-array-dim-negative-float.sx`,
> `examples/1134-diagnostics-value-param-u32-overflow.sx`.
>
> **Convergence — the last three count-surface cells (attempt 9).** Three
> adjacent cells of the SAME shared count surface still diverged. (1) An ALIASED
> integer constraint (`Count :: u32`; `$K: Count`) bypassed the value-param range
> gate — only BUILTIN constraint names matched `intTypeRange`, so
> `Box(5_000_000_000)` with `$K: Count` compiled and bound a truncated value. The
> gate (`Lowering.resolveValueParamArg`, shared by BOTH binders — struct +
> type-fn) now resolves the constraint to its underlying builtin
> (`canonicalIntConstraintName`: `Count` → u32, `Small` → s8) before
> range-checking, so an aliased integer constraint behaves exactly like the
> builtin it names. (2) A named const with an EXPRESSION RHS (`M :: 2; N :: M + 1`)
> did not fold as a count — `program_index.moduleConstInt` read only a LITERAL RHS
> node. It now folds every const's RHS through the shared `evalConstIntExpr`
> (cycle-guarded so `N :: N` / mutual cycles fold to null, not a stack overflow),
> and scanDecls pass-0 pre-registers expression-RHS consts; so `N :: M + 1` == 3
> at every count consumer (dim direct + alias, Vector lane, value-param struct +
> type-fn, `inline for`). (3) The stateful `Lowering.resolveArrayLen` STILL
> fabricated length 0 after a failed fold; it now returns null → the `.unresolved`
> sentinel (no fabrication), and the binding's lowering bails on it cleanly — a
> field access on an already-diagnosed `.unresolved` value stays silent
> (`emitFieldError`), so a failed-fold dim emits ONE clean diagnostic and never
> reaches the `sizeOf` panic. Files: `src/ir/program_index.zig` (+`.test.zig`),
> `src/ir/lower.zig`. Regressions: `examples/0146-types-comptime-count-matrix.sx`
> (the full positive matrix — every consumer × representative leaf form),
> `examples/1135-diagnostics-value-param-alias-constraint-overflow.sx` (aliased
> u32 + s8 overflow), `examples/1136-diagnostics-array-dim-nonconst-direct-no-crash.sx`
> (direct non-const dim halts cleanly, no fabrication / panic); the cascade
> cleanup also tightened `examples/1502`/`1503` to one diagnostic each.
>
> **Final convergence — type-fn binder parity (attempt 10).** One last cell of
> the count surface still diverged from the struct binder. A FAILED value-param
> bind on a type-RETURNING FUNCTION (`MakeC :: ($K: Count, $T: Type) -> Type
> { return [K]T; }`; `a : MakeC(5_000_000_000, s64)`) emitted its correct range
> diagnostic, but `instantiateTypeFunction` then returned `null`, so
> `resolveParameterizedWithBindings` fell through to the empty-struct *placeholder*
> named `MakeC`. The binding `a` got that placeholder type, so a downstream
> `a.len` cascaded a bogus second error `field 'len' not found on type 'MakeC'`.
> The struct binder (`instantiateGenericStruct`) already returned `.unresolved`
> here; the type-fn binder now matches it — the failed value-param bind poisons to
> `.unresolved` instead of `null`, so the caller propagates the diagnosed poison
> and the existing `emitFieldError` suppression yields ONE clean diagnostic. Covers
> every type-fn value-param failure mode (overflow via aliased constraint,
> non-const arg, unknown type arg). Files: `src/ir/lower.zig` (one line in
> `instantiateTypeFunction`). Regression:
> `examples/1137-diagnostics-value-param-type-fn-no-cascade.sx`.
## Symptom
A fixed array whose dimension is a module-global integer constant (`N :: 16;
a : [N]T`) miscompiles element access: reads/writes compute a wrong address.
With `s64` elements `a[0]` returns GARBAGE (silent); with slice/pointer element
types (`[N]string`) it Bus-errors. The identical program with a LITERAL dimension
(`a : [16]T`) is correct. Silent-miscompile class (cf. 00790082).
## Reproduction
```sx
#import "modules/std.sx";
N :: 16;
main :: () { a : [N]s64 = ---; a[0] = 7; print("a0={}\n", a[0]); }
```
`./zig-out/bin/sx run` prints `a0=8472789232` (garbage); want `a0=7`. Replacing
`[N]` with `[16]` prints `7`.
## Investigation prompt
A fixed-array TYPE whose dimension is a named const (`N :: 16; [N]T`) resolves to
a wrong element stride / array length in codegen — element address computation is
wrong (garbage for scalars, bad pointer for slice/pointer elements). Literal
dimensions are correct, so the defect is in resolving the array-type DIMENSION
from a constant expression (vs a literal) — the dim likely resolves to 0/unknown
or the element size is wrong. Look at array-type resolution where the length is a
const-expr (type lowering / sizeof / element-stride computation). Fix so a
named-const dimension yields the same layout as the literal. Verify with the
repro (expect 7) + a `[N]string`/`[N]struct` case (no bus error, correct reads),
and `zig build && zig build test && bash tests/run_examples.sh` green.