Files
sx/issues/0083-named-const-array-dimension-miscompiled.md
agra e8cc9d03de fix(ir): precise oversized-dim diagnostic on the alias path (0083)
The stateless alias-registration array-dim path collapsed foldDimU32's
distinct .too_large / .below_min outcomes into null, so an oversized type
alias (Big :: [5000000000]s64) emitted the FALSE 'an array dimension is not
a compile-time integer constant' message while the direct form correctly
reported 'array dimension 5000000000 does not fit in u32'.

Add program_index.reportDimError as the single source of dim-error wording
(the stateful path now emits through it too) and type_bridge.foldArrayDim to
surface the DimU32 reason at the alias-registration site. An oversized/negative
alias dim now routes to reportDimError for the same precise message as the
direct form; a genuinely non-const alias dim keeps the alias-specific message.

Regression: examples/1131-diagnostics-array-dim-oversized-u32-alias.sx
2026-06-04 12:31:24 +03:00

174 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 0083 — fixed array with a named-constant dimension is miscompiled
> **RESOLVED.** Root cause: `TypeResolver.resolveCompound`'s array arm resolved
> the dimension with `if (length.data == .int_literal) ... else 0` — a named
> const (`N :: 16`) hit the silent `else 0`, so `[N]T` became a 0-length / 0-byte
> array and element access ran out of bounds (garbage for scalars, bus error for
> slice/pointer/struct elements). Fix: the array arm now delegates the dimension
> to `inner.resolveArrayLen` (symmetric with `inner.resolveInner` for the element
> type). The stateful `Lowering.resolveArrayLen` evaluates the dimension as a
> compile-time integer across the comptime-constant, generic-value, and
> module-global const tables, and emits a diagnostic (no fabricated length) when
> it isn't one.
>
> **Exhaustive follow-up (attempt 2).** The first fix covered every *stateful*
> resolution path (direct local decls, struct fields, function params/returns),
> but the *stateless* registration-time resolver (`type_bridge`, used for type
> aliases `Arr :: [N]T` and inline union/enum field types) still resolved the
> named dim with a silent `else 0` — so `Arr :: [N]s64; a : Arr` and
> `union { a: [N]s64 }` were still miscompiled. Fix: the module-global const
> table (`ProgramIndex.module_const_map`) is now threaded into `type_bridge`
> alongside the alias map, so `StatelessInner.resolveArrayLen` resolves a named
> module-const dim to the same length everywhere. The remaining unresolvable case
> (a computed/comptime dimension on the binding-free path) bails LOUDLY instead of
> fabricating a 0 length. Files: `src/ir/type_resolver.zig`, `src/ir/lower.zig`,
> `src/ir/type_bridge.zig`. Regression: `examples/0140-types-named-const-array-dim.sx`
> (direct + type-alias + nested `[N][M]T` + union-field dims, s64 / string /
> struct element types).
>
> **Root-cause close-out (attempt 3).** Attempt 2 threaded the const map into
> `type_bridge` but the map wasn't fully populated when an alias resolved its
> dimension: type aliases (`Arr :: [N]T`) resolve EAGERLY in scanDecls pass 1,
> while TYPED consts (`N : s64 : 16`) register only in pass 2 and a
> forward-declared untyped const (`Arr :: [N]T; N :: 16`) hadn't registered yet
> either — so the stateless resolver saw an empty table, printed a non-fatal
> warning, fabricated length 0, and CONTINUED to garbage / a segfault. Three
> coordinated fixes: (1) a scanDecls **pass 0** pre-registers every integer-valued
> module const into `module_const_map` BEFORE any alias resolves, so typed,
> untyped, and forward-referenced consts all resolve identically; (2) both the
> stateful and stateless dim resolvers now share one routine
> (`program_index.moduleConstInt`) so they cannot disagree again; (3) the length-0
> fabrications are GONE — `resolveArrayLen` returns `?u32`, `resolveCompound`
> yields the `.unresolved` sentinel on null (never a 0-byte array), the stateful
> path emits a diagnostic, and the registration path surfaces an unresolved alias
> as a clean compile error that aborts the build (the `type_bridge.zig:270`
> Vector-lane `else => 0` is fixed the same way). Files:
> `src/ir/program_index.zig`, `src/ir/lower.zig`, `src/ir/type_bridge.zig`,
> `src/ir/type_resolver.zig`. Regressions:
> `examples/0143-types-typed-const-array-dim.sx` (typed-const dim direct + via
> alias for s64/string/struct, forward-ref alias, nested) and
> `examples/1129-diagnostics-array-dim-not-const.sx` (an unresolvable computed dim
> halts with a clean diagnostic + non-zero exit, not a fabricated 0-length array).
>
> **Const-expression dimensions (attempt 4).** Attempts 13 resolved only a BARE
> named-const dim (`[M]`) or a literal (`[5]`); any constant-FOLDABLE *expression*
> dimension (`[M + 1]`, `[M * N]`, `[N - M]`, nested `[M + N - 1]`, parenthesised
> `[(M + 1) * 2]`) was wrongly rejected as "not a compile-time integer constant"
> even though every operand is compile-time-known. Such a dimension MUST be
> evaluated, not rejected. Fix: the shared dim resolver now routes the dimension
> through a single constant integer-expression evaluator
> (`program_index.evalConstIntExpr`) that folds integer `+ - * / %` and unary
> negate (parentheses carry no AST node) over literals and named/typed module
> consts, recursively. The leaf-name lookup is delegated (`ctx.lookupDimName`) so
> the stateful body-lowering path and the stateless registration path share the
> EXACT SAME folding logic and cannot diverge — an expression dim via a type alias
> resolves identically to the direct form. The no-fabrication discipline is
> unchanged: a genuinely non-comptime dimension (a runtime local, a non-comptime
> call, an unbound name) — or arithmetic that overflows / divides by zero — still
> yields null → `.unresolved` → the same clean compile-halting diagnostic, never a
> fabricated length. Files: `src/ir/program_index.zig` (+`.test.zig`),
> `src/ir/lower.zig`, `src/ir/type_bridge.zig`. Regression:
> `examples/0144-types-const-expr-array-dim.sx` (every expression form, direct vs
> alias, scalar / string / struct element types); `1129` re-pointed at a genuinely
> non-const dimension (`[get()]s64`, a runtime call) so it still proves the
> stateless clean-halt.
>
> **Unified comptime-int evaluator (attempt 5).** Attempts 14 fixed the array
> *dimension* paths but the SAME length-0 fabrication class survived on the
> siblings that resolve a comptime integer elsewhere: the three Vector lane
> resolvers (`resolveTypeCallWithBindings`, `resolveParameterizedWithBindings`,
> `resolveArrayLiteralType`) and the two generic value-param binders
> (`instantiateGenericStruct`, `instantiateTypeFunction`) each hand-rolled an
> `else => 0` switch, so `Vector(N, f32)` / `Vec(N, f32)` (N a module const)
> fabricated a 0-lane `<0 x float>` (LLVM "huge alignment" abort) or a 0 binding
> under a wrong mangled name; and the `inline for` bound folder (`evalComptimeInt`)
> only knew literals / comptime cursors / `<pack>.len`, so `inline for 0..M` failed
> outright. Fix: every one of those sites now routes through the single shared
> `program_index.evalConstIntExpr` — `evalComptimeInt` delegates to it (the pack
> `.len` leaf moved into the shared folder via a new `ctx.lookupPackLen`); the
> Vector lane and value-param resolvers fold through it and emit a clean diagnostic
> + `.unresolved` (never `else => 0`) on a non-const operand. Two enabling fixes
> upstream of resolution: the unknown-type semantic checker no longer walks a
> value-param position (`Vector(N, …)` / `Vec(N, …)`) as a type name (it was
> reporting "unknown type 'N'"); and both the parameterized-type-arg parser and
> the function-body-detection lookahead (`hasFnBodyAfterArrow`) accept a
> const-EXPRESSION in a value position, so `Vector(M + 1, f32)` and `[M + 1]T`
> parse as a return type too (the latter a pre-existing attempt-4 sibling miss).
> Files: `src/ir/program_index.zig` (+`.test.zig`), `src/ir/lower.zig`,
> `src/ir/type_bridge.zig`, `src/ir/semantic_diagnostics.zig`, `src/parser.zig`.
> Regressions: `examples/1501-vectors-const-lane.sx` (named-const + const-expr
> lane, direct + alias, 3- and 4-lane reads), `examples/1502-vectors-runtime-lane-
> not-const.sx` (a runtime lane clean-halts, exit 1, no LLVM crash),
> `examples/0207-generics-value-param-const.sx` (`Vec(N,f32)` / `Vec(M+1,f32)`
> resolve to the same instantiation as `Vec(3,f32)`),
> `examples/0610-comptime-inline-for-const-bound.sx` (`inline for 0..M` and
> `0..(M+1)` unroll).
>
> **Value-param type functions + oversized guard (attempt 6).** Two remaining
> siblings in the comptime-int path. (1) A type-RETURNING function with a value
> param used as a TYPE annotation (`b : Make(N, s64)` where `Make :: ($K: u32,
> $T: Type) -> Type { return [K]T; }`) was rejected "unknown type 'N'" because
> the unknown-type checker walked the value-param position as a type name, AND the
> parameterized-type-annotation path never routed to `instantiateTypeFunction`
> (only the `.call` path did), nor did that binder resolve a non-struct/union
> return shape. Fix: `isValueParamPosition` (semantic_diagnostics.zig) now also
> skips a value param of a `fn_ast_map` type-returning function (mirroring the
> binder's value/type classification); `resolveParameterizedWithBindings` routes
> a type-returning-function name to `instantiateTypeFunction`; and that binder
> resolves a general return-type expression (`return [K]T`) with bindings active.
> `Make(N, s64)`, `Make(M + 1, s64)`, and `Make(3, s64)` now resolve to one
> `[3]s64`. (2) Oversized dim/lane folds (`[5_000_000_000]`) panicked the
> compiler — fixed under issue 0087 via the shared range-checked
> `program_index.foldDimU32` gate. Files: `src/ir/semantic_diagnostics.zig`,
> `src/ir/lower.zig`, `src/ir/program_index.zig`, `src/ir/type_bridge.zig`.
> Regression: `examples/0208-generics-value-param-type-function.sx`.
>
> **Diagnostic-accuracy parity (attempt 7).** The fold + layout were correct, but
> the two paths still DIVERGED on the error MESSAGE for an oversized dim. The
> direct form (`a : [5_000_000_000]s64`) reported the accurate "array dimension
> 5000000000 does not fit in u32" (from the stateful `resolveArrayLen`, which
> branches on `foldDimU32`'s `.too_large` / `.below_min` / `.not_const` variants),
> but the type-ALIAS form (`Big :: [5_000_000_000]s64`) reported a FALSE "an array
> dimension is not a compile-time integer constant" — because the stateless
> `resolveArrayLen` collapsed every non-`.ok` `DimU32` to `null`, so the
> alias-registration site had only one generic message to emit. Fix: a single
> wording source `program_index.reportDimError(diag, span, DimU32)` now owns the
> dim-error text; the stateful path emits through it, and the alias-registration
> site re-folds a top-level array dim via the new `type_bridge.foldArrayDim`
> (same shared `foldDimU32`) and routes a `.too_large` / `.below_min` result to
> `reportDimError` — so an oversized alias dim now reports the SAME precise
> message as the direct form. A genuinely non-const alias dim (`[get()]`) still
> gets the alias-specific "not a compile-time integer constant" message (1129).
> Files: `src/ir/program_index.zig`, `src/ir/type_bridge.zig`, `src/ir/lower.zig`.
> Regression: `examples/1131-diagnostics-array-dim-oversized-u32-alias.sx`
> (oversized dim via alias → "does not fit in u32", matching direct example 1130;
> 1129 still proves the non-const path keeps the generic message).
## Symptom
A fixed array whose dimension is a module-global integer constant (`N :: 16;
a : [N]T`) miscompiles element access: reads/writes compute a wrong address.
With `s64` elements `a[0]` returns GARBAGE (silent); with slice/pointer element
types (`[N]string`) it Bus-errors. The identical program with a LITERAL dimension
(`a : [16]T`) is correct. Silent-miscompile class (cf. 00790082).
## Reproduction
```sx
#import "modules/std.sx";
N :: 16;
main :: () { a : [N]s64 = ---; a[0] = 7; print("a0={}\n", a[0]); }
```
`./zig-out/bin/sx run` prints `a0=8472789232` (garbage); want `a0=7`. Replacing
`[N]` with `[16]` prints `7`.
## Investigation prompt
A fixed-array TYPE whose dimension is a named const (`N :: 16; [N]T`) resolves to
a wrong element stride / array length in codegen — element address computation is
wrong (garbage for scalars, bad pointer for slice/pointer elements). Literal
dimensions are correct, so the defect is in resolving the array-type DIMENSION
from a constant expression (vs a literal) — the dim likely resolves to 0/unknown
or the element size is wrong. Look at array-type resolution where the length is a
const-expr (type lowering / sizeof / element-stride computation). Fix so a
named-const dimension yields the same layout as the literal. Verify with the
repro (expect 7) + a `[N]string`/`[N]struct` case (no bus error, correct reads),
and `zig build && zig build test && bash tests/run_examples.sh` green.