Two lowering sites materialized a local array as a whole LLVM value;
the legalizer scalarizes each such op into one SelectionDAG node per
element, and at ~64K elements the DAG combiner segfaults
(DAGCombiner::visitMERGE_VALUES → ReplaceAllUsesWith).
- lowerVarDecl: an array-typed `---` initializer emits NO store — the
slot stays uninitialized instead of receiving a whole-array undef
store. The tuple zero-init carve-out stays; non-array `---` keeps
the undef store. The interp is unchanged either way (slots start
.undef).
- lowerIndexExpr: element reads on an array with addressable storage
GEP the storage and load one element — the general-expression
sibling of 0110's lowerFor fix — without value-lowering the object
(a dead whole-array load would still reach the DAG). Storage-less
arrays keep the index_get fallback.
Sibling shape filed as 0125: any_to_string's per-array-type arms still
pass the array by value, so a 64K+ array type + any {} print crashes.
Regression: examples/0055-basic-large-stack-array.sx (sx build
segfaulted pre-fix). 22 .ir snapshots re-pinned: removed undef stores
and ig.tmp spills, in-place gep+load (instruction-shape-only churn,
reviewed).
4.8 KiB
RESOLVED — 0124: 64K+ stack arrays emit whole-aggregate load/store ops that segfault LLVM
RESOLVED (2026-06-12). Root cause: two lowering sites materialized a local array as a first-class LLVM value, which the legalizer scalarizes into one SelectionDAG node per element. Fix: (1)
lowerVarDecl(src/ir/lower/stmt.zig) emits NO store for an array-typed---initializer — the slot stays uninitialized instead of receiving a whole-array undef store (tuple zero-init carve-out kept; non-array---keeps the undef store); (2)lowerIndexExpr(src/ir/lower/expr.zig) reads elements of an array with addressable storage viaindex_gepon the storage + a single-element load — the general-expression sibling of 0110'slowerForfix — without value-lowering the object (a dead whole-array load would still reach the DAG). Storage-less arrays (rvalues, by-value params) keep theindex_getfallback. Residual sibling shapes filed as issue 0125 (any_to_string's per-array-type arms pass the array by value — any 64K+ array type + any{}print still crashes). Regression test:examples/0055-basic-large-stack-array.sx([65536]u8 write/read loops + [131072]s64 first/last —sx buildsegfaulted pre-fix). 22.irsnapshots re-pinned (removed undef stores /ig.tmpspills → in-place gep+load; reviewed instruction-shape-only). Gates: zig build test 426/426, suite 592/592, distribution repo 14/14.
Symptom
Declaring a large (~64KB+) stack array in a function reachable from
main crashes the compiler during native emission — a segfault inside
libLLVM, not a diagnostic.
- Observed:
Segmentation fault at address 0x16b...(a stack address) undersx build, insideDAGCombiner::visitMERGE_VALUES→SelectionDAG::ReplaceAllUsesWith(viaLLVMTargetMachineEmitToFile, src/ir/emit_llvm.zig:2894). - Expected: the program compiles; the array lives in the frame and is accessed in place.
The crash threshold is DAG-shape dependent, not a clean size boundary
([65535]u8 and [65537]u8 compile, [65536]u8, [66000]u8,
[131072]u8 crash), because the real problem is the SelectionDAG
node count: lowering materializes the array as a FIRST-CLASS LLVM
value, and the legalizer scalarizes each whole-aggregate op into one
node per element. Two emission shapes produce such ops:
buf : [N]u8 = ---;stores a whole-array undef constant (store [N x i8] undef, ptr %alloca) — a store of nothing, for an explicitly-uninitialized local.buf[i]reads on a local array lower asindex_geton the array VALUE: load the entire array as an SSA value, spill it to anig.tmpalloca, GEP one element (the general-expression sibling of resolved issue 0110, which fixed onlylowerFor's element fetch). Besides the crash, this copies N bytes to read 1.
Each shape crashes llc in isolation on the dumped IR; with both replaced by in-place access the module compiles.
Reproduction
#import "modules/std.sx";
f :: (fd: s32) {
buf : [65536]u8 = ---;
if buf[0] > 0 { out("x\n"); }
}
main :: () -> s32 {
f(1);
return 0;
}
Observed at master 7f2b8b5: sx build segfaults in libLLVM with the
stack trace above. sx ir shows the two whole-aggregate ops:
%alloca1 = alloca [65536 x i8], align 1
store [65536 x i8] undef, ptr %alloca1, align 1
%load = load [65536 x i8], ptr %alloca1, align 1
%ig.tmp = alloca [65536 x i8], align 1
store [65536 x i8] %load, ptr %ig.tmp, align 1
%ig.ptr = getelementptr [65536 x i8], ptr %ig.tmp, i64 0, i64 0
Investigation prompt
Two lowering sites produce the whole-aggregate ops; fix both:
src/ir/lower/stmt.ziglowerVarDecl(annotated branch): a.undef_literalinitializer falls through tolowerExpr(val)→constUndef(array type)→store.---means explicitly uninitialized — emit NO store at all (keep the existing tuple zero-init carve-out above it).src/ir/lower/expr.ziglowerIndexExpr: when the indexed object is an array with addressable storage (getExprAllocahit, same guard as 0110'slowerForfix), emitindex_gepon the storage + a single-elementloadinstead ofindex_geton the loaded array value. Storage-less arrays (rvalues) keep theindex_getfallback. The object must NOT be lowered as a value on the storage path or the dead whole-arrayloadstill reaches the DAG.
Verification: the repro builds and runs (prints nothing or x
depending on stack garbage — gate on exit 0 of the build, not the
read); [65535]/[65537]/[131072] variants all build. Pin a
regression example that builds AND deterministically runs (write
before read). zig build && zig build test,
bash tests/run_examples.sh green; expect .ir snapshot churn from
removed undef stores and the new gep+load shape — re-pin and review.