issues: file 0124 — 64K+ stack arrays emit whole-aggregate ops that segfault LLVM

This commit is contained in:
agra
2026-06-12 07:55:46 +03:00
parent 0e890b9992
commit 47110b37cf

View File

@@ -0,0 +1,86 @@
# 0124 — 64K+ stack arrays emit whole-aggregate load/store ops that segfault LLVM
## Symptom
Declaring a large (~64KB+) stack array in a function reachable from
`main` crashes the compiler during native emission — a segfault inside
libLLVM, not a diagnostic.
- **Observed**: `Segmentation fault at address 0x16b...` (a stack
address) under `sx build`, inside
`DAGCombiner::visitMERGE_VALUES``SelectionDAG::ReplaceAllUsesWith`
(via `LLVMTargetMachineEmitToFile`, src/ir/emit_llvm.zig:2894).
- **Expected**: the program compiles; the array lives in the frame and
is accessed in place.
The crash threshold is DAG-shape dependent, not a clean size boundary
(`[65535]u8` and `[65537]u8` compile, `[65536]u8`, `[66000]u8`,
`[131072]u8` crash), because the real problem is the SelectionDAG
node count: lowering materializes the array as a FIRST-CLASS LLVM
value, and the legalizer scalarizes each whole-aggregate op into one
node per element. Two emission shapes produce such ops:
1. `buf : [N]u8 = ---;` stores a whole-array undef constant
(`store [N x i8] undef, ptr %alloca`) — a store of nothing, for an
explicitly-uninitialized local.
2. `buf[i]` reads on a local array lower as `index_get` on the array
VALUE: load the entire array as an SSA value, spill it to an
`ig.tmp` alloca, GEP one element (the general-expression sibling of
resolved issue 0110, which fixed only `lowerFor`'s element fetch).
Besides the crash, this copies N bytes to read 1.
Each shape crashes llc in isolation on the dumped IR; with both
replaced by in-place access the module compiles.
## Reproduction
```sx
#import "modules/std.sx";
f :: (fd: s32) {
buf : [65536]u8 = ---;
if buf[0] > 0 { out("x\n"); }
}
main :: () -> s32 {
f(1);
return 0;
}
```
Observed at master 7f2b8b5: `sx build` segfaults in libLLVM with the
stack trace above. `sx ir` shows the two whole-aggregate ops:
```llvm
%alloca1 = alloca [65536 x i8], align 1
store [65536 x i8] undef, ptr %alloca1, align 1
%load = load [65536 x i8], ptr %alloca1, align 1
%ig.tmp = alloca [65536 x i8], align 1
store [65536 x i8] %load, ptr %ig.tmp, align 1
%ig.ptr = getelementptr [65536 x i8], ptr %ig.tmp, i64 0, i64 0
```
## Investigation prompt
Two lowering sites produce the whole-aggregate ops; fix both:
1. `src/ir/lower/stmt.zig` `lowerVarDecl` (annotated branch): a
`.undef_literal` initializer falls through to
`lowerExpr(val)``constUndef(array type)``store`. `---` means
explicitly uninitialized — emit NO store at all (keep the existing
tuple zero-init carve-out above it).
2. `src/ir/lower/expr.zig` `lowerIndexExpr`: when the indexed object
is an array with addressable storage (`getExprAlloca` hit, same
guard as 0110's `lowerFor` fix), emit `index_gep` on the storage +
a single-element `load` instead of `index_get` on the loaded array
value. Storage-less arrays (rvalues) keep the `index_get` fallback.
The object must NOT be lowered as a value on the storage path or
the dead whole-array `load` still reaches the DAG.
Verification: the repro builds and runs (prints nothing or `x`
depending on stack garbage — gate on exit 0 of the build, not the
read); `[65535]`/`[65537]`/`[131072]` variants all build. Pin a
regression example that builds AND deterministically runs (write
before read). `zig build && zig build test`,
`bash tests/run_examples.sh` green; expect `.ir` snapshot churn from
removed undef stores and the new gep+load shape — re-pin and review.