# RESOLVED — 0109: allocas inside loop bodies accumulate stack per iteration **Root cause:** `emitAlloca` (and ~18 sibling `LLVMBuildAlloca` temp sites in the LLVM backend) built allocas at the builder's current position. An alloca inside a loop body re-executes per iteration and LLVM reclaims allocas only at `ret`, so the frame grew with the trip count — body locals, nested-loop index slots, and spill temps (`ig.tmp` etc.) all segfaulted long loops on stack exhaustion. **Fix:** new `LLVMEmitter.buildEntryAlloca` (src/ir/emit_llvm.zig) builds every per-instruction alloca in the function's entry block (after existing entry allocas, builder position restored); all `LLVMBuildAlloca` sites reachable during instruction emission in src/backend/llvm/ops.zig, src/backend/llvm/abi.zig and src/ir/emit_llvm.zig route through it. Initialization stores stay at the use site, so per-iteration re-init semantics are unchanged; entry-block slots are also mem2reg-promotable. ~35 `.ir` snapshots churned (pure alloca position moves — verified type-multiset-identical per file). **Regression test:** `examples/0047-basic-loop-local-stack-reuse.sx` (1M-iteration body-local loop prints `sum=499999500000`; 3M-iteration nested loop prints `n=3000000`; both segfaulted pre-fix). --- # 0109 — allocas inside loop bodies accumulate stack per iteration → segfault on long loops **Symptom.** Any `alloca` that lands inside a loop's body block executes anew on every iteration, and LLVM stack allocas are only reclaimed at function return — so the frame grows monotonically with the trip count. Observed: a 1M-iteration loop with a body-local array segfaults (stack overflow, fault address at the guard page); so does a 3M-iteration nested loop with **no user locals at all** (the inner loop's hidden index slot is itself a body-block alloca of the outer loop). Expected: loop-local storage is reused across iterations; stack usage is static per frame regardless of trip count. This hits three shapes, all confirmed: 1. user locals declared in a loop body (`buf : [128]i64 = ---;`), 2. nested loops (inner `for`'s `idx_slot` alloca sits in the outer body), 3. compiler temporaries spilled in the body (e.g. `index_get`'s `ig.tmp` — see issue 0110 for the for-over-array case specifically). ## Reproduction Repro A — body local (`issues/0109-loop-body-alloca-stack-growth.sx`): ```sx #import "modules/std.sx"; main :: () -> i32 { sum := 0; for 0..1000000: (i) { buf : [128]i64 = ---; buf[0] = i; sum += buf[0]; } print("sum={}\n", sum); 0 } ``` - **Observed**: `Segmentation fault at address 0x16e70ffd0` (guard page). With `0..1000` instead it prints `sum=499500` and exits 0 — the program is correct, only the stack accumulation kills it. - **Expected**: prints `sum=499999500000`, exit 0, at any trip count. Repro B — pure nested loops, zero user locals: ```sx #import "modules/std.sx"; main :: () -> i32 { n := 0; for 0..3000000: (i) { for 0..1: (j) { n += 1; } } print("n={}\n", n); 0 } ``` - **Observed**: segfault. **Expected**: `n=3000000`, exit 0. The emitted IR shows the cause directly (`sx ir`, body of repro A): ```llvm for.body.1: %alloca2 = alloca [128 x i64], align 8 ; fresh 1KB every iteration ... %ig.tmp = alloca [128 x i64], align 8 ; plus a 1KB spill temp ``` ## Root cause (suspected area) `Builder.alloca` (`src/ir/module.zig` ~474) emits the `.alloca` instruction into the current block, and the LLVM emitter (`src/backend/llvm/ops.zig` `emitAlloca` ~327) builds `LLVMBuildAlloca` at the current insertion point — so loop-body allocas are *executed* per iteration. LLVM only treats entry-block allocas as static frame slots (and mem2reg/SROA only promote those); a non-entry alloca re-executes and grows the stack each time, until `ret`. The standard fix (what clang does): emit **all** static allocas into the function's entry block. Least-invasive locus is the emitter — in `emitAlloca`, save the current insertion point, position the builder at the entry block's first non-alloca instruction (or end of entry if empty), build the alloca there, restore the position, `mapRef` as before. The IR shape and the interpreter are untouched. All sx allocas are statically sized (TypeId), so every one is hoistable. ## Investigation prompt (paste into a fresh session) > Fix issue 0109: loop-body allocas grow the stack per iteration and long > loops segfault. In `src/backend/llvm/ops.zig` `emitAlloca` (~327), hoist the > alloca to the current function's entry block: get the function via the > current insert block's parent, position the builder before the entry > block's first non-alloca instruction (`LLVMGetEntryBasicBlock` + > `LLVMGetFirstInstruction` walk past `LLVMAlloca` opcodes — same positioning > pattern as `injectCtorIntoMain` in `src/ir/emit_llvm.zig` ~466), build the > alloca + `mapRef`, then restore the previous insertion point > (`LLVMGetInsertBlock` before / `LLVMPositionBuilderAtEnd` after). Audit the > other in-place `LLVMBuildAlloca` temporaries in `src/ir/emit_llvm.zig` > (`ba.tmp`, `abi.tmp`, `ig.tmp`, etc. — grep `BuildAlloca`) and route the > ones reachable inside loops through the same hoist helper. > > Semantics note: per-iteration re-zeroing must not regress — initialization > stores (e.g. `store undef` / `= .{...}` inits) stay where the decl was, in > the body block; only the `alloca` itself moves to entry. > > Verify: both repros in `issues/0109-loop-body-alloca-stack-growth.md` (A is > `issues/0109-loop-body-alloca-stack-growth.sx`) now print > `sum=499999500000` / `n=3000000` and exit 0; `sx ir` on repro A shows no > `alloca` inside `for.body.*`. Then `zig build && zig build test && bash > tests/run_examples.sh` — any `.ir` snapshot churn from alloca placement must > be reviewed (`git diff examples/expected/`) before `--update`. Promote a > trip-count-bounded variant (e.g. 200k iterations, small buf) to > `examples/00xx-basic-loop-local-stack-reuse.sx` as the pinned regression.