Files
sx/issues/0110-for-array-by-value-full-array-spill.md
agra d8076b9333 lang: rename signed integer types sN -> iN
Surface rename of the signed integer family: s1..s64 become i1..i64
(u1..u64, usize, isize unchanged). 'string' keeps the s-prefix arm in
name classification; width parsing moves to the i-prefix arm next to
isize.

Internal TypeId tags follow the surface (.s8/.s16/.s32/.s64 ->
.i8/.i16/.i32/.i64), as do mono-key mangle fragments (ptr_i64,
tu_i64_bool) and all display/diagnostic formatting (i{d}).

Migrated in the same sweep: stdlib + examples + issue repros + FFI C
companions (shared symbol names like ffi_id_i64), expected
stdout/stderr/ir snapshots, specs.md, readme.md, CLAUDE.md/AGENTS.md,
implementation_plan.md, docs/, issue writeups. Vendored stb_image and
historical flow state left untouched.

zig build test: 426/426; examples suite: 595/595.
2026-06-12 09:31:53 +03:00

5.4 KiB
Raw Permalink Blame History

RESOLVED — 0110: for arr: (x) over an array copies the ENTIRE array per element

Root cause: lowerFor's by-value element fetch emitted index_get on the array value; the emitter realizes that as spill-whole-array-to-temp + GEP one element, per iteration — O(N²) bytes copied (and pre-0109, per-iteration stack growth that segfaulted a [4096]i64 loop).

Fix: in src/ir/lower/control_flow.zig lowerFor, when the iterable is an array with addressable storage (getExprAlloca hit, and the iterable was not deref'd from a pointer — a deref'd identifier's alloca holds the pointer, not the array), the by-value fetch is index_gep on the storage + a single element load. Storage-less arrays and the deref'd-pointer case keep the index_get fallback. Capture semantics unchanged: the loaded element is still a copy (mutating it does not write back), matching the by-ref branch's base resolution.

Regression test: examples/0048-basic-for-array-large.sx (4096-element sum prints sum=8386560, segfaulted pre-fix; copy-guard confirms by-value capture mutation leaves the array untouched).


0110 — for arr: (x) over an array copies the ENTIRE array per element

Symptom. The collection-form for with by-value capture over an array lowers the element fetch as index_get on the array value, which the LLVM emitter realizes as: load the whole array as an SSA value, spill it to a fresh ig.tmp alloca, GEP one element. Per iteration. Observed: a for over a [4096]i64 array segfaults (4096 iterations × 32KB spill = ~134MB of stack — see issue 0109 for why body allocas never unwind); a [256]i64 version completes but copies 256 × 2KB = 512KB to read 2KB of data. Expected: O(1) stack, O(N) total work — GEP into the array's existing storage and load the single element.

Even after 0109's entry-block hoist (which collapses the spill to one reused slot and fixes the crash), the full-array copy per element remains — O(N²) bytes copied per loop. The by-ref capture path (for arr: (*x)) already does this right: it GEPs into the array's alloca (getExprAlloca) without copying. By-value should reuse the same base and just add a load.

Reproduction

#import "modules/std.sx";

main :: () -> i32 {
    arr : [4096]i64 = ---;
    i := 0;
    while i < 4096 { arr[i] = i; i += 1; }
    sum := 0;
    for arr: (x) { sum += x; }
    print("sum={}\n", sum);
    0
}
  • Observed: Segmentation fault (stack overflow). With 256 instead of 4096 it prints sum=32640 — and sx ir shows the per-iteration spill:
for.body.4:
  %ig.tmp = alloca [256 x i64], align 8        ; fresh full-array temp
  store [256 x i64] %load6, ptr %ig.tmp, align 8  ; copy ALL 256 elements
  %ig.ptr = getelementptr [256 x i64], ptr %ig.tmp, i64 0, i64 %load8
  %ig.val = load i64, ptr %ig.ptr, align 8     ; ...to read ONE
  • Expected: sum=8386560, exit 0; the body GEPs arr's own storage: getelementptr on the original alloca + a single i64 load, no array-sized temp, no copy.

Repro co-located: issues/0110-for-array-by-value-full-array-spill.sx (unpinned until fixed; depends on 0109 only for the crash, not for the copy).

Root cause (suspected area)

src/ir/lower/control_flow.zig lowerFor (~336-343): the by-ref branch builds index_gep on the array's storage (getExprAlloca(fe.iterable) orelse iterable), but the by-value branch emits .index_get = .{ .lhs = iterable, .rhs = idx_val } where iterable is the loaded array value. The emitter's index_get on an aggregate SSA value (src/ir/emit_llvm.zig, ig.tmp site ~2100s) has no address to index, so it spills the whole value first.

Fix shape: in lowerFor, when the iterable is an array with storage (getExprAlloca hit — same condition the by-ref branch uses), lower the by-value element as index_gep on that storage followed by a load of the element type, instead of index_get on the value. Capture semantics are unchanged: the load still produces a per-iteration copy of the element (mutating x must not write back; mutating the array inside the body is already visible to subsequent iterations through the same storage in the by-ref form — match existing behavior with a test). Keep the index_get fallback for storage-less iterables (e.g. an array returned by a call).

Investigation prompt (paste into a fresh session)

Fix issue 0110: collection-form for arr: (x) (by-value capture) over an array spills the entire array per iteration. In src/ir/lower/control_flow.zig lowerFor (~336-343), make the by-value element fetch reuse the by-ref branch's base resolution: if the iterable is an .array and getExprAlloca(fe.iterable) (or the deref'd pointer base) yields storage, emit index_gep on that storage + builder.load of the element type; otherwise keep the existing index_get path. Slices/List views are unaffected (their index_get is already pointer-backed).

Verify: ./zig-out/bin/sx run issues/0110-for-array-by-value-full-array-spill.sx prints sum=8386560 exit 0 (after 0109 lands; before it, verify the 256-element variant's sx ir shows no ig.tmp array spill in for.body.*). Add a semantics guard test: mutate x in the body and confirm the array is unchanged after the loop (by-value capture stays a copy). Then zig build && zig build test && bash tests/run_examples.sh, review any .ir snapshot diffs, and promote the repro to examples/00xx-basic-for-array-large.sx.