Files
sx/issues/0110-for-array-by-value-full-array-spill.md
agra d8076b9333 lang: rename signed integer types sN -> iN
Surface rename of the signed integer family: s1..s64 become i1..i64
(u1..u64, usize, isize unchanged). 'string' keeps the s-prefix arm in
name classification; width parsing moves to the i-prefix arm next to
isize.

Internal TypeId tags follow the surface (.s8/.s16/.s32/.s64 ->
.i8/.i16/.i32/.i64), as do mono-key mangle fragments (ptr_i64,
tu_i64_bool) and all display/diagnostic formatting (i{d}).

Migrated in the same sweep: stdlib + examples + issue repros + FFI C
companions (shared symbol names like ffi_id_i64), expected
stdout/stderr/ir snapshots, specs.md, readme.md, CLAUDE.md/AGENTS.md,
implementation_plan.md, docs/, issue writeups. Vendored stb_image and
historical flow state left untouched.

zig build test: 426/426; examples suite: 595/595.
2026-06-12 09:31:53 +03:00

113 lines
5.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# RESOLVED — 0110: `for arr: (x)` over an array copies the ENTIRE array per element
**Root cause:** `lowerFor`'s by-value element fetch emitted `index_get` on the
array *value*; the emitter realizes that as spill-whole-array-to-temp + GEP one
element, per iteration — O(N²) bytes copied (and pre-0109, per-iteration stack
growth that segfaulted a `[4096]i64` loop).
**Fix:** in `src/ir/lower/control_flow.zig` `lowerFor`, when the iterable is an
array with addressable storage (`getExprAlloca` hit, and the iterable was not
deref'd from a pointer — a deref'd identifier's alloca holds the pointer, not
the array), the by-value fetch is `index_gep` on the storage + a single element
`load`. Storage-less arrays and the deref'd-pointer case keep the `index_get`
fallback. Capture semantics unchanged: the loaded element is still a copy
(mutating it does not write back), matching the by-ref branch's base resolution.
**Regression test:** `examples/0048-basic-for-array-large.sx` (4096-element sum
prints `sum=8386560`, segfaulted pre-fix; copy-guard confirms by-value capture
mutation leaves the array untouched).
---
# 0110 — `for arr: (x)` over an array copies the ENTIRE array per element
**Symptom.** The collection-form `for` with by-value capture over an array
lowers the element fetch as `index_get` on the array *value*, which the LLVM
emitter realizes as: load the whole array as an SSA value, spill it to a
fresh `ig.tmp` alloca, GEP one element. Per iteration. Observed: a `for` over
a `[4096]i64` array segfaults (4096 iterations × 32KB spill = ~134MB of stack
— see issue 0109 for why body allocas never unwind); a `[256]i64` version
completes but copies 256 × 2KB = 512KB to read 2KB of data. Expected: O(1)
stack, O(N) total work — GEP into the array's existing storage and load the
single element.
Even after 0109's entry-block hoist (which collapses the spill to one reused
slot and fixes the crash), the full-array copy per element remains — O(N²)
bytes copied per loop. The by-ref capture path (`for arr: (*x)`) already does
this right: it GEPs into the array's alloca (`getExprAlloca`) without
copying. By-value should reuse the same base and just add a load.
## Reproduction
```sx
#import "modules/std.sx";
main :: () -> i32 {
arr : [4096]i64 = ---;
i := 0;
while i < 4096 { arr[i] = i; i += 1; }
sum := 0;
for arr: (x) { sum += x; }
print("sum={}\n", sum);
0
}
```
- **Observed**: `Segmentation fault` (stack overflow). With `256` instead of
`4096` it prints `sum=32640` — and `sx ir` shows the per-iteration spill:
```llvm
for.body.4:
%ig.tmp = alloca [256 x i64], align 8 ; fresh full-array temp
store [256 x i64] %load6, ptr %ig.tmp, align 8 ; copy ALL 256 elements
%ig.ptr = getelementptr [256 x i64], ptr %ig.tmp, i64 0, i64 %load8
%ig.val = load i64, ptr %ig.ptr, align 8 ; ...to read ONE
```
- **Expected**: `sum=8386560`, exit 0; the body GEPs `arr`'s own storage:
`getelementptr` on the original alloca + a single `i64` load, no array-sized
temp, no copy.
Repro co-located: `issues/0110-for-array-by-value-full-array-spill.sx`
(unpinned until fixed; depends on 0109 only for the crash, not for the copy).
## Root cause (suspected area)
`src/ir/lower/control_flow.zig` `lowerFor` (~336-343): the by-ref branch
builds `index_gep` on the array's storage (`getExprAlloca(fe.iterable) orelse
iterable`), but the by-value branch emits
`.index_get = .{ .lhs = iterable, .rhs = idx_val }` where `iterable` is the
loaded array **value**. The emitter's `index_get` on an aggregate SSA value
(`src/ir/emit_llvm.zig`, `ig.tmp` site ~2100s) has no address to index, so it
spills the whole value first.
Fix shape: in `lowerFor`, when the iterable is an array with storage
(`getExprAlloca` hit — same condition the by-ref branch uses), lower the
by-value element as `index_gep` on that storage followed by a `load` of the
element type, instead of `index_get` on the value. Capture semantics are
unchanged: the load still produces a per-iteration copy of the *element*
(mutating `x` must not write back; mutating the array inside the body is
already visible to subsequent iterations through the same storage in the
by-ref form — match existing behavior with a test). Keep the `index_get`
fallback for storage-less iterables (e.g. an array returned by a call).
## Investigation prompt (paste into a fresh session)
> Fix issue 0110: collection-form `for arr: (x)` (by-value capture) over an
> array spills the entire array per iteration. In
> `src/ir/lower/control_flow.zig` `lowerFor` (~336-343), make the by-value
> element fetch reuse the by-ref branch's base resolution: if the iterable is
> an `.array` and `getExprAlloca(fe.iterable)` (or the deref'd pointer base)
> yields storage, emit `index_gep` on that storage + `builder.load` of the
> element type; otherwise keep the existing `index_get` path. Slices/List
> views are unaffected (their `index_get` is already pointer-backed).
>
> Verify: `./zig-out/bin/sx run issues/0110-for-array-by-value-full-array-spill.sx`
> prints `sum=8386560` exit 0 (after 0109 lands; before it, verify the
> 256-element variant's `sx ir` shows no `ig.tmp` array spill in
> `for.body.*`). Add a semantics guard test: mutate `x` in the body and
> confirm the array is unchanged after the loop (by-value capture stays a
> copy). Then `zig build && zig build test && bash tests/run_examples.sh`,
> review any `.ir` snapshot diffs, and promote the repro to
> `examples/00xx-basic-for-array-large.sx`.