mem: Phase 1.4a — fat-pointer aggregates from #run serialize via host memory

The Phase 1.4 serializer left a silent malformed-const case: when the
interp evaluated a `#run` returning a string (or anything with a fat
pointer inside), the data field came in as a `.int` holding a libc
host address. `LLVMConstInt(ptr_type, addr, 1)` happily emitted `i0 0`
in the static const, and the runtime segfaulted on the first read.

Phase 1.4a closes this for string and slice destinations. The signature
of `valueToLLVMConst` now takes the IR `TypeId` (instead of just the
LLVM type) and a borrowed `*Interpreter`. A new helper
`serializeAggregateValue` splits on the IR type:

- `string` / `slice` (fat pointer `{data, len}`): extract `len`, read
  that many bytes from the data field's address (via `interp.heapSlice`
  for `heap_ptr`, via a new `readHostBytes` for `byte_ptr` / `.int`,
  via slice indexing for string literals). Emit the bytes as a private
  global byte array using the existing `emitConstStringGlobal`. The
  fat-pointer aggregate's data ptr resolves to the byte array's address.
- `struct`: walk the IR field types in lockstep with the value's
  fields; recurse with each declared field TypeId. This replaces the
  old LLVM-type-walk via `LLVMStructGetTypeAtIndex` which couldn't tell
  string-typed fields from generic ptr fields.
- `array`: walk with the element TypeId.

The remaining `.int → ptr` trap (a host address landing in a bare ptr
field outside a fat pointer) now bails loudly with a named diagnostic
identifying it as Phase 1.4a heap-walk follow-up territory. No
practical trigger in-tree, so deferred.

`Interpreter.heapSlice` promoted from package-private to `pub` so
the serializer can read interp-managed heap data.

Regression: `examples/136-comptime-string-global.sx` —
`GREETING :: #run build_greeting();` where `build_greeting` returns
`concat("hello", " world")`. Runtime prints `greeting = 'hello world'`
and `greeting.len = 11`. Pre-1.4a this segfaulted on the first read.

158/158 example tests; chess clean on macOS / iOS sim / Android via
`tools/verify-step.sh`.
This commit is contained in:
agra
2026-05-25 15:45:33 +03:00
parent da1063f1bb
commit 179310d62b
6 changed files with 231 additions and 60 deletions

View File

@@ -5,6 +5,41 @@ Tracking checkpoint for the mem.sx Zig-aligned implementation
## Last completed step
- **Phase 1.4a — IR `TypeId` threaded through `valueToLLVMConst`;
string/slice fat-pointer aggregates serialize by reading host
memory.** The Phase 1.4 serializer bailed on `heap_ptr` / `byte_ptr`
and silently emitted `i0 0` for the trap-case where a `.int` host
address landed in a ptr-typed slot. Now the call site at
`emit_llvm.zig:676` passes `global.ty` (TypeId) and `&interp_inst`
instead of just the LLVM type. The serializer splits on the IR
type:
- `string` / `slice` (fat pointer `{data, len}`): extract `len`,
read that many bytes from the data field's address (heap_ptr →
`interp.heapSlice`; byte_ptr/int → raw process memory via a new
`readHostBytes` helper; string literal → direct slice). Emit
the bytes as a private global byte array via the existing
`emitConstStringGlobal` and use it as the aggregate's data ptr.
- `struct`: walk the IR field types in lockstep with the value
fields; recurse per field with its declared TypeId. Replaces
the old LLVM-type-walk via `LLVMStructGetTypeAtIndex` which
couldn't tell `string`-typed fields from generic ptr fields.
- `array`: walk elements with the element TypeId.
The `.int → ptr` slot mismatch (a host address landing in a
non-fat-pointer ptr slot) now bails loudly with a named
diagnostic — that's the genuine heap-walk frontier where future
work would need to capture struct content recursively, not the
silent malformed-const we had before.
`Interpreter.heapSlice` was promoted from package-private to
`pub` so the serializer can read interp heap. Regression at
`examples/136-comptime-string-global.sx`: `GREETING :: #run
build_greeting();` where `build_greeting` returns `concat("hello",
" world")` — runtime prints `greeting = 'hello world' / greeting.len
= 11`. Pre-1.4a this segfaulted. 158/158 example tests + chess
clean on all three platforms via `tools/verify-step.sh`.
- **Allocator `init` returns the state by value.** Building on the
Option 3 lvalue-borrow rule, `GPA.init`, `Arena.init`, and
`TrackingAllocator.init` now return `T` (not `*T`). The caller binds
@@ -230,22 +265,17 @@ allocator).
Open follow-ups, in roughly the order they make sense:
- **Phase 1.4a** — Thread IR `TypeId` (not just LLVM `LLVMTypeRef`)
through `valueToLLVMConst` so `heap_ptr` values from `#run` can be
serialized. Requires walking the struct/slice/primitive children
recursively; cycle detection via `(heap_id, type_id)` visited set.
Practical trigger: a `#run` that builds a `Widget.{}` and
protocol-erases via `xx`, producing a `heap_ptr` to the boxed
payload. None exists in-tree yet — surface it via a focused
regression alongside the implementation.
- **`.int → ptr` heap-walk follow-up.** Phase 1.4a handles the
fat-pointer aggregate case. A `.int` host-address landing in a
bare ptr field (e.g. a struct with a raw `[*]u8` member) still
bails. Requires recursive struct walking with cycle detection on
`(heap_id, type_id)` visited pairs. No practical trigger
in-tree; defer until a real `#run` site surfaces the need.
- **`resolveType(null) -> .s64` audit.** The silent fallback at
`lower.zig:8387` is still in place for every caller other than
`lowerComptimeGlobal`. CLAUDE.md REJECTED PATTERNS forbids this
shape. Survey callers; either make the default an error
diagnostic or thread an inferred type per call site.
- **`tools/verify-step.sh` gate.** Run iOS sim + Android to confirm
this session's GlyphCache + Metal/Gles3 sweeps + Phase 1.4 didn't
regress non-macOS platforms.
## Phase 0.3 audit findings — chess allocator usage (closed)
@@ -271,7 +301,18 @@ Allocator value naturally.
## Log
- **2026-05-25 (latest)** — Allocator `init` returns the state by
- **2026-05-25 (latest)** — Phase 1.4a shipped. `valueToLLVMConst`
takes IR `TypeId` (not LLVM type) + an interpreter handle.
String/slice fat pointers are serialized by capturing the
pointed-to bytes (via `interp.heapSlice` for heap_ptr, raw
process memory via new `readHostBytes` for byte_ptr / .int /
string literal) and emitting a private global byte array. Struct
/ array aggregates recurse with declared field/element TypeIds.
The trap case (`.int` landing in a ptr slot outside a fat
pointer) bails loudly. `Interpreter.heapSlice` promoted to
`pub`. Regression: `examples/136-comptime-string-global.sx`.
158/158 + chess green on all three platforms.
- **2026-05-25 (penultimate)** — Allocator `init` returns the state by
value. GPA / Arena / TrackingAllocator all changed; `Arena.deinit`
no longer self-deallocs. `UIPipeline.arena_a/_b` embedded as values;
`@self.arena_a` at the *Arena use site. `examples/50-smoke.sx`