compiler-API: welded structs by reflection + memory-order validation

Replace the explored byte-layout-override engine (offset-ordered LLVM structs /
weld plans / byte-blobs — all unnecessary) with a much simpler design: a welded
`struct abi(.zig) extern compiler { … }` is a bodied header declaring its fields
in the bound compiler type's MEMORY order. The compiler reflects the real Zig
type (field names via @typeInfo, offsets via @offsetOf, size via @sizeOf —
nothing hand-maintained) and validates the header matches, with loud diagnostics.

On pass it is an ordinary struct whose natural layout already equals the Zig
layout — no reorder, no padding, no index/remap tables, no special LLVM path — so
@ptrCast'ing it to the compiler's own type and dereferencing is byte-identical.
When types.zig shifts, the header stops matching and the developer gets a specific
message to fix it.

- compiler_lib.zig: weldStruct reflects field names and bakes bound_types fields
  in ascending-offset (memory) order; deleted computeWeldPlan/WeldPlan/WeldElement.
- nominal.zig validateWeldedStruct: precise diagnostics — field-not-found,
  wrong-field-order (+ expected memory order), type-layout (size) mismatch,
  total-size mismatch.
- Examples: 0627 (StructInfo in memory order, byte-identical, usable),
  1186 (source-order StructInfo -> wrong-field-order diagnostic); 1183 refreshed.
- Design doc + checkpoint updated.
This commit is contained in:
agra
2026-06-17 15:45:23 +03:00
parent 88c4cbcfa5
commit 40d075ca98
14 changed files with 230 additions and 218 deletions

View File

@@ -7,25 +7,59 @@ Companion to the design-of-record
with ONE welded mechanism. Branch: `reify` (off `master`). Update after every step.
## ⏯ Resume (fresh session)
Phase 1 is COMPLETE and committed (`cd5b958`); Phase 2 (full byte-layout weld)
just started. **Do sub-step 2.2 next** — make `src/backend/llvm/types.zig`'s
`.@"struct"` case build a welded struct's LLVM type from `compiler_lib.computeWeldPlan`
(offset-ordered field elements + `[N x i8]` padding) with a build-time
`LLVMOffsetOfElement == plan offset` + `LLVMABISizeOfType == total_size` assertion;
cache the plan per TypeId for the GEP sites. The plan math (sub-step 2.1) is done,
pure, and unit-tested — see `computeWeldPlan` in `src/ir/compiler_lib.zig`. Full
2.22.6 breakdown under **## Next step**. Read order: this file → the design doc →
`src/ir/compiler_lib.zig`. Build/verify: `zig build && zig build test` (green now).
Phase 1 done; Phase 2 **welded structs are working** via a much simpler design than
the original byte-layout-override "GEP engine" (that plan — `computeWeldPlan`,
offset-ordered LLVM structs, byte-blobs — was explored and DROPPED). The locked
design: a welded `Name :: struct abi(.zig) extern compiler { … }` is a bodied
header declaring fields in the compiler type's MEMORY order; the compiler reflects
the bound Zig type (`@typeInfo` names + `@offsetOf` offsets + `@sizeOf`, nothing
maintained by hand) and VALIDATES the header matches, with loud diagnostics. On
pass it's an ordinary byte-identical struct — so `@ptrCast` to the compiler's own
type + deref just works; no index tables, no reorder, no special emit.
> ⚠ Snapshot gotcha: `zig build test -Dupdate-goldens` on this aarch64 host clobbers
> cross-arch examples' CI-captured `.stdout` (1228/1231/1639/1651/16571660) with
> host-specific empties. After regenerating, revert those (`git checkout` / `rm`)
> before committing — they are NOT part of this stream.
**Next:** Phase 2 continues — re-express `type_info`/`define` (struct) as sx over
welded `register_struct`/`find_type` (host-call bridge, Phase 2.5/2.6); see
**## Next step**. Read order: this file → `src/ir/compiler_lib.zig` (registry +
reflection) → `src/ir/lower/nominal.zig` `validateWeldedStruct`. Build/verify:
`zig build && zig build test`.
> ⚠ Snapshot workflow: use `-Dname=examples/NNNN-foo.sx[,…] -Dupdate-goldens` to
> regenerate ONLY the named example(s) — a full `-Dupdate-goldens` re-runs all ~690
> and a flaky/host-divergent example (AOT/cross-arch) can clobber good snapshots.
> See CLAUDE.md → Snapshot integrity.
## Last completed step
**Phase 2, sub-step 1 — the weld-plan layout math + `StructInfo` registered.**
The de-risked core of the byte-layout-override ("GEP") engine, pure + unit-tested,
no emit/interp wiring yet (suite trivially green).
**Phase 2 — welded structs by reflection + memory-order validation (byte-identical,
no GEP engine).** A welded `struct abi(.zig) extern compiler { … }` now works
end-to-end as a byte-identical mirror of the bound Zig type.
Design (locked, supersedes the byte-layout-override plan):
- The sx header declares fields in the compiler type's MEMORY order. The compiler
REFLECTS the bound Zig type — field names from `@typeInfo`, offsets from
`@offsetOf`, size from `@sizeOf` — and validates the header matches. Nothing is
maintained by hand; a `types.zig` change re-reflects on the next compiler build.
- On pass it's an ORDINARY struct whose natural layout already equals the Zig
layout → `@ptrCast` to the compiler type + deref is byte-identical. No
byte-blob, no index/remap tables, no reorder, no special LLVM path.
- Loud, precise diagnostics on any drift: *field not found* (+ memory order),
*wrong field order at position N* (+ expected memory order), *type layout
mismatch* (field size), *layout mismatch* (total size / count).
What changed from the dropped plan:
- `compiler_lib.zig`: `weldStruct` now REFLECTS field names (`@typeInfo`) and bakes
`bound_types` fields in ascending-OFFSET (memory) order — no hand-listed names.
Deleted `computeWeldPlan`/`WeldPlan`/`WeldElement`. `validateStructLayout` checks
the sx header against the memory-ordered registry.
- `nominal.zig` `validateWeldedStruct`: renders the precise diagnostics
(+ `weldedFieldOrderStr`).
- Examples: `0627` (StructInfo in memory order, byte-identical, usable);
`1186` (source-order StructInfo → wrong-field-order diagnostic). `1183` message
refreshed.
- `zig build` + `zig build test` green (692 corpus, unit tests pass).
### Earlier — Phase 2.1 (weld-plan layout math, now removed)
**The weld-plan offset math + `StructInfo` registered.** Was the core of the
byte-layout-override engine; superseded by the reflection+validation design above.
Decision (locked 2026-06-17): **full byte-layout weld** — a welded sx struct is
laid out byte-identically to the bound Zig type (Zig's `@offsetOf`, reordering +
@@ -204,54 +238,48 @@ What landed:
- **Deferred**: offset-override / LLVM byte-offset GEP for non-natural layouts
(needed by `StructInfo`'s slice field, Phase 2).
## Next step — Phase 2 decomposition (byte-layout weld for `StructInfo`)
## Next step — Phase 2: welded compiler FUNCTIONS over the real types
The weld plan (sub-step 1) is the pure layout math. The remaining sub-steps wire
it through emit + interp so a non-natural welded struct actually works. Each must
stay green; do ONE per session (the IR-stream split rule).
Welded structs are byte-identical mirrors now, so the API surface can grow:
- **2.2 — LLVM type honours the plan.** In `src/backend/llvm/types.zig` `.@"struct"`
case: if the struct's name is in `compiler_lib.findType`, build the LLVM struct
from `computeWeldPlan` — elements in offset order (real field types + `[N x i8]`
padding), and **assert** `LLVMOffsetOfElement(elem) == plan.elements[e].offset`
for every field element + `LLVMABISizeOfType == total_size` (the build-time
layout-equality assertion; mismatch = a loud emit failure). Cache the plan per
TypeId (the GEP sites + interp need the remap). Prove: a welded struct's LLVM
type has the Zig offsets (an emit-level test or an `.ir`/codegen check).
- **2.3 — field access honours the remap.** Every `struct_gep` / field load+store
for a welded struct maps the sx field index → `plan.sx_to_llvm[i]` before
`LLVMBuildStructGEP2` (`src/backend/llvm/ops.zig` — `emitFieldAccess` /
struct-literal init / the `field_ptr` paths). Prove with a REORDERED welded
struct used as runtime data: construct + read each field back correct.
- **2.4 — interp comptime layout.** The comptime interp represents structs as
`Value.aggregate` by logical index — fine for field access. The byte layout
matters at the handler boundary: serialize a welded-struct `Value` into
Zig-layout memory (via the plan's offsets) so a handler can take `*ZigType`,
and read a Zig-layout result back into a `Value`. (Or: keep handlers reading
`Value` aggregates logically — decide when wiring `register_struct`.)
- **2.5 — `register_struct` / `find_type` handlers.** Bind
`register_struct(StructInfo) -> Type` (guarded: dup field names, kind) +
`find_type(StringId) -> ?Type` over the host-call bridge, consuming a welded
`StructInfo`. Prove: build a struct programmatically + round-trip a source one.
- **2.6 — re-express `type_info`/`define` (struct) as sx** over `register_struct`/
- **Bind `register_struct` / `find_type`** over the host-call bridge
(`compiler_lib.zig` `bound_fns`, like `intern`/`text_of`). `register_struct`
takes a welded `StructInfo` and mints a real `TypeId` (guarded: dup field names,
kind well-formedness — the checks `define` does today). Because the welded
`StructInfo` is byte-identical, the handler can read it as the real Zig
`*StructInfo` (cast + deref) rather than marshalling a `Value` field-by-field —
the payoff of the byte-weld. `find_type(StringId) -> ?Type` reads the table.
Prove: build a struct programmatically + round-trip a source one.
- **Re-express `type_info`/`define` (struct) as sx** over `register_struct`/
`find_type`; migrate `examples/0622`; delete the bespoke struct interp arms
(`defineStruct`/`reflectTypeInfo` struct path). Design build-order steps 23.
(`defineStruct` / the `reflectTypeInfo` struct path).
Then Phase 3+: widen to enum/tuple (`EnumInfo`/`TaggedUnionInfo`/`TupleInfo`,
optional fields → sentinels), migrate `BuildOptions` to `abi(.zig) extern
compiler` (the `#compiler` registry re-homes under the `compiler` lib), delete
Then Phase 3+: widen the welded types to `EnumInfo`/`TaggedUnionInfo`/`TupleInfo`
(optional fields → sentinels) — each just needs an sx header in the compiler
type's memory order + the matching `register_*` fn. Finally migrate `BuildOptions`
to `abi(.zig) extern compiler` (re-home the `#compiler` registry) and delete
`#compiler`.
Note: a welded struct with an `?T` / `union(enum)` field (e.g. `EnumInfo`'s
`backing_type: ?TypeId`, `explicit_values: ?[]const i64`) is the next layout
wrinkle — the sx header must mirror Zig's optional/union representation. Handle
when reached (sentinels or accessor fns; see the design doc Risks).
## Known issues
- None for this stream. (Metatype's deferred enhancement is issue 0141 — comptime
`List` growth; orthogonal, see `current/CHECKPOINT-METATYPE.md`.)
## Log
- **Phase 2.1 — weld-plan layout math + `StructInfo` registered.** Decision:
full byte-layout weld (not logical-field marshalling). `computeWeldPlan`
(offset-order elements + padding + sx→element remap), pure + unit-tested
against `Field` (identity) and `StructInfo` (reordered, remap `[1,0,3,2]`).
No emit/interp wiring yet. Build + suite green.
- **Phase 2 — welded structs by reflection + memory-order validation.** Dropped
the byte-layout-override engine (computeWeldPlan / offset-ordered LLVM struct /
byte-blob — all explored, all unnecessary). Instead: the sx header declares
fields in the compiler type's memory order; the compiler reflects the bound Zig
type (`@typeInfo`/`@offsetOf`/`@sizeOf`) and validates the header matches with
loud diagnostics (field-not-found, wrong-order+expected-order, size mismatch).
On pass it's an ordinary byte-identical struct — cast + deref just works.
Examples 0627 (usable) / 1186 (wrong-order diagnostic). Suite green (692).
- **Phase 2.1 — weld-plan layout math (REMOVED).** The byte-layout-override math;
superseded by the reflection+validation design and deleted.
- **Phase 1 polish — comptime-only enforcement.** A runtime call to a welded fn is
a clean build-gating error (`emitCall` gate, guarded by enclosing-`is_comptime`
so `#run`/`::` uses stay green), not a link failure. Example 1185. Build + suite