comptime-API: strip the byte-weld; pivot to a flat-memory comptime VM

The byte-weld (sx structs whose layout was validated to mirror the
compiler's Zig records) plus the serialization/marshaling bridge was the
wrong direction: it bolted a parallel layout regime and hand-built
byte-copies onto a comptime value model that fundamentally isn't bytes.

Strip the struct-weld machinery:
- compiler_lib.zig loses the type registry (weldStruct / bound_types /
  BoundType / FieldLayout / findType / SxField / LayoutMismatch /
  validateStructLayout); it is now just the intern/text_of function
  host-call bridge (kept as the Phase-3 compiler-call seed).
- nominal.zig loses validateWeldedStruct / weldedFieldOrderStr + the
  sd.abi == .zig validation call.
- Remove the struct-weld unit tests and examples 0625/0627 (welded
  structs) + 1183/1186 (weld-layout diagnostics).
- The #library / abi / extern syntax stays.

Record the new direction: a bytecode VM over flat, byte-addressable
memory so comptime values are native bytes (no weld/validation/marshal),
target-aware (preserves cross-compilation) and sandboxed. See
current/PLAN-COMPILER-VM.md (Phase 0 strip -> Phase 1 flat-memory value
model -> Phase 2 bytecode -> Phase 3 compiler-API on flat memory).
design/comptime-compiler-api.md gets a SUPERSEDED banner. Also drop the
"~500 lines / split the step" rule from CLAUDE.md.
This commit is contained in:
agra
2026-06-17 19:29:36 +03:00
parent 40d075ca98
commit 18af8eb845
23 changed files with 505 additions and 498 deletions

View File

@@ -7,21 +7,34 @@ Companion to the design-of-record
with ONE welded mechanism. Branch: `reify` (off `master`). Update after every step.
## ⏯ Resume (fresh session)
Phase 1 done; Phase 2 **welded structs are working** via a much simpler design than
the original byte-layout-override "GEP engine" (that plan — `computeWeldPlan`,
offset-ordered LLVM structs, byte-blobs — was explored and DROPPED). The locked
design: a welded `Name :: struct abi(.zig) extern compiler { … }` is a bodied
header declaring fields in the compiler type's MEMORY order; the compiler reflects
the bound Zig type (`@typeInfo` names + `@offsetOf` offsets + `@sizeOf`, nothing
maintained by hand) and VALIDATES the header matches, with loud diagnostics. On
pass it's an ordinary byte-identical struct — so `@ptrCast` to the compiler's own
type + deref just works; no index tables, no reorder, no special emit.
**Next:** Phase 2 continues — re-express `type_info`/`define` (struct) as sx over
welded `register_struct`/`find_type` (host-call bridge, Phase 2.5/2.6); see
**## Next step**. Read order: this file → `src/ir/compiler_lib.zig` (registry +
reflection) → `src/ir/lower/nominal.zig` `validateWeldedStruct`. Build/verify:
`zig build && zig build test`.
> **⚠ DIRECTION CHANGED (2026-06-17). The active plan is now
> [`PLAN-COMPILER-VM.md`](PLAN-COMPILER-VM.md), NOT the weld.**
> The **byte-weld + serialization/marshaling** approach is the wrong direction and is
> being **stripped**. New foundation: a **bytecode VM over flat, byte-addressable
> memory** so comptime values are native bytes; then the compiler-API rides on it with
> direct memory access (no weld, no validation, no marshaling). Everything below this
> banner describes the now-superseded weld state (committed on `reify` through
> `40d075c`) and is kept only to scope the Phase 0 strip. Read
> `PLAN-COMPILER-VM.md` first.
>
> **Why the pivot:** the comptime evaluator (`src/ir/interp.zig`) represents values as
> tagged `Value` unions, NOT native bytes — so a comptime `@ptrCast(*StructInfo)`
> reads the `Value` union's memory, not a struct. The weld tried to bridge that with
> hand-marshaling — exactly what the design set out to kill. Flat memory makes comptime
> values real bytes, so the bridge disappears. (JIT-native comptime was rejected: it
> breaks cross-compilation — host vs target layout — and loses the sandbox. A
> flat-memory VM keeps both while getting native bytes + speed.)
>
> **Next action:** execute Phase 0 of `PLAN-COMPILER-VM.md` (strip the weld machinery),
> then Phase 1 (flat-memory value model). Build/verify: `zig build && zig build test`.
### (superseded) prior weld resume
Phase 1 done; Phase 2 welded structs were working via reflection + memory-order
validation (the `computeWeldPlan`/byte-blob "GEP engine" was explored + DROPPED even
earlier). A welded `Name :: struct abi(.zig) extern compiler { … }` declared fields in
the compiler type's MEMORY order; the compiler reflected the bound Zig type and
VALIDATED the header. **This whole mechanism is now being stripped — see the banner.**
> ⚠ Snapshot workflow: use `-Dname=examples/NNNN-foo.sx[,…] -Dupdate-goldens` to
> regenerate ONLY the named example(s) — a full `-Dupdate-goldens` re-runs all ~690
@@ -223,6 +236,12 @@ What landed:
`zig build` + `zig build test` green (450/450 unit + 685 corpus).
## Current state
> **Pivoted — see the banner + `PLAN-COMPILER-VM.md`.** The items below are the weld
> machinery as it stands on `reify` HEAD (`40d075c`); they are the **strip list** for
> Phase 0, not the forward direction. The `#library`/`abi`/`extern` *syntax* stays; the
> weld *semantics* (layout reflection/validation, marshaling dispatch) go.
- `compiler :: #library "compiler";` parses + is recognised as the comptime-only
internal surface (never dlopen'd).
- `abi(.zig) extern compiler` STRUCTS: layout-validated against the registry
@@ -238,9 +257,18 @@ What landed:
- **Deferred**: offset-override / LLVM byte-offset GEP for non-natural layouts
(needed by `StructInfo`'s slice field, Phase 2).
## Next step — Phase 2: welded compiler FUNCTIONS over the real types
## Next step — execute `PLAN-COMPILER-VM.md`
Welded structs are byte-identical mirrors now, so the API surface can grow:
> The weld is being stripped. The next step is **Phase 0 of
> [`PLAN-COMPILER-VM.md`](PLAN-COMPILER-VM.md)** — remove the weld / serialize /
> marshal machinery (`compiler_lib.zig` reflection+validation, `nominal.zig`
> `validateWeldedStruct`, the `compiler_welded` dispatch, the weld examples/diagnostics
> 0625/0627/1183/1184/1185/1186), keeping the `#library`/`abi`/`extern` *syntax*. Then
> Phase 1 (flat-memory value model). The weld-era "next step" below is **obsolete** —
> kept only as a record of what the weld surface was about to do.
### (obsolete) weld-era next step
Welded structs were byte-identical mirrors, so the API surface was set to grow:
- **Bind `register_struct` / `find_type`** over the host-call bridge
(`compiler_lib.zig` `bound_fns`, like `intern`/`text_of`). `register_struct`
@@ -270,6 +298,107 @@ when reached (sentinels or accessor fns; see the design doc Risks).
`List` growth; orthogonal, see `current/CHECKPOINT-METATYPE.md`.)
## Log
- **Phase 1.final start (VM plan) — wiring entry point `tryEval` (2026-06-17).**
`comptime_vm.tryEval(gpa, module, func_id) ?Value` runs a comptime function entirely on
the VM, returns a legacy `Value` (deep-copied to `gpa`) or `null` to fall back.
Unit-tested (pure 6*7 → 42; unbox_any → null). NOT yet routed into the host: needs
(1) panic→error hardening of `Machine` accessors so arbitrary funcs bail instead of
crashing, (2) implicit-ctx handling, (3) wiring at `emit_llvm` const-init behind
`SX_COMPTIME_FLAT`, (4) corpus parity run. See `PLAN-COMPILER-VM.md` Phase 1.final.
688 corpus green.
- **Phase 1 sub-step 1.5b (VM plan) — Reg↔Value boundary bridge (2026-06-17).**
Builtin/compiler_call/extern handlers are coupled to the legacy `Interpreter`, so the
wiring will use WHOLE-FUNCTION fallback (VM runs pure functions; bail → legacy re-runs
the whole eval). Built the boundary bridge that enables it: `valueToReg` (Value arg →
Reg, aggregates into flat memory) + `regToValue` (VM result → Value, deep-copied).
Covers scalars/strings/structs; other shapes bail. Transitional. Round-trip
unit-tested. 688 corpus green. Next: the wiring (flag + route a comptime entry through
the VM with legacy fallback).
- **Phase 1 sub-step 1.5 (VM plan) — direct `call` + stack-lifetime change (2026-06-17).**
`Vm` gained `module` (callee resolution) + `depth`/`max_depth` guard. `call` marshals
arg Refs → Reg and recursively runs the callee; aggregates pass as Addrs over shared
flat memory. `Frame` no longer reclaims the machine on exit (else a returned aggregate
Addr dangles) — allocations live to `Vm.deinit`. Extern/builtin callees bail (1.5b).
Unit-tested: direct call (142), recursion sum(0..n) (15/55). 688 corpus green. Next:
1.5b (call_builtin/compiler_call/extern), then hybrid wiring.
- **Phase 1 sub-step 4d (VM plan) — deref/addr_of; pivot decision (2026-06-17).**
Ported `addr_of` (pass-through) + `deref` (readField through pointer), unit-tested
(deref *i64 → 77, addr_of struct + field → 80). DECIDED to stop porting rarer ops
(tagged-union payload/any/closures) blind — their byte semantics are ambiguous without
real call sites — and pivot to CALLS (sub-step 1.5: `call`, then builtin/compiler) +
HYBRID WIRING (`-Dcomptime-flat` → VM with legacy fallback on `error.Unsupported`), so
the VM runs the real corpus and surfaces exactly what's needed. Key design point for
calls: aggregate-return lifetime → drop per-frame stack reclaim (let a comptime eval's
allocations live to `Vm.deinit`). 688 corpus green. See `PLAN-COMPILER-VM.md` decision
block.
- **Phase 1 sub-step 4c (VM plan) — optionals + payloadless enums (2026-06-17).**
`kindOf`: enum → word; `?T` → word (pointer-child, null==0) or `{T@0,i1@sizeof(T)}`
aggregate. Ported optional_wrap/unwrap/has_value/coalesce (`optChildIsPtr`/`optHas`;
const_null reads as none) + payloadless enum_init/enum_tag. Unit-tested (?i64 → 91,
?*i64 null==0 → 99, enum tag → 11). 688 corpus green. Next: 4d (tagged unions, any,
closures).
- **Phase 1 sub-step 4b (VM plan) — slices + strings on flat memory (2026-06-17).**
`{ptr@0(pointer_size), len@8(i64)}` fat pointers (kindOf: string/slice → aggregate).
Ported `const_string` (text+NUL + fat pointer in flat memory), `length`/`data_ptr`,
`array_to_slice`, `subslice`, index-through-slice (`elemAddr` loads `.ptr`), and
`str_eq`/`str_ne` (memcmp). Unit-tested (str length+eq/ne, array→slice index sum=23,
subslice sum=43). 688 corpus green. Next: 4c (optionals/enums/any/closures).
- **Phase 1 sub-step 4a (VM plan) — tuples + arrays on flat memory (2026-06-17).**
`kindOf` widened (tuple/array → aggregate). Ported `tuple_init`/`tuple_get`
(`tupleFieldOffset`), `index_get`/`index_gep` (`elemAddr` = base + idx*elem_size over
array/pointer/many_pointer; slice/string bases bail), `length` on array values.
Unit-tested (mixed tuple, [3]i64 index sum=42, length=3). 688 corpus green. Next:
sub-step 4b (slices/strings, then optionals/enums/any/closures).
- **Phase 1 sub-step 3 (VM plan) — memory + structs on flat memory (2026-06-17).**
`Vm` gained optional `table: *const TypeTable` (target-aware layout). Ported
`alloca`/`load`/`store` + `struct_init`/`struct_get`/`struct_gep`, laying structs out
at the table's natural offsets. Value model: scalar/pointer → register word;
struct → lives in flat memory, its value IS its address (read→addr, write→memcpy), so
nested structs compose and `struct_gep` = base+offset. `kindOf` bails loudly on
not-yet-ported types. Addr-based values survive allocator realloc. Unit-tested
(struct round-trip, alloca+gep+store+load, nested struct). 688 corpus green. Next:
sub-step 4 (arrays/slices/strings/optionals/enums/tuples/any/closures, then calls).
- **Phase 1 sub-step 2 (VM plan) — flat-memory executor: scalars + control flow
(2026-06-17).** Added `Vm` to `comptime_vm.zig`: walks the same IR `Inst` over
flat-memory frames (register `Reg` = scalar bits or `Addr`), mirroring the legacy
interp's scalar semantics (i64 wrapping/signed, f64). Ported constants, arithmetic,
comparison, logical, conversions, terminators (`br`/`cond_br`/`ret`/`ret_void`) and
`block_param`; every other op bails loudly (`error.Unsupported` + op name in
`detail`). Unit-tested on hand-built tiny IR (`Fb` builder): int add, f64 arithmetic,
cond_br selection, a block-param loop, div-by-zero + unsupported-op bails. Corpus
untouched (688 green). Next: sub-step 3 (memory + aggregates on flat memory, where
target-aware layout enters).
- **Phase 1 sub-step 1 (VM plan) — flat-memory machine substrate (2026-06-17).**
New `src/ir/comptime_vm.zig`: `Machine` (linear byte memory + bump/stack allocator
with `mark`/`reset`, scalar `readWord`/`writeWord` 1/2/4/8 LE, `bytes` views, addr 0
reserved as `null_addr`) + `Frame` (Ref-indexed register file, stack reclamation on
deinit). `Reg` = raw u64 (immediate scalar OR `Addr`). Unit-tested
(`comptime_vm.test.zig`), registered in the barrel; standalone — the legacy
interpreter stays live, corpus untouched (688 green). Next: sub-step 2 (executor +
scalar/branch ops over the same IR). Also removed the "~500 lines / split step" rule
from CLAUDE.md per request.
- **Phase 0 (VM plan) — struct-weld stripped; `intern`/`text_of` bridge kept
(2026-06-17).** Removed the struct-weld registry from `compiler_lib.zig`
(`weldStruct`/`bound_types`/`BoundType`/`FieldLayout`/`findType`/`SxField`/
`LayoutMismatch`/`validateStructLayout`), `validateWeldedStruct`/`weldedFieldOrderStr`
+ the `sd.abi == .zig` call from `nominal.zig`, the struct-weld unit tests, and
examples `0625`/`0627`/`1183`/`1186`. KEPT (decision) the `intern`/`text_of` function
host-call bridge — a clean scalar dispatch, not weld/serialize/marshal, the Phase-3
compiler-call seed — so `weldedCompilerFn`, the `compiler_welded` dispatch, the
`emitCall` comptime-only gate, the `#library`/`abi`/`extern` syntax, and examples
`0626`/`1184`/`1185` remain. `zig build test` green (688 corpus, 0 failed). Next:
Phase 1 (flat-memory value model) per `PLAN-COMPILER-VM.md`.
- **DIRECTION CHANGE — pivot off the byte-weld to a flat-memory bytecode VM
(2026-06-17).** Decided the weld + serialization/marshaling bridge is the wrong
direction (it hand-marshals onto a comptime value model that isn't bytes — exactly
what the design set out to kill). New foundation: a bytecode VM over flat memory so
comptime values are native bytes; the compiler-API then rides on it via direct memory
(no weld/validation/marshaling). JIT-native comptime was weighed and rejected (breaks
cross-compilation, loses the sandbox). Wrote `current/PLAN-COMPILER-VM.md` (Phase 0
strip → Phase 1 flat-memory value model → Phase 2 bytecode → Phase 3 compiler-API on
flat memory). Banner added to `design/comptime-compiler-api.md` (superseded). Reverted
the session's uncommitted `register_struct`/`find_type` marshaling experiment back to
`reify` HEAD (40d075c). No code stripped yet — Phase 0 is the next action.
- **Phase 2 — welded structs by reflection + memory-order validation.** Dropped
the byte-layout-override engine (computeWeldPlan / offset-ordered LLVM struct /
byte-blob — all explored, all unnecessary). Instead: the sx header declares