The byte-weld (sx structs whose layout was validated to mirror the compiler's Zig records) plus the serialization/marshaling bridge was the wrong direction: it bolted a parallel layout regime and hand-built byte-copies onto a comptime value model that fundamentally isn't bytes. Strip the struct-weld machinery: - compiler_lib.zig loses the type registry (weldStruct / bound_types / BoundType / FieldLayout / findType / SxField / LayoutMismatch / validateStructLayout); it is now just the intern/text_of function host-call bridge (kept as the Phase-3 compiler-call seed). - nominal.zig loses validateWeldedStruct / weldedFieldOrderStr + the sd.abi == .zig validation call. - Remove the struct-weld unit tests and examples 0625/0627 (welded structs) + 1183/1186 (weld-layout diagnostics). - The #library / abi / extern syntax stays. Record the new direction: a bytecode VM over flat, byte-addressable memory so comptime values are native bytes (no weld/validation/marshal), target-aware (preserves cross-compilation) and sandboxed. See current/PLAN-COMPILER-VM.md (Phase 0 strip -> Phase 1 flat-memory value model -> Phase 2 bytecode -> Phase 3 compiler-API on flat memory). design/comptime-compiler-api.md gets a SUPERSEDED banner. Also drop the "~500 lines / split the step" rule from CLAUDE.md.
20 KiB
PLAN — Comptime Bytecode VM + flat memory (then re-home the compiler-API on it)
Direction change (2026-06-17). The comptime compiler-API stream pivots off the byte-weld. The weld (sx structs whose layout is validated to mirror the compiler's Zig types) + the serialization / marshaling bridge at the call boundary is the wrong direction — it bolts a parallel layout regime and hand-built byte-copies onto a comptime value model that fundamentally isn't bytes. We strip it and build the right foundation: a bytecode VM over flat, byte-addressable memory, where comptime values ARE native bytes (like runtime). On that base the compiler-API needs no weld, no validation, no marshaling — the compiler's own types are read/built directly as memory and its functions take/return real pointers.
Supersedes the build order in
design/comptime-compiler-api.md(kept for history). This is the active plan for the stream. Branch:reify.
Why
src/ir/interp.zig is a tree-walking interpreter over the SSA IR that represents
every value as a tagged Value union (int, float, aggregate: []const Value,
type_tag, heap_ptr, …). Two consequences:
- Slow. Per-value boxing in a tagged union; per-op
switchoverInst; an aggregate is a heap[]const Value, walked element-by-element. - Not native memory. A struct value is
[]const Value(tagged unions), NOT the struct's bytes. So a comptime@ptrCast(*StructInfo)reads theValueunion's memory, not aStructInfo— which forced the whole weld+marshal detour.
Make comptime values native bytes in a flat memory and both problems dissolve: structs/arrays/slices are their bytes at natural layout (no weld), the compiler's own records are directly addressable (no marshal), and a bytecode loop over flat memory is fast.
End state
- Comptime execution = a bytecode VM over a flat linear memory (real
host-allocated bytes; layout is target-aware via the type table's sizes). Values
are bytes at addresses plus a scalar register file. No tagged
Valueunion. - The comptime compiler-API: the compiler exposes its real types + functions to
comptime sx. sx reads/builds them as native memory and calls compiler functions by
pointer. No
abi(.zig)weld, novalidateStructLayout, noregister_structfield-by-field marshaling — gone. declare/define/type_infoand#compiler/BuildOptionsride this one mechanism; the bespoke interp arms are deleted.
Principles (hold at every step)
- Green at every step.
zig build && zig build testpass after each sub-step. The existing tagged-Valueinterpreter stays the live evaluator until the VM reaches corpus parity; swap behind a build flag, then delete the old path. - Target-aware, not host-baked. Flat-memory layout uses the type table's target
sizes (
pointer_size,typeSizeBytes/offsets), NEVER host@sizeOf. This is what keeps cross-compilation correct (the JIT-comptime alternative could not). - Sandboxed. Flat-memory accesses are bounds-checked; step/call-depth budgets remain; an OOB / bad access traps to a build-gating diagnostic with a source span — never a compiler-process crash.
- No silent fallbacks (per CLAUDE.md): an unhandled op / shape bails loudly with a named reason, never a zero/default that looks like success.
Phases
Phase 0 — Strip the weld / serialize / marshal machinery
Delete the wrong-direction code so the VM builds on a clean base. Pure removal + corpus rebaseline; suite green.
src/ir/compiler_lib.zig: the reflection (weldStruct/bound_types/FieldLayout/BoundType), the layout validation (validateStructLayout/LayoutMismatch/SxField). Decide the fate of thebound_fnshost-call registry (intern/text_ofhandlers) — it is likely subsumed by the VM's compiler-call path in Phase 3, butintern/text_ofmay survive as the first such calls.src/ir/lower/nominal.zig:validateWeldedStruct+weldedFieldOrderStr+ thesd.abi == .zigvalidation call inregisterStructDecl.src/ir/interp.zig: thecompiler_weldeddispatch branch.src/backend/llvm/ops.zig: theemitCallcomptime-only gate keyed oncompiler_welded(re-derive the comptime-only guard from a non-weld signal if still needed).- Corpus: retire / convert the weld examples + diagnostics —
0625,0627(welded struct),1183,1186(weld-layout diagnostics),1184/1185(welded-fn). Keep0626(intern/text_ofround-trip) only if it survives the new call path. - Keep (re-evaluate in Phase 3), independent of the weld semantics: the
#library "compiler"decl, theabi(.x)annotation +extern <lib>syntax, and thecallconv → abiunification. These are surface syntax that may still serve the compiler-API; only the weld semantics are stripped here.
Verification: zig build test green with the weld machinery gone; the surviving
syntax still parses (parser unit tests).
Phase 1 — Flat-memory value model (still IR-walking, no bytecode yet)
Introduce flat memory and move comptime values onto it, decoupled from bytecode so
the value-model change is isolated. Each sub-step ports one op group and keeps the
corpus green; the OLD tagged path stays behind a build flag (-Dcomptime-flat) until
all groups land, then the shim is deleted.
- Machine + scalars. A flat memory region (host
[]u8) with a stack (frames) + bump-allocated heap, and a scalar register file. Portint/float/bool/undefand arithmetic/compare/branch. Aggregates still go through a compat shim to the old representation. - Aggregates. Structs/arrays/tuples laid out in flat memory at target layout;
port
struct_init/struct_get/array/index_gepto read/write bytes at computed offsets. - Slices / strings.
{ptr, len}fat pointers in flat memory. - Optionals / enums / tagged unions. Tag + payload bytes.
- Pointers.
alloca/store/load/ GEP unified onto flat addresses; retireslot_ptr/heap_ptr/byte_ptrin favor of flat-memory addresses. - Closures. Fn id + captured env materialized in flat memory.
- Extern / host calls. A struct arg is already bytes → pass its address; this
removes most of
marshalExternArg. - Reflection / minting.
declare/define/type_inforead flat-memory values; type-table mutation copies escaping data into compiler-owned memory at the boundary (lifetime), as today.
Verification: with -Dcomptime-flat the full corpus (currently 692) is byte-for-
byte identical to the tagged path; then make flat the default and delete the shim.
Phase 2 — Bytecode
Compile a comptime function's IR → a compact bytecode and execute the bytecode instead
of walking Inst. Pure encoding/speed; semantics identical to Phase 1. Land at least a
minimal register-bytecode loop (the stream's stated goal is a bytecode VM); a
fragment cache is optional follow-up.
Verification: corpus identical to Phase 1; comptime throughput measurably improved on a heavy-comptime micro-benchmark.
Phase 1.final — host wiring (the remaining integration)
The wiring ENTRY POINT exists: comptime_vm.tryEval(gpa, module, func_id) ?Value runs a
comptime function entirely on the VM and returns a legacy Value, or null to fall
back. Unit-tested (pure 6*7 → 42; unsupported → null). Remaining to actually route the
host through it:
- Panic→error hardening (prerequisite).
Machine.readWord/writeWord/bytescurrentlyassert(debug panic) on null/OOB. For arbitrary host functions to be safe, make them returnerror.OutOfBoundsso a malformed run BAILS (→ null → legacy) instead of crashing the compiler. Ripples throughreadField/writeField/slice helpers (addtry). - Implicit context. Host comptime functions may have
has_implicit_ctx(param 0 =*Context); the legacyrunmaterializes a default ctx. The VMrundoes not — so either materialize it too, or only routetryEvalat funcs without implicit ctx. - Wire one site behind a flag/env (
SX_COMPTIME_FLAT, →-Dcomptime-flatlater): the const-init fold inemit_llvm.zigemitGlobals(result = tryEval(...) orelse interp.call(...)). Default off → corpus unaffected. - Parity + coverage. Run the corpus with the flag ON; results must be byte-identical
to legacy. Measure how many comptime evals the VM already handles; the bail
details name what to port next (tagged-union payload / any / closures / builtins). - Grow coverage (port the deferred ops +
call_builtin/compiler_callvia the bridge) until the VM is the default and the legacy path is deleted.
Phase 3 — Compiler-API on flat memory (resume the stream — no weld)
With native-byte comptime values, re-home the compiler-API:
- Expose the compiler's real types. Register the actual
types.zigrecords (StructInfo,EnumInfo,Field, …) into the comptime type table under sx-visible names, with their real (host) layout — the type IS the compiler's, so there is nothing to validate or keep in sync. (This is the projection that replaces the weld's reflection — owned by the compiler, not declared in sx.) - Expose the compiler's functions.
register_struct,find_type,intern,text_of, and the reflection readers operate on flat-memory pointers / handles directly (no marshaling — the bytes already ARE the record). - Re-express
declare/define/type_infoas sx over these; delete the bespoke interp arms (defineStruct/defineEnum/defineTuple/reflectTypeInfo); migrateexamples/0622(struct),0619/0620/0623(enum/tuple). - Migrate
BuildOptionsoff#compileronto this mechanism; delete#compiler.
Verification: the metatype + #compiler surfaces are gone, re-expressed as sx over
the exposed compiler-API; full corpus green.
Open questions (resolve as reached, record decisions here)
- Host-ABI vs target-ABI split. The compiler runs on the host, so its OWN exposed records are host-laid-out; user comptime types are target-laid-out. The flat-memory model must carry both regimes (a per-type ABI tag on layout queries). Confirm the boundary where a flat-memory pointer to a compiler record is handed to host Zig code uses host layout.
- Exposing compiler types to sx. Mechanism for projecting
types.zigrecords into the comptime type table with real offsets (the non-weld replacement) — a registry the compiler owns, keyed by sx-visible name → real Zig type's layout + a host-call ABI. - Bytecode shape. IR-derived vs a fresh ISA; register vs stack; fragment caching.
- Pointer escape / lifetime. Flat-memory pointers stored into the persistent type table must be copied into compiler-owned memory at the boundary (as today).
- Old-path retirement. Keep the tagged interpreter until Phase 1 parity, then
delete — confirm no non-comptime consumer depends on
Value.
File map (current → touched)
| Area | File | Phase |
|---|---|---|
| Comptime evaluator | src/ir/interp.zig |
0 (strip weld dispatch), 1–2 (rebuild) |
| Weld registry | src/ir/compiler_lib.zig |
0 (strip), 3 (replace with type/fn exposure) |
| Weld validation | src/ir/lower/nominal.zig |
0 (strip validateWeldedStruct) |
| Comptime-only gate | src/backend/llvm/ops.zig |
0 (re-derive without weld signal) |
| Host-FFI marshalling | src/ir/host_ffi.zig |
1 (struct-by-pointer trims it) |
| Metatype arms | src/ir/interp.zig (defineStruct/…/reflectTypeInfo) |
3 (delete, re-express in sx) |
#compiler / BuildOptions |
library/modules/build.sx, src/ir/compiler_hooks.zig |
3 (migrate, delete #compiler) |
| Surface syntax | src/parser.zig, src/ast.zig (abi/extern/#library) |
kept; revisited Phase 3 |
Status
- Phase 0 — DONE (2026-06-17). The struct-weld machinery is stripped:
compiler_lib.ziglost the type registry (weldStruct/bound_types/BoundType/FieldLayout/findType/SxField/LayoutMismatch/validateStructLayout);nominal.ziglostvalidateWeldedStruct/weldedFieldOrderStr+ thesd.abi == .zigcall; the struct-weld unit tests + examples0625/0627/1183/1186are removed. Decision (recorded): theintern/text_offunction host-call bridge is KEPT — it is a clean scalar dispatch (string→handle), not weld/serialize/marshal, and is the seed Phase 3 grows the compiler-call path from. So thecompiler_weldeddispatch (interp.callExternis unchanged at HEAD — the pre-branch incall()),weldedCompilerFn(decl.zig), theemitCallcomptime-only gate (ops.zig), and examples0626/1184/1185stay. The#library/abi/externSYNTAX stays.zig build testgreen (688 corpus, 0 failed; unit tests pass). - Phase 1 — in progress.
-
Sub-step 1 — DONE.
src/ir/comptime_vm.zig: the flat-memoryMachine(linear byte memory + bump/stack allocator withmark/resetreclamation + scalarreadWord/writeWord(1/2/4/8, little-endian) +bytesviews; addr 0 reserved asnull_addr) andFrame(register file indexed by Ref + stack reclamation ondeinit). A registerRegis a raw u64 — immediate scalar ORAddr. Standalone + unit-tested (comptime_vm.test.zig, in the barrel); does NOT touch the live interpreter, so the corpus stays green (688). No op execution yet. -
Sub-step 2 — DONE. The executor (
Vmincomptime_vm.zig): walks the SAME IRInstover flat-memory frames, mirroring the legacy interp's scalar semantics (i64 wrapping/signed + f64 register words, keyed off the result/operandTypeId). Ported: constants (const_int/float/bool/null/undef), arithmetic (add/sub/mul/div/mod/neg), comparison (cmp_*), logical (bool_and/or/not), conversions (widen/narrow/bitcastpassthrough,int_to_float/float_to_int), terminators (br/cond_br/ret/ret_void) andblock_param(branch args passed as Refs — the same frame persists, SSA-safe). Any other op bails loudly (error.Unsupported+detail = @tagName(op)). Unit-tested on hand-built IR (Fbbuilder): integer add, f64 arithmetic, cond_br branch selection, a block-param loop summing i..1, div-by-zero + unsupported-op bails. Corpus untouched (688 green) — the executor is exercised by unit tests only, not yet wired to real comptime eval. -
Sub-step 3 — DONE. Memory + structs on flat memory.
Vmgained an optionaltable: *const TypeTable(target-aware layout). Portedalloca/load/store(over flat addresses,Store.val_tydrives width) andstruct_init/struct_get/struct_gep(structs laid out at the table's natural offsets). The value model: aKind.word(scalar/pointer ≤8B) sits in a register; aKind.aggregate(struct) lives in flat memory and its "value" IS its address (read returns the address, write memcpys), so nested structs compose andstruct_gepis just base+offset (no field-pointer dance).kindOfbails loudly on the not-yet-ported types (slice/string/any/optional/enum/array/tuple/…). The Addr-based value model survives allocator realloc (offsets are stable; slices are only materialized transiently). Unit-tested: struct_init+get round-trip, alloca+gep+store+load, nested-struct aggregate copy + nested read. Corpus untouched (688 green). -
Sub-step 4a — DONE. Tuples + arrays.
kindOfwidened (tuple/array→ aggregate). Portedtuple_init/tuple_get(positional,tupleFieldOffset),index_get/index_gep(elemAddr= base + idx*elem_size over array/pointer/ many_pointer bases; slice/string bases bail), andlengthon an array value (staticArrayInfo.length). Unit-tested: mixed tuple round-trip,[3]i64gep/store + index_get sum (42), arraylength(3). 688 corpus green. -
Sub-step 4b — DONE. Slices + strings as
{ptr@0 (pointer_size), len@8 (i64)}fat pointers (kindOf: string/slice → aggregate). Portedconst_string(materializes text+NUL in flat memory + a fat pointer),length/data_ptr(read len/ptr fields),array_to_slice,subslice, indexing through a slice/string (elemAddrloads.ptrfirst), andstr_eq/str_ne(len+memcmp). HelpersmakeSlice/sliceLen/sliceData. Unit-tested: string length + str_eq/ne, array→slice + slice index + slice length (23), array subslice (43). 688 corpus green. -
Sub-step 4c — DONE (optionals + payloadless enums).
kindOf:enum→ word;?T→ word if pointer-child (null==0) else{T@0, i1@sizeof(T)}aggregate. Portedoptional_wrap/unwrap/has_value/coalesce(withoptChildIsPtr/optHashelpers;const_null→null_addrreads as none),enum_init(payloadless: tag is the value),enum_tag(payloadless/word). Unit-tested: non-pointer?i64wrap/unwrap/coalesce (91), pointer?*i64null==0 (99), payloadless enum tag (11). 688 corpus green. -
Sub-step 4d — partial (
addr_of/derefDONE).addr_ofpasses through (an aggregate value already IS its address; a pointer is already an address — mirrors the legacy);deref=readFieldthrough the pointer (ins.tyis the pointee). Unit-tested (deref a*i64→ 77; addr_of a struct value + field read → 80). Deferred to the wiring phase (intentionally, not ported blind): tagged-union payload (enum_initw/ payload,enum_payload— the legacy stores untyped Values andfield_indexindexes payload sub-fields, not variants, so a byte model's payload type is ambiguous without a real call site),anyboxing, closures, and the bitwise ops. These have subtleties best resolved against actual corpus cases — the VM's louderror.Unsupported+detailwill name exactly what each real eval needs. -
Sub-step 1.5 — direct
callDONE.Vmgainedmodule: *const Module(resolves a calleeFuncId) + adepth/max_depthrecursion guard.callmarshals arg Refs → Reg words and recursivelyruns the callee; aggregate args/ results pass as theirAddrover the SHARED flat memory (no copy). Stack-lifetime change:Frameno longer reclaims the machine on exit (a returned aggregate's Addr would dangle) — a comptime eval's allocations live toVm.deinit;Machine.mark/resetstay for explicit use. Extern/builtin callees (no blocks) bail loudly (1.5b). Unit-tested: direct call (add(20,22)+100→ 142) and recursion (sum(0..n)→ 15/55). 688 corpus green. -
Sub-step 1.5b —
Reg↔Valueboundary bridge DONE. The builtin/compiler_call/ extern handlers are all coupled to the legacyInterpreter(e.g.compiler_libhandlers take*Interpreter), so the VM can't call them directly — the wiring uses WHOLE-FUNCTION fallback instead (VM runs pure functions; a bail re-runs the whole eval in the legacy). That needs the boundary bridge:valueToReg(hostValuearg → VMReg, materializing aggregates into flat memory) +regToValue(VM result →Value, deep-copied out). Covers scalars + strings + structs (other aggregate shapes bail loudly; added as wiring surfaces them). Transitional — deleted once the VM owns comptime end-to-end. Unit-tested with round-trips. 688 corpus green. -
Then the wiring step (below) — now unblocked.
-
Decision (2026-06-17): pivot from blind op-porting to CALLS + hybrid wiring
The common leaf ops are ported (scalars, control flow, structs, tuples, arrays, slices, strings, optionals, payloadless enums, deref/addr_of) and unit-tested. Continuing to port the rarer ops (tagged-union payload, any, closures) in isolation risks subtle bugs and has low signal. The higher-value path:
- Calls (sub-step 1.5) —
call(direct), thencall_builtin/compiler_call. The shared flat memory makes aggregate args/results pass naturally (they're Addrs). The one design point: aggregate-return lifetime — a callee's stack-reclaim would dangle a returned struct Addr, so for comptime (bounded) the VM should stop reclaiming per-frame and let the whole eval's allocations live untilVm.deinit(keepMachine.mark/resetfor explicit use; drop it fromFrame.deinit). - Hybrid wiring —
-Dcomptime-flatroutes a comptime eval through the VM, falling back to the legacy interp onerror.Unsupported. This makes the VM run the REAL corpus, proving parity incrementally and surfacing exactly which ops each real eval needs — far better signal than more isolated unit tests.