Files
sx/current/PLAN-COMPILER-VM.md
agra ba28488d99 P5.5: migrate the 35 BuildOptions accessors off #compiler to VM-native abi(.compiler)
`BuildOptions :: struct #compiler { ...35 methods... }` becomes
`BuildOptions :: struct { }` (an opaque null-sentinel handle) plus 35 free
`ufcs (self: BuildOptions, …) abi(.compiler)` decls in build.sx, each serviced
by a new `comptime_vm.callBuildOptionFn` arm (off `callCompilerFn`). No legacy
`compiler_lib` handler: the names are registered in `bound_fns` with a single
bailing stub only so `weldedCompilerFn` accepts them.

- String lifetime: setters dupe the arg into the persistent `Vm.gpa` (the
  Compilation allocator, threaded into both `tryEval` and `runBuildCallback` —
  not the per-eval VM arena) and write/append to the threaded `BuildConfig`.
  Getters read the field/slice or compute the target predicate from the triple.
- Dispatch routing (Option B): a `#run`/const-init entry that directly calls a
  compiler-domain/welded fn (`emit_llvm.entryNeedsVm`) runs on the VM with no
  legacy fallback regardless of the `-Dcomptime-flat` gate, so gate-OFF stays
  green without a legacy BuildOptions handler (P5.7 retires the legacy interp).
- Mark the 5 `platform/bundle.sx` getter-calling helpers `abi(.compiler)` (they
  are comptime-only bundler code; otherwise their now-welded getter calls trip
  the runtime-call gate).
- 37 `.ir` snapshots regenerated (std transitively imports build.sx → string-
  pool/type-table indices shift); verified `.ir`-only, zero behavior-stream diffs.

BuildOptions `compiler_call` strict bails gone (1609/1614/1615 strict-clean);
1616 now bails on a separate, pre-existing unported bitwise/shift VM gap (`shr`),
to port first in P5.6. 703/0 both gates.

Also sweep the outdated "flat memory" terminology to "comptime/byte-addressable"
across comptime_vm + the plan/checkpoint/CLAUDE docs: the comptime VM is
arena-backed, byte-addressable memory where `Addr` is a real host pointer, not a
flat contiguous address space (flag names `-Dcomptime-flat`/`SX_COMPTIME_FLAT` kept).
2026-06-19 13:21:09 +03:00

742 lines
58 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# PLAN — Comptime Bytecode VM + comptime memory (then re-home the compiler-API on it)
> **Direction change (2026-06-17).** The comptime compiler-API stream pivots off the
> **byte-weld**. The weld (sx structs whose layout is validated to mirror the
> compiler's Zig types) + the **serialization / marshaling** bridge at the call
> boundary is the wrong direction — it bolts a parallel layout regime and hand-built
> byte-copies onto a comptime value model that fundamentally isn't bytes. We strip it
> and build the right foundation: a **bytecode VM over byte-addressable
> memory**, where comptime values ARE native bytes (like runtime). On that base the
> compiler-API needs no weld, no validation, no marshaling — the compiler's own types
> are read/built directly as memory and its functions take/return real pointers.
>
> Supersedes the build order in `design/comptime-compiler-api.md` (kept for history).
> This is the active plan for the stream. Branch: `reify`.
## Why
`src/ir/interp.zig` is a tree-walking interpreter over the SSA IR that represents
every value as a tagged `Value` union (`int`, `float`, `aggregate: []const Value`,
`type_tag`, `heap_ptr`, …). Two consequences:
1. **Slow.** Per-value boxing in a tagged union; per-op `switch` over `Inst`; an
aggregate is a heap `[]const Value`, walked element-by-element.
2. **Not native memory.** A struct value is `[]const Value` (tagged unions), NOT the
struct's bytes. So a comptime `@ptrCast(*StructInfo)` reads the `Value` union's
memory, not a `StructInfo` — which forced the whole weld+marshal detour.
Make comptime values **native bytes in byte-addressable memory** and both problems dissolve:
structs/arrays/slices are their bytes at natural layout (no weld), the compiler's own
records are directly addressable (no marshal), and a bytecode loop over comptime memory is
fast.
## End state
- Comptime execution = a **bytecode VM** over a **byte-addressable memory** (real
host-allocated bytes; layout is **target-aware** via the type table's sizes). Values
are bytes at addresses plus a scalar register file. No tagged `Value` union.
- The comptime compiler-API: the compiler **exposes its real types + functions** to
comptime sx. sx reads/builds them as native memory and calls compiler functions by
pointer. No `abi(.zig)` weld, no `validateStructLayout`, no `register_struct`
field-by-field marshaling — gone.
- `declare`/`define`/`type_info` and `#compiler`/`BuildOptions` ride this one
mechanism; the bespoke interp arms are deleted.
- **ONE evaluator at the end — non-negotiable.** The legacy tagged-`Value` interpreter
(`interp.zig`) is **DELETED**. We do NOT ship both permanently. "Dual-path"
(a compiler-API fn with both a legacy `compiler_lib` handler AND a VM-native impl) and
the emit-time legacy fallback are **transitional only** — scaffolding while the VM
reaches parity at BOTH comptime sites (emit time AND lowering time). The flag
`-Dcomptime-flat` is the swap mechanism; once the VM runs everywhere with parity, the
flag, the fallback, and `interp.zig` all go. Any "VM-only at emit, legacy at lowering"
split is a waypoint, never the destination.
## Principles (hold at every step)
- **Green at every step.** `zig build && zig build test` pass after each sub-step. The
existing tagged-`Value` interpreter stays the live evaluator until the VM reaches
corpus parity; swap behind a build flag, then delete the old path.
- **Target-aware, not host-baked.** Flat-memory layout uses the type table's target
sizes (`pointer_size`, `typeSizeBytes`/offsets), NEVER host `@sizeOf`. This is what
keeps cross-compilation correct (the JIT-comptime alternative could not).
- **Sandboxed.** Flat-memory accesses are bounds-checked; step/call-depth budgets
remain; an OOB / bad access traps to a build-gating diagnostic with a source span —
never a compiler-process crash.
- **No silent fallbacks** (per CLAUDE.md): an unhandled op / shape bails loudly with a
named reason, never a zero/default that looks like success.
## Phases
### Phase 0 — Strip the weld / serialize / marshal machinery
Delete the wrong-direction code so the VM builds on a clean base. Pure removal +
corpus rebaseline; suite green.
- `src/ir/compiler_lib.zig`: the reflection (`weldStruct` / `bound_types` /
`FieldLayout` / `BoundType`), the layout validation (`validateStructLayout` /
`LayoutMismatch` / `SxField`). Decide the fate of the `bound_fns` host-call registry
(`intern`/`text_of` handlers) — it is likely subsumed by the VM's compiler-call path
in Phase 3, but `intern`/`text_of` may survive as the first such calls.
- `src/ir/lower/nominal.zig`: `validateWeldedStruct` + `weldedFieldOrderStr` + the
`sd.abi == .zig` validation call in `registerStructDecl`.
- `src/ir/interp.zig`: the `compiler_welded` dispatch branch.
- `src/backend/llvm/ops.zig`: the `emitCall` comptime-only gate keyed on
`compiler_welded` (re-derive the comptime-only guard from a non-weld signal if still
needed).
- Corpus: retire / convert the weld examples + diagnostics — `0625`, `0627` (welded
struct), `1183`, `1186` (weld-layout diagnostics), `1184`/`1185` (welded-fn). Keep
`0626` (`intern`/`text_of` round-trip) only if it survives the new call path.
- **Keep (re-evaluate in Phase 3), independent of the weld semantics:** the
`#library "compiler"` decl, the `abi(.x)` annotation + `extern <lib>` syntax, and the
`callconv → abi` unification. These are surface syntax that may still serve the
compiler-API; only the *weld semantics* are stripped here.
**Verification:** `zig build test` green with the weld machinery gone; the surviving
syntax still parses (parser unit tests).
### Phase 1 — Flat-memory value model (still IR-walking, no bytecode yet)
Introduce comptime memory and move comptime values onto it, **decoupled from bytecode** so
the value-model change is isolated. Each sub-step ports one op group and keeps the
corpus green; the OLD tagged path stays behind a build flag (`-Dcomptime-flat`) until
all groups land, then the shim is deleted.
1. **Machine + scalars.** A comptime memory region (host `[]u8`) with a stack (frames) +
bump-allocated heap, and a scalar register file. Port `int`/`float`/`bool`/`undef`
and arithmetic/compare/branch. Aggregates still go through a compat shim to the old
representation.
2. **Aggregates.** Structs/arrays/tuples laid out in comptime memory at **target** layout;
port `struct_init` / `struct_get` / `array` / `index_gep` to read/write bytes at
computed offsets.
3. **Slices / strings.** `{ptr, len}` fat pointers in comptime memory.
4. **Optionals / enums / tagged unions.** Tag + payload bytes.
5. **Pointers.** `alloca` / `store` / `load` / GEP unified onto comptime addresses; retire
`slot_ptr` / `heap_ptr` / `byte_ptr` in favor of comptime addresses.
6. **Closures.** Fn id + captured env materialized in comptime memory.
7. **Extern / host calls.** A struct arg is already bytes → pass its address; this
removes most of `marshalExternArg`.
8. **Reflection / minting.** `declare` / `define` / `type_info` read comptime
values; type-table mutation copies escaping data into compiler-owned memory at the
boundary (lifetime), as today.
**Verification:** with `-Dcomptime-flat` the full corpus (currently 692) is byte-for-
byte identical to the tagged path; then make the VM the default and delete the shim.
### Phase 2 — Bytecode
Compile a comptime function's IR → a compact bytecode and execute the bytecode instead
of walking `Inst`. Pure encoding/speed; semantics identical to Phase 1. Land at least a
minimal register-bytecode loop (the stream's stated goal is a *bytecode* VM); a
fragment cache is optional follow-up.
**Verification:** corpus identical to Phase 1; comptime throughput measurably improved
on a heavy-comptime micro-benchmark.
### Phase 1.final — host wiring (the remaining integration)
The wiring ENTRY POINT exists: `comptime_vm.tryEval(gpa, module, func_id) ?Value` runs a
comptime function entirely on the VM and returns a legacy `Value`, or `null` to fall
back. Unit-tested (pure `6*7` → 42; unsupported → null). Remaining to actually route the
host through it:
1. **Panic→error hardening (prerequisite).** `Machine.readWord`/`writeWord`/`bytes`
currently `assert` (debug panic) on null/OOB. For arbitrary host functions to be
safe, make them return `error.OutOfBounds` so a malformed run BAILS (→ null → legacy)
instead of crashing the compiler. Ripples through `readField`/`writeField`/slice
helpers (add `try`).
2. **Implicit context.** Host comptime functions may have `has_implicit_ctx` (param 0 =
`*Context`); the legacy `run` materializes a default ctx. The VM `run` does not — so
either materialize it too, or only route `tryEval` at funcs without implicit ctx.
3. **Wire one site** behind a flag/env (`SX_COMPTIME_FLAT`, → `-Dcomptime-flat` later):
the const-init fold in `emit_llvm.zig` `emitGlobals` (`result = tryEval(...) orelse
interp.call(...)`). Default off → corpus unaffected.
4. **Parity + coverage.** Run the corpus with the flag ON; results must be byte-identical
to legacy. Measure how many comptime evals the VM already handles; the bail `detail`s
name what to port next (tagged-union payload / any / closures / builtins).
5. Grow coverage (port the deferred ops + `call_builtin`/`compiler_call` via the bridge)
until the VM is the default and the legacy path is deleted.
**Status (2026-06-17): steps 14 DONE; step 5 = the next session.**
- **(1) Hardening — DONE.** `Machine.readWord`/`writeWord`/`bytes` return
`error.OutOfBounds` (null / out-of-range / oversized / overflow-safe) instead of
asserting. `OutOfBounds` added to `Vm.Error`; `try` threaded through
`readField`/`writeField`/`optHas`/`makeSlice`/`sliceLen`/`sliceData`/`elemAddr` and
every exec arm + the bridge. New unit tests: hardened-accessor OOB returns, and a
null-deref function → `tryEval` returns `null` (legacy fallback), not a panic.
- **(2) Implicit context — DONE (materialized, 2026-06-17 step 5).** Initially a
conservative skip; now `tryEval` MATERIALIZES the implicit ctx: a comptime entry with
`has_implicit_ctx` (whose sole param is the `*Context`) gets a zeroed `Context` of the
right size/align allocated in comptime memory, its address passed as arg 0. The common
const body never reads the ctx; a body that USES the allocator loads a fn from it and
`call_indirect`s (unported) → bails → legacy. No func-ref materialization was needed:
handled bodies don't read the ctx contents, and gate-ON corpus parity (688, 0 failed)
empirically confirms no divergence. (A body that read+branched on a null allocator fn
could in principle diverge; none does — parity is the guard.)
- **(3) Wire one site — DONE.** Const-init fold in `emitGlobals` is `(if comptime_flat)
tryEval(...) else null) orelse interp.call(...)`. Gated by env `SX_COMPTIME_FLAT`
(a `LLVMEmitter.comptime_flat` field read once from `std.c.getenv` in `init`).
Default OFF → corpus unaffected (688 green).
- **(4) Parity + coverage — DONE.** Gate ON: full corpus byte-identical (688, 0 failed);
manual `sx run` of 0605/0606/0607/0608 byte-identical to gate-OFF. Coverage-trace
facility in place (`comptime_vm.last_bail_reason` + env `SX_COMPTIME_FLAT_TRACE`,
printing HANDLED / fallback+reason per init).
- **(5) Implicit-context materialization + memory builtins + f32 — DONE; op-porting CONTINUES.**
Coverage climbed **0 → 16 → 27** handled corpus const-inits (fallbacks 22 → 11); parity
stays **688/688** (gate ON and OFF) at every step. Landed, in order: implicit ctx
materialized (→16); `writeField` null-aggregate fix (storing a `null` non-pointer
optional `null_addr` sentinel into an aggregate slot OOB-bailed → now ZEROES the
destination = none/empty; unit-test regression); curated libc MEMORY builtins on comptime
memory (`Vm.callMemBuiltin`: `malloc`/`calloc` → `allocBytes` 16-aligned & 256-MiB-capped,
`free` → no-op, `memcpy`/`memmove`/`memset` on comptime bytes — sandboxed, target-aware,
result byte-identical to legacy; unlocked `0604`'s 11 comptime mallocs); and an **f32
storage fix** (float registers hold f64 bits, but f32 memory is the 4-byte single —
`readField`/`writeField` now `@floatCast` instead of truncating the f64 bits, which had
written zeros for `1.0`; a real latent bug `0604` surfaced; unit tests added).
- **(6) Real default context + call_indirect + func_ref + global_get — DONE.** Coverage
**27 → 31** handled (fallbacks 11 → 7); parity stays **688/688** both gate ON and OFF.
Per the user's direction ("the VM can set up a default context"), `runEntry` now
materializes the REAL default context (not a zeroed one): the implicit-ctx param is an
opaque `*void`, so `materializeDefaultContext` finds the `__sx_default_context` global
and lays its initializer constant (`{ {null, alloc_fn, dealloc_fn}, null }`, carrying
the CAllocator thunk func-refs) into comptime memory via a new recursive `layoutConst`.
With `func_ref` (a function value encoded as `FuncId.index() + 1` so word 0 stays
reserved for the NULL function pointer — `funcRefWord`/`funcRefToId`) and `call_indirect`
(decode the callee word → `FuncId` → dispatch; 0 → bail) ported, a comptime body
that allocates via `context.allocator` now runs ENTIRELY on the VM: `alloc_string` →
`context.allocator.alloc_bytes` → `call_indirect` → thunk → `CAllocator.alloc_bytes` →
`libc_malloc` → the VM's native comptime `malloc`. Unlocked `0606` (string global via
the allocator). Also: `global_get` lazily evaluates a comptime global's `comptime_func`
(memoized in `global_cache`) — unlocked `CT_CHAIN`; struct field access (`fieldOffset`/
`struct_get`) now handles string/slice `{ptr@0,len@8}` fat pointers (needed by
`alloc_string`'s `s.ptr`/`s.len`); and `regToValue` maps a function-typed word back to
`.func_ref` so a func-ref result serializes identically to legacy (kept `1128`'s
rejection diagnostic byte-identical). Unit tests added (global_get, func_ref +
call_indirect). **Note: native `malloc` is still REQUIRED** — the CAllocator thunk
bottoms out at libc `malloc`, and the VM can't use a host pointer with comptime
load/store, so comptime `malloc` must allocate from comptime memory. The default context
lets the allocator PROTOCOL run; native `malloc` is its final step.
- **(7) `is_comptime` + failable/error cluster + the signed-load fix — DONE.** Coverage
**31 → 36** handled (fallbacks 7 → 2); parity stays **688/688** both gate ON and OFF.
- **`is_comptime`** → always 1 on the VM (folds to false in compiled code). Unlocked `1030`.
- **Failable / error-channel cluster** (`1037` escape, `1038` handled): `kindOf(error_set)
→ word` (a u32 tag id); `regToValue` now bridges TUPLES (the failable `(value…, tag)`
shape the host's `checkComptimeFailable` reads); `trace_frame` packs `(func_id<<32 |
span.start)` from a new `call_stack` (pushed by `invoke`/`runEntry`); and `sx_trace_push`
/ `sx_trace_clear` are serviced NATIVELY (the VM calls the real sx_trace.c functions —
linked into the compiler — so the return-trace buffer the host reads is populated
identically to the legacy dlsym path). `raise`/`catch`/`or` all run on the VM now.
- **Signed sub-64-bit load fix (a real GENERAL bug the failable case surfaced):**
`readField` now SIGN-extends `i8`/`i16`/`i32`/`isize` loads (was zero-extending, so a
stored `i32 -1` reloaded as `0xFFFFFFFF` = +4.29e9 and `< 0` was false — which silently
hid `raise error.Bad`). Affects any negative signed sub-64-bit value stored & reloaded;
gate-ON corpus parity confirms it's a strict fix. Unit test added (+ failable tests
pass via 1037/1038 in the corpus).
- **Remaining fallbacks (2, both principled — the VM correctly stays on legacy):**
`intern` (`0626`, the welded compiler-API fn — Phase 3 re-homes it) and the inline-asm
global call (`1654`, never comptime-evaluable). Every other measured corpus const-init
is handled on the VM.
At this point the comptime VM handles essentially the entire real comptime corpus
(scalars, control flow, structs/tuples/arrays/slices/strings/optionals/enums, calls +
recursion, the implicit context + allocator protocol, globals, failables + return
traces). Phase 2 (bytecode) and Phase 3 (compiler-API on comptime memory) are the forward
work; flipping the VM to default + deleting the legacy path awaits those.
- **(8) Wire the `#run` side-effect path; trace-clear-on-fallback — DONE.** The second
comptime call site (`emit_llvm.runComptimeSideEffects`, top-level `#run <expr>;`) now
routes through `tryEval` with legacy fallback, like the const-init fold; `tryEval` yields
`.void_val` for a void/noreturn entry. Fixed a trace-corruption the new site exposed
(`1035`): a side-effect that pushes trace frames then bails (on `print`) had the legacy
re-run double-push them — both sites now `sx_trace_clear()` right before the legacy
fallback to discard the VM's partial pushes. Parity **688/688** both gate ON and OFF. All
comptime evaluation now routes through the VM-with-fallback (uniform).
- **(9) `-Dcomptime-flat` build flag — DONE (the "swap behind a build flag" step).** The VM
gate is now a build option (`build.zig` → a `build_opts` module on `mod`; `emit_llvm.init`
reads `build_opts.comptime_flat or SX_COMPTIME_FLAT env`), default OFF. `zig build test
-Dcomptime-flat` runs the FULL corpus on the VM (688/0) — the build-integrated parity
gate. Verified the flag toggles the binary (flag-built `sx` uses the VM with no env var;
default-built does not). This is the prerequisite to eventually making the VM default +
deleting the legacy path (which still awaits Phase 2/3 + broader confidence).
- **(10) Compiler-call path on the VM — `intern`/`text_of` native (Phase 3 SEED) — DONE.**
`invoke` now services a welded `compiler`-library function (the `compiler_welded` flag is
the safety boundary) via `Vm.callCompilerFn` — natively on comptime memory, NO legacy
`Interpreter`: `intern(s: string) -> StringId` reads the string bytes from comptime memory and
`internString`s into the (const-cast) table (pool-only, never touches type layout, so the
VM's cached sizes stay valid); `text_of(id) -> string` materializes the pooled text back
into comptime memory as a fat pointer. Unlocked `0626` — the ONLY remaining const-init fallback
is now the inline-asm global (`1654`, genuinely not comptime-evaluable). Parity **688/688**
both gate ON and OFF; unit test added. This is the mechanism Phase 3 grows: the next
compiler functions (`find_type`, `register_struct`, the reflection readers) are added the
same way — comptime pointer in, handle/pointer out, no marshaling.
**Phase 3 progress (2026-06-18):**
- **(P3.1) First read-only reflection readers — `find_type` + `type_field_count` (DONE).**
Two more `compiler`-library fns bound the same way as the `intern`/`text_of` seed
(added to `compiler_lib.bound_fns` AND `Vm.callCompilerFn`, native on comptime memory, no
marshaling). A **type handle is a plain `u32` `TypeId`** (exactly like `StringId`), so
both calls keep the seed's clean scalar shape — handle in, scalar out:
`find_type(name: StringId) -> TypeId` (`TypeTable.findByName`) and
`type_field_count(t: TypeId) -> i64` (a new `TypeTable.memberCount` query — struct/union/
tagged-union fields, enum variants, array/vector length — that BOTH the legacy handler
and the VM call, so the two paths can't drift). Example `0628` chains
`intern → find_type → type_field_count` and a not-found lookup, both folded at `#run`,
both VM-HANDLED natively (no fallback). Parity **689/689** (gate ON and OFF); VM unit test
added.
- **Decision (resolves the plan's `find_type → ?Type` sketch):** `find_type` returns a
NON-optional `TypeId`, using the codebase's dedicated `unresolved` (0) sentinel for
not-found — NOT an `?Type`. Rationale: a `Type` value resolves to `.any`
(`type_resolver.zig`), which the comptime VM does not represent; and an optional
return can't cross the legacy↔VM eval boundary (`regToValue` bridges only
word/string/struct/tuple). `unresolved` is the project-blessed unmistakable "no type"
marker (see CLAUDE.md REJECTED PATTERNS — a dedicated sentinel is the required shape),
so the caller checks the handle against 0. This keeps the reader a clean scalar mirror
of `intern`/`text_of` and defers `.any`/optional plumbing to when it's actually needed.
- **(P3.2) Field-level reflection readers — `type_nominal_name` + `type_field_name` +
`type_field_type` (DONE).** Three more readers on the same `TypeId`-handle shape (each
backed by a new `TypeTable` query that BOTH the legacy handler and the VM call, so no
drift): `type_nominal_name(t: TypeId) -> StringId` (`nominalName` — a named type's own
name; loud-bail for unnamed types), `type_field_name(t: TypeId, idx: i64) -> StringId`
(`memberName` — struct/union/tagged-union field, enum variant, named-tuple element), and
`type_field_type(t: TypeId, idx: i64) -> TypeId` (`memberType` — struct/tuple/array/vector
member type). All loud-bail on out-of-range idx / no-member (no silent default). These are
the first MULTI-ARG compiler fns (the VM's `callCompilerFn` now reads arg 1 = idx); added
`Vm.argHandle`/`argTypeId` helpers (range-checked u32/TypeId arg reads). Naming uses the
`type_*` family so nothing collides with the std metatype builtins (`field_name`/`type_name`
exist in `core.sx`). Example `0629` reflects `Pair { lo: Point; hi: Point }` — reads each
field name and the nominal name of a field's type, all folded at `#run`, all VM-HANDLED
natively. Parity **690/690** (gate ON and OFF); VM unit test added.
- **(P3.2b) Kind + enum-value readers — `type_kind` + `type_field_value` (DONE).** The last
two read-only readers the metatype's `type_info(T)` needs, completing the READ side: a
comptime sx fn can now fully reflect a struct/enum/tagged-union/tuple into data with no
`#builtin`. `type_kind(t: TypeId) -> i64` (`TypeTable.kindCode` — a stable, compiler-owned
discriminant: 0 other · 1 struct · 2 enum · 3 tagged_union · 4 tuple · 5 union · 6 array ·
7 vector · 8 error_set; TOTAL — never bails, an unnamed/non-aggregate type reads `other`)
and `type_field_value(t: TypeId, idx: i64) -> i64` (`TypeTable.memberValue` — an enum
variant's explicit value or ordinal; mirrors the `field_value_int` builtin; loud-bail for
a non-enum / out-of-range idx). Example `0630` reflects `Color`/`WindowFlags`(flags)/`Point`.
Parity **691/691** (gate ON and OFF); VM unit test added.
- **READ side now complete:** `find_type` + `type_kind` + `type_field_count` +
`type_field_name` + `type_field_type` + `type_nominal_name` + `type_field_value` cover
everything `reflectTypeInfo` reads.
- **(P3.3) WRITE side — `declare_type` + `pointer_to` + ONE kind-branching `register_type` (DONE).**
The mutating side is a SINGLE `register_type(handle, kind, members)` that branches on `kind`
IN THE COMPILER (subsuming `define`'s `defineStruct`/`defineEnum`/`defineTuple`), plus
`declare_type(name) -> Type` (forward handle) and `pointer_to(t) -> Type` (build `*T`
references). They take/return real `Type` values (matching meta.sx's declare/define).
- **Timing decision (per the user):** mint LAZILY at LOWERING time (single pass, NOT a
pre-emit phase, NOT two-pass) — the existing `runComptimeTypeFunc` path. So the write
side is **legacy-only** (`compiler_lib` handlers); the VM isn't wired at lowering time, so
no VM mirror is needed (the read-side readers stay dual-path for emit-time reflection). A
non-generic `-> Type` builder is now flagged `is_comptime` (`decl.zig`) so its dead body
permits the welded calls (the comptime-only gate).
- **Graph support:** forward `declare_type` handles + `pointer_to` express a
mutually-recursive A↔B graph (`*A`, `*B`, B-by-value) before bodies are filled.
`register_type` is **idempotent** — re-filling a nominal slot (same module reached via two
import edges) re-mints identically instead of erroring (`nominalIdent` reads identity from
any nominal kind). `kind` codes match `type_kind`: 1 struct · 2 enum (actual `.@"enum"`) ·
3 tagged_union · 4 tuple.
- **Two bugs fixed en route** (issue 0142): (a) a fully payloadless comptime-minted enum
was minted as an all-void `tagged_union` → `verifySizes` panic; now mints a real
`.@"enum"` (both `register_type` kind 2 AND the metatype `defineEnum`). (b) bare
`EnumType.variant` qualified construction of a payloadless variant wasn't supported (failed
for hand-written enums too) — added in `lowerFieldAccess` (`isPayloadlessVariant`).
- Examples: `0631` (graph + actual-enum + reflection), `0632` (make_enum all-void),
`0633`/`0634`/`0635` (namespaced / bare / multi-edge import of a minted type), `0187`
(qualified variant construction). Parity 697/697 (gate ON and OFF); unit tests added.
- **Next (P3.4):** re-express `declare`/`define`/`type_info` as sx over the read+write
compiler-API and DELETE the bespoke interp arms — needs the VM hardened against malformed
lowering-time IR first (the metatype runs at lowering time), so either harden + wire the VM
there, or migrate the metatype onto the legacy compiler-API calls first. Decide when reached.
Phase 2 (bytecode) is the orthogonal speed work.
### Phase 3 — Compiler-API on comptime memory (resume the stream — no weld)
With native-byte comptime values, re-home the compiler-API:
- **Expose the compiler's real types.** Register the actual `types.zig` records
(`StructInfo`, `EnumInfo`, `Field`, …) into the comptime type table under sx-visible
names, with their **real (host) layout** — the type IS the compiler's, so there is
nothing to validate or keep in sync. (This is the projection that *replaces* the
weld's reflection — owned by the compiler, not declared in sx.)
- **Expose the compiler's functions.** `register_struct`, `find_type`, `intern`,
`text_of`, and the reflection readers operate on comptime pointers / handles
directly (no marshaling — the bytes already ARE the record).
- **Re-express** `declare` / `define` / `type_info` as sx over these; delete the
bespoke interp arms (`defineStruct` / `defineEnum` / `defineTuple` / `reflectTypeInfo`);
migrate `examples/0622` (struct), `0619`/`0620`/`0623` (enum/tuple).
- **Migrate `BuildOptions`** off `#compiler` onto this mechanism; **delete `#compiler`**.
**Verification:** the metatype + `#compiler` surfaces are gone, re-expressed as sx over
the exposed compiler-API; full corpus green.
### Phase 4 — Retire the legacy interp (the ONE-evaluator end state)
The metatype CONSTRUCTION + REFLECTION surface is VM-native (steps 7/8 — `0614``0624`,
`0632` all HANDLED). This phase moves EVERYTHING ELSE off `interp.zig` and deletes it.
**What the legacy interp is still used for (audited 2026-06-18) — five roles:**
| Role | Wired to VM? | Site |
|------|--------------|------|
| **A. Comptime folds** (type-fn / `::` const-init / `#run`) | ✅ VM + legacy fallback | `comptime.zig:530`, `emit_llvm.zig:871`/`971` |
| **B. `#insert` string eval** | ❌ legacy-only (VM wiring reverted — 0737 malformed-IR crash) | `comptime.zig:634` |
| **C. Post-link bundler** (`platform.bundle` — Info.plist/codesign/process/fs) | ❌ legacy-only | `core.zig:invokeByFuncId` ← `main.zig:769` |
| **D. `#compiler` hooks** (`compiler_call` — BuildOptions/bundling) | ❌ legacy-only; `Value`-based ABI | `compiler_hooks.zig`, `interp.zig:1130` |
| **E. Bail diagnostics** (`Interpreter.last_bail_*` statics) | n/a | `main.zig:464` |
Shared substrate everything traffics in: the **`Value`** tagged union (the
`regToValue`/`valueToReg` bridge + the hooks + `core.zig`) and the **host-FFI bridge**
(`host_ffi.zig` + `interp.callExtern` — dlsym + cdecl trampolines for real libc).
**DECISION (2026-06-18, user): UNIFY.** The VM gains a host-FFI escape + real-pointer
translation and runs BOTH sandboxed comptime folds AND the unsandboxed post-link bundler.
`interp.zig` is fully deleted — true ONE evaluator, two modes (sandboxed / host-effects).
**Remaining comptime-fold gaps** (full corpus fallback inventory — 15 examples; 1179/1180
are legitimate negative-test bails that BECOME VM diagnostics, 1145 is a scan artifact):
`box_any`/`unbox_any` (6), `out`/print (2), `global_addr` (1), trace frames (1),
`compiler_call` (2 — role D).
**Sub-phases (dependency order; each its own session, both gates 697/0 after each):**
- **4A — finish comptime ops (small, parity-guarded).** Drive the fold fallback list to
empty except `compiler_call`:
- **4A.1** `box_any`/`unbox_any`. Word case = alloc 16B `{tag@0, value@8}`, tag =
`source_type.index()` (matches legacy comptime; note runtime `anyTag` normalizes
arbitrary-width ints), value via `writeField(source_type)` (so f32 etc. round-trip);
unbox = `readField(addr+8, target)`. Aggregate-Any payload needs the runtime
pointer-in-value-slot shape (`coerceToI64` alloca+ptrtoint) — implement or bail loudly.
- **4A.2** `out`/print → add a VM output buffer; flush through the same path as
`core.flushInterpOutput`.
- **4A.3** `global_addr` (address-of a global in comptime memory).
- **4A.4** trace frames (`sx_trace_*` / `interp_print_frames`).
- **4B — VM-native diagnostics (role E). MUST land before deleting legacy.** Today a VM
bail silently falls back; with legacy gone the VM bail IS the user-facing build-gating
diagnostic. Surface the VM's `detail`/span/file into what `main.zig` renders; turn
1179/1180-style bails into proper diagnostics. No diagnostic may regress.
- **4C — `#insert` on the VM (role B).** Re-wire `evalComptimeString` through `tryEval`;
the lowering-time-IR hardening that forced the 0737 revert is already in place. Verify
the `#insert` corpus parity.
- **4D — host FFI on the VM (role D substrate). DONE.** Solved by a better allocator, not a
pin/tag scheme: the comptime memory is now an **arena** of stable host allocations and `Addr`
IS a real host pointer (`4D.0`, `625ba0f`), so a comptime pointer and an FFI-returned host
pointer are the same value — no translation, no realloc hazard. `Vm.callHostExtern`
(`4D.1`, `e7a8708`) dispatches ANY extern via `host_ffi` dlsym + trampolines (args/returns pass
untouched); `4D.2` (`6a7f690`) adds slice/string args (→ NUL-term `char*`) + float guards.
Examples 0636/0637. **(Superseded sub-note:** the earlier "pin the buffer / comptime↔host translate"
hazard is moot — the arena never moves an allocation.)
- **`#compiler` / `compiler_call` — DELETED, replaced by the `abi(.compiler)` ABI (decision 2026-06-18,
REVISED from the earlier `abi(.zig) extern compiler` shape).** A function is *compiler-domain* — it runs in
the comptime evaluator (VM/interp), NEVER in the shipped binary — because its **ABI says so**: `abi(.compiler)`.
No `extern <lib>`, no fake `#library "compiler"`. One annotation covers BOTH roles: (a) the **compiler-API
surface** (`intern`/`find_type`/`build_options`/`set_post_link_callback`/… — bodiless decls whose Zig/VM
handler is the impl, on `compiler_lib`'s export list, dispatched by `Vm.callCompilerFn`); (b) **user
compiler-domain functions** like post-link callbacks (`bundle_main` — BODIED `abi(.compiler)`, lowered for VM
eval but emit-skipped). The `#compiler` struct attribute + the `compiler_call` IR op + the `Value`-based hook
`Registry` (`compiler_hooks.zig`) all **go away**. **Why this is cleaner than the welded-fn approach:** the
former runtime-call enforcement blocker (a `build_options()` call inside an LLVM-emitted callback body) is
MOOT — a compiler-domain function is never emitted, so its compiler-API calls never reach `emitCall`.
**Staged build (each its own step, both gates green):**
- **S1+S2 — DONE (2026-06-18):** introduced `abi(.compiler)`, REMOVED the `.zig` ABI + `abi(.zig) extern
compiler` + `#library "compiler"` (clean cutover, no legacy); migrated all compiler-API examples. The
binding now keys off `fd.abi == .compiler` (`decl.zig` `weldedCompilerFn`); a bodiless `abi(.compiler)`
decl lowers extern-like (declared-not-defined) with no implicit ctx. **700/0 both gates.**
- **S3 — DONE (2026-06-18):** emit_llvm skips BODIED `abi(.compiler)` function bodies. Added an
`is_compiler_domain` flag to the IR `Function`; a bodied `abi(.compiler)` function LOWERS its body (for VM
eval) + is flagged `is_comptime` but is NOT emitted (Pass 2 skip; declared external-linkage so the empty
decl verifies). KEY fix: a call to a comptime-only callee (compiler-API `compiler_welded` OR
`is_compiler_domain`) inside a dead comptime body now emits `undef` instead of a real `call` (`ops.zig`
`emitCall`) — the old `compiler_call` did this; without it an AOT link leaves an undefined `_double`/`_intern`
reference (this also fixed a pre-existing untested AOT breakage of the bodiless compiler-API examples).
`fnIsBodilessCompiler` distinguishes the API surface (declare-only) from a compiler-domain callback (lowered,
emit-skipped). Regression: `examples/0638-comptime-domain-fn-not-emitted` (`double` folds a `#run` const,
absent from the binary, JIT+AOT). **701/0 both gates.**
- **S4 — callback-param propagation: OPTIONAL / DEFERRED (ergonomics only).** Verified 2026-06-18: an
`abi(.compiler)` function is TYPE-compatible with a plain `() -> R` param (the ABI marks the *function* —
`is_compiler_domain` — not its *type*, which stays `() -> R` CC-default). So a callback that needs to be
compiler-domain just declares itself `abi(.compiler)` (S3) and passes to a plain param fine; auto-propagation
from an `abi(.compiler)` PARAM type is a nicety, not a prerequisite for S5. Skipped for now.
- **S5a — DONE (2026-06-18):** the corpus-covered slice. `build_options` + `set_post_link_callback` →
free `abi(.compiler)` functions (VM `callCompilerFn` arms + legacy `compiler_lib` handlers); **`BuildConfig`
threaded into the VM** via a `tryEval` param (the same one `main.zig` forwards — shared with 4E). `build.sx`
extracts `set_post_link_callback` from the `struct #compiler` as a free `ufcs` fn; `bundle_main` + the
platform registrars (`configure`) are `abi(.compiler)`. 37 examples' `.ir` snapshots regen'd (benign:
declaration renumber + `@str` suffix shift — every example imports build.sx via the prelude). Strict
`compiler_call` bails 6→2; 0602/0603/1604/1611 HANDLED. **701/0 both gates.**
- **S5b/S5c (port the ~37 hooks) — SUPERSEDED 2026-06-18 by the sx-driven build pipeline (below).**
Porting each `BuildOptions` accessor to an `abi(.compiler)` function that delegates to a `compiler_hooks`
hook just re-encodes sx-level logic (string setters/getters, `is_macos` triple-matching, list appends) as
compiler hooks. The hooks need NOTHING from the compiler except the `BuildConfig` state. So instead of 37
hooks, **drive the whole build pipeline from sx** (the logical end of "bundling lives in sx"). S5a stays as
a green intermediate; the sx-build-pipeline replaces `build_options`/`set_post_link_callback`/the whole
`#compiler` surface wholesale.
### Phase 5 — sx-driven build pipeline (replaces the BuildOptions hooks; decision 2026-06-18, user)
**The build pipeline becomes an sx program.** `BuildConfig` is plain sx data (an ordinary struct, sx-owned
end-to-end — no `#compiler`, no hooks, no shared Zig state, no weld/offset access). The compiler shrinks to
a few `abi(.compiler)` PRIMITIVES that take **explicit args** (so nothing is shared by memory), and an sx
`build()` driver orchestrates configure → emit → link → bundle. **Chosen boundary: Option B** — the compiler
keeps the proven Zig linker as a primitive; sx owns config + orchestration + bundle. (Option A — sx shells
`cc`/`ld` itself — is a later refinement once the per-target link-line logic is ported to sx.)
**File split (user decision 2026-06-19):** the low-level compiler-API PRIMITIVES live in
`library/modules/compiler.sx` (the comptime `compiler` library — renamed from the interim `std/build.sx`); the
default `build` IMPLEMENTATION (`default_build` + the `on_build` slot + the sx `BuildConfig`) lives in
`library/modules/build.sx` alongside the existing `BuildOptions` DSL. So `compiler.sx` = primitives, `build.sx` =
orchestration/default impl. **Build-callback fallibility was DROPPED (user 2026-06-19):** the primitives + the
build callback are NOT `-> !` — a failed action (e.g. `link`) BAILS on the VM (hard build error). So the shapes
below shed their `-> !`.
Shape (build-callback fallibility dropped 2026-06-19):
```sx
// library/modules/compiler.sx (the comptime `compiler` library — PRIMITIVES)
emit_object :: () -> string abi(.compiler); // emitted .o path (query)
link :: (objects: List(string), output: string, libraries: List(string),
frameworks: List(string), flags: List(string), target: string) abi(.compiler); // void; bails on failure
c_object_paths :: () -> List(string) abi(.compiler); // metadata queries
link_libraries :: () -> List(string) abi(.compiler);
// library/modules/build.sx (the build DSL — DEFAULT IMPLEMENTATION + slot)
BuildConfig :: struct { output: string; target: string; flags: List(string);
frameworks: List(string); bundle_path: string; bundle_id: string; ... }
default_build :: (config: BuildConfig) abi(.compiler) { // the default pipeline (void)
obj := emit_object(); objs := c_object_paths(); objs.append(obj);
link(objs, config.output, link_libraries(), config.frameworks, config.flags, config.target);
if config.bundle_path.len > 0 { bundle_app(config); } } // bundle_app = today's sx bundler
on_build : (BuildConfig) abi(.compiler) = default_build; // the override slot
// user overrides: build :: (config: BuildConfig) abi(.compiler) { ... } #run on_build = build;
```
The compiler's whole post-IR role: codegen → build the CLI-derived `BuildConfig` → read `on_build` → invoke
`on_build(config)` on the VM; a `raise` fails the build. Plain `sx run` fires none of it.
**Steps (each its own green step; depends on 4E first):**
- **P5.1 — 4E prereq — DONE (2026-06-19).** `core.invokeByFuncId` routes the post-link callback through the
**VM** (`comptime_vm.tryEval`), NO fallback (a side-effecting callback can't double-execute): a bail is a hard
build error (`comptime_vm.last_bail_reason` surfaced by `main.printInterpBailDiag`). `BuildConfig` +
`import_sources` threaded in; `flushInterpOutput` deleted (VM `out` writes direct via host-FFI). Smoke test
`examples/1661-platform-post-link-vm-list` (AOT): a post-link callback GROWS a `List` (0141 — works on the VM,
bails on legacy with `struct_get`), so the build succeeds (exit 0) only via the VM. Non-empty callback `args`
rejected loudly (the `on_build(config)` arg-marshaling entry is P5.3). **702/0 both gates.**
- **P5.2 — primitives.** Split: the read-only **metadata queries are DONE (2026-06-19)** — `c_object_paths() ->
List(string)` + `link_libraries() -> List(string)` as `abi(.compiler)` fns (stdlib `library/modules/compiler.sx`),
serviced by `comptime_vm.callCompilerFn` over `BuildConfig` fields `main.zig` forwards; new VM `makeStringList`
builds the `List(string)` in comptime memory from the call's result type (`ins.ty` now threaded through
`invoke`/`callCompilerFn`). Smoke test `1662-platform-build-pipeline-queries` (AOT + C companion). 703/0 both
gates. **`emit_object() -> string` is also DONE (2026-06-19)** as a QUERY (not an action): the Zig driver emits
the object eagerly, so the primitive just returns the path from `BuildConfig.object_path` (no vtable). So all
three QUERY primitives are done. **P5.2b — `link(...)` (the one genuine ACTION) — DONE (2026-06-19).** USER
DECISION: the build callback is NOT fallible, so `link` is plain VOID (no `-> !`) and a failure BAILS (hard
build error) — no failable-tuple construction. It dispatches through a host-installed `compiler_hooks.BuildHooks`
vtable (`comptime_vm.zig` can't depend on the driver); `main.LinkHooksCtx.link` adapts to `target.link`. New VM
readers `readStringList`/`readStringArg` (inverse of `makeStringList`). Smoke test
`1663-platform-build-pipeline-link` (AOT): a post-link callback re-links the build's objects to a temp output —
the relinked binary RUNS; negative-probe verified. The Zig driver still auto-links (removed in P5.4). 704/0.
- **P5.3 — `on_build` registrar — DONE (2026-06-19).** `on_build(cb)` registers the build callback
(`cb: (opt: BuildOptions) -> bool abi(.compiler)`); the compiler force-lowers + auto-invokes the well-known
`default_pipeline` when no override. (Implemented as a registrar, not an assignable slot — the opaque
`BuildOptions` handle is one word, so arg-passing needs no struct marshaling.)
- **P5.4 core — DONE (2026-06-19).** `default_pipeline` in `build.sx` drives the whole build; NO Zig
auto-emit/auto-link; `emit_object`/`link` are sx-called actions via the `BuildHooks` vtable;
`set_post_link_callback` deleted (all callers on `on_build`). Build-path auto-imports `modules/build.sx`.
703/0 both gates.
### THE FINAL DIRECTION (user, 2026-06-19): FULL MIGRATION — NO LEGACY LEFT.
**Decision: DROP gate-OFF entirely.** The VM becomes the SOLE comptime evaluator; `-Dcomptime-flat` is made
permanent then removed; `interp.zig` (the legacy tagged-`Value` `Interpreter`) is DELETED. There is no
dual-path, no legacy `compiler_lib` handler, no `regToValue`/`valueToReg` bridge, no VM→legacy fallback. We
migrate the BuildOptions surface DIRECTLY to VM-native `abi(.compiler)` arms (no legacy handler — there is no
legacy to handle). **All bundling + code signing for EVERY target lives in the sx `default_pipeline`.**
- **P5.5 — DONE (2026-06-19).** The 35 `BuildOptions :: struct #compiler` methods migrated to VM-native
`abi(.compiler)`: `BuildOptions :: struct { }` (opaque null-sentinel handle) + 35 free
`ufcs (self: BuildOptions, …) abi(.compiler)` decls in `build.sx`, serviced by a new
`comptime_vm.callBuildOptionFn` arm off `callCompilerFn` — **NO legacy `compiler_lib` handler** (names
registered in `bound_fns` with a single bailing stub only so `weldedCompilerFn` accepts them). Setters dupe the
arg string into the PERSISTENT `Vm.gpa` (the Compilation allocator — threaded into both `tryEval` and
`runBuildCallback` — NOT the per-eval VM arena) and write/append to the threaded `BuildConfig`; string getters
return the field (or `""`); bool getters compute from the triple (`predIsMacOS`/…); count/index getters read the
`BuildConfig` slices. **Dispatch routing (Option B):** a `#run`/const-init entry that directly calls a
compiler-domain/welded fn (`emit_llvm.entryNeedsVm`) runs on the VM with NO legacy fallback regardless of the
`-Dcomptime-flat` gate → gate-OFF stays green without a legacy BuildOptions handler. 5 `platform/bundle.sx`
getter-calling helpers marked `abi(.compiler)` (comptime-only bundler code). 37 `.ir` regenerated (string-pool
churn; behavior-identical, verified `.ir`-only). **703/0 BOTH gates.** BuildOptions `compiler_call` bails GONE
(1609/1614/1615 strict-clean); 1616 now bails on `shr` — a SEPARATE unported bitwise/shift VM gap
(`shl`/`shr`/`bit_and`/`bit_or`/`bit_xor`/`bit_not`), to port FIRST in P5.6 (1616 is unpinned + can't JIT-run on
macOS regardless). Also swept the outdated "flat memory" terminology → "comptime/byte-addressable" (the VM is
arena-backed, `Addr` = real host pointer; flag names `-Dcomptime-flat`/`SX_COMPTIME_FLAT` kept).
- **P5.6 — ALL bundling + code signing in `default_pipeline` (every target).** `default_pipeline` (or a
`bundle()` it calls, in `platform/bundle.sx`) performs, after `link`, the full per-target bundle when
`bundle_path()` is set — branching on `is_macos`/`is_ios_device`/`is_ios_simulator`/`is_android`:
- **macOS `.app`** — `Contents/{MacOS,Resources,Frameworks}`, `Info.plist`, embed `-framework` dylibs +
`install_name_tool` fixups, `codesign` (ad-hoc or with `codesign_identity`).
- **iOS device `.app`** — device slice, embedded `.mobileprovision` (`provisioning_profile`), entitlements,
`codesign` with the real identity; **iOS simulator `.app`** — sim slice, no provisioning, ad-hoc sign.
- **Android `.apk`** — `AndroidManifest.xml` (or `manifest_path` override), asset tree (`add_asset_dir`),
`#jni_main` Java → `javac` → `d8` → `classes.dex`, `aapt2` package, `zipalign`, `apksigner` with the
debug/`keystore_path` keystore.
All of it runs on the VM via the migrated `abi(.compiler)` getters + `fs`/`process` host-FFI (the existing
`platform/bundle.sx` logic, now reading the VM-native accessors instead of `#compiler` hooks). The compiler
keeps ONLY the linker as a primitive (Option B). Remove the `--bundle`/`post_link_module` Zig shim — bundling
is `default_pipeline`'s job; CLI flags feed `BuildConfig` and `default_pipeline` branches on it.
- **P5.7 — DELETE all legacy.** Remove the `#compiler` attribute (parse + lower), the `compiler_call` IR op
(`inst.zig` + every switch arm + the `interp.zig:1130` dispatch), `compiler_hooks.zig`
(`HookFn`/`Registry`/all hooks). Make `-Dcomptime-flat` permanent (VM always) and **delete `interp.zig`**
(`Interpreter`/`Value`/`defineEnum`…/`reflectTypeInfo`/`callExtern`/`last_bail_*`); drop the
`regToValue`/`valueToReg` bridge and the VM→legacy fallback in `emit_llvm` (`#run`/const-init) and
`comptime.zig` (type-fn / `#insert`) — a VM bail is now ALWAYS a build-gating diagnostic (4B wiring), never a
fallback. `core.invokeByFuncId` is already VM-only. Re-express `define`/`make_enum` as sx over the
compiler-API. Land the 0141 repro as a corpus test. Reconcile 1654 (asm-global at comptime) to the VM wording.
- **P5.8 — real-project validation (integration).** Build `~/projects/m3te` and `~/projects/distribution` with
the new pipeline end-to-end (their real bundle/codesign/target configs) — these are the acceptance test that
`default_pipeline` covers all targets. Fix gaps surfaced there. Add dedicated bundle smoke tests (min `.app` +
`.apk`) to the corpus (the bundler still has no `zig build test` coverage — the stream's top risk).
**End state:** ONE evaluator (the VM); ZERO legacy; the entire build — emit, link, and all bundling + code
signing for macOS/iOS-device/iOS-sim/Android — is sx in `default_pipeline`, overridable via `#run on_build(...)`.
The compiler is: parse → IR → codegen → invoke `on_build`/`default_pipeline` on the VM (which calls back into
the linker primitive). `m3te` + `distribution` build clean.
**Dependencies:** 4A → (4B, 4C independent) ; `abi(.compiler)` S1+S2(done) → S3 → S4 → S5 (BuildOptions) ;
FFI(done)+`BuildConfig`-on-VM → (S5, 4E) → 4F.
**Top risks:** (1) the bundler has no corpus guard (4E needs dedicated tests); (2) deleting
`#compiler`/`compiler_call` + re-expressing `BuildOptions` over the compiler-API (`abi(.compiler)`) touches the
whole build/bundle path — stage it behind real bundle builds; (3) S3's emit-skip relies on DCE dropping the
unreferenced compiler-domain declaration — verify no stray runtime reference keeps it alive (link error).
## Open questions (resolve as reached, record decisions here)
- **Host-ABI vs target-ABI split.** The compiler runs on the host, so its OWN exposed
records are host-laid-out; user comptime types are target-laid-out. The comptime
model must carry both regimes (a per-type ABI tag on layout queries). Confirm the
boundary where a comptime pointer to a compiler record is handed to host Zig code
uses host layout.
- **Exposing compiler types to sx.** Mechanism for projecting `types.zig` records into
the comptime type table with real offsets (the non-weld replacement) — a registry the
compiler owns, keyed by sx-visible name → real Zig type's layout + a host-call ABI.
- **Bytecode shape.** IR-derived vs a fresh ISA; register vs stack; fragment caching.
- **Pointer escape / lifetime.** Flat-memory pointers stored into the persistent type
table must be copied into compiler-owned memory at the boundary (as today).
- **Old-path retirement.** Keep the tagged interpreter until Phase 1 parity, then
delete — confirm no non-comptime consumer depends on `Value`.
## File map (current → touched)
| Area | File | Phase |
|------|------|-------|
| Comptime evaluator | `src/ir/interp.zig` | 0 (strip weld dispatch), 12 (rebuild) |
| Weld registry | `src/ir/compiler_lib.zig` | 0 (strip), 3 (replace with type/fn exposure) |
| Weld validation | `src/ir/lower/nominal.zig` | 0 (strip `validateWeldedStruct`) |
| Comptime-only gate | `src/backend/llvm/ops.zig` | 0 (re-derive without weld signal) |
| Host-FFI marshalling | `src/ir/host_ffi.zig` | 1 (struct-by-pointer trims it) |
| Metatype arms | `src/ir/interp.zig` (`defineStruct`/…/`reflectTypeInfo`) | 3 (delete, re-express in sx) |
| `#compiler` / BuildOptions | `library/modules/build.sx`, `src/ir/compiler_hooks.zig` | 3 (migrate, delete `#compiler`) |
| Surface syntax | `src/parser.zig`, `src/ast.zig` (`abi`/`extern`/`#library`) | kept; revisited Phase 3 |
## Status
- **Phase 0 — DONE (2026-06-17).** The struct-weld machinery is stripped:
`compiler_lib.zig` lost the type registry (`weldStruct`/`bound_types`/`BoundType`/
`FieldLayout`/`findType`/`SxField`/`LayoutMismatch`/`validateStructLayout`);
`nominal.zig` lost `validateWeldedStruct`/`weldedFieldOrderStr` + the
`sd.abi == .zig` call; the struct-weld unit tests + examples `0625`/`0627`/`1183`/
`1186` are removed. **Decision (recorded):** the `intern`/`text_of` function
host-call bridge is KEPT — it is a clean scalar dispatch (string→handle), not
weld/serialize/marshal, and is the seed Phase 3 grows the compiler-call path from.
So the `compiler_welded` dispatch (`interp.callExtern` is unchanged at HEAD — the
pre-branch in `call()`), `weldedCompilerFn` (decl.zig), the `emitCall` comptime-only
gate (ops.zig), and examples `0626`/`1184`/`1185` stay. The `#library`/`abi`/`extern`
SYNTAX stays. `zig build test` green (688 corpus, 0 failed; unit tests pass).
- **Phase 1 — in progress.**
- **Sub-step 1 — DONE.** `src/ir/comptime_vm.zig`: the comptime `Machine`
(linear byte memory + bump/stack allocator with `mark`/`reset` reclamation +
scalar `readWord`/`writeWord` (1/2/4/8, little-endian) + `bytes` views; addr 0
reserved as `null_addr`) and `Frame` (register file indexed by Ref + stack
reclamation on `deinit`). A register `Reg` is a raw u64 — immediate scalar OR
`Addr`. Standalone + unit-tested (`comptime_vm.test.zig`, in the barrel); does
NOT touch the live interpreter, so the corpus stays green (688). No op execution
yet.
- **Sub-step 2 — DONE.** The executor (`Vm` in `comptime_vm.zig`): walks the SAME
IR `Inst` over comptime frames, mirroring the legacy interp's scalar semantics
(i64 wrapping/signed + f64 register words, keyed off the result/operand `TypeId`).
Ported: constants (`const_int`/`float`/`bool`/`null`/`undef`), arithmetic
(`add`/`sub`/`mul`/`div`/`mod`/`neg`), comparison (`cmp_*`), logical
(`bool_and`/`or`/`not`), conversions (`widen`/`narrow`/`bitcast` passthrough,
`int_to_float`/`float_to_int`), terminators (`br`/`cond_br`/`ret`/`ret_void`) and
`block_param` (branch args passed as Refs — the same frame persists, SSA-safe).
Any other op bails loudly (`error.Unsupported` + `detail = @tagName(op)`).
Unit-tested on hand-built IR (`Fb` builder): integer add, f64 arithmetic, cond_br
branch selection, a block-param loop summing i..1, div-by-zero + unsupported-op
bails. Corpus untouched (688 green) — the executor is exercised by unit tests only,
not yet wired to real comptime eval.
- **Sub-step 3 — DONE.** Memory + structs on comptime memory. `Vm` gained an optional
`table: *const TypeTable` (target-aware layout). Ported `alloca`/`load`/`store`
(over comptime addresses, `Store.val_ty` drives width) and `struct_init`/`struct_get`/
`struct_gep` (structs laid out at the table's natural offsets). The value model: a
`Kind.word` (scalar/pointer ≤8B) sits in a register; a `Kind.aggregate` (struct)
lives in comptime memory and its "value" IS its address (read returns the address,
write memcpys), so nested structs compose and `struct_gep` is just base+offset (no
field-pointer dance). `kindOf` bails loudly on the not-yet-ported types
(slice/string/any/optional/enum/array/tuple/…). The Addr-based value model survives
allocator realloc (offsets are stable; slices are only materialized transiently).
Unit-tested: struct_init+get round-trip, alloca+gep+store+load, nested-struct
aggregate copy + nested read. Corpus untouched (688 green).
- **Sub-step 4a — DONE.** Tuples + arrays. `kindOf` widened (`tuple`/`array` →
aggregate). Ported `tuple_init`/`tuple_get` (positional, `tupleFieldOffset`),
`index_get`/`index_gep` (`elemAddr` = base + idx*elem_size over array/pointer/
many_pointer bases; slice/string bases bail), and `length` on an array value
(static `ArrayInfo.length`). Unit-tested: mixed tuple round-trip, `[3]i64`
gep/store + index_get sum (42), array `length` (3). 688 corpus green.
- **Sub-step 4b — DONE.** Slices + strings as `{ptr@0 (pointer_size), len@8 (i64)}`
fat pointers (`kindOf`: string/slice → aggregate). Ported `const_string` (materializes
text+NUL in comptime memory + a fat pointer), `length`/`data_ptr` (read len/ptr fields),
`array_to_slice`, `subslice`, indexing *through* a slice/string (`elemAddr` loads
`.ptr` first), and `str_eq`/`str_ne` (len+memcmp). Helpers `makeSlice`/`sliceLen`/
`sliceData`. Unit-tested: string length + str_eq/ne, array→slice + slice index +
slice length (23), array subslice (43). 688 corpus green.
- **Sub-step 4c — DONE (optionals + payloadless enums).** `kindOf`: `enum` → word;
`?T` → word if pointer-child (null==0) else `{T@0, i1@sizeof(T)}` aggregate. Ported
`optional_wrap`/`unwrap`/`has_value`/`coalesce` (with `optChildIsPtr`/`optHas`
helpers; `const_null` → `null_addr` reads as none), `enum_init` (payloadless: tag is
the value), `enum_tag` (payloadless/word). Unit-tested: non-pointer `?i64`
wrap/unwrap/coalesce (91), pointer `?*i64` null==0 (99), payloadless enum tag (11).
688 corpus green.
- **Sub-step 4d — partial (`addr_of`/`deref` DONE).** `addr_of` passes through (an
aggregate value already IS its address; a pointer is already an address — mirrors
the legacy); `deref` = `readField` through the pointer (`ins.ty` is the pointee).
Unit-tested (deref a `*i64` → 77; addr_of a struct value + field read → 80).
**Deferred to the wiring phase (intentionally, not ported blind):** tagged-union
payload (`enum_init` w/ payload, `enum_payload` — the legacy stores *untyped* Values
and `field_index` indexes payload sub-fields, not variants, so a byte model's
payload type is ambiguous without a real call site), `any` boxing, closures, and the
bitwise ops. These have subtleties best resolved against actual corpus cases — the
VM's loud `error.Unsupported` + `detail` will name exactly what each real eval needs.
- **Sub-step 1.5 — direct `call` DONE.** `Vm` gained `module: *const Module`
(resolves a callee `FuncId`) + a `depth`/`max_depth` recursion guard. `call`
marshals arg Refs → Reg words and recursively `run`s the callee; aggregate args/
results pass as their `Addr` over the SHARED comptime memory (no copy). **Stack-lifetime
change:** `Frame` no longer reclaims the machine on exit (a returned aggregate's
Addr would dangle) — a comptime eval's allocations live to `Vm.deinit`;
`Machine.mark`/`reset` stay for explicit use. Extern/builtin callees (no blocks)
bail loudly (1.5b). Unit-tested: direct call (`add(20,22)+100` → 142) and recursion
(`sum(0..n)` → 15/55). 688 corpus green.
- **Sub-step 1.5b — `Reg`↔`Value` boundary bridge DONE.** The builtin/`compiler_call`/
extern handlers are all coupled to the legacy `Interpreter` (e.g. `compiler_lib`
handlers take `*Interpreter`), so the VM can't call them directly — the wiring uses
WHOLE-FUNCTION fallback instead (VM runs pure functions; a bail re-runs the whole
eval in the legacy). That needs the boundary bridge: `valueToReg` (host `Value` arg →
VM `Reg`, materializing aggregates into comptime memory) + `regToValue` (VM result →
`Value`, deep-copied out). Covers scalars + strings + structs (other aggregate shapes
bail loudly; added as wiring surfaces them). Transitional — deleted once the VM owns
comptime end-to-end. Unit-tested with round-trips. 688 corpus green.
- **Then the wiring step** (below) — now unblocked.
### Decision (2026-06-17): pivot from blind op-porting to CALLS + hybrid wiring
The common leaf ops are ported (scalars, control flow, structs, tuples, arrays, slices,
strings, optionals, payloadless enums, deref/addr_of) and unit-tested. Continuing to
port the rarer ops (tagged-union payload, any, closures) in isolation risks subtle
bugs and has low signal. The higher-value path:
1. **Calls (sub-step 1.5)** — `call` (direct), then `call_builtin`/`compiler_call`. The
shared comptime memory makes aggregate args/results pass naturally (they're Addrs). The
one design point: **aggregate-return lifetime** — a callee's stack-reclaim would
dangle a returned struct Addr, so for comptime (bounded) the VM should stop
reclaiming per-frame and let the whole eval's allocations live until `Vm.deinit`
(keep `Machine.mark/reset` for explicit use; drop it from `Frame.deinit`).
2. **Hybrid wiring** — `-Dcomptime-flat` routes a comptime eval through the VM, falling
back to the legacy interp on `error.Unsupported`. This makes the VM run the REAL
corpus, proving parity incrementally and surfacing exactly which ops each real eval
needs — far better signal than more isolated unit tests.