P5.5: migrate the 35 BuildOptions accessors off #compiler to VM-native abi(.compiler)

`BuildOptions :: struct #compiler { ...35 methods... }` becomes
`BuildOptions :: struct { }` (an opaque null-sentinel handle) plus 35 free
`ufcs (self: BuildOptions, …) abi(.compiler)` decls in build.sx, each serviced
by a new `comptime_vm.callBuildOptionFn` arm (off `callCompilerFn`). No legacy
`compiler_lib` handler: the names are registered in `bound_fns` with a single
bailing stub only so `weldedCompilerFn` accepts them.

- String lifetime: setters dupe the arg into the persistent `Vm.gpa` (the
  Compilation allocator, threaded into both `tryEval` and `runBuildCallback` —
  not the per-eval VM arena) and write/append to the threaded `BuildConfig`.
  Getters read the field/slice or compute the target predicate from the triple.
- Dispatch routing (Option B): a `#run`/const-init entry that directly calls a
  compiler-domain/welded fn (`emit_llvm.entryNeedsVm`) runs on the VM with no
  legacy fallback regardless of the `-Dcomptime-flat` gate, so gate-OFF stays
  green without a legacy BuildOptions handler (P5.7 retires the legacy interp).
- Mark the 5 `platform/bundle.sx` getter-calling helpers `abi(.compiler)` (they
  are comptime-only bundler code; otherwise their now-welded getter calls trip
  the runtime-call gate).
- 37 `.ir` snapshots regenerated (std transitively imports build.sx → string-
  pool/type-table indices shift); verified `.ir`-only, zero behavior-stream diffs.

BuildOptions `compiler_call` strict bails gone (1609/1614/1615 strict-clean);
1616 now bails on a separate, pre-existing unported bitwise/shift VM gap (`shr`),
to port first in P5.6. 703/0 both gates.

Also sweep the outdated "flat memory" terminology to "comptime/byte-addressable"
across comptime_vm + the plan/checkpoint/CLAUDE docs: the comptime VM is
arena-backed, byte-addressable memory where `Addr` is a real host pointer, not a
flat contiguous address space (flag names `-Dcomptime-flat`/`SX_COMPTIME_FLAT` kept).
This commit is contained in:
agra
2026-06-19 13:21:09 +03:00
parent af32c3823c
commit ba28488d99
48 changed files with 13896 additions and 14974 deletions

View File

@@ -11,7 +11,7 @@ with ONE welded mechanism. Branch: `reify` (off `master`). Update after every st
> **⚠ DIRECTION CHANGED (2026-06-17). The active plan is now
> [`PLAN-COMPILER-VM.md`](PLAN-COMPILER-VM.md), NOT the weld.**
> The **byte-weld + serialization/marshaling** approach is the wrong direction and is
> being **stripped**. New foundation: a **bytecode VM over flat, byte-addressable
> being **stripped**. New foundation: a **bytecode VM over byte-addressable
> memory** so comptime values are native bytes; then the compiler-API rides on it with
> direct memory access (no weld, no validation, no marshaling). Everything below this
> banner describes the now-superseded weld state (committed on `reify` through
@@ -21,15 +21,15 @@ with ONE welded mechanism. Branch: `reify` (off `master`). Update after every st
> **Why the pivot:** the comptime evaluator (`src/ir/interp.zig`) represents values as
> tagged `Value` unions, NOT native bytes — so a comptime `@ptrCast(*StructInfo)`
> reads the `Value` union's memory, not a struct. The weld tried to bridge that with
> hand-marshaling — exactly what the design set out to kill. Flat memory makes comptime
> hand-marshaling — exactly what the design set out to kill. Comptime memory makes comptime
> values real bytes, so the bridge disappears. (JIT-native comptime was rejected: it
> breaks cross-compilation — host vs target layout — and loses the sandbox. A
> flat-memory VM keeps both while getting native bytes + speed.)
> comptime VM keeps both while getting native bytes + speed.)
>
> **Next action (2026-06-18) — the WHOLE metatype surface is VM-native (steps 7+8, committed through
> `d0ebc55`; step 8 uncommitted).** `declare`/`define`/`type_info` + tagged-union `enum_init` all run
> NATIVELY on the VM (`.call_builtin` exec arm → `callBuiltinVm`; `defineFromInfo` decodes a
> `TypeInfo` from flat memory, `buildTypeInfo` reflects one INTO flat memory — faithful ports of
> `TypeInfo` from comptime memory, `buildTypeInfo` reflects one INTO comptime memory — faithful ports of
> legacy `defineEnum`/`Struct`/`Tuple`/`reflectTypeInfo`). The ENTIRE metatype range `0614``0624` +
> `0632` runs **HANDLED with ZERO fallback** (incl. the `define(declare, type_info(T))` round-trips
> `0619`/`0622`/`0623`); VM output byte-matches legacy. `enum_init`/`define`/`type_info` bail loudly
@@ -52,7 +52,7 @@ with ONE welded mechanism. Branch: `reify` (off `master`). Update after every st
> `0602`/`0603` stay on legacy fallback until the BuildOptions migration lands. **Migration shape**
> (end-state, shares the `BuildConfig`-on-the-VM prerequisite with the bundler 4E): (1) each
> `BuildOptions` setter/getter becomes a `compiler` fn in `compiler_lib.bound_fns` + `Vm.callCompilerFn`,
> reading flat-memory args + a `*BuildConfig` threaded into the `Vm` (the same `BuildConfig`
> reading comptime args + a `*BuildConfig` threaded into the `Vm` (the same `BuildConfig`
> `main.zig` forwards); (2) `library/modules/build.sx` declares them `abi(.zig) extern compiler`
> instead of `struct #compiler`; (3) delete the `compiler_call` op + `compiler_hooks.zig`
> `HookFn`/`Registry` + the `#compiler` parse/lower path. See `PLAN-COMPILER-VM.md` Phase 4.
@@ -90,7 +90,7 @@ with ONE welded mechanism. Branch: `reify` (off `master`). Update after every st
> entry):** `c_object_paths() -> List(string)` + `link_libraries() -> List(string)` are `abi(.compiler)` primitives
> (new stdlib home `library/modules/compiler.sx`), serviced by `comptime_vm.callCompilerFn` reading `BuildConfig`
> fields `main.zig` forwards (`c_object_paths`/`link_libraries`). New reusable VM helper `makeStringList` builds a
> `List(string)` in flat memory (target-aware via the result type's offsets); `invoke`/`callCompilerFn` now thread
> `List(string)` in comptime memory (target-aware via the result type's offsets); `invoke`/`callCompilerFn` now thread
> the call's result type (`ins.ty`). Legacy handlers bail loudly (VM-only by nature — post-link). Smoke test
> `1662-platform-build-pipeline-queries` (AOT, C companion → 1 object): a post-link callback checks the VM-built
> list is well-formed; build exit 0 ONLY if so (negative-probe verified: wrong count → "post-link callback
@@ -106,9 +106,13 @@ with ONE welded mechanism. Branch: `reify` (off `master`). Update after every st
> build is sx-driven via `default_pipeline` (force-lowered + auto-invoked; NO Zig auto-emit/auto-link);
> `on_build(cb)` is the sole callback mechanism; `set_post_link_callback` deleted. **703/0 both gates.**
> **NEXT — the FULL MIGRATION (no legacy left), spec'd as Phase 5 steps P5.5P5.8 in `PLAN-COMPILER-VM.md`:**
> P5.5 migrate the 36 `BuildOptions` `#compiler` methods → VM-native `abi(.compiler)` arms (NO legacy handler —
> direct migration; thread a persistent allocator for setter strings; kills the 4 strict `compiler_call` bails
> 1609/1614/1615/1616) · P5.6 ALL bundling + code signing for EVERY target (macOS/iOS-device/iOS-sim/Android) in
> **P5.5 DONE (2026-06-19, newest Log entry):** the 35 `BuildOptions` `#compiler` methods → VM-native
> `abi(.compiler)` arms (`comptime_vm.callBuildOptionFn`, NO legacy handler); setter strings duped into the
> persistent `Vm.gpa`; `#run`/const-init compiler-domain entries routed to the VM (`entryNeedsVm`, no fallback)
> so gate-OFF stays green; 5 bundle.sx helpers marked `abi(.compiler)`. BuildOptions `compiler_call` bails GONE
> (1609/1614/1615 strict-clean; 1616 now bails on `shr` — a SEPARATE unported bitwise/shift VM gap, do FIRST in
> P5.6). 37 `.ir` regenerated (string-pool churn, behavior-identical). 703/0 BOTH gates. · P5.6 ALL bundling +
> code signing for EVERY target (macOS/iOS-device/iOS-sim/Android) in
> the sx `default_pipeline` · P5.7 DELETE `#compiler`/`compiler_call`/`compiler_hooks`/`interp.zig` + the
> `regToValue` bridge + VM→legacy fallback (drop gate-OFF; VM is the SOLE evaluator) · P5.8 build
> `~/projects/m3te` + `~/projects/distribution` end-to-end as the acceptance test + add `.app`/`.apk` smoke tests.
@@ -389,7 +393,7 @@ What landed:
> marshal machinery (`compiler_lib.zig` reflection+validation, `nominal.zig`
> `validateWeldedStruct`, the `compiler_welded` dispatch, the weld examples/diagnostics
> 0625/0627/1183/1184/1185/1186), keeping the `#library`/`abi`/`extern` *syntax*. Then
> Phase 1 (flat-memory value model). The weld-era "next step" below is **obsolete** —
> Phase 1 (byte-addressable value model). The weld-era "next step" below is **obsolete** —
> kept only as a record of what the weld surface was about to do.
### (obsolete) weld-era next step
@@ -423,6 +427,38 @@ when reached (sentinels or accessor fns; see the design doc Risks).
`List` growth; orthogonal, see `current/CHECKPOINT-METATYPE.md`.)
## Log
- **P5.5 — the 35 `BuildOptions` accessors migrated off `struct #compiler` onto VM-native `abi(.compiler)` (2026-06-19).**
`BuildOptions :: struct #compiler { ...35 methods... }` → `BuildOptions :: struct { }` (an opaque
null-sentinel handle) + 35 free `ufcs (self: BuildOptions, …) abi(.compiler)` decls in
`library/modules/build.sx`, each serviced by a new `comptime_vm.callBuildOptionFn` arm (dispatched from
`callCompilerFn`). **NO legacy `compiler_lib` handler** (per the full-migration direction): the 35 names are
registered in `compiler_lib.bound_fns` only so `weldedCompilerFn` accepts them, with a single bailing stub
`handleBuildOptionsAccessor` (never reached). **String lifetime:** setters dupe the arg string into the
PERSISTENT `Vm.gpa` (the Compilation allocator threaded into both `tryEval` and `runBuildCallback` — NOT the
per-eval VM arena, whose bytes die at `Vm.deinit`), so a `#run`-set path survives to post-link. Setters
write/append the duped string to the threaded `BuildConfig` (`output_path`/`bundle_path`/…, the `link_flags`/
`frameworks`/`asset_dirs` ArrayLists); string getters return the field (or `""`); bool getters compute from the
triple (`predIsMacOS`/`predIsIOS`/…, mirroring the legacy hooks); count/indexed getters read the `BuildConfig`
slices. **Dispatch routing (Option B, chosen at start):** a `#run` / const-init entry that directly calls a
compiler-domain / compiler-welded fn (`emit_llvm.entryNeedsVm`) is routed through the VM with NO legacy fallback
— regardless of the `-Dcomptime-flat` gate — so gate-OFF stays green without a legacy BuildOptions handler
(P5.7 retires the legacy interp entirely). The 5 `platform/bundle.sx` helpers that call getters
(`build_info_plist`/`embed_framework`/`android_bundle_main`/`build_android_manifest`/`compile_jni_main_sources`)
are marked `abi(.compiler)` too (they're comptime-only bundler code; without it their now-welded getter calls
trip the runtime-call gate). **Snapshots:** 37 `.ir` churned (std transitively imports build.sx → string-pool/
type-table indices shift) — regen scoped via `-Dname`; verified ONLY `.ir` changed (zero behavior-stream diffs).
**703/0 BOTH gates.** Strict sweep: the BuildOptions `compiler_call` bails are GONE (1609/1614/1615 strict-clean);
1616 now bails on `shr` (a pre-existing, separate VM gap — bitwise/shift ops `shl`/`shr`/`bit_and`/`bit_or`/
`bit_xor`/`bit_not` are unported in `comptime_vm`, surfaced now that the iOS-device bundler runs further; 1616 is
unpinned + can't JIT-run on macOS anyway). **Also (per user): swept the outdated "flat memory" terminology** —
the comptime VM is byte-addressable, ARENA-backed memory where `Addr` is a REAL host pointer, NOT a flat
contiguous address space; "flat memory"/"flat-memory" → "comptime memory" / "byte-addressable" across
`comptime_vm.zig` + the plan/checkpoint/CLAUDE docs (flag names `-Dcomptime-flat`/`SX_COMPTIME_FLAT` kept).
> **NEXT — P5.6 (ALL bundling + code signing in `default_pipeline`).** First likely sub-task: port the
> bitwise/shift ops (`shl`/`shr`/`bit_and`/`bit_or`/`bit_xor`/`bit_not`) into `comptime_vm` so the real bundler
> path runs on the VM (the 1616 `shr` gap). Then move `platform/bundle.sx`'s per-target logic to read the
> migrated `abi(.compiler)` getters + `fs`/`process` host-FFI, call `bundle()` from `default_pipeline` after
> `link` when `bundle_path()` is set, and remove the `--bundle`/`post_link_module` Zig shim.
- **P5.4 CORE — the whole build is sx-driven via `default_pipeline`; no Zig auto-emit/auto-link (2026-06-19).**
The compiler's post-IR role is now: codegen → invoke the build callback. **There is NO auto-emit / auto-link.**
Commits (all green): (1) **core** (`d178454`) — `emit_object()` is an ACTION (verify+emit via a host
@@ -497,7 +533,7 @@ when reached (sentinels or accessor fns; see the design doc Risks).
`main.LinkHooksCtx` (holds allocator/io/base_config/has_jni_main; its `link` adapter unions the explicit
`flags` with the CLI ones and calls `target.link(objects[0], objects[1..], …)` — the linker treats first-vs-rest
as equal inputs). **New VM readers** (inverse of `makeStringList`): `readStringList` (a `List(string)` arg →
`[][]const u8`, element bytes are views into stable flat-memory arena) + `readStringArg` (a `string` arg).
`[][]const u8`, element bytes are views into stable comptime arena) + `readStringArg` (a `string` arg).
Registered `link` on `bound_fns` (legacy stub bails — VM-only). **Smoke test**
`examples/1663-platform-build-pipeline-link` (AOT): a post-link callback re-links the build's own objects (via
`c_object_paths` + `emit_object`) into a temp output through the sx `link` primitive — and the **relinked binary
@@ -513,7 +549,7 @@ when reached (sentinels or accessor fns; see the design doc Risks).
`default_build` grows into) and are serviced by `comptime_vm.callCompilerFn` reading two new `BuildConfig`
fields (`c_object_paths`/`link_libraries`) that `main.zig` forwards before the post-link callback (alongside
`binary_path`/`target_triple`/…). **Reusable new piece:** `Vm.makeStringList(table, list_ty, items)` builds a
`List(string)` in flat memory — backing array of `string` fat pointers + the `{items,len,cap}` struct, all laid
`List(string)` in comptime memory — backing array of `string` fat pointers + the `{items,len,cap}` struct, all laid
out from the RESULT type's field offsets/types (target-aware, no hardcoded layout). To get the result type,
`invoke`/`callCompilerFn` now thread the call instruction's `ins.ty` (the only call-result-type need so far).
Legacy (`compiler_lib`) handlers for these bail loudly (`handleBuildPipelineQuery`) — they're VM-only by nature
@@ -581,7 +617,7 @@ when reached (sentinels or accessor fns; see the design doc Risks).
(1) **`trace_resolve`** (1035) — PORTED to the VM (`comptime_vm.zig`): unpack the `(func_id<<32|offset)`
comptime frame, resolve func name + `file:line:col` + source line via a **`source_map` now threaded into the
VM** (new `tryEval` param, `&import_sources` from emit_llvm), build the `{file,line,col,func,line_text}`
`Frame` struct in flat memory (`makeStringValue`/`writeField`/`fieldOffset`). (2) **0522** (bare-pack
`Frame` struct in comptime memory (`makeStringValue`/`writeField`/`fieldOffset`). (2) **0522** (bare-pack
`[]Any`) — was a CRASH (`reflectArgTypeId` `@intCast` of a garbage word) → hardened to a loud bail
(`typeIdxOf` checked cast; the VM must never panic). ROOT CAUSE: after the 0143 fix `$args` materializes as
`[]type_value` (8-byte), but the example declared `describe(args: []Any)` (16-byte) → every element past the
@@ -867,7 +903,7 @@ when reached (sentinels or accessor fns; see the design doc Risks).
Replaced the growable `ArrayList(u8)` flat buffer (which reallocs/MOVES on growth) with a
`std.heap.ArenaAllocator`: each `allocBytes` is a separate arena allocation that never moves and
is freed wholesale on `deinit` (no per-object free, no cap, no fixed buffer). **`Addr` is now the
allocation's absolute host pointer** (`@intFromPtr`), not an offset — so a flat-memory pointer and
allocation's absolute host pointer** (`@intFromPtr`), not an offset — so a comptime pointer and
an FFI-returned host pointer are the SAME kind of value, and the FFI bridge (4D.1) can pass them
to/from libc with ZERO translation and no per-call pinning (the original moving-buffer hazard is
gone by construction). `Machine.readWord/writeWord/bytes` deref the absolute pointer directly,
@@ -883,9 +919,9 @@ when reached (sentinels or accessor fns; see the design doc Risks).
- **Phase 4A.1 (VM plan) — `box_any`/`unbox_any` on the VM + `.any` as a 16-byte aggregate (2026-06-18).**
Ported the Any-boxing conversion pair: `box_any` allocates the 16-byte `{ type_tag@0, value@8 }`
box (tag = source TypeId index, matching the legacy comptime interp), writing a word source's
scalar via `writeField(source_type)` (so f32 round-trips) or an aggregate source's flat-memory
scalar via `writeField(source_type)` (so f32 round-trips) or an aggregate source's comptime
ADDR (the runtime pointer-in-value-slot shape); `unbox_any` reads the value slot back (word →
`readField`, aggregate → the stored ADDR). **Required making `.any` a first-class flat-memory
`readField`, aggregate → the stored ADDR). **Required making `.any` a first-class comptime
aggregate** (it was `kindOf → .unsupported`): `kindOf(.any) = .aggregate` (16B, by-address) +
`fieldOffset` special-cases `.any` to the `{@0, @8}` layout (shared with string/slice) — without
the latter, a `struct_get` on an Any panicked (`union field 'struct' while 'any' is active`),
@@ -935,7 +971,7 @@ when reached (sentinels or accessor fns; see the design doc Risks).
HANDLED** on the VM (define is the whole eval); `0622`/`0623` run define HANDLED then fall back
cleanly at the still-unported `type_info` reflection. VM output byte-matches legacy for all 7.
**697/0 BOTH gates + all unit tests (added: tagged-union `enum_init` payload layout).** On
`reify`. **Next:** port `type_info` (REFLECT a type → build a `TypeInfo` value in flat memory,
`reify`. **Next:** port `type_info` (REFLECT a type → build a `TypeInfo` value in comptime memory,
the inverse — reuses the tagged-union `enum_init` write) so `0619`/`0622`/`0623` go fully HANDLED;
then the rest of the comptime corpus (drive the SX_COMPTIME_FLAT_TRACE fallback list toward the
genuinely-non-comptime cases) before the VM-default flip + legacy deletion.
@@ -1155,7 +1191,7 @@ when reached (sentinels or accessor fns; see the design doc Risks).
- **Phase 3 P3.1 (VM plan) — first read-only reflection readers: `find_type` + `type_field_count` (2026-06-18).**
Two more `compiler`-library fns, bound the same way as the `intern`/`text_of` seed (added
to `compiler_lib.bound_fns` for the legacy handler + the welded-decl export check, AND to
`Vm.callCompilerFn` for the native flat-memory path — NO marshaling). A **type handle is a
`Vm.callCompilerFn` for the native comptime path — NO marshaling). A **type handle is a
plain `u32` `TypeId`** (like `StringId`), so both keep the seed's clean scalar shape:
`find_type(name: StringId) -> TypeId` (`TypeTable.findByName`, `unresolved`/0 if absent) and
`type_field_count(t: TypeId) -> i64` (a NEW `TypeTable.memberCount` query — struct/union/
@@ -1166,7 +1202,7 @@ when reached (sentinels or accessor fns; see the design doc Risks).
(`find_type` + `type_field_count`, struct found → 3 fields, missing → `unresolved`).
**Parity 689/689** (gate ON and OFF). **Decision (resolves the plan's `find_type → ?Type`
sketch):** return a NON-optional `TypeId` with the `unresolved` (0) sentinel for not-found,
NOT `?Type` — a `Type` value resolves to `.any` (which the flat-memory VM doesn't represent)
NOT `?Type` — a `Type` value resolves to `.any` (which the comptime VM doesn't represent)
and an optional can't cross the legacy↔VM eval boundary; `unresolved` is the project-blessed
unmistakable "no type" marker. Forward (P3.2): more readers on the same handle shape
(`type_name`/`field_name`/`field_type`/kind), then `register_struct` (first mutating fn).
@@ -1184,14 +1220,14 @@ when reached (sentinels or accessor fns; see the design doc Risks).
(malformed `ret Ref.none` → bail, not crash). Parity **688/688** both ways.
- **Phase 3 SEED (VM plan) — compiler-call path: `intern`/`text_of` native on the VM (2026-06-18).**
`invoke` now dispatches a welded `compiler`-library fn (gated on `compiler_welded`) to
`Vm.callCompilerFn`, serviced NATIVELY on flat memory (no legacy `Interpreter`):
`intern(string)->StringId` reads the flat-memory string bytes and `internString`s into the
`Vm.callCompilerFn`, serviced NATIVELY on comptime memory (no legacy `Interpreter`):
`intern(string)->StringId` reads the comptime string bytes and `internString`s into the
const-cast table (pool-only — doesn't touch type layout, so cached sizes stay valid);
`text_of(StringId)->string` materializes the pooled text back into flat memory. Unlocked
`text_of(StringId)->string` materializes the pooled text back into comptime memory. Unlocked
`0626`; the ONLY remaining const-init fallback is now the inline-asm global (`1654`).
Parity **688/688** (gate ON and OFF); unit test added. This is the mechanism Phase 3 grows
— the next compiler fns (`find_type`, `register_struct`, reflection readers) bind the same
way (flat-memory pointer in, handle/pointer out, no marshaling).
way (comptime pointer in, handle/pointer out, no marshaling).
- **Phase 1.final step 9 (VM plan) — `-Dcomptime-flat` build flag (the "swap behind a build flag" step) (2026-06-18).**
Added the `-Dcomptime-flat` build option (build.zig → a `build_opts` options module on
`mod`; `emit_llvm.init` reads `build_opts.comptime_flat or SX_COMPTIME_FLAT env`). This is
@@ -1199,7 +1235,7 @@ when reached (sentinels or accessor fns; see the design doc Risks).
`zig build test -Dcomptime-flat` runs the FULL corpus on the VM (688/0). Verified the flag
toggles the binary: flag-built `sx` reports VM HANDLED with no env var; default-built does
not. Default OFF — `zig build test` unchanged (688/0). Env var still works for ad-hoc runs.
Next (forward): Phase 2 (bytecode) / Phase 3 (compiler-API on flat memory); eventual
Next (forward): Phase 2 (bytecode) / Phase 3 (compiler-API on comptime memory); eventual
default-flip + legacy deletion.
- **Phase 1.final step 8 (VM plan) — wire the `#run` side-effect path + trace-clear-on-fallback (2026-06-18).**
Wired the SECOND comptime call site (`runComptimeSideEffects`, top-level `#run <expr>;`)
@@ -1226,42 +1262,42 @@ when reached (sentinels or accessor fns; see the design doc Risks).
test added). VM HANDLES **36** corpus const-inits (was 31); **parity 688/688** (gate ON
and OFF). Only **2 fallbacks** remain, both principled: `intern` (`0626`, welded
compiler-API fn — Phase 3) + inline-asm global (`1654`). Forward work: Phase 2 (bytecode),
Phase 3 (compiler-API on flat memory).
Phase 3 (compiler-API on comptime memory).
- **Phase 1.final step 6 (VM plan) — real default context + call_indirect + func_ref + global_get; coverage 27→31 (2026-06-17).**
Per the user's direction ("the VM can set up a default context"), `runEntry` now
materializes the REAL default context instead of a zeroed one. The implicit-ctx param is
an opaque `*void`, so `materializeDefaultContext` finds the `__sx_default_context` global
and lays its initializer (`{ {null, alloc_fn, dealloc_fn}, null }`, the CAllocator thunk
func-refs) into flat memory via a new recursive `layoutConst`. With `func_ref` (function
func-refs) into comptime memory via a new recursive `layoutConst`. With `func_ref` (function
value encoded `FuncId.index()+1`, reserving word 0 for the null fn-ptr) and
`call_indirect` (decode word → FuncId → dispatch; 0 → bail) ported, the whole allocator
protocol runs on the VM:
`context.allocator.alloc_bytes` → call_indirect → thunk → `CAllocator.alloc_bytes` →
`libc_malloc` → native flat malloc. Unlocked `0606` (string global). Also: `global_get`
`libc_malloc` → native comptime malloc. Unlocked `0606` (string global). Also: `global_get`
lazily evaluates a comptime global's `comptime_func` (memoized) — unlocked `CT_CHAIN`;
field access (`fieldOffset`/`struct_get`) handles string/slice `{ptr@0,len@8}` fat
pointers (needed by `alloc_string`); `regToValue` maps function-typed words → `.func_ref`
(kept `1128`'s rejection byte-identical). Native `malloc` is still required (the thunk
bottoms out at it; a host pointer can't be used with flat-memory load/store). VM HANDLES
bottoms out at it; a host pointer can't be used with comptime load/store). VM HANDLES
**31** corpus const-inits (was 27); **parity 688/688** (gate ON and OFF). Unit tests:
global_get, func_ref+call_indirect. Remaining fallbacks (7): `.unsupported` aggregates
(3× — `1037`/`1038`), extern/builtin `intern`+asm (2×), `trace_frame`, `is_comptime`.
- **Phase 1.final step 5 cont. (VM plan) — libc memory builtins + f32 fix; coverage 16→27 (2026-06-17).**
Identified the dominant fallback (`call to extern/builtin`) as **11× `malloc`** (0604) +
1× `intern`. Modeled a curated set of libc MEMORY builtins natively on flat memory
1× `intern`. Modeled a curated set of libc MEMORY builtins natively on comptime memory
(`Vm.callMemBuiltin`): `malloc`/`calloc` → `allocBytes` (16-aligned, 256-MiB cap → bail),
`free` → no-op, `memcpy`/`memmove`/`memset` on flat bytes — sandboxed (no host heap/dlsym),
`free` → no-op, `memcpy`/`memmove`/`memset` on comptime bytes — sandboxed (no host heap/dlsym),
target-aware; the computed result is byte-identical to legacy (which calls real libc).
This surfaced a **real latent f32 bug**: float registers hold f64 bits, but f32 MEMORY is
the 4-byte single — `readField`/`writeField` were truncating the f64 bits (writing zeros
for `1.0`); now they `@floatCast` on f32 load/store (mirrors legacy `storeAtRawPtr`).
Result: VM HANDLES **27** corpus const-inits (was 16); **parity 688/688** (gate ON and
OFF). Unit tests added (f32 round-trip; malloc → usable flat memory). Next: the `kindOf`
OFF). Unit tests added (f32 round-trip; malloc → usable comptime memory). Next: the `kindOf`
`.unsupported` aggregates (3×), `global_get` (2×), the rest.
- **Phase 1.final step 5 (VM plan) — implicit-context materialization; coverage 0→16 (2026-06-17).**
`tryEval` now MATERIALIZES the implicit ctx instead of skipping it: a `has_implicit_ctx`
comptime entry (sole param `*Context`) gets a zeroed `Context` of the right size/align
in flat memory, its address passed as arg 0. Const bodies that ignore the ctx run; a
in comptime memory, its address passed as arg 0. Const bodies that ignore the ctx run; a
body that uses the allocator hits unported `call_indirect` → bails → legacy. No func-ref
materialization needed (handled bodies don't read ctx contents; parity is the guard).
Fixed a real bug surfaced by the coverage pass: storing a `null` non-pointer optional
@@ -1298,14 +1334,14 @@ when reached (sentinels or accessor fns; see the design doc Risks).
Builtin/compiler_call/extern handlers are coupled to the legacy `Interpreter`, so the
wiring will use WHOLE-FUNCTION fallback (VM runs pure functions; bail → legacy re-runs
the whole eval). Built the boundary bridge that enables it: `valueToReg` (Value arg →
Reg, aggregates into flat memory) + `regToValue` (VM result → Value, deep-copied).
Reg, aggregates into comptime memory) + `regToValue` (VM result → Value, deep-copied).
Covers scalars/strings/structs; other shapes bail. Transitional. Round-trip
unit-tested. 688 corpus green. Next: the wiring (flag + route a comptime entry through
the VM with legacy fallback).
- **Phase 1 sub-step 1.5 (VM plan) — direct `call` + stack-lifetime change (2026-06-17).**
`Vm` gained `module` (callee resolution) + `depth`/`max_depth` guard. `call` marshals
arg Refs → Reg and recursively runs the callee; aggregates pass as Addrs over shared
flat memory. `Frame` no longer reclaims the machine on exit (else a returned aggregate
comptime memory. `Frame` no longer reclaims the machine on exit (else a returned aggregate
Addr dangles) — allocations live to `Vm.deinit`. Extern/builtin callees bail (1.5b).
Unit-tested: direct call (142), recursion sum(0..n) (15/55). 688 corpus green. Next:
1.5b (call_builtin/compiler_call/extern), then hybrid wiring.
@@ -1325,38 +1361,38 @@ when reached (sentinels or accessor fns; see the design doc Risks).
const_null reads as none) + payloadless enum_init/enum_tag. Unit-tested (?i64 → 91,
?*i64 null==0 → 99, enum tag → 11). 688 corpus green. Next: 4d (tagged unions, any,
closures).
- **Phase 1 sub-step 4b (VM plan) — slices + strings on flat memory (2026-06-17).**
- **Phase 1 sub-step 4b (VM plan) — slices + strings on comptime memory (2026-06-17).**
`{ptr@0(pointer_size), len@8(i64)}` fat pointers (kindOf: string/slice → aggregate).
Ported `const_string` (text+NUL + fat pointer in flat memory), `length`/`data_ptr`,
Ported `const_string` (text+NUL + fat pointer in comptime memory), `length`/`data_ptr`,
`array_to_slice`, `subslice`, index-through-slice (`elemAddr` loads `.ptr`), and
`str_eq`/`str_ne` (memcmp). Unit-tested (str length+eq/ne, array→slice index sum=23,
subslice sum=43). 688 corpus green. Next: 4c (optionals/enums/any/closures).
- **Phase 1 sub-step 4a (VM plan) — tuples + arrays on flat memory (2026-06-17).**
- **Phase 1 sub-step 4a (VM plan) — tuples + arrays on comptime memory (2026-06-17).**
`kindOf` widened (tuple/array → aggregate). Ported `tuple_init`/`tuple_get`
(`tupleFieldOffset`), `index_get`/`index_gep` (`elemAddr` = base + idx*elem_size over
array/pointer/many_pointer; slice/string bases bail), `length` on array values.
Unit-tested (mixed tuple, [3]i64 index sum=42, length=3). 688 corpus green. Next:
sub-step 4b (slices/strings, then optionals/enums/any/closures).
- **Phase 1 sub-step 3 (VM plan) — memory + structs on flat memory (2026-06-17).**
- **Phase 1 sub-step 3 (VM plan) — memory + structs on comptime memory (2026-06-17).**
`Vm` gained optional `table: *const TypeTable` (target-aware layout). Ported
`alloca`/`load`/`store` + `struct_init`/`struct_get`/`struct_gep`, laying structs out
at the table's natural offsets. Value model: scalar/pointer → register word;
struct → lives in flat memory, its value IS its address (read→addr, write→memcpy), so
struct → lives in comptime memory, its value IS its address (read→addr, write→memcpy), so
nested structs compose and `struct_gep` = base+offset. `kindOf` bails loudly on
not-yet-ported types. Addr-based values survive allocator realloc. Unit-tested
(struct round-trip, alloca+gep+store+load, nested struct). 688 corpus green. Next:
sub-step 4 (arrays/slices/strings/optionals/enums/tuples/any/closures, then calls).
- **Phase 1 sub-step 2 (VM plan) — flat-memory executor: scalars + control flow
- **Phase 1 sub-step 2 (VM plan) — comptime executor: scalars + control flow
(2026-06-17).** Added `Vm` to `comptime_vm.zig`: walks the same IR `Inst` over
flat-memory frames (register `Reg` = scalar bits or `Addr`), mirroring the legacy
comptime frames (register `Reg` = scalar bits or `Addr`), mirroring the legacy
interp's scalar semantics (i64 wrapping/signed, f64). Ported constants, arithmetic,
comparison, logical, conversions, terminators (`br`/`cond_br`/`ret`/`ret_void`) and
`block_param`; every other op bails loudly (`error.Unsupported` + op name in
`detail`). Unit-tested on hand-built tiny IR (`Fb` builder): int add, f64 arithmetic,
cond_br selection, a block-param loop, div-by-zero + unsupported-op bails. Corpus
untouched (688 green). Next: sub-step 3 (memory + aggregates on flat memory, where
untouched (688 green). Next: sub-step 3 (memory + aggregates on comptime memory, where
target-aware layout enters).
- **Phase 1 sub-step 1 (VM plan) — flat-memory machine substrate (2026-06-17).**
- **Phase 1 sub-step 1 (VM plan) — comptime machine substrate (2026-06-17).**
New `src/ir/comptime_vm.zig`: `Machine` (linear byte memory + bump/stack allocator
with `mark`/`reset`, scalar `readWord`/`writeWord` 1/2/4/8 LE, `bytes` views, addr 0
reserved as `null_addr`) + `Frame` (Ref-indexed register file, stack reclamation on
@@ -1375,16 +1411,16 @@ when reached (sentinels or accessor fns; see the design doc Risks).
compiler-call seed — so `weldedCompilerFn`, the `compiler_welded` dispatch, the
`emitCall` comptime-only gate, the `#library`/`abi`/`extern` syntax, and examples
`0626`/`1184`/`1185` remain. `zig build test` green (688 corpus, 0 failed). Next:
Phase 1 (flat-memory value model) per `PLAN-COMPILER-VM.md`.
- **DIRECTION CHANGE — pivot off the byte-weld to a flat-memory bytecode VM
Phase 1 (byte-addressable value model) per `PLAN-COMPILER-VM.md`.
- **DIRECTION CHANGE — pivot off the byte-weld to a byte-addressable bytecode VM
(2026-06-17).** Decided the weld + serialization/marshaling bridge is the wrong
direction (it hand-marshals onto a comptime value model that isn't bytes — exactly
what the design set out to kill). New foundation: a bytecode VM over flat memory so
what the design set out to kill). New foundation: a bytecode VM over comptime memory so
comptime values are native bytes; the compiler-API then rides on it via direct memory
(no weld/validation/marshaling). JIT-native comptime was weighed and rejected (breaks
cross-compilation, loses the sandbox). Wrote `current/PLAN-COMPILER-VM.md` (Phase 0
strip → Phase 1 flat-memory value model → Phase 2 bytecode → Phase 3 compiler-API on
flat memory). Banner added to `design/comptime-compiler-api.md` (superseded). Reverted
strip → Phase 1 byte-addressable value model → Phase 2 bytecode → Phase 3 compiler-API on
comptime memory). Banner added to `design/comptime-compiler-api.md` (superseded). Reverted
the session's uncommitted `register_struct`/`find_type` marshaling experiment back to
`reify` HEAD (40d075c). No code stripped yet — Phase 0 is the next action.
- **Phase 2 — welded structs by reflection + memory-order validation.** Dropped

View File

@@ -1,11 +1,11 @@
# PLAN — Comptime Bytecode VM + flat memory (then re-home the compiler-API on it)
# PLAN — Comptime Bytecode VM + comptime memory (then re-home the compiler-API on it)
> **Direction change (2026-06-17).** The comptime compiler-API stream pivots off the
> **byte-weld**. The weld (sx structs whose layout is validated to mirror the
> compiler's Zig types) + the **serialization / marshaling** bridge at the call
> boundary is the wrong direction — it bolts a parallel layout regime and hand-built
> byte-copies onto a comptime value model that fundamentally isn't bytes. We strip it
> and build the right foundation: a **bytecode VM over flat, byte-addressable
> and build the right foundation: a **bytecode VM over byte-addressable
> memory**, where comptime values ARE native bytes (like runtime). On that base the
> compiler-API needs no weld, no validation, no marshaling — the compiler's own types
> are read/built directly as memory and its functions take/return real pointers.
@@ -25,14 +25,14 @@ every value as a tagged `Value` union (`int`, `float`, `aggregate: []const Value
struct's bytes. So a comptime `@ptrCast(*StructInfo)` reads the `Value` union's
memory, not a `StructInfo` — which forced the whole weld+marshal detour.
Make comptime values **native bytes in a flat memory** and both problems dissolve:
Make comptime values **native bytes in byte-addressable memory** and both problems dissolve:
structs/arrays/slices are their bytes at natural layout (no weld), the compiler's own
records are directly addressable (no marshal), and a bytecode loop over flat memory is
records are directly addressable (no marshal), and a bytecode loop over comptime memory is
fast.
## End state
- Comptime execution = a **bytecode VM** over a **flat linear memory** (real
- Comptime execution = a **bytecode VM** over a **byte-addressable memory** (real
host-allocated bytes; layout is **target-aware** via the type table's sizes). Values
are bytes at addresses plus a scalar register file. No tagged `Value` union.
- The comptime compiler-API: the compiler **exposes its real types + functions** to
@@ -93,31 +93,31 @@ corpus rebaseline; suite green.
syntax still parses (parser unit tests).
### Phase 1 — Flat-memory value model (still IR-walking, no bytecode yet)
Introduce flat memory and move comptime values onto it, **decoupled from bytecode** so
Introduce comptime memory and move comptime values onto it, **decoupled from bytecode** so
the value-model change is isolated. Each sub-step ports one op group and keeps the
corpus green; the OLD tagged path stays behind a build flag (`-Dcomptime-flat`) until
all groups land, then the shim is deleted.
1. **Machine + scalars.** A flat memory region (host `[]u8`) with a stack (frames) +
1. **Machine + scalars.** A comptime memory region (host `[]u8`) with a stack (frames) +
bump-allocated heap, and a scalar register file. Port `int`/`float`/`bool`/`undef`
and arithmetic/compare/branch. Aggregates still go through a compat shim to the old
representation.
2. **Aggregates.** Structs/arrays/tuples laid out in flat memory at **target** layout;
2. **Aggregates.** Structs/arrays/tuples laid out in comptime memory at **target** layout;
port `struct_init` / `struct_get` / `array` / `index_gep` to read/write bytes at
computed offsets.
3. **Slices / strings.** `{ptr, len}` fat pointers in flat memory.
3. **Slices / strings.** `{ptr, len}` fat pointers in comptime memory.
4. **Optionals / enums / tagged unions.** Tag + payload bytes.
5. **Pointers.** `alloca` / `store` / `load` / GEP unified onto flat addresses; retire
`slot_ptr` / `heap_ptr` / `byte_ptr` in favor of flat-memory addresses.
6. **Closures.** Fn id + captured env materialized in flat memory.
5. **Pointers.** `alloca` / `store` / `load` / GEP unified onto comptime addresses; retire
`slot_ptr` / `heap_ptr` / `byte_ptr` in favor of comptime addresses.
6. **Closures.** Fn id + captured env materialized in comptime memory.
7. **Extern / host calls.** A struct arg is already bytes → pass its address; this
removes most of `marshalExternArg`.
8. **Reflection / minting.** `declare` / `define` / `type_info` read flat-memory
8. **Reflection / minting.** `declare` / `define` / `type_info` read comptime
values; type-table mutation copies escaping data into compiler-owned memory at the
boundary (lifetime), as today.
**Verification:** with `-Dcomptime-flat` the full corpus (currently 692) is byte-for-
byte identical to the tagged path; then make flat the default and delete the shim.
byte identical to the tagged path; then make the VM the default and delete the shim.
### Phase 2 — Bytecode
Compile a comptime function's IR → a compact bytecode and execute the bytecode instead
@@ -160,7 +160,7 @@ host through it:
- **(2) Implicit context — DONE (materialized, 2026-06-17 step 5).** Initially a
conservative skip; now `tryEval` MATERIALIZES the implicit ctx: a comptime entry with
`has_implicit_ctx` (whose sole param is the `*Context`) gets a zeroed `Context` of the
right size/align allocated in flat memory, its address passed as arg 0. The common
right size/align allocated in comptime memory, its address passed as arg 0. The common
const body never reads the ctx; a body that USES the allocator loads a fn from it and
`call_indirect`s (unported) → bails → legacy. No func-ref materialization was needed:
handled bodies don't read the ctx contents, and gate-ON corpus parity (688, 0 failed)
@@ -179,9 +179,9 @@ host through it:
stays **688/688** (gate ON and OFF) at every step. Landed, in order: implicit ctx
materialized (→16); `writeField` null-aggregate fix (storing a `null` non-pointer
optional `null_addr` sentinel into an aggregate slot OOB-bailed → now ZEROES the
destination = none/empty; unit-test regression); curated libc MEMORY builtins on flat
destination = none/empty; unit-test regression); curated libc MEMORY builtins on comptime
memory (`Vm.callMemBuiltin`: `malloc`/`calloc` → `allocBytes` 16-aligned & 256-MiB-capped,
`free` → no-op, `memcpy`/`memmove`/`memset` on flat bytes — sandboxed, target-aware,
`free` → no-op, `memcpy`/`memmove`/`memset` on comptime bytes — sandboxed, target-aware,
result byte-identical to legacy; unlocked `0604`'s 11 comptime mallocs); and an **f32
storage fix** (float registers hold f64 bits, but f32 memory is the 4-byte single —
`readField`/`writeField` now `@floatCast` instead of truncating the f64 bits, which had
@@ -192,13 +192,13 @@ host through it:
materializes the REAL default context (not a zeroed one): the implicit-ctx param is an
opaque `*void`, so `materializeDefaultContext` finds the `__sx_default_context` global
and lays its initializer constant (`{ {null, alloc_fn, dealloc_fn}, null }`, carrying
the CAllocator thunk func-refs) into flat memory via a new recursive `layoutConst`.
the CAllocator thunk func-refs) into comptime memory via a new recursive `layoutConst`.
With `func_ref` (a function value encoded as `FuncId.index() + 1` so word 0 stays
reserved for the NULL function pointer — `funcRefWord`/`funcRefToId`) and `call_indirect`
(decode the callee word → `FuncId` → dispatch; 0 → bail) ported, a comptime body
that allocates via `context.allocator` now runs ENTIRELY on the VM: `alloc_string` →
`context.allocator.alloc_bytes` → `call_indirect` → thunk → `CAllocator.alloc_bytes` →
`libc_malloc` → the VM's native flat-memory `malloc`. Unlocked `0606` (string global via
`libc_malloc` → the VM's native comptime `malloc`. Unlocked `0606` (string global via
the allocator). Also: `global_get` lazily evaluates a comptime global's `comptime_func`
(memoized in `global_cache`) — unlocked `CT_CHAIN`; struct field access (`fieldOffset`/
`struct_get`) now handles string/slice `{ptr@0,len@8}` fat pointers (needed by
@@ -206,8 +206,8 @@ host through it:
`.func_ref` so a func-ref result serializes identically to legacy (kept `1128`'s
rejection diagnostic byte-identical). Unit tests added (global_get, func_ref +
call_indirect). **Note: native `malloc` is still REQUIRED** — the CAllocator thunk
bottoms out at libc `malloc`, and the VM can't use a host pointer with flat-memory
load/store, so comptime `malloc` must allocate from flat memory. The default context
bottoms out at libc `malloc`, and the VM can't use a host pointer with comptime
load/store, so comptime `malloc` must allocate from comptime memory. The default context
lets the allocator PROTOCOL run; native `malloc` is its final step.
- **(7) `is_comptime` + failable/error cluster + the signed-load fix — DONE.** Coverage
**31 → 36** handled (fallbacks 7 → 2); parity stays **688/688** both gate ON and OFF.
@@ -229,10 +229,10 @@ host through it:
`intern` (`0626`, the welded compiler-API fn — Phase 3 re-homes it) and the inline-asm
global call (`1654`, never comptime-evaluable). Every other measured corpus const-init
is handled on the VM.
At this point the flat-memory VM handles essentially the entire real comptime corpus
At this point the comptime VM handles essentially the entire real comptime corpus
(scalars, control flow, structs/tuples/arrays/slices/strings/optionals/enums, calls +
recursion, the implicit context + allocator protocol, globals, failables + return
traces). Phase 2 (bytecode) and Phase 3 (compiler-API on flat memory) are the forward
traces). Phase 2 (bytecode) and Phase 3 (compiler-API on comptime memory) are the forward
work; flipping the VM to default + deleting the legacy path awaits those.
- **(8) Wire the `#run` side-effect path; trace-clear-on-fallback — DONE.** The second
comptime call site (`emit_llvm.runComptimeSideEffects`, top-level `#run <expr>;`) now
@@ -251,20 +251,20 @@ host through it:
deleting the legacy path (which still awaits Phase 2/3 + broader confidence).
- **(10) Compiler-call path on the VM — `intern`/`text_of` native (Phase 3 SEED) — DONE.**
`invoke` now services a welded `compiler`-library function (the `compiler_welded` flag is
the safety boundary) via `Vm.callCompilerFn` — natively on flat memory, NO legacy
`Interpreter`: `intern(s: string) -> StringId` reads the string bytes from flat memory and
the safety boundary) via `Vm.callCompilerFn` — natively on comptime memory, NO legacy
`Interpreter`: `intern(s: string) -> StringId` reads the string bytes from comptime memory and
`internString`s into the (const-cast) table (pool-only, never touches type layout, so the
VM's cached sizes stay valid); `text_of(id) -> string` materializes the pooled text back
into flat memory as a fat pointer. Unlocked `0626` — the ONLY remaining const-init fallback
into comptime memory as a fat pointer. Unlocked `0626` — the ONLY remaining const-init fallback
is now the inline-asm global (`1654`, genuinely not comptime-evaluable). Parity **688/688**
both gate ON and OFF; unit test added. This is the mechanism Phase 3 grows: the next
compiler functions (`find_type`, `register_struct`, the reflection readers) are added the
same way — flat-memory pointer in, handle/pointer out, no marshaling.
same way — comptime pointer in, handle/pointer out, no marshaling.
**Phase 3 progress (2026-06-18):**
- **(P3.1) First read-only reflection readers — `find_type` + `type_field_count` (DONE).**
Two more `compiler`-library fns bound the same way as the `intern`/`text_of` seed
(added to `compiler_lib.bound_fns` AND `Vm.callCompilerFn`, native on flat memory, no
(added to `compiler_lib.bound_fns` AND `Vm.callCompilerFn`, native on comptime memory, no
marshaling). A **type handle is a plain `u32` `TypeId`** (exactly like `StringId`), so
both calls keep the seed's clean scalar shape — handle in, scalar out:
`find_type(name: StringId) -> TypeId` (`TypeTable.findByName`) and
@@ -277,7 +277,7 @@ host through it:
- **Decision (resolves the plan's `find_type → ?Type` sketch):** `find_type` returns a
NON-optional `TypeId`, using the codebase's dedicated `unresolved` (0) sentinel for
not-found — NOT an `?Type`. Rationale: a `Type` value resolves to `.any`
(`type_resolver.zig`), which the flat-memory VM does not represent; and an optional
(`type_resolver.zig`), which the comptime VM does not represent; and an optional
return can't cross the legacy↔VM eval boundary (`regToValue` bridges only
word/string/struct/tuple). `unresolved` is the project-blessed unmistakable "no type"
marker (see CLAUDE.md REJECTED PATTERNS — a dedicated sentinel is the required shape),
@@ -341,7 +341,7 @@ host through it:
there, or migrate the metatype onto the legacy compiler-API calls first. Decide when reached.
Phase 2 (bytecode) is the orthogonal speed work.
### Phase 3 — Compiler-API on flat memory (resume the stream — no weld)
### Phase 3 — Compiler-API on comptime memory (resume the stream — no weld)
With native-byte comptime values, re-home the compiler-API:
- **Expose the compiler's real types.** Register the actual `types.zig` records
@@ -350,7 +350,7 @@ With native-byte comptime values, re-home the compiler-API:
nothing to validate or keep in sync. (This is the projection that *replaces* the
weld's reflection — owned by the compiler, not declared in sx.)
- **Expose the compiler's functions.** `register_struct`, `find_type`, `intern`,
`text_of`, and the reflection readers operate on flat-memory pointers / handles
`text_of`, and the reflection readers operate on comptime pointers / handles
directly (no marshaling — the bytes already ARE the record).
- **Re-express** `declare` / `define` / `type_info` as sx over these; delete the
bespoke interp arms (`defineStruct` / `defineEnum` / `defineTuple` / `reflectTypeInfo`);
@@ -399,7 +399,7 @@ are legitimate negative-test bails that BECOME VM diagnostics, 1145 is a scan ar
pointer-in-value-slot shape (`coerceToI64` alloca+ptrtoint) — implement or bail loudly.
- **4A.2** `out`/print → add a VM output buffer; flush through the same path as
`core.flushInterpOutput`.
- **4A.3** `global_addr` (address-of a global in flat memory).
- **4A.3** `global_addr` (address-of a global in comptime memory).
- **4A.4** trace frames (`sx_trace_*` / `interp_print_frames`).
- **4B — VM-native diagnostics (role E). MUST land before deleting legacy.** Today a VM
bail silently falls back; with legacy gone the VM bail IS the user-facing build-gating
@@ -410,11 +410,11 @@ are legitimate negative-test bails that BECOME VM diagnostics, 1145 is a scan ar
the `#insert` corpus parity.
- **4D — host FFI on the VM (role D substrate). DONE.** Solved by a better allocator, not a
pin/tag scheme: the comptime memory is now an **arena** of stable host allocations and `Addr`
IS a real host pointer (`4D.0`, `625ba0f`), so a flat-memory pointer and an FFI-returned host
IS a real host pointer (`4D.0`, `625ba0f`), so a comptime pointer and an FFI-returned host
pointer are the same value — no translation, no realloc hazard. `Vm.callHostExtern`
(`4D.1`, `e7a8708`) dispatches ANY extern via `host_ffi` dlsym + trampolines (args/returns pass
untouched); `4D.2` (`6a7f690`) adds slice/string args (→ NUL-term `char*`) + float guards.
Examples 0636/0637. **(Superseded sub-note:** the earlier "pin the buffer / flat↔host translate"
Examples 0636/0637. **(Superseded sub-note:** the earlier "pin the buffer / comptime↔host translate"
hazard is moot — the arena never moves an allocation.)
- **`#compiler` / `compiler_call` — DELETED, replaced by the `abi(.compiler)` ABI (decision 2026-06-18,
REVISED from the earlier `abi(.zig) extern compiler` shape).** A function is *compiler-domain* — it runs in
@@ -512,7 +512,7 @@ The compiler's whole post-IR role: codegen → build the CLI-derived `BuildConfi
- **P5.2 — primitives.** Split: the read-only **metadata queries are DONE (2026-06-19)** — `c_object_paths() ->
List(string)` + `link_libraries() -> List(string)` as `abi(.compiler)` fns (stdlib `library/modules/compiler.sx`),
serviced by `comptime_vm.callCompilerFn` over `BuildConfig` fields `main.zig` forwards; new VM `makeStringList`
builds the `List(string)` in flat memory from the call's result type (`ins.ty` now threaded through
builds the `List(string)` in comptime memory from the call's result type (`ins.ty` now threaded through
`invoke`/`callCompilerFn`). Smoke test `1662-platform-build-pipeline-queries` (AOT + C companion). 703/0 both
gates. **`emit_object() -> string` is also DONE (2026-06-19)** as a QUERY (not an action): the Zig driver emits
the object eagerly, so the primitive just returns the path from `BuildConfig.object_path` (no vtable). So all
@@ -540,19 +540,23 @@ dual-path, no legacy `compiler_lib` handler, no `regToValue`/`valueToReg` bridge
migrate the BuildOptions surface DIRECTLY to VM-native `abi(.compiler)` arms (no legacy handler — there is no
legacy to handle). **All bundling + code signing for EVERY target lives in the sx `default_pipeline`.**
- **P5.5 — migrate the 36 `BuildOptions :: struct #compiler` methods → VM-native `abi(.compiler)`.** Each
becomes a free `ufcs (self: BuildOptions, …) abi(.compiler)` decl (so `opt.method(...)` still resolves via
UFCS) with a `comptime_vm.callCompilerFn` arm — and **NO legacy `compiler_lib` handler** (the user's directive;
the legacy interp is going away). Families: string SETTERS (`set_bundle_path`/`set_bundle_id`/
`set_codesign_identity`/`set_provisioning_profile`/`set_manifest_path`/`set_keystore_path`/`add_framework`/
`add_link_flag`/`set_output_path`/`set_wasm_shell`/`set_post_link_module`/`add_asset_dir`) — write/append to the
threaded `BuildConfig`; string GETTERS (`binary_path`/`bundle_path`/`bundle_id`/`codesign_identity`/
`provisioning_profile`/`target_triple`/`manifest_path`/`keystore_path`); BOOL getters (`is_macos`/`is_ios`/
`is_ios_device`/`is_ios_simulator`/`is_android` — compute from the triple); LIST/index getters
(`framework_count`/`framework_at`/`framework_path_*`/`asset_dir_*`/`jni_main_*`, built via `makeStringList`).
**String lifetime:** a setter at `#run` must dupe the flat-memory string into a PERSISTENT allocator (NOT the
per-eval VM arena) — thread `emit_llvm.alloc` into the VM (e.g. `BuildConfig.string_alloc`) so the strings
survive to post-link. This kills the 4 strict `compiler_call` bails (1609/1614/1615/1616).
- **P5.5 — DONE (2026-06-19).** The 35 `BuildOptions :: struct #compiler` methods migrated to VM-native
`abi(.compiler)`: `BuildOptions :: struct { }` (opaque null-sentinel handle) + 35 free
`ufcs (self: BuildOptions, …) abi(.compiler)` decls in `build.sx`, serviced by a new
`comptime_vm.callBuildOptionFn` arm off `callCompilerFn` — **NO legacy `compiler_lib` handler** (names
registered in `bound_fns` with a single bailing stub only so `weldedCompilerFn` accepts them). Setters dupe the
arg string into the PERSISTENT `Vm.gpa` (the Compilation allocator — threaded into both `tryEval` and
`runBuildCallback` — NOT the per-eval VM arena) and write/append to the threaded `BuildConfig`; string getters
return the field (or `""`); bool getters compute from the triple (`predIsMacOS`/…); count/index getters read the
`BuildConfig` slices. **Dispatch routing (Option B):** a `#run`/const-init entry that directly calls a
compiler-domain/welded fn (`emit_llvm.entryNeedsVm`) runs on the VM with NO legacy fallback regardless of the
`-Dcomptime-flat` gate → gate-OFF stays green without a legacy BuildOptions handler. 5 `platform/bundle.sx`
getter-calling helpers marked `abi(.compiler)` (comptime-only bundler code). 37 `.ir` regenerated (string-pool
churn; behavior-identical, verified `.ir`-only). **703/0 BOTH gates.** BuildOptions `compiler_call` bails GONE
(1609/1614/1615 strict-clean); 1616 now bails on `shr` — a SEPARATE unported bitwise/shift VM gap
(`shl`/`shr`/`bit_and`/`bit_or`/`bit_xor`/`bit_not`), to port FIRST in P5.6 (1616 is unpinned + can't JIT-run on
macOS regardless). Also swept the outdated "flat memory" terminology → "comptime/byte-addressable" (the VM is
arena-backed, `Addr` = real host pointer; flag names `-Dcomptime-flat`/`SX_COMPTIME_FLAT` kept).
- **P5.6 — ALL bundling + code signing in `default_pipeline` (every target).** `default_pipeline` (or a
`bundle()` it calls, in `platform/bundle.sx`) performs, after `link`, the full per-target bundle when
`bundle_path()` is set — branching on `is_macos`/`is_ios_device`/`is_ios_simulator`/`is_android`:
@@ -595,9 +599,9 @@ unreferenced compiler-domain declaration — verify no stray runtime reference k
## Open questions (resolve as reached, record decisions here)
- **Host-ABI vs target-ABI split.** The compiler runs on the host, so its OWN exposed
records are host-laid-out; user comptime types are target-laid-out. The flat-memory
records are host-laid-out; user comptime types are target-laid-out. The comptime
model must carry both regimes (a per-type ABI tag on layout queries). Confirm the
boundary where a flat-memory pointer to a compiler record is handed to host Zig code
boundary where a comptime pointer to a compiler record is handed to host Zig code
uses host layout.
- **Exposing compiler types to sx.** Mechanism for projecting `types.zig` records into
the comptime type table with real offsets (the non-weld replacement) — a registry the
@@ -636,7 +640,7 @@ unreferenced compiler-domain declaration — verify no stray runtime reference k
gate (ops.zig), and examples `0626`/`1184`/`1185` stay. The `#library`/`abi`/`extern`
SYNTAX stays. `zig build test` green (688 corpus, 0 failed; unit tests pass).
- **Phase 1 — in progress.**
- **Sub-step 1 — DONE.** `src/ir/comptime_vm.zig`: the flat-memory `Machine`
- **Sub-step 1 — DONE.** `src/ir/comptime_vm.zig`: the comptime `Machine`
(linear byte memory + bump/stack allocator with `mark`/`reset` reclamation +
scalar `readWord`/`writeWord` (1/2/4/8, little-endian) + `bytes` views; addr 0
reserved as `null_addr`) and `Frame` (register file indexed by Ref + stack
@@ -645,7 +649,7 @@ unreferenced compiler-domain declaration — verify no stray runtime reference k
NOT touch the live interpreter, so the corpus stays green (688). No op execution
yet.
- **Sub-step 2 — DONE.** The executor (`Vm` in `comptime_vm.zig`): walks the SAME
IR `Inst` over flat-memory frames, mirroring the legacy interp's scalar semantics
IR `Inst` over comptime frames, mirroring the legacy interp's scalar semantics
(i64 wrapping/signed + f64 register words, keyed off the result/operand `TypeId`).
Ported: constants (`const_int`/`float`/`bool`/`null`/`undef`), arithmetic
(`add`/`sub`/`mul`/`div`/`mod`/`neg`), comparison (`cmp_*`), logical
@@ -657,12 +661,12 @@ unreferenced compiler-domain declaration — verify no stray runtime reference k
branch selection, a block-param loop summing i..1, div-by-zero + unsupported-op
bails. Corpus untouched (688 green) — the executor is exercised by unit tests only,
not yet wired to real comptime eval.
- **Sub-step 3 — DONE.** Memory + structs on flat memory. `Vm` gained an optional
- **Sub-step 3 — DONE.** Memory + structs on comptime memory. `Vm` gained an optional
`table: *const TypeTable` (target-aware layout). Ported `alloca`/`load`/`store`
(over flat addresses, `Store.val_ty` drives width) and `struct_init`/`struct_get`/
(over comptime addresses, `Store.val_ty` drives width) and `struct_init`/`struct_get`/
`struct_gep` (structs laid out at the table's natural offsets). The value model: a
`Kind.word` (scalar/pointer ≤8B) sits in a register; a `Kind.aggregate` (struct)
lives in flat memory and its "value" IS its address (read returns the address,
lives in comptime memory and its "value" IS its address (read returns the address,
write memcpys), so nested structs compose and `struct_gep` is just base+offset (no
field-pointer dance). `kindOf` bails loudly on the not-yet-ported types
(slice/string/any/optional/enum/array/tuple/…). The Addr-based value model survives
@@ -677,7 +681,7 @@ unreferenced compiler-domain declaration — verify no stray runtime reference k
gep/store + index_get sum (42), array `length` (3). 688 corpus green.
- **Sub-step 4b — DONE.** Slices + strings as `{ptr@0 (pointer_size), len@8 (i64)}`
fat pointers (`kindOf`: string/slice → aggregate). Ported `const_string` (materializes
text+NUL in flat memory + a fat pointer), `length`/`data_ptr` (read len/ptr fields),
text+NUL in comptime memory + a fat pointer), `length`/`data_ptr` (read len/ptr fields),
`array_to_slice`, `subslice`, indexing *through* a slice/string (`elemAddr` loads
`.ptr` first), and `str_eq`/`str_ne` (len+memcmp). Helpers `makeSlice`/`sliceLen`/
`sliceData`. Unit-tested: string length + str_eq/ne, array→slice + slice index +
@@ -703,7 +707,7 @@ unreferenced compiler-domain declaration — verify no stray runtime reference k
- **Sub-step 1.5 — direct `call` DONE.** `Vm` gained `module: *const Module`
(resolves a callee `FuncId`) + a `depth`/`max_depth` recursion guard. `call`
marshals arg Refs → Reg words and recursively `run`s the callee; aggregate args/
results pass as their `Addr` over the SHARED flat memory (no copy). **Stack-lifetime
results pass as their `Addr` over the SHARED comptime memory (no copy). **Stack-lifetime
change:** `Frame` no longer reclaims the machine on exit (a returned aggregate's
Addr would dangle) — a comptime eval's allocations live to `Vm.deinit`;
`Machine.mark`/`reset` stay for explicit use. Extern/builtin callees (no blocks)
@@ -714,7 +718,7 @@ unreferenced compiler-domain declaration — verify no stray runtime reference k
handlers take `*Interpreter`), so the VM can't call them directly — the wiring uses
WHOLE-FUNCTION fallback instead (VM runs pure functions; a bail re-runs the whole
eval in the legacy). That needs the boundary bridge: `valueToReg` (host `Value` arg →
VM `Reg`, materializing aggregates into flat memory) + `regToValue` (VM result →
VM `Reg`, materializing aggregates into comptime memory) + `regToValue` (VM result →
`Value`, deep-copied out). Covers scalars + strings + structs (other aggregate shapes
bail loudly; added as wiring surfaces them). Transitional — deleted once the VM owns
comptime end-to-end. Unit-tested with round-trips. 688 corpus green.
@@ -726,7 +730,7 @@ strings, optionals, payloadless enums, deref/addr_of) and unit-tested. Continuin
port the rarer ops (tagged-union payload, any, closures) in isolation risks subtle
bugs and has low signal. The higher-value path:
1. **Calls (sub-step 1.5)** — `call` (direct), then `call_builtin`/`compiler_call`. The
shared flat memory makes aggregate args/results pass naturally (they're Addrs). The
shared comptime memory makes aggregate args/results pass naturally (they're Addrs). The
one design point: **aggregate-return lifetime** — a callee's stack-reclaim would
dangle a returned struct Addr, so for comptime (bounded) the VM should stop
reclaiming per-frame and let the whole eval's allocations live until `Vm.deinit`