diff --git a/current/CHECKPOINT-COMPILER-API.md b/current/CHECKPOINT-COMPILER-API.md index c3ce4399..91ac8171 100644 --- a/current/CHECKPOINT-COMPILER-API.md +++ b/current/CHECKPOINT-COMPILER-API.md @@ -26,22 +26,28 @@ with ONE welded mechanism. Branch: `reify` (off `master`). Update after every st > breaks cross-compilation — host vs target layout — and loses the sandbox. A > flat-memory VM keeps both while getting native bytes + speed.) > -> **Next action (2026-06-18):** Phase 1.final op-porting is essentially COMPLETE — the VM -> handles **36** real corpus const-inits (0 → 16 → 27 → 31 → 36), with only **2** fallbacks -> left, both principled (`intern` = the welded compiler-API fn, Phase 3; inline-asm global -> `1654`, never comptime-evaluable). Parity **688/688** (gate ON and OFF). The VM now covers -> scalars/control-flow/aggregates/strings/optionals/enums, calls+recursion, the implicit -> context + full allocator protocol, globals, and failables + return traces. BOTH comptime -> call sites (const-init + `#run` side-effects) route through the VM with legacy fallback. -> **The forward work is Phase 2 (bytecode) and Phase 3 (compiler-API on flat memory)**; flipping the VM to -> default + deleting the legacy path awaits those. See `PLAN-COMPILER-VM.md` Phase 1.final -> Status steps 7–10 (Phase 3 seed: `intern`/`text_of` native on the VM — `0626` handled). -> Build/verify: `zig build && zig build test` (688, gate OFF). Run the corpus ON the VM: +> **Next action (2026-06-18):** **Phase 3 is UNDER WAY.** The VM now hosts the first +> read-only reflection readers — `find_type(name: StringId) -> TypeId` and +> `type_field_count(t: TypeId) -> i64` — bound exactly like the `intern`/`text_of` seed +> (a type handle is a plain `u32` `TypeId`, so the calls stay clean scalar host-calls). +> Example `0628` chains `intern → find_type → type_field_count`, VM-HANDLED natively. +> Parity **689/689** (gate ON and OFF), VM unit test added. Phase 1.final op-porting was +> already complete (the VM covers scalars/control-flow/aggregates/strings/optionals/enums, +> calls+recursion, the implicit context + full allocator protocol, globals, failables + +> return traces); both comptime call sites route through the VM with legacy fallback. +> **Forward (P3.2):** more read-only readers on the same `TypeId`-handle shape +> (`type_name`, `field_name`, `field_type`, kind queries), then `register_struct` (the +> first MUTATING fn — mints a `TypeId`; resolve the mutable-table / host-ABI-vs-target-ABI +> boundary deliberately). Re-expressing `declare`/`define`/`type_info` as sx (the metatype, +> which runs at LOWERING time) needs the VM hardened against malformed lowering-time IR +> first — keep it on the legacy path until then. Phase 2 (bytecode) is the orthogonal +> speed work. **Decision recorded:** `find_type` returns a non-optional `TypeId` using the +> `unresolved` (0) sentinel, NOT `?Type` (a `Type` value is `.any`-typed, which the VM +> doesn't represent, and an optional can't cross the eval bridge) — see `PLAN-COMPILER-VM.md` +> Phase 3 progress note. +> Build/verify: `zig build && zig build test` (689, gate OFF). Run the corpus ON the VM: > `zig build test -Dcomptime-flat` (the build flag) OR env `SX_COMPTIME_FLAT=1`. Coverage -> trace: `SX_COMPTIME_FLAT_TRACE=1`. **Forward: Phase 3 — grow the compiler-API on the VM** -> (`find_type` / `register_struct` / reflection readers via `Vm.callCompilerFn`, then -> re-express `declare`/`define`/`type_info` as sx and delete the bespoke interp arms); -> Phase 2 (bytecode) is the orthogonal speed work. +> trace: `SX_COMPTIME_FLAT_TRACE=1`. ### (superseded) prior weld resume Phase 1 done; Phase 2 welded structs were working via reflection + memory-order @@ -312,6 +318,24 @@ when reached (sentinels or accessor fns; see the design doc Risks). `List` growth; orthogonal, see `current/CHECKPOINT-METATYPE.md`.) ## Log +- **Phase 3 P3.1 (VM plan) — first read-only reflection readers: `find_type` + `type_field_count` (2026-06-18).** + Two more `compiler`-library fns, bound the same way as the `intern`/`text_of` seed (added + to `compiler_lib.bound_fns` for the legacy handler + the welded-decl export check, AND to + `Vm.callCompilerFn` for the native flat-memory path — NO marshaling). A **type handle is a + plain `u32` `TypeId`** (like `StringId`), so both keep the seed's clean scalar shape: + `find_type(name: StringId) -> TypeId` (`TypeTable.findByName`, `unresolved`/0 if absent) and + `type_field_count(t: TypeId) -> i64` (a NEW `TypeTable.memberCount` query — struct/union/ + tagged-union fields, enum variants, array/vector length — called by BOTH paths so they + can't drift; bails loudly, never a silent 0). New example `0628-comptime-compiler-find-type` + chains `intern → find_type → type_field_count` (and a not-found lookup → 0), both folded at + `#run`, both VM-HANDLED natively (trace confirms no fallback). VM unit test added + (`find_type` + `type_field_count`, struct found → 3 fields, missing → `unresolved`). + **Parity 689/689** (gate ON and OFF). **Decision (resolves the plan's `find_type → ?Type` + sketch):** return a NON-optional `TypeId` with the `unresolved` (0) sentinel for not-found, + NOT `?Type` — a `Type` value resolves to `.any` (which the flat-memory VM doesn't represent) + and an optional can't cross the legacy↔VM eval boundary; `unresolved` is the project-blessed + unmistakable "no type" marker. Forward (P3.2): more readers on the same handle shape + (`type_name`/`field_name`/`field_type`/kind), then `register_struct` (first mutating fn). - **VM robustness — `Frame` bounds-check; lowering-time `#insert` wiring explored + reverted (2026-06-18).** Explored wiring the VM at the LOWERING-time comptime site (`evalComptimeString`, the `#insert` string fold). 12/13 `#insert` examples ran on the VM with parity, but `0737` diff --git a/current/PLAN-COMPILER-VM.md b/current/PLAN-COMPILER-VM.md index 0437de2e..3a80b47c 100644 --- a/current/PLAN-COMPILER-VM.md +++ b/current/PLAN-COMPILER-VM.md @@ -253,6 +253,36 @@ host through it: compiler functions (`find_type`, `register_struct`, the reflection readers) are added the same way — flat-memory pointer in, handle/pointer out, no marshaling. +**Phase 3 progress (2026-06-18):** +- **(P3.1) First read-only reflection readers — `find_type` + `type_field_count` (DONE).** + Two more `compiler`-library fns bound the same way as the `intern`/`text_of` seed + (added to `compiler_lib.bound_fns` AND `Vm.callCompilerFn`, native on flat memory, no + marshaling). A **type handle is a plain `u32` `TypeId`** (exactly like `StringId`), so + both calls keep the seed's clean scalar shape — handle in, scalar out: + `find_type(name: StringId) -> TypeId` (`TypeTable.findByName`) and + `type_field_count(t: TypeId) -> i64` (a new `TypeTable.memberCount` query — struct/union/ + tagged-union fields, enum variants, array/vector length — that BOTH the legacy handler + and the VM call, so the two paths can't drift). Example `0628` chains + `intern → find_type → type_field_count` and a not-found lookup, both folded at `#run`, + both VM-HANDLED natively (no fallback). Parity **689/689** (gate ON and OFF); VM unit test + added. + - **Decision (resolves the plan's `find_type → ?Type` sketch):** `find_type` returns a + NON-optional `TypeId`, using the codebase's dedicated `unresolved` (0) sentinel for + not-found — NOT an `?Type`. Rationale: a `Type` value resolves to `.any` + (`type_resolver.zig`), which the flat-memory VM does not represent; and an optional + return can't cross the legacy↔VM eval boundary (`regToValue` bridges only + word/string/struct/tuple). `unresolved` is the project-blessed unmistakable "no type" + marker (see CLAUDE.md REJECTED PATTERNS — a dedicated sentinel is the required shape), + so the caller checks the handle against 0. This keeps the reader a clean scalar mirror + of `intern`/`text_of` and defers `.any`/optional plumbing to when it's actually needed. +- **Next (P3.2):** more read-only readers on the same `TypeId`-handle shape (`type_name(t) + -> StringId`, `field_name(t, i) -> StringId`, `field_type(t, i) -> TypeId`, kind queries), + then `register_struct` (the first MUTATING fn — mints a `TypeId`; resolve the mutable-table + / host-ABI-vs-target-ABI boundary deliberately, per the open questions). Re-expressing + `declare`/`define`/`type_info` as sx (the metatype, which runs at LOWERING time) still + needs the VM hardened against malformed lowering-time IR first — keep that on the legacy + path until then (see the resume note in CHECKPOINT-COMPILER-API.md). + ### Phase 3 — Compiler-API on flat memory (resume the stream — no weld) With native-byte comptime values, re-home the compiler-API: diff --git a/examples/0628-comptime-compiler-find-type.sx b/examples/0628-comptime-compiler-find-type.sx new file mode 100644 index 00000000..44074bca --- /dev/null +++ b/examples/0628-comptime-compiler-find-type.sx @@ -0,0 +1,37 @@ +// Comptime compiler API — read-only reflection readers (Phase 3). +// +// `find_type` / `type_field_count` are bound to the `compiler` library via +// `abi(.zig) extern compiler`, joining the `intern` / `text_of` seed. They are +// the first REFLECTION readers: the compiler exposes its own type table to +// comptime sx as plain handles (a `TypeId` is a u32, like a `StringId`), so the +// calls are clean scalar host-calls — handle in, scalar out, no marshaling. +// +// find_type(name) → the named type's handle (0 / `unresolved` if absent) +// type_field_count(t) → its member count (struct fields here) +// +// Comptime-only: they run inside `#run`, folding to plain int constants the +// runtime `main` prints. Chains `intern` → `find_type` → `type_field_count`. + +#import "modules/std.sx"; + +compiler :: #library "compiler"; + +StringId :: u32; +TypeId :: u32; + +intern :: (s: string) -> StringId abi(.zig) extern compiler; +find_type :: (name: StringId) -> TypeId abi(.zig) extern compiler; +type_field_count :: (t: TypeId) -> i64 abi(.zig) extern compiler; + +Point :: struct { x: i64; y: i64; z: i64; } + +// Look the struct up by name and count its fields, all at comptime. +point_fields :: #run type_field_count(find_type(intern("Point"))); + +// A name with no matching type folds to the `unresolved` sentinel (0). +missing_id :: #run find_type(intern("NoSuchType")); + +main :: () { + print("Point has {} fields\n", point_fields); + print("missing type id = {}\n", missing_id); +} diff --git a/examples/expected/0628-comptime-compiler-find-type.exit b/examples/expected/0628-comptime-compiler-find-type.exit new file mode 100644 index 00000000..573541ac --- /dev/null +++ b/examples/expected/0628-comptime-compiler-find-type.exit @@ -0,0 +1 @@ +0 diff --git a/examples/expected/0628-comptime-compiler-find-type.stderr b/examples/expected/0628-comptime-compiler-find-type.stderr new file mode 100644 index 00000000..8b137891 --- /dev/null +++ b/examples/expected/0628-comptime-compiler-find-type.stderr @@ -0,0 +1 @@ + diff --git a/examples/expected/0628-comptime-compiler-find-type.stdout b/examples/expected/0628-comptime-compiler-find-type.stdout new file mode 100644 index 00000000..e1f81f48 --- /dev/null +++ b/examples/expected/0628-comptime-compiler-find-type.stdout @@ -0,0 +1,2 @@ +Point has 3 fields +missing type id = 0 diff --git a/src/ir/compiler_lib.zig b/src/ir/compiler_lib.zig index 4c7e9e12..09e52024 100644 --- a/src/ir/compiler_lib.zig +++ b/src/ir/compiler_lib.zig @@ -47,6 +47,8 @@ pub const BoundFn = struct { pub const bound_fns = [_]BoundFn{ .{ .sx_name = "intern", .handler = handleIntern }, .{ .sx_name = "text_of", .handler = handleTextOf }, + .{ .sx_name = "find_type", .handler = handleFindType }, + .{ .sx_name = "type_field_count", .handler = handleTypeFieldCount }, }; /// Look up a compiler function by its sx name. Returns null when the name is not @@ -82,3 +84,30 @@ fn handleTextOf(interp: *Interpreter, args: []const Value) InterpError!Value { const id: StringId = @enumFromInt(@as(u32, @intCast(args[0].int))); return Value{ .string = interp.module.types.getString(id) }; } + +/// `find_type(name: StringId) -> TypeId` — look up a named type (struct / enum / +/// union / tagged-union / error-set) by its interned name and return its handle. +/// A name with no matching type yields the dedicated `unresolved` sentinel (a +/// `TypeId` of 0), the codebase-blessed "no type" marker — NOT an `?Type` (a +/// `Type` value is `.any`-typed, which the flat-memory VM does not represent, and +/// an optional can't cross the legacy↔VM eval boundary). The caller checks the +/// handle against 0 / `unresolved`. The VM mirrors this in `comptime_vm.callCompilerFn`. +fn handleFindType(interp: *Interpreter, args: []const Value) InterpError!Value { + if (args.len != 1 or args[0] != .int) return error.TypeError; + if (args[0].int < 0 or args[0].int > std.math.maxInt(u32)) return error.TypeError; + const name: StringId = @enumFromInt(@as(u32, @intCast(args[0].int))); + const tid = interp.module.types.findByName(name) orelse types.TypeId.unresolved; + return Value{ .int = tid.index() }; +} + +/// `type_field_count(t: TypeId) -> i64` — the member count of an aggregate type +/// (struct/union/tagged-union fields, enum variants, array/vector length), read +/// through `TypeTable.memberCount`. A type with no member count (scalar, pointer, +/// the `unresolved` sentinel, …) is a loud error — never a silent 0. +fn handleTypeFieldCount(interp: *Interpreter, args: []const Value) InterpError!Value { + if (args.len != 1 or args[0] != .int) return error.TypeError; + if (args[0].int < 0 or args[0].int > std.math.maxInt(u32)) return error.TypeError; + const tid: types.TypeId = @enumFromInt(@as(u32, @intCast(args[0].int))); + const count = interp.module.types.memberCount(tid) orelse return error.TypeError; + return Value{ .int = count }; +} diff --git a/src/ir/comptime_vm.test.zig b/src/ir/comptime_vm.test.zig index 58cb1ca6..a21bd4b7 100644 --- a/src/ir/comptime_vm.test.zig +++ b/src/ir/comptime_vm.test.zig @@ -769,6 +769,70 @@ test "comptime_vm exec: compiler-fn intern/text_of round-trip (native, no legacy try std.testing.expectEqual(@as(i64, 5), toI64(try v.run(module.getFunction(main_id), &.{}))); } +test "comptime_vm exec: compiler-fn find_type + type_field_count (native reflection)" { + const alloc = std.testing.allocator; + var module = Module.init(alloc); + defer module.deinit(); + + // A struct `Point { x, y, z }` registered in the type table (the thing the + // reflection readers look up by name and count the fields of). + const point_name = module.types.internString("Point"); + const pfields = [_]types.TypeInfo.StructInfo.Field{ + .{ .name = module.types.internString("x"), .ty = .i64 }, + .{ .name = module.types.internString("y"), .ty = .i64 }, + .{ .name = module.types.internString("z"), .ty = .i64 }, + }; + _ = module.types.intern(.{ .@"struct" = .{ .name = point_name, .fields = &pfields } }); + + // extern find_type(name: u32) -> u32 [compiler] (FuncId 0, no body) + const fp = [_]Function.Param{.{ .name = module.types.internString("name"), .ty = .u32 }}; + var ffb = Fb.init(alloc, &fp, .u32); + ffb.func.is_extern = true; + ffb.func.compiler_welded = true; + ffb.func.name = module.types.internString("find_type"); + const find_id = module.addFunction(ffb.func); + + // extern type_field_count(t: u32) -> i64 [compiler] (FuncId 1, no body) + const cp = [_]Function.Param{.{ .name = module.types.internString("t"), .ty = .u32 }}; + var cfb = Fb.init(alloc, &cp, .i64); + cfb.func.is_extern = true; + cfb.func.compiler_welded = true; + cfb.func.name = module.types.internString("type_field_count"); + const count_id = module.addFunction(cfb.func); + + // main(): return type_field_count(find_type(intern_id_of("Point"))) → 3 + // ("Point" is already interned above; pass its StringId directly.) + var fb = Fb.init(alloc, &.{}, .i64); + const b0 = fb.block(&.{}); + const nm = fb.add(b0, inst(.{ .const_int = @intFromEnum(point_name) }, .u32)); + const nargs = [_]Ref{ref(nm)}; + const tid = fb.add(b0, inst(.{ .call = .{ .callee = find_id, .args = &nargs } }, .u32)); + const targs = [_]Ref{ref(tid)}; + const cnt = fb.add(b0, inst(.{ .call = .{ .callee = count_id, .args = &targs } }, .i64)); + _ = fb.add(b0, inst(.{ .ret = .{ .operand = ref(cnt) } }, .void)); + const main_id = module.addFunction(fb.func); + + var v = vm.Vm.init(alloc); + v.table = &module.types; + v.module = &module; + defer v.deinit(); + try std.testing.expectEqual(@as(i64, 3), toI64(try v.run(module.getFunction(main_id), &.{}))); + + // A name with no matching type → the `unresolved` (0) sentinel. + const missing = module.types.internString("Nope"); + var mfb = Fb.init(alloc, &.{}, .u32); + const mb = mfb.block(&.{}); + const mnm = mfb.add(mb, inst(.{ .const_int = @intFromEnum(missing) }, .u32)); + const margs = [_]Ref{ref(mnm)}; + const mres = mfb.add(mb, inst(.{ .call = .{ .callee = find_id, .args = &margs } }, .u32)); + _ = mfb.add(mb, inst(.{ .ret = .{ .operand = ref(mres) } }, .void)); + const missing_main = module.addFunction(mfb.func); + try std.testing.expectEqual( + @as(i64, @intFromEnum(TypeId.unresolved)), + toI64(try v.run(module.getFunction(missing_main), &.{})), + ); +} + test "comptime_vm exec: func_ref + call_indirect dispatch" { const alloc = std.testing.allocator; var module = Module.init(alloc); diff --git a/src/ir/comptime_vm.zig b/src/ir/comptime_vm.zig index d49276e4..7b1858d9 100644 --- a/src/ir/comptime_vm.zig +++ b/src/ir/comptime_vm.zig @@ -1011,6 +1011,30 @@ pub const Vm = struct { if (text.len > 0) @memcpy(try self.machine.bytes(data, text.len), text); return try self.makeSlice(table, data, text.len); } + // ── read-only reflection readers (Phase 3) ────────────────────────── + // Type handle = a u32 `TypeId` (a word), exactly like `StringId` — so + // these mirror intern/text_of's shape: word in, word out, no marshaling. + if (std.mem.eql(u8, name, "find_type")) { + if (args.len != 1) return self.failMsg("comptime find_type: expected one StringId arg"); + const raw = frame.get(args[0].index()); + if (raw > std.math.maxInt(u32)) return self.failMsg("comptime find_type: StringId out of range"); + const sid: types.StringId = @enumFromInt(@as(u32, @intCast(raw))); + // Not found → the dedicated `unresolved` (0) sentinel, never a real + // type id (mirrors `compiler_lib.handleFindType`). + const tid = table.findByName(sid) orelse TypeId.unresolved; + return @as(Reg, tid.index()); + } + if (std.mem.eql(u8, name, "type_field_count")) { + if (args.len != 1) return self.failMsg("comptime type_field_count: expected one TypeId arg"); + const raw = frame.get(args[0].index()); + if (raw > std.math.maxInt(u32)) return self.failMsg("comptime type_field_count: TypeId out of range"); + const tid: TypeId = @enumFromInt(@as(u32, @intCast(raw))); + // Same `TypeTable.memberCount` the legacy handler reads → no drift; a + // type with no member count bails loudly (no silent 0). + const count = table.memberCount(tid) orelse + return self.failMsg("comptime type_field_count: type has no field/variant count"); + return @as(Reg, @bitCast(count)); + } return null; // not a known compiler function → caller bails to legacy } diff --git a/src/ir/types.zig b/src/ir/types.zig index c788d4a4..f929e7d2 100644 --- a/src/ir/types.zig +++ b/src/ir/types.zig @@ -480,6 +480,26 @@ pub const TypeTable = struct { return null; } + /// Member count of an aggregate type: struct/union/tagged-union fields, enum + /// variants, or array/vector length. Returns null for a type that has no + /// member count (a scalar, pointer, the `unresolved` sentinel, …) — so a + /// caller bails loudly rather than reading a silent 0. The comptime + /// compiler-API reflection reader `type_field_count` rides on this (both the + /// legacy `compiler_lib` handler and the flat-memory VM call it, so the two + /// paths can never drift). Out-of-range ids return null, not a panic. + pub fn memberCount(self: *const TypeTable, id: TypeId) ?i64 { + if (id.index() >= self.infos.items.len) return null; + return switch (self.get(id)) { + .@"struct" => |s| @intCast(s.fields.len), + .@"union" => |u| @intCast(u.fields.len), + .tagged_union => |u| @intCast(u.fields.len), + .@"enum" => |e| @intCast(e.variants.len), + .array => |a| @intCast(a.length), + .vector => |v| @intCast(v.length), + else => null, + }; + } + /// Source-sensitive variant of `findByName`: asserts at most one named type /// matches, then returns it (or null). Quarantines the global first-match /// scan — new resolver code that must not silently pick a first-of-many