Files
sx/current/CHECKPOINT-COMPILER-API.md
agra 40d075ca98 compiler-API: welded structs by reflection + memory-order validation
Replace the explored byte-layout-override engine (offset-ordered LLVM structs /
weld plans / byte-blobs — all unnecessary) with a much simpler design: a welded
`struct abi(.zig) extern compiler { … }` is a bodied header declaring its fields
in the bound compiler type's MEMORY order. The compiler reflects the real Zig
type (field names via @typeInfo, offsets via @offsetOf, size via @sizeOf —
nothing hand-maintained) and validates the header matches, with loud diagnostics.

On pass it is an ordinary struct whose natural layout already equals the Zig
layout — no reorder, no padding, no index/remap tables, no special LLVM path — so
@ptrCast'ing it to the compiler's own type and dereferencing is byte-identical.
When types.zig shifts, the header stops matching and the developer gets a specific
message to fix it.

- compiler_lib.zig: weldStruct reflects field names and bakes bound_types fields
  in ascending-offset (memory) order; deleted computeWeldPlan/WeldPlan/WeldElement.
- nominal.zig validateWeldedStruct: precise diagnostics — field-not-found,
  wrong-field-order (+ expected memory order), type-layout (size) mismatch,
  total-size mismatch.
- Examples: 0627 (StructInfo in memory order, byte-identical, usable),
  1186 (source-order StructInfo -> wrong-field-order diagnostic); 1183 refreshed.
- Design doc + checkpoint updated.
2026-06-17 15:45:23 +03:00

318 lines
20 KiB
Markdown

# CHECKPOINT-COMPILER-API — comptime `compiler` library (`#library "compiler"` + `abi(.zig) extern`)
Companion to the design-of-record
[../design/comptime-compiler-api.md](../design/comptime-compiler-api.md) (the plan
+ phased build order live there). This stream supersedes the metatype
`declare`/`define`/`type_info` `#builtin`s and the `#compiler` struct attribute
with ONE welded mechanism. Branch: `reify` (off `master`). Update after every step.
## ⏯ Resume (fresh session)
Phase 1 done; Phase 2 **welded structs are working** via a much simpler design than
the original byte-layout-override "GEP engine" (that plan — `computeWeldPlan`,
offset-ordered LLVM structs, byte-blobs — was explored and DROPPED). The locked
design: a welded `Name :: struct abi(.zig) extern compiler { … }` is a bodied
header declaring fields in the compiler type's MEMORY order; the compiler reflects
the bound Zig type (`@typeInfo` names + `@offsetOf` offsets + `@sizeOf`, nothing
maintained by hand) and VALIDATES the header matches, with loud diagnostics. On
pass it's an ordinary byte-identical struct — so `@ptrCast` to the compiler's own
type + deref just works; no index tables, no reorder, no special emit.
**Next:** Phase 2 continues — re-express `type_info`/`define` (struct) as sx over
welded `register_struct`/`find_type` (host-call bridge, Phase 2.5/2.6); see
**## Next step**. Read order: this file → `src/ir/compiler_lib.zig` (registry +
reflection) → `src/ir/lower/nominal.zig` `validateWeldedStruct`. Build/verify:
`zig build && zig build test`.
> ⚠ Snapshot workflow: use `-Dname=examples/NNNN-foo.sx[,…] -Dupdate-goldens` to
> regenerate ONLY the named example(s) — a full `-Dupdate-goldens` re-runs all ~690
> and a flaky/host-divergent example (AOT/cross-arch) can clobber good snapshots.
> See CLAUDE.md → Snapshot integrity.
## Last completed step
**Phase 2 — welded structs by reflection + memory-order validation (byte-identical,
no GEP engine).** A welded `struct abi(.zig) extern compiler { … }` now works
end-to-end as a byte-identical mirror of the bound Zig type.
Design (locked, supersedes the byte-layout-override plan):
- The sx header declares fields in the compiler type's MEMORY order. The compiler
REFLECTS the bound Zig type — field names from `@typeInfo`, offsets from
`@offsetOf`, size from `@sizeOf` — and validates the header matches. Nothing is
maintained by hand; a `types.zig` change re-reflects on the next compiler build.
- On pass it's an ORDINARY struct whose natural layout already equals the Zig
layout → `@ptrCast` to the compiler type + deref is byte-identical. No
byte-blob, no index/remap tables, no reorder, no special LLVM path.
- Loud, precise diagnostics on any drift: *field not found* (+ memory order),
*wrong field order at position N* (+ expected memory order), *type layout
mismatch* (field size), *layout mismatch* (total size / count).
What changed from the dropped plan:
- `compiler_lib.zig`: `weldStruct` now REFLECTS field names (`@typeInfo`) and bakes
`bound_types` fields in ascending-OFFSET (memory) order — no hand-listed names.
Deleted `computeWeldPlan`/`WeldPlan`/`WeldElement`. `validateStructLayout` checks
the sx header against the memory-ordered registry.
- `nominal.zig` `validateWeldedStruct`: renders the precise diagnostics
(+ `weldedFieldOrderStr`).
- Examples: `0627` (StructInfo in memory order, byte-identical, usable);
`1186` (source-order StructInfo → wrong-field-order diagnostic). `1183` message
refreshed.
- `zig build` + `zig build test` green (692 corpus, unit tests pass).
### Earlier — Phase 2.1 (weld-plan layout math, now removed)
**The weld-plan offset math + `StructInfo` registered.** Was the core of the
byte-layout-override engine; superseded by the reflection+validation design above.
Decision (locked 2026-06-17): **full byte-layout weld** — a welded sx struct is
laid out byte-identically to the bound Zig type (Zig's `@offsetOf`, reordering +
padding included), so it passes to a Zig handler as raw memory with zero
marshalling. (The alternative — handlers reading interp `Value` aggregates
logically, no layout override — was rejected; welded types must also be usable as
runtime data, and the design wants the literal byte weld.)
- Measured: Zig reorders `StructInfo` to `fields`@0, `name`@16, `nominal_id`@20,
`is_protocol`@24, size 32 — vs sx-natural `name`@0, `fields`@8, … So the override
is genuinely required (`Field`'s two-u32 natural layout was the easy case).
- `compiler_lib.zig`: registered `StructInfo` (`weldStruct`, the second
`bound_types` entry). Added `WeldElement` / `WeldPlan` + `computeWeldPlan(alloc,
fields, total)` — pure: orders fields by ascending byte offset, inserts padding
elements for gaps + the alignment tail, and builds the sx-field → LLVM-element
remap. This is what the LLVM type builder + struct-GEP sites will consume.
- Unit-tested (`compiler_lib.test.zig`): `Field` → identity plan (2 elems, no pad);
`StructInfo` → 5 elems `[fields@0, name@16, nominal_id@20, is_protocol@24,
pad@25..32]`, remap `[1,0,3,2]`.
- `zig build` + `zig build test` green.
### Earlier — Phase 1 polish (comptime-only enforcement)
**A RUNTIME call to a `fn abi(.zig) extern compiler` is a clean build-gating error
instead of an undefined-symbol link failure.**
- `emitCall` (`src/backend/llvm/ops.zig`): when the callee is `compiler_welded`
AND the ENCLOSING function is not `is_comptime` (i.e. genuine runtime code, not a
`#run`/`::` initializer wrapper whose LLVM body is dead), print a clear
"comptime-only … cannot be called at runtime" error and set
`comptime_failed` (the driver halts before object/JIT emission). The enclosing
`is_comptime` guard is what keeps the legitimate `#run` use (example 0626) green.
- Corpus: `examples/1185-diagnostics-weld-fn-runtime-call.sx` (runtime `intern(…)`
→ clean error, exit 1, no link failure).
- `zig build` + `zig build test` green (458 unit + 690 corpus).
### Earlier — fifth sub-step (host-call bridge)
**A `fn abi(.zig) extern compiler` dispatches, under the comptime interpreter, to
its registered Zig handler instead of dlsym.**
- `compiler_lib.zig`: function registry — `BoundFn { sx_name, handler }`,
`bound_fns` = `intern(string)->StringId` + `text_of(StringId)->string` (the
string-pool round-trip), `findFn`, and `FnHandler` (`*Interpreter, []Value ->
Value`). `intern` mutates via `interp.mint orelse @constCast(&module.types)`
(the same mutable-table access the metatype mint path uses); `text_of` reads the
const pool. Imports `interp.zig` (the compiler_hooks↔interp cycle pattern).
- IR `Function` gained `compiler_welded: bool`. `declareFunction`
(`src/ir/lower/decl.zig`) sets it via `weldedCompilerFn`, which also VALIDATES:
the bound lib must be `compiler` and the name must be on the function-export
list — else a build-gating `.err` (no silent fall-through to dlsym).
- `interp.call()`: before the dlsym/extern path, a `compiler_welded` function
routes to `compiler_lib.findFn(name).handler(self, args)` (clean bail off the
export list).
- Corpus: `examples/0626-comptime-weld-fn-intern-text-of.sx` (`#run
text_of(intern("hello, compiler"))` folds to a string constant → prints it);
`examples/1184-diagnostics-weld-fn-unexported.sx` (unexported welded-fn name →
build error). `findFn` lookup unit-tested.
- **Runtime-call rejection is NOT yet clean** — welded fns are comptime-only; a
RUNTIME call would emit a reference to a non-existent extern symbol → a loud
LINK error (not silent, but not a tidy diagnostic). The examples call welded fns
only inside `#run`. A dedicated "comptime-only symbol" emit diagnostic is the
immediate follow-up.
- `zig build` + `zig build test` green (458 unit tests + 689 corpus).
### Earlier — fourth sub-step (welded-struct layout validation)
**A `struct abi(.zig) extern compiler { … }` is validated against the binding
registry as a *header checked against the implementation*.**
- `compiler_lib.zig`: `validateStructLayout(bt, sx_fields, total)` — pure, returns
the first `LayoutMismatch` (field count / name / size / total) or null. Plus
`lib_name = "compiler"` and `SxField`. Unit-tested (faithful `Field` passes;
each drift flagged as the right variant).
- `registerStructDecl` (`src/ir/lower/nominal.zig`): for `sd.abi == .zig`,
`validateWeldedStruct` checks the bound lib is `compiler`, the name is on the
export list (`findType`), and the sx layout (field names + `typeSizeBytes` +
total) matches the welded type — emitting a build-gating `.err` (good span into
the struct body) on any failure. No silent reinterpretation.
- `#library "compiler"` is the comptime-only internal surface, NOT a dylib —
`src/main.zig`'s dlopen walker skips it (was emitting a spurious `libcompiler.so`
load warning).
- Corpus: `examples/0625-comptime-weld-struct-field.sx` (faithful `Field` welds,
validates, usable as data → `name=7 ty=3`); `examples/1183-diagnostics-weld-
struct-field-count.sx` (one-field `Field` → build-gating field-count diagnostic).
- **Offset-override / GEP emission for non-natural Zig layouts is NOT here** — it
isn't exercised by `Field` (two u32s = natural layout coincides with the weld).
It arrives with `StructInfo` in Phase 2 (slices/reordering), where the bound
offsets actually differ from the sx-natural ones. The validation already checks
per-field size + total, so a layout drift is caught even before the override
engine exists.
- `zig build` + `zig build test` green (456 unit tests + 687 corpus).
### Earlier — third sub-step (binding registry)
**The binding registry (welded-type lookup, layout baked from the real Zig
type).**
- New `src/ir/compiler_lib.zig` — the `compiler` library's binding registry, the
curated safety boundary. `BoundType { sx_name, size, alignment, fields:
[]FieldLayout{name, offset, size} }`; `weldStruct` bakes the layout from a real
Zig struct via `@sizeOf`/`@alignOf`/`@offsetOf` at compiler-build time (a
sx-field-count mismatch is a `@compileError`, never a silent truncation).
`bound_types` exports `Field` (welded to `types.TypeInfo.StructInfo.Field` —
two `u32`s); `findType(sx_name) ?*const BoundType` is the lookup the welded-decl
resolution path will consult (returns null off the export list — clean boundary,
no silent default).
- Registered in the barrel (`src/ir/ir.zig`): `compiler_lib` + `compiler_lib_tests`.
- Tests (`src/ir/compiler_lib.test.zig`): `findType("Field")` equals the real
`StructInfo.Field` `@sizeOf`/`@alignOf`/`@offsetOf` (8 bytes, two u32s at 0/4);
an unexported name returns null. Break-verified (a wrong size → suite red,
named `ir.compiler_lib.test...`).
- `zig build` + `zig build test` green (454 unit tests).
### Earlier — second sub-step (struct-decl parse)
**`abi(.zig) extern <lib>` PARSES on a STRUCT decl (parse-only, no semantics).**
- `ast.StructDecl` gained `abi: ABI` + `extern_lib: ?[]const u8` binding fields.
- `parseStructDecl` (`src/parser.zig`): after `struct` (and the `#compiler`
check), parse an optional `abi(...)` then optional `extern <lib>` — same slot
order as fn decls — and thread them onto the node. Ordinary structs are
unperturbed (`parseOptionalAbi`/`parseOptionalExternExport` no-op when absent).
- Parser unit tests (`src/parser.test.zig`): `Field :: struct abi(.zig) extern
compiler { name: StringId; ty: Type; }` parses with `abi == .zig`, `extern_lib
== "compiler"`, field list intact; a plain struct leaves `abi == .default` /
`extern_lib == null`. Break-verified (a wrong-sentinel assert turns the suite
red, confirming the test runs).
- `zig build` + `zig build test` green.
### Earlier — first sub-step (fn decls) + the syntax pivot
**`abi(.zig) extern <lib>` PARSES on a fn decl (parse-only).** Plus the syntax
pivot it required.
Syntax decision (locked 2026-06-17, supersedes the doc's original
`extern(.zig) <lib>` single-qualifier form): the ABI/layout selector and the
linkage keyword are two orthogonal annotations.
- `abi(.x)` — ABI / calling-convention annotation in the slot **before**
`extern`/`export`. **Unified replacement for `callconv(...)`, which is removed.**
`ABI = { default, c, zig, pure }`: `.c` (C ABI), `.zig` (Zig-layout weld → the
`compiler` library), `.pure` (naked asm), `.default` (unannotated). Can appear
standalone (no extern) on any fn / fn-type / lambda.
- `extern <lib>` — linkage keyword + binding source (named library).
So a welded binding is `text_of :: (id: StringId) -> string abi(.zig) extern compiler;`.
What landed:
- **AST** (`src/ast.zig`): `CallingConvention` → `ABI { default, c, zig, pure }`;
the `call_conv` field → `abi: ABI` on `FnDecl` / `Lambda` / `FunctionTypeExpr`.
- **Lexer/token** (`src/token.zig`, `src/lexer.zig`): `kw_callconv` → `kw_abi`,
keyword string `"callconv"` → `"abi"`.
- **Parser** (`src/parser.zig`): `parseOptionalCallConv` → `parseOptionalAbi`
(parses `abi(.c|.zig|.pure)`); wired in the fn-decl postfix slot (before
`extern`/`export`), the function-type-expr slot, and the lambda slot;
`isFunctionDef`/`hasFnBodyAfterArrow` recognise `kw_abi`.
- **AST→IR map** (`src/ir/type_resolver.zig`, `src/ir/lower/decl.zig`, `sema.zig`,
`closure.zig`): the AST `.abi == .c` reads kept their C-ABI meaning; the
function-type resolver maps `.zig`/`.pure` → IR `.default` (no fn-pointer-type
CC for those decl-level ABIs; neither occurs in a function-TYPE position yet).
- **CC-mismatch diagnostic** (`src/ir/lower/expr.zig`, `src/sema.zig`): the
user-facing text `callconv(.c)` → `abi(.c)`.
- **sx migration**: 52 `.sx` files `callconv(` → `abi(` (all were function-type
callback annotations — none in the fn-decl postfix slot, so no reordering).
- **Docs**: `readme.md`, `specs.md`, the design doc, snapshots (0114 / 1104 /
1200) regenerated for the rename.
- **Tests**: parser unit tests in `src/parser.test.zig` — `abi(.zig) extern <lib>`
on a fn decl (asserts `abi == .zig`, `extern_export == .extern_`, `extern_lib ==
"compiler"`); bare `extern` leaves `abi == .default`; standalone `abi(.c)` /
`abi(.pure)`. lexer/sema tests updated.
`zig build` + `zig build test` green (450/450 unit + 685 corpus).
## Current state
- `compiler :: #library "compiler";` parses + is recognised as the comptime-only
internal surface (never dlopen'd).
- `abi(.zig) extern compiler` STRUCTS: layout-validated against the registry
(faithful → ok; drift → build-gating diagnostic). `Field` welds + usable.
- `abi(.zig) extern compiler` FUNCTIONS: dispatched under the comptime interp to
their registered Zig handler (`intern`/`text_of` round-trip works); unexported
names rejected at declaration. Comptime-only.
- A RUNTIME call to a welded fn is a clean build-gating error (comptime-only
enforcement at `emitCall`); the legitimate `#run`/`::` use stays green.
- The whole Phase 1 foundation (parse → registry → struct-layout validation →
function host-call bridge → comptime-only enforcement) is in place for the
two-u32 `Field` case + the two string readers.
- **Deferred**: offset-override / LLVM byte-offset GEP for non-natural layouts
(needed by `StructInfo`'s slice field, Phase 2).
## Next step — Phase 2: welded compiler FUNCTIONS over the real types
Welded structs are byte-identical mirrors now, so the API surface can grow:
- **Bind `register_struct` / `find_type`** over the host-call bridge
(`compiler_lib.zig` `bound_fns`, like `intern`/`text_of`). `register_struct`
takes a welded `StructInfo` and mints a real `TypeId` (guarded: dup field names,
kind well-formedness — the checks `define` does today). Because the welded
`StructInfo` is byte-identical, the handler can read it as the real Zig
`*StructInfo` (cast + deref) rather than marshalling a `Value` field-by-field —
the payoff of the byte-weld. `find_type(StringId) -> ?Type` reads the table.
Prove: build a struct programmatically + round-trip a source one.
- **Re-express `type_info`/`define` (struct) as sx** over `register_struct`/
`find_type`; migrate `examples/0622`; delete the bespoke struct interp arms
(`defineStruct` / the `reflectTypeInfo` struct path).
Then Phase 3+: widen the welded types to `EnumInfo`/`TaggedUnionInfo`/`TupleInfo`
(optional fields → sentinels) — each just needs an sx header in the compiler
type's memory order + the matching `register_*` fn. Finally migrate `BuildOptions`
to `abi(.zig) extern compiler` (re-home the `#compiler` registry) and delete
`#compiler`.
Note: a welded struct with an `?T` / `union(enum)` field (e.g. `EnumInfo`'s
`backing_type: ?TypeId`, `explicit_values: ?[]const i64`) is the next layout
wrinkle — the sx header must mirror Zig's optional/union representation. Handle
when reached (sentinels or accessor fns; see the design doc Risks).
## Known issues
- None for this stream. (Metatype's deferred enhancement is issue 0141 — comptime
`List` growth; orthogonal, see `current/CHECKPOINT-METATYPE.md`.)
## Log
- **Phase 2 — welded structs by reflection + memory-order validation.** Dropped
the byte-layout-override engine (computeWeldPlan / offset-ordered LLVM struct /
byte-blob — all explored, all unnecessary). Instead: the sx header declares
fields in the compiler type's memory order; the compiler reflects the bound Zig
type (`@typeInfo`/`@offsetOf`/`@sizeOf`) and validates the header matches with
loud diagnostics (field-not-found, wrong-order+expected-order, size mismatch).
On pass it's an ordinary byte-identical struct — cast + deref just works.
Examples 0627 (usable) / 1186 (wrong-order diagnostic). Suite green (692).
- **Phase 2.1 — weld-plan layout math (REMOVED).** The byte-layout-override math;
superseded by the reflection+validation design and deleted.
- **Phase 1 polish — comptime-only enforcement.** A runtime call to a welded fn is
a clean build-gating error (`emitCall` gate, guarded by enclosing-`is_comptime`
so `#run`/`::` uses stay green), not a link failure. Example 1185. Build + suite
green (458 unit, 690 corpus).
- **Phase 1.1 fifth sub-step — host-call bridge (welded functions).**
`compiler_lib` function registry (`intern`/`text_of`) + `findFn`; IR `Function`
`compiler_welded` flag set/validated in `declareFunction` (`weldedCompilerFn`);
`interp.call()` dispatches welded calls to the Zig handler. Examples 0626 (round-
trip) + 1184 (unexported-fn diagnostic); `findFn` unit-tested. Runtime-call clean
rejection deferred (loud link error today). Build + suite green (458 unit, 689
corpus).
- **Phase 1.1 fourth sub-step — welded-struct layout validation.**
`validateStructLayout` (pure, unit-tested) + `validateWeldedStruct` wired into
`registerStructDecl`: a `struct abi(.zig) extern compiler` is validated against
the registry (lib == compiler, name exported, layout matches) with build-gating
diagnostics. `#library "compiler"` no longer dlopen'd. Examples 0625 (faithful
Field) + 1183 (field-count mismatch diagnostic). Offset-override/GEP deferred to
Phase 2 (not exercised by Field's natural layout). Build + suite green (456 unit,
687 corpus).
- **Phase 1.1 third sub-step — binding registry.** New `src/ir/compiler_lib.zig`:
the `compiler` lib's welded-type registry; `Field` welded to
`StructInfo.Field` with layout baked from the real Zig type
(`@offsetOf`/`@sizeOf`/`@alignOf`); `findType` lookup proven by unit test
(+ null off the export list). Standalone island — not yet consumed by lowering.
Build + suite green (454 unit tests). Break-verified.
- **Phase 1.1 second sub-step — struct-decl binding parses.** `ast.StructDecl`
gained `abi` + `extern_lib`; `parseStructDecl` parses `abi(.zig) extern <lib>`
after `struct`. Parser unit tests (welded `Field` + plain struct), break-verified.
Build + suite green. Parse-only sub-step (fns + structs) of Phase 1.1 complete.
- **Phase 1.1 first sub-step + `callconv`→`abi` unification.** Parsed `abi(.zig)
extern <lib>` on fn decls; unified `callconv` into `abi(.c|.zig|.pure)` (removed
the `callconv` keyword), migrated 52 sx files + compiler diagnostics + docs +
snapshots. Build + suite green. The original design's `extern(.zig)` single
qualifier was split into `abi(.zig)` (ABI/layout, before extern) + `extern
<lib>` (linkage + source) — recorded in the design doc's syntax-decision note.