Files
sx/current/CHECKPOINT-COMPILER-API.md
agra 40d075ca98 compiler-API: welded structs by reflection + memory-order validation
Replace the explored byte-layout-override engine (offset-ordered LLVM structs /
weld plans / byte-blobs — all unnecessary) with a much simpler design: a welded
`struct abi(.zig) extern compiler { … }` is a bodied header declaring its fields
in the bound compiler type's MEMORY order. The compiler reflects the real Zig
type (field names via @typeInfo, offsets via @offsetOf, size via @sizeOf —
nothing hand-maintained) and validates the header matches, with loud diagnostics.

On pass it is an ordinary struct whose natural layout already equals the Zig
layout — no reorder, no padding, no index/remap tables, no special LLVM path — so
@ptrCast'ing it to the compiler's own type and dereferencing is byte-identical.
When types.zig shifts, the header stops matching and the developer gets a specific
message to fix it.

- compiler_lib.zig: weldStruct reflects field names and bakes bound_types fields
  in ascending-offset (memory) order; deleted computeWeldPlan/WeldPlan/WeldElement.
- nominal.zig validateWeldedStruct: precise diagnostics — field-not-found,
  wrong-field-order (+ expected memory order), type-layout (size) mismatch,
  total-size mismatch.
- Examples: 0627 (StructInfo in memory order, byte-identical, usable),
  1186 (source-order StructInfo -> wrong-field-order diagnostic); 1183 refreshed.
- Design doc + checkpoint updated.
2026-06-17 15:45:23 +03:00

20 KiB

CHECKPOINT-COMPILER-API — comptime compiler library (#library "compiler" + abi(.zig) extern)

Companion to the design-of-record ../design/comptime-compiler-api.md (the plan

  • phased build order live there). This stream supersedes the metatype declare/define/type_info #builtins and the #compiler struct attribute with ONE welded mechanism. Branch: reify (off master). Update after every step.

⏯ Resume (fresh session)

Phase 1 done; Phase 2 welded structs are working via a much simpler design than the original byte-layout-override "GEP engine" (that plan — computeWeldPlan, offset-ordered LLVM structs, byte-blobs — was explored and DROPPED). The locked design: a welded Name :: struct abi(.zig) extern compiler { … } is a bodied header declaring fields in the compiler type's MEMORY order; the compiler reflects the bound Zig type (@typeInfo names + @offsetOf offsets + @sizeOf, nothing maintained by hand) and VALIDATES the header matches, with loud diagnostics. On pass it's an ordinary byte-identical struct — so @ptrCast to the compiler's own type + deref just works; no index tables, no reorder, no special emit.

Next: Phase 2 continues — re-express type_info/define (struct) as sx over welded register_struct/find_type (host-call bridge, Phase 2.5/2.6); see ## Next step. Read order: this file → src/ir/compiler_lib.zig (registry + reflection) → src/ir/lower/nominal.zig validateWeldedStruct. Build/verify: zig build && zig build test.

⚠ Snapshot workflow: use -Dname=examples/NNNN-foo.sx[,…] -Dupdate-goldens to regenerate ONLY the named example(s) — a full -Dupdate-goldens re-runs all ~690 and a flaky/host-divergent example (AOT/cross-arch) can clobber good snapshots. See CLAUDE.md → Snapshot integrity.

Last completed step

Phase 2 — welded structs by reflection + memory-order validation (byte-identical, no GEP engine). A welded struct abi(.zig) extern compiler { … } now works end-to-end as a byte-identical mirror of the bound Zig type.

Design (locked, supersedes the byte-layout-override plan):

  • The sx header declares fields in the compiler type's MEMORY order. The compiler REFLECTS the bound Zig type — field names from @typeInfo, offsets from @offsetOf, size from @sizeOf — and validates the header matches. Nothing is maintained by hand; a types.zig change re-reflects on the next compiler build.
  • On pass it's an ORDINARY struct whose natural layout already equals the Zig layout → @ptrCast to the compiler type + deref is byte-identical. No byte-blob, no index/remap tables, no reorder, no special LLVM path.
  • Loud, precise diagnostics on any drift: field not found (+ memory order), wrong field order at position N (+ expected memory order), type layout mismatch (field size), layout mismatch (total size / count).

What changed from the dropped plan:

  • compiler_lib.zig: weldStruct now REFLECTS field names (@typeInfo) and bakes bound_types fields in ascending-OFFSET (memory) order — no hand-listed names. Deleted computeWeldPlan/WeldPlan/WeldElement. validateStructLayout checks the sx header against the memory-ordered registry.
  • nominal.zig validateWeldedStruct: renders the precise diagnostics (+ weldedFieldOrderStr).
  • Examples: 0627 (StructInfo in memory order, byte-identical, usable); 1186 (source-order StructInfo → wrong-field-order diagnostic). 1183 message refreshed.
  • zig build + zig build test green (692 corpus, unit tests pass).

Earlier — Phase 2.1 (weld-plan layout math, now removed)

The weld-plan offset math + StructInfo registered. Was the core of the byte-layout-override engine; superseded by the reflection+validation design above.

Decision (locked 2026-06-17): full byte-layout weld — a welded sx struct is laid out byte-identically to the bound Zig type (Zig's @offsetOf, reordering + padding included), so it passes to a Zig handler as raw memory with zero marshalling. (The alternative — handlers reading interp Value aggregates logically, no layout override — was rejected; welded types must also be usable as runtime data, and the design wants the literal byte weld.)

  • Measured: Zig reorders StructInfo to fields@0, name@16, nominal_id@20, is_protocol@24, size 32 — vs sx-natural name@0, fields@8, … So the override is genuinely required (Field's two-u32 natural layout was the easy case).
  • compiler_lib.zig: registered StructInfo (weldStruct, the second bound_types entry). Added WeldElement / WeldPlan + computeWeldPlan(alloc, fields, total) — pure: orders fields by ascending byte offset, inserts padding elements for gaps + the alignment tail, and builds the sx-field → LLVM-element remap. This is what the LLVM type builder + struct-GEP sites will consume.
  • Unit-tested (compiler_lib.test.zig): Field → identity plan (2 elems, no pad); StructInfo → 5 elems [fields@0, name@16, nominal_id@20, is_protocol@24, pad@25..32], remap [1,0,3,2].
  • zig build + zig build test green.

Earlier — Phase 1 polish (comptime-only enforcement)

A RUNTIME call to a fn abi(.zig) extern compiler is a clean build-gating error instead of an undefined-symbol link failure.

  • emitCall (src/backend/llvm/ops.zig): when the callee is compiler_welded AND the ENCLOSING function is not is_comptime (i.e. genuine runtime code, not a #run/:: initializer wrapper whose LLVM body is dead), print a clear "comptime-only … cannot be called at runtime" error and set comptime_failed (the driver halts before object/JIT emission). The enclosing is_comptime guard is what keeps the legitimate #run use (example 0626) green.
  • Corpus: examples/1185-diagnostics-weld-fn-runtime-call.sx (runtime intern(…) → clean error, exit 1, no link failure).
  • zig build + zig build test green (458 unit + 690 corpus).

Earlier — fifth sub-step (host-call bridge)

A fn abi(.zig) extern compiler dispatches, under the comptime interpreter, to its registered Zig handler instead of dlsym.

  • compiler_lib.zig: function registry — BoundFn { sx_name, handler }, bound_fns = intern(string)->StringId + text_of(StringId)->string (the string-pool round-trip), findFn, and FnHandler (*Interpreter, []Value -> Value). intern mutates via interp.mint orelse @constCast(&module.types) (the same mutable-table access the metatype mint path uses); text_of reads the const pool. Imports interp.zig (the compiler_hooks↔interp cycle pattern).
  • IR Function gained compiler_welded: bool. declareFunction (src/ir/lower/decl.zig) sets it via weldedCompilerFn, which also VALIDATES: the bound lib must be compiler and the name must be on the function-export list — else a build-gating .err (no silent fall-through to dlsym).
  • interp.call(): before the dlsym/extern path, a compiler_welded function routes to compiler_lib.findFn(name).handler(self, args) (clean bail off the export list).
  • Corpus: examples/0626-comptime-weld-fn-intern-text-of.sx (#run text_of(intern("hello, compiler")) folds to a string constant → prints it); examples/1184-diagnostics-weld-fn-unexported.sx (unexported welded-fn name → build error). findFn lookup unit-tested.
  • Runtime-call rejection is NOT yet clean — welded fns are comptime-only; a RUNTIME call would emit a reference to a non-existent extern symbol → a loud LINK error (not silent, but not a tidy diagnostic). The examples call welded fns only inside #run. A dedicated "comptime-only symbol" emit diagnostic is the immediate follow-up.
  • zig build + zig build test green (458 unit tests + 689 corpus).

Earlier — fourth sub-step (welded-struct layout validation)

A struct abi(.zig) extern compiler { … } is validated against the binding registry as a header checked against the implementation.

  • compiler_lib.zig: validateStructLayout(bt, sx_fields, total) — pure, returns the first LayoutMismatch (field count / name / size / total) or null. Plus lib_name = "compiler" and SxField. Unit-tested (faithful Field passes; each drift flagged as the right variant).
  • registerStructDecl (src/ir/lower/nominal.zig): for sd.abi == .zig, validateWeldedStruct checks the bound lib is compiler, the name is on the export list (findType), and the sx layout (field names + typeSizeBytes + total) matches the welded type — emitting a build-gating .err (good span into the struct body) on any failure. No silent reinterpretation.
  • #library "compiler" is the comptime-only internal surface, NOT a dylib — src/main.zig's dlopen walker skips it (was emitting a spurious libcompiler.so load warning).
  • Corpus: examples/0625-comptime-weld-struct-field.sx (faithful Field welds, validates, usable as data → name=7 ty=3); examples/1183-diagnostics-weld- struct-field-count.sx (one-field Field → build-gating field-count diagnostic).
  • Offset-override / GEP emission for non-natural Zig layouts is NOT here — it isn't exercised by Field (two u32s = natural layout coincides with the weld). It arrives with StructInfo in Phase 2 (slices/reordering), where the bound offsets actually differ from the sx-natural ones. The validation already checks per-field size + total, so a layout drift is caught even before the override engine exists.
  • zig build + zig build test green (456 unit tests + 687 corpus).

Earlier — third sub-step (binding registry)

The binding registry (welded-type lookup, layout baked from the real Zig type).

  • New src/ir/compiler_lib.zig — the compiler library's binding registry, the curated safety boundary. BoundType { sx_name, size, alignment, fields: []FieldLayout{name, offset, size} }; weldStruct bakes the layout from a real Zig struct via @sizeOf/@alignOf/@offsetOf at compiler-build time (a sx-field-count mismatch is a @compileError, never a silent truncation). bound_types exports Field (welded to types.TypeInfo.StructInfo.Field — two u32s); findType(sx_name) ?*const BoundType is the lookup the welded-decl resolution path will consult (returns null off the export list — clean boundary, no silent default).
  • Registered in the barrel (src/ir/ir.zig): compiler_lib + compiler_lib_tests.
  • Tests (src/ir/compiler_lib.test.zig): findType("Field") equals the real StructInfo.Field @sizeOf/@alignOf/@offsetOf (8 bytes, two u32s at 0/4); an unexported name returns null. Break-verified (a wrong size → suite red, named ir.compiler_lib.test...).
  • zig build + zig build test green (454 unit tests).

Earlier — second sub-step (struct-decl parse)

abi(.zig) extern <lib> PARSES on a STRUCT decl (parse-only, no semantics).

  • ast.StructDecl gained abi: ABI + extern_lib: ?[]const u8 binding fields.
  • parseStructDecl (src/parser.zig): after struct (and the #compiler check), parse an optional abi(...) then optional extern <lib> — same slot order as fn decls — and thread them onto the node. Ordinary structs are unperturbed (parseOptionalAbi/parseOptionalExternExport no-op when absent).
  • Parser unit tests (src/parser.test.zig): Field :: struct abi(.zig) extern compiler { name: StringId; ty: Type; } parses with abi == .zig, extern_lib == "compiler", field list intact; a plain struct leaves abi == .default / extern_lib == null. Break-verified (a wrong-sentinel assert turns the suite red, confirming the test runs).
  • zig build + zig build test green.

Earlier — first sub-step (fn decls) + the syntax pivot

abi(.zig) extern <lib> PARSES on a fn decl (parse-only). Plus the syntax pivot it required.

Syntax decision (locked 2026-06-17, supersedes the doc's original extern(.zig) <lib> single-qualifier form): the ABI/layout selector and the linkage keyword are two orthogonal annotations.

  • abi(.x) — ABI / calling-convention annotation in the slot before extern/export. Unified replacement for callconv(...), which is removed. ABI = { default, c, zig, pure }: .c (C ABI), .zig (Zig-layout weld → the compiler library), .pure (naked asm), .default (unannotated). Can appear standalone (no extern) on any fn / fn-type / lambda.
  • extern <lib> — linkage keyword + binding source (named library).

So a welded binding is text_of :: (id: StringId) -> string abi(.zig) extern compiler;.

What landed:

  • AST (src/ast.zig): CallingConventionABI { default, c, zig, pure }; the call_conv field → abi: ABI on FnDecl / Lambda / FunctionTypeExpr.
  • Lexer/token (src/token.zig, src/lexer.zig): kw_callconvkw_abi, keyword string "callconv""abi".
  • Parser (src/parser.zig): parseOptionalCallConvparseOptionalAbi (parses abi(.c|.zig|.pure)); wired in the fn-decl postfix slot (before extern/export), the function-type-expr slot, and the lambda slot; isFunctionDef/hasFnBodyAfterArrow recognise kw_abi.
  • AST→IR map (src/ir/type_resolver.zig, src/ir/lower/decl.zig, sema.zig, closure.zig): the AST .abi == .c reads kept their C-ABI meaning; the function-type resolver maps .zig/.pure → IR .default (no fn-pointer-type CC for those decl-level ABIs; neither occurs in a function-TYPE position yet).
  • CC-mismatch diagnostic (src/ir/lower/expr.zig, src/sema.zig): the user-facing text callconv(.c)abi(.c).
  • sx migration: 52 .sx files callconv(abi( (all were function-type callback annotations — none in the fn-decl postfix slot, so no reordering).
  • Docs: readme.md, specs.md, the design doc, snapshots (0114 / 1104 / 1200) regenerated for the rename.
  • Tests: parser unit tests in src/parser.test.zigabi(.zig) extern <lib> on a fn decl (asserts abi == .zig, extern_export == .extern_, extern_lib == "compiler"); bare extern leaves abi == .default; standalone abi(.c) / abi(.pure). lexer/sema tests updated.

zig build + zig build test green (450/450 unit + 685 corpus).

Current state

  • compiler :: #library "compiler"; parses + is recognised as the comptime-only internal surface (never dlopen'd).
  • abi(.zig) extern compiler STRUCTS: layout-validated against the registry (faithful → ok; drift → build-gating diagnostic). Field welds + usable.
  • abi(.zig) extern compiler FUNCTIONS: dispatched under the comptime interp to their registered Zig handler (intern/text_of round-trip works); unexported names rejected at declaration. Comptime-only.
  • A RUNTIME call to a welded fn is a clean build-gating error (comptime-only enforcement at emitCall); the legitimate #run/:: use stays green.
  • The whole Phase 1 foundation (parse → registry → struct-layout validation → function host-call bridge → comptime-only enforcement) is in place for the two-u32 Field case + the two string readers.
  • Deferred: offset-override / LLVM byte-offset GEP for non-natural layouts (needed by StructInfo's slice field, Phase 2).

Next step — Phase 2: welded compiler FUNCTIONS over the real types

Welded structs are byte-identical mirrors now, so the API surface can grow:

  • Bind register_struct / find_type over the host-call bridge (compiler_lib.zig bound_fns, like intern/text_of). register_struct takes a welded StructInfo and mints a real TypeId (guarded: dup field names, kind well-formedness — the checks define does today). Because the welded StructInfo is byte-identical, the handler can read it as the real Zig *StructInfo (cast + deref) rather than marshalling a Value field-by-field — the payoff of the byte-weld. find_type(StringId) -> ?Type reads the table. Prove: build a struct programmatically + round-trip a source one.
  • Re-express type_info/define (struct) as sx over register_struct/ find_type; migrate examples/0622; delete the bespoke struct interp arms (defineStruct / the reflectTypeInfo struct path).

Then Phase 3+: widen the welded types to EnumInfo/TaggedUnionInfo/TupleInfo (optional fields → sentinels) — each just needs an sx header in the compiler type's memory order + the matching register_* fn. Finally migrate BuildOptions to abi(.zig) extern compiler (re-home the #compiler registry) and delete #compiler.

Note: a welded struct with an ?T / union(enum) field (e.g. EnumInfo's backing_type: ?TypeId, explicit_values: ?[]const i64) is the next layout wrinkle — the sx header must mirror Zig's optional/union representation. Handle when reached (sentinels or accessor fns; see the design doc Risks).

Known issues

  • None for this stream. (Metatype's deferred enhancement is issue 0141 — comptime List growth; orthogonal, see current/CHECKPOINT-METATYPE.md.)

Log

  • Phase 2 — welded structs by reflection + memory-order validation. Dropped the byte-layout-override engine (computeWeldPlan / offset-ordered LLVM struct / byte-blob — all explored, all unnecessary). Instead: the sx header declares fields in the compiler type's memory order; the compiler reflects the bound Zig type (@typeInfo/@offsetOf/@sizeOf) and validates the header matches with loud diagnostics (field-not-found, wrong-order+expected-order, size mismatch). On pass it's an ordinary byte-identical struct — cast + deref just works. Examples 0627 (usable) / 1186 (wrong-order diagnostic). Suite green (692).
  • Phase 2.1 — weld-plan layout math (REMOVED). The byte-layout-override math; superseded by the reflection+validation design and deleted.
  • Phase 1 polish — comptime-only enforcement. A runtime call to a welded fn is a clean build-gating error (emitCall gate, guarded by enclosing-is_comptime so #run/:: uses stay green), not a link failure. Example 1185. Build + suite green (458 unit, 690 corpus).
  • Phase 1.1 fifth sub-step — host-call bridge (welded functions). compiler_lib function registry (intern/text_of) + findFn; IR Function compiler_welded flag set/validated in declareFunction (weldedCompilerFn); interp.call() dispatches welded calls to the Zig handler. Examples 0626 (round- trip) + 1184 (unexported-fn diagnostic); findFn unit-tested. Runtime-call clean rejection deferred (loud link error today). Build + suite green (458 unit, 689 corpus).
  • Phase 1.1 fourth sub-step — welded-struct layout validation. validateStructLayout (pure, unit-tested) + validateWeldedStruct wired into registerStructDecl: a struct abi(.zig) extern compiler is validated against the registry (lib == compiler, name exported, layout matches) with build-gating diagnostics. #library "compiler" no longer dlopen'd. Examples 0625 (faithful Field) + 1183 (field-count mismatch diagnostic). Offset-override/GEP deferred to Phase 2 (not exercised by Field's natural layout). Build + suite green (456 unit, 687 corpus).
  • Phase 1.1 third sub-step — binding registry. New src/ir/compiler_lib.zig: the compiler lib's welded-type registry; Field welded to StructInfo.Field with layout baked from the real Zig type (@offsetOf/@sizeOf/@alignOf); findType lookup proven by unit test (+ null off the export list). Standalone island — not yet consumed by lowering. Build + suite green (454 unit tests). Break-verified.
  • Phase 1.1 second sub-step — struct-decl binding parses. ast.StructDecl gained abi + extern_lib; parseStructDecl parses abi(.zig) extern <lib> after struct. Parser unit tests (welded Field + plain struct), break-verified. Build + suite green. Parse-only sub-step (fns + structs) of Phase 1.1 complete.
  • Phase 1.1 first sub-step + callconvabi unification. Parsed abi(.zig) extern <lib> on fn decls; unified callconv into abi(.c|.zig|.pure) (removed the callconv keyword), migrated 52 sx files + compiler diagnostics + docs + snapshots. Build + suite green. The original design's extern(.zig) single qualifier was split into abi(.zig) (ABI/layout, before extern) + extern <lib> (linkage + source) — recorded in the design doc's syntax-decision note.