From 08b0a35758dfd6ac5b2bb2fa49fe4bee84f9f74b Mon Sep 17 00:00:00 2001 From: agra Date: Wed, 17 Jun 2026 09:38:00 +0300 Subject: [PATCH] =?UTF-8?q?design:=20comptime=20compiler=20API=20=E2=80=94?= =?UTF-8?q?=20#library=20"compiler"=20+=20extern(.zig)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Unified sx<->compiler binding that subsumes the metatype declare/define primitives AND the #compiler struct attribute. A named 'compiler' library exposes the compiler's real types (layout-welded via extern(.zig), offsets queried from the Zig type at compiler-build time + a build-time equality assertion) and functions (comptime-only, host-call bridged). declare/ define/type_info become sx library code over register_*/find_type; the projected meta.sx TypeInfo + hand marshaling are deleted; BuildOptions migrates onto it and #compiler is removed. Includes the safety boundary (curated export list, guarded mutators, comptime-only), the honest limit (the ordering law stays, but stops leaking as 'weird stages' — dissolving the 0141 class), a phased suite-green build order, and the open risks (union(enum) welding, optional fields, LLVM offset emission). --- design/comptime-compiler-api.md | 183 ++++++++++++++++++++++++++++++++ 1 file changed, 183 insertions(+) create mode 100644 design/comptime-compiler-api.md diff --git a/design/comptime-compiler-api.md b/design/comptime-compiler-api.md new file mode 100644 index 00000000..c93fb648 --- /dev/null +++ b/design/comptime-compiler-api.md @@ -0,0 +1,183 @@ +# Comptime Compiler API — `#library "compiler"` + `extern(.zig)` + +> **Status: design-of-record (not yet an active stream).** Captures a unified +> mechanism for sx↔compiler binding that subsumes the metatype `declare`/`define` +> primitives AND the `#compiler` struct attribute, and exposes the compiler's own +> type-table API to comptime sx. Supersedes the bespoke `meta.sx` `TypeInfo` +> projection (the "weld it" decision). Co-designed in conversation 2026-06-17. + +## Motivation + +Today the compiler↔sx boundary is **two ad-hoc mechanisms**: + +- `#compiler` structs (`BuildOptions`) — sx struct whose methods are compiler hooks + (registered in `compiler_hooks.zig`). A handle to compiler state, method-bound. +- The metatype `declare`/`define`/`type_info` `#builtin`s — comptime sx reaching + into the type table through a narrow, fixed keyhole, with a *separate, translated* + `TypeInfo` data model in `meta.sx` (marshalled by hand in `interp.zig`). + +Both are the SAME idea — comptime sx interacting with the compiler — implemented +twice, differently. And the metatype path carries real costs: a projected data +model that drifts from `types.zig`, hand-written marshaling, and the staging +fragility of issue 0141 (constructor bodies lowered at `scanDecls` in a half-built +world → wrong IR). + +**This unifies them.** One mechanism: a named `compiler` library that exposes a +curated set of the compiler's real types (welded by layout) and functions +(host-call bridged), reachable from comptime sx. `declare`/`define`/`type_info` +become sx library code over the real API; `#compiler` is deleted; `BuildOptions` +migrates onto it. + +## The mechanism + +### `#library "compiler"` + +```sx +compiler :: #library "compiler"; +``` + +A named binding target that resolves NOT to a `.dylib` but to the compiler's own +internal surface (Zig types + functions). Two defining properties: + +- **It IS the safety boundary.** The `compiler` library exports exactly the + curated set of types + functions the compiler chooses to expose. Anything not on + that export list is unreachable from user comptime code — the boundary is the + lib's symbol table, not a convention. +- **It is comptime-only.** The compiler isn't present at runtime, so every function + from `compiler` resolves only under the comptime interpreter; calling one at + runtime is a clean "comptime-only symbol" error, falling out of the existing + `is_comptime` boundary. (Welded *types* are still usable as plain runtime data; + only the *functions* are comptime-gated.) + +### `extern(.zig) ` — postfix attribute + +Slots where `#builtin` / `#compiler` go (postfix, after the return type for fns, +after `struct` for types), with the library handle following: + +```sx +// functions: +text_of :: (id: StringId) -> string extern(.zig) compiler; +intern :: (s: string) -> StringId extern(.zig) compiler; +register_type :: (info: StructInfo) -> Type extern(.zig) compiler; +find_type :: (name: StringId) -> ?Type extern(.zig) compiler; + +// types (layout-welded to the lib's real Zig type): +Field :: struct extern(.zig) compiler { name: StringId; ty: Type; }; +StructInfo :: struct extern(.zig) compiler { + name: StringId; fields: []Field; is_protocol: bool; nominal_id: u32; +}; +``` + +`extern(.zig)` = "Zig ABI / Zig layout"; `` = the binding source. + +### Layout welding — why it's exact, not brittle + +The sx compiler is itself a Zig program; `types.zig` is part of it. So at +**compiler-build time** the real record's layout is available via +`@offsetOf` / `@sizeOf` / `@alignOf`. An `extern(.zig) compiler` struct is laid out +to the bound Zig type's EXACT offsets (queried, not guessed), and the compiler +ASSERTS the sx declaration matches the Zig type byte-for-byte (a mismatch is a +build error — the sx side is a header checked against the implementation). Because +the same compiler builds both, they're guaranteed identical, and a `types.zig` +change re-bakes the offsets on the next build — both sides move together. + +This is what C-ABI `extern` can't do: it copies Zig's REAL layout, so Zig slices +(`{ptr,len}`), field reordering, and `union(enum)` tag placement all "just work" — +no slice→ptr+len surgery on `types.zig`, no version fragility. + +### Host-call bridge (functions) + +`compiler` functions dispatch, under the comptime interp, to the registered +internal Zig function — the generalization of the path that already exists +(`host_ffi.zig` resolves comptime `extern "c"` via dlsym; `compiler_hooks.zig` +registers `#compiler` method hooks). The `compiler` lib's registry maps each +exported sx name → its Zig function + welded signature. + +## The exposed surface (curated) + +Types (welded): `StringId` (u32 handle), `Type` (≡ `TypeId`, u32), `Field`, +`StructInfo`, `EnumInfo`, `TaggedUnionInfo`, `TupleInfo`, and a kind-tagged +`TypeInfo` view (see Risks — the `union(enum)` is the one harder shape). + +Functions (comptime-only): `intern(string)->StringId`, `text_of(StringId)->string`, +`find_type(StringId)->?Type`, guarded mutators +`register_struct/register_enum/register_tuple(info)->Type`, and the reflection +readers (`type_of`, field/variant iteration) over the welded records. + +`declare`/`define`/`type_info` collapse into thin sx over `register_*`/`find_type` +— or disappear. The bespoke interp arms (`.declare`/`.define`/`.type_info`, +`defineEnum`/`defineStruct`/`defineTuple`/`reflectTypeInfo`) are deleted. + +## What it buys (and the one honest limit) + +Dissolves: the bespoke `declare`/`define` surface, the projected `TypeInfo` model, +the hand-marshaling, the `#compiler` duplication, and the **0141 class of bugs** — +registration becomes a direct, guarded API call, not "evaluate an sx stdlib body +(List/append) at `scanDecls`," so there's no body to mis-lower at a half-built +stage. + +Does NOT repeal: the **ordering law** — a type's layout must exist before code +that uses it is lowered. That's inherent to the compiler, not machinery. The win +is that it stops leaking as "weird exposed stages" and becomes an encapsulated +contract inside the compiler API (the API decides how a registration slots in), +instead of the user threading `declare`→forward-slot→`define`→eval-timing by hand. + +## Safety boundary + +- Only the `compiler` export list is reachable — no raw `*TypeTable`. +- Mutators are **guarded** (`register_*` validate: dup field/variant names, kind + changes, well-formedness) — the same checks `define` does today, now at the API. +- Comptime-only enforcement on functions; runtime use is a clean error. +- Mirrors Zig's own discipline: comptime builds types through sanctioned doors + (`@Type`), it doesn't let user code scribble on the compiler's tables. + +## BuildOptions migration + +`BuildOptions :: struct #compiler { ... }` + `build_options() #compiler` → +`extern(.zig) compiler`: the setter/getter hook-methods become `extern(.zig) +compiler` functions (or methods on a welded/handle `BuildOptions`), backed by the +same `BuildConfig` state. The `compiler_hooks.zig` registry becomes the `compiler` +lib's function/type registry. Net: the build DSL and the metatype API ride one +mechanism. + +## `#compiler` removal + +After both consumers are migrated, delete the `#compiler` attribute and its +special paths: lexer/parser token + sema handling (`src/lexer.zig`, `src/parser.zig`, +`src/sema.zig`, `src/token.zig`, `src/ast.zig`), and the `#compiler`-specific +registration in `compiler_hooks.zig` (the registry stays, re-homed under `compiler`). +sx footprint is tiny (2 lines in `library/modules/build.sx`). + +## Build order (each phase keeps `zig build test` green) + +1. **`extern(.zig)` + `#library` foundation** — lex/parse the postfix attribute and + `#library "compiler"`; a binding registry (sx name → Zig type/fn); the layout + engine honoring the bound type's `@offsetOf` offsets + LLVM emission that hits + them; **build-time layout-equality assertion**. Prove with `Field` (two u32s). +2. **Weld `StructInfo`** + `StringId` accessors (`intern`/`text_of`) over the + host-call bridge. +3. **Re-express `type_info`/`define` (struct)** as sx over `register_struct`/ + `find_type`; migrate `examples/0622`; delete the struct interp arms; suite green. +4. **Widen to enum/tuple** — weld `EnumInfo`/`TaggedUnionInfo`/`TupleInfo` + (optional fields → sentinels: `backing_type` `.unresolved`, `explicit_values` + len-0); migrate `examples/0619`/`0623`; delete the enum/tuple interp arms. +5. **Migrate `BuildOptions`** to `extern(.zig) compiler`. +6. **Delete `#compiler`**; suite green. + +## Risks / open questions + +- **`union(enum)` welding.** `TypeInfo` is a Zig tagged union; mirroring its tag + placement is the one shape harder than plain structs. Start with a `kind`-tagged + *view* (weld the payload structs, drive the discriminant via a `kind` accessor), + defer full-union welding. `type_info`/`define` mostly traffic in the payload + records anyway. +- **Optional fields in welded records** (`?[]const i64`, `?TypeId`) — represent via + sentinels on the sx side, or expose through accessor functions rather than raw + fields. +- **LLVM layout emission** for arbitrary external offsets (padding / byte-offset + GEPs) is the meatiest part of phase 1. +- **Mutation safety** — the guarded-mutator surface must cover every invariant the + type table relies on (interning, nominal ids, forward slots). +- **`@offsetOf` binding for nested/parameterized types** — the registry must map + each exported sx type to a concrete Zig type; generic Zig types need a concrete + instantiation to bind.