comptime compiler-API: Phase 1 foundation + Phase 2.1 weld plan

Introduce the welded comptime `compiler` library (`#library "compiler"` +
`abi(.zig) extern compiler`), per design/comptime-compiler-api.md, and unify
`callconv(...)` into the new `abi(...)` annotation.

abi(...) replaces callconv(...):
- New ABI enum { default, c, zig, pure }; `abi(.c|.zig|.pure)` parses in the
  postfix slot before extern/export (and standalone). `kw_callconv` -> `kw_abi`.
- Migrated 52 sx files, the call-convention-mismatch diagnostic, and docs
  (readme/specs) from `callconv(.c)` to `abi(.c)`.

Phase 1 — welded compiler library (parse -> registry -> validation -> bridge):
- `abi(.zig) extern compiler` parses on fn decls (carries abi/extern_lib) and
  struct decls (StructDecl.abi/extern_lib).
- `#library "compiler"` is the comptime-only internal surface — never dlopen'd.
- src/ir/compiler_lib.zig: the binding registry (the safety boundary). `Field`
  welded to StructInfo.Field with layout baked from the real Zig type
  (@offsetOf/@sizeOf); `findType`/`findFn`. Welded structs are layout-validated
  at registration (field set + total size) as a header checked against the impl.
- Host-call bridge: a `fn abi(.zig) extern compiler` dispatches under the
  comptime interp to its registered Zig handler (intern/text_of round-trip),
  never dlsym. IR Function.compiler_welded; validated in declareFunction.
- Comptime-only enforcement: a runtime call to a welded fn is a clean
  build-gating error (emitCall), not an undefined-symbol link failure.

Phase 2.1 — byte-layout weld foundation:
- Decision: full byte-layout weld (sx struct laid out byte-identically to the
  bound Zig type). Registered StructInfo (first non-natural / Zig-reordered
  layout). `computeWeldPlan` — pure offset-ordered element plan + padding +
  sx-field->LLVM-element remap; unit-tested. Emit/interp wiring is the next
  sub-step (2.2+, see current/CHECKPOINT-COMPILER-API.md).

Examples: 0625/0626 (welded struct + fn round-trip), 1183/1184/1185
(layout-mismatch, unexported-fn, runtime-call diagnostics).
This commit is contained in:
agra
2026-06-17 13:31:11 +03:00
parent 3a9b508502
commit cd5b958d19
100 changed files with 1490 additions and 298 deletions

291
src/ir/compiler_lib.zig Normal file
View File

@@ -0,0 +1,291 @@
//! The comptime `compiler` library's binding registry — the curated surface of
//! the compiler's own types (layout-welded) and functions (host-call bridged)
//! reachable from comptime sx via `abi(.zig) extern compiler`. See
//! `design/comptime-compiler-api.md`.
//!
//! **This registry IS the safety boundary.** Only the entries registered here
//! are bindable from user comptime code; anything not on the export list is
//! unreachable. A welded `Name :: struct abi(.zig) extern compiler { … }` (or a
//! welded fn) resolves its layout/dispatch against this table, not the ordinary
//! extern-lib path.
//!
//! **Layout is welded, not guessed.** Because the sx compiler is itself a Zig
//! program, the real internal type's layout is available at compiler-build time:
//! each `BoundType` bakes `@sizeOf`/`@alignOf`/`@offsetOf` from the bound Zig
//! type. A `types.zig` change re-bakes the offsets on the next build, so both
//! sides move together. The sx-side `struct abi(.zig) …` declaration is then a
//! *header* checked against these offsets (the build-time layout-equality
//! assertion lands in the layout sub-step).
const std = @import("std");
const types = @import("types.zig");
const interp_mod = @import("interp.zig");
const Value = interp_mod.Value;
const Interpreter = interp_mod.Interpreter;
const InterpError = interp_mod.InterpError;
const StringId = types.StringId;
/// One field of a welded type: its sx-visible name plus the byte offset + size
/// taken from the bound Zig type.
pub const FieldLayout = struct {
name: []const u8,
offset: usize,
size: usize,
};
/// A type exported by the `compiler` library, welded to a real internal Zig
/// type. `size`/`alignment`/`fields` are baked from that Zig type at
/// compiler-build time (so they cannot drift from the implementation).
pub const BoundType = struct {
/// The sx-side name a welded `struct abi(.zig) extern compiler` uses.
sx_name: []const u8,
size: usize,
alignment: usize,
fields: []const FieldLayout,
};
/// The real internal Zig type each welded export binds to. Kept as named
/// aliases so the binding sites read as a curated list.
const FieldZig = types.TypeInfo.StructInfo.Field; // { name: StringId, ty: TypeId } — two u32s
const StructInfoZig = types.TypeInfo.StructInfo; // { name, fields: []Field, is_protocol, nominal_id } — Zig-reordered
/// Bake a `BoundType` from a real Zig struct type `T`. Field offsets/sizes come
/// from `@offsetOf`/`@sizeOf` on `T`; `sx_field_names` supplies the sx-visible
/// names positionally (must match `T`'s field order and count — a mismatch is a
/// compile error, never a silent truncation).
fn weldStruct(
comptime sx_name: []const u8,
comptime T: type,
comptime sx_field_names: []const []const u8,
) BoundType {
const zig_fields = @typeInfo(T).@"struct".fields;
if (zig_fields.len != sx_field_names.len)
@compileError("compiler-lib weld '" ++ sx_name ++ "': sx field count != Zig field count");
comptime var layouts: [zig_fields.len]FieldLayout = undefined;
inline for (zig_fields, 0..) |zf, i| {
layouts[i] = .{
.name = sx_field_names[i],
.offset = @offsetOf(T, zf.name),
.size = @sizeOf(zf.type),
};
}
const frozen = layouts;
return .{
.sx_name = sx_name,
.size = @sizeOf(T),
.alignment = @alignOf(T),
.fields = &frozen,
};
}
/// The welded-type export list. `Field` (two u32s, natural layout) proved the
/// weld in Phase 1; `StructInfo` (Phase 2) is the first NON-natural layout —
/// Zig reorders its fields (`fields`@0, `name`@16, `nominal_id`@20,
/// `is_protocol`@24), so it exercises the offset-override engine. `EnumInfo` /
/// `TaggedUnionInfo` / `TupleInfo` join later.
pub const bound_types = [_]BoundType{
weldStruct("Field", FieldZig, &.{ "name", "ty" }),
weldStruct("StructInfo", StructInfoZig, &.{ "name", "fields", "is_protocol", "nominal_id" }),
};
/// Look up a welded type by its sx name. Returns null when the name is not on
/// the `compiler` library's export list (the lookup the welded-decl resolution
/// path consults instead of the ordinary extern-lib path).
pub fn findType(sx_name: []const u8) ?*const BoundType {
for (&bound_types) |*bt| {
if (std.mem.eql(u8, bt.sx_name, sx_name)) return bt;
}
return null;
}
/// The name of the only welded library. A `struct abi(.zig) extern <lib>` with a
/// different `<lib>` is rejected — `compiler` is the sole comptime weld source.
pub const lib_name = "compiler";
/// One field of an sx welded-struct declaration, as the lowering observed it:
/// the field's sx name plus the size the sx type system computed for its type.
pub const SxField = struct {
name: []const u8,
size: usize,
};
/// The first way an sx welded-struct declaration fails to faithfully mirror the
/// bound Zig type. The sx declaration is a *header* checked against the real
/// implementation, so any drift is a build error rather than a silent
/// reinterpretation. The caller renders the chosen variant into a diagnostic.
pub const LayoutMismatch = union(enum) {
/// The sx declaration has a different field count than the welded type.
field_count: struct { expected: usize, got: usize },
/// Field `index` carries the wrong sx name (a weld is positional + by-name).
field_name: struct { index: usize, expected: []const u8, got: []const u8 },
/// Field `index` (`name`) is a different size than the welded type's field.
field_size: struct { index: usize, name: []const u8, expected: usize, got: usize },
/// The total struct size differs (padding / alignment drift).
total_size: struct { expected: usize, got: usize },
};
/// Check an sx welded-struct declaration against the bound Zig type. Returns the
/// FIRST mismatch, or null if the sx declaration is a faithful header. Fields are
/// checked positionally + by name + by size, and the total size is compared — for
/// a natural (C-like) layout this catches a missing/extra field (count), a rename
/// or reorder (name), a retype (size), and padding drift (total). Explicit
/// per-field OFFSET overrides (for non-natural Zig layouts — slices, reordered or
/// `union(enum)` fields) arrive with `StructInfo` in Phase 2; `Field`'s two-u32
/// natural layout needs none.
pub fn validateStructLayout(
bt: *const BoundType,
sx_fields: []const SxField,
sx_total_size: usize,
) ?LayoutMismatch {
if (sx_fields.len != bt.fields.len)
return .{ .field_count = .{ .expected = bt.fields.len, .got = sx_fields.len } };
for (sx_fields, bt.fields, 0..) |sf, bf, i| {
if (!std.mem.eql(u8, sf.name, bf.name))
return .{ .field_name = .{ .index = i, .expected = bf.name, .got = sf.name } };
if (sf.size != bf.size)
return .{ .field_size = .{ .index = i, .name = bf.name, .expected = bf.size, .got = sf.size } };
}
if (sx_total_size != bt.size)
return .{ .total_size = .{ .expected = bt.size, .got = sx_total_size } };
return null;
}
// ── Weld plan (byte-layout override) ────────────────────────────────────────
//
// A welded struct must be laid out byte-identically to the bound Zig type, whose
// fields Zig may REORDER (and pad). The sx struct's natural layout generally
// won't match — so the compiler imposes the Zig layout: it builds the struct's
// LLVM type as the fields in ascending-OFFSET order, with explicit padding
// elements filling the gaps, and remaps each sx field index to its LLVM element
// index. `computeWeldPlan` is that pure layout math; the LLVM type builder + the
// struct-GEP / field-access sites consume the plan (later sub-steps), and the
// interp serializes comptime struct Values through the same offsets.
/// One element of a welded struct's LLVM layout: either a real field (carrying
/// its sx field index) or a padding gap. Always in ascending `offset` order.
pub const WeldElement = struct {
/// The sx field index this element holds, or null for a padding gap.
sx_field: ?usize,
/// Byte offset of this element within the struct (the bound Zig offset).
offset: usize,
/// Byte width of this element (the field's size, or the gap width).
size: usize,
};
/// The byte-layout plan for a welded struct: its LLVM elements in offset order
/// (fields + padding) and the sx-field → LLVM-element-index remap. Owns its
/// slices — `deinit` with the same allocator passed to `computeWeldPlan`.
pub const WeldPlan = struct {
elements: []const WeldElement,
/// `sx_to_llvm[i]` is the index into `elements` of sx field `i`.
sx_to_llvm: []const usize,
total_size: usize,
pub fn deinit(self: *WeldPlan, alloc: std.mem.Allocator) void {
alloc.free(self.elements);
alloc.free(self.sx_to_llvm);
}
};
/// Compute the byte-layout plan for a struct whose fields carry their bound Zig
/// offsets (`fields[i].offset`/`.size`, e.g. from a `BoundType`). `total_size` is
/// the bound Zig `@sizeOf`. The result lists LLVM elements in ascending-offset
/// order — real fields interleaved with padding gaps — plus the sx-field →
/// element-index remap that struct-GEP uses. Pure; allocates the result slices.
pub fn computeWeldPlan(
alloc: std.mem.Allocator,
fields: []const FieldLayout,
total_size: usize,
) !WeldPlan {
// Order the sx field indices by ascending byte offset (stable).
const order = try alloc.alloc(usize, fields.len);
defer alloc.free(order);
for (order, 0..) |*o, i| o.* = i;
std.sort.insertion(usize, order, fields, struct {
fn lessThan(fs: []const FieldLayout, a: usize, b: usize) bool {
return fs[a].offset < fs[b].offset;
}
}.lessThan);
var elements = std.ArrayList(WeldElement).empty;
errdefer elements.deinit(alloc);
const sx_to_llvm = try alloc.alloc(usize, fields.len);
errdefer alloc.free(sx_to_llvm);
var cursor: usize = 0;
for (order) |sx_i| {
const f = fields[sx_i];
// Fill any gap before this field with a padding element.
if (f.offset > cursor) {
try elements.append(alloc, .{ .sx_field = null, .offset = cursor, .size = f.offset - cursor });
}
sx_to_llvm[sx_i] = elements.items.len;
try elements.append(alloc, .{ .sx_field = sx_i, .offset = f.offset, .size = f.size });
cursor = f.offset + f.size;
}
// Trailing padding up to the bound total size (alignment tail).
if (total_size > cursor) {
try elements.append(alloc, .{ .sx_field = null, .offset = cursor, .size = total_size - cursor });
}
return .{
.elements = try elements.toOwnedSlice(alloc),
.sx_to_llvm = sx_to_llvm,
.total_size = total_size,
};
}
// ── Functions (comptime-only, host-call bridged) ────────────────────────────
/// A welded `compiler` function: dispatched under the comptime interpreter to its
/// Zig handler (never dlsym'd). The handler receives the interpreter (for the
/// string pool / type table) and the already-evaluated argument `Value`s, and
/// returns the result `Value`.
pub const FnHandler = *const fn (interp: *Interpreter, args: []const Value) InterpError!Value;
pub const BoundFn = struct {
sx_name: []const u8,
handler: FnHandler,
};
/// The welded-function export list. Start small (Phase 1): the `StringId`
/// round-trip readers. `find_type` / the guarded `register_*` mutators join in
/// later phases.
pub const bound_fns = [_]BoundFn{
.{ .sx_name = "intern", .handler = handleIntern },
.{ .sx_name = "text_of", .handler = handleTextOf },
};
/// Look up a welded function by its sx name. Returns null when the name is not on
/// the `compiler` library's function-export list.
pub fn findFn(sx_name: []const u8) ?*const BoundFn {
for (&bound_fns) |*bf| {
if (std.mem.eql(u8, bf.sx_name, sx_name)) return bf;
}
return null;
}
/// The comptime type table to intern into: the host's mutable mint target when
/// set (the metatype-construction path), else the module's table reached through
/// a const-cast — the same access the interp's mint path uses (interp.zig). The
/// underlying table is genuinely mutable; the interp merely holds it `const`.
fn mintTable(interp: *Interpreter) *types.TypeTable {
return interp.mint orelse @constCast(&interp.module.types);
}
/// `intern(s: string) -> StringId` — intern `s` into the compiler's string pool
/// and return its handle. The inverse of `text_of`.
fn handleIntern(interp: *Interpreter, args: []const Value) InterpError!Value {
if (args.len != 1 or args[0] != .string) return error.TypeError;
const id = mintTable(interp).internString(args[0].string);
return Value{ .int = @intFromEnum(id) };
}
/// `text_of(id: StringId) -> string` — resolve a string handle back to its text.
/// The inverse of `intern`.
fn handleTextOf(interp: *Interpreter, args: []const Value) InterpError!Value {
if (args.len != 1 or args[0] != .int) return error.TypeError;
if (args[0].int < 0 or args[0].int > std.math.maxInt(u32)) return error.TypeError;
const id: StringId = @enumFromInt(@as(u32, @intCast(args[0].int)));
return Value{ .string = interp.module.types.getString(id) };
}