Merge branch 'arch-refactor'

This commit is contained in:
agra
2026-06-03 16:34:16 +03:00
25 changed files with 9882 additions and 3525 deletions

View File

@@ -40,7 +40,7 @@ to satisfy all three. "JIT" and "comptime" are **not** the same thing.
|---|---|---|
| **AOT** (`sx build`) | native machine code in an on-disk binary | pointer to an interned `Frame` |
| **JIT** (`sx run`) | ORC-JIT'd machine code in anonymous memory | pointer to an interned `Frame` |
| **Comptime** (`#run`) | the IR interpreter (`interp.zig`) — no machine code | packed `(func_id, ir_offset)` |
| **Comptime** (`#run`) | the IR interpreter (`interp.zig`) — no machine code | packed `(func_id, span.start)` |
The crucial constraint: **the same lowered IR runs in the compiled
backend *and* the interpreter.** So a value the IR produces (like a trace
@@ -92,39 +92,55 @@ so the location — *and the offending source line itself* (`line_text`, for the
formatter reads it directly. No PC capture, no DWARF, no symbolizer, no runtime
file read.
A comptime frame is instead a packed `(func_id: u32, ir_offset: u32)`,
resolved through the interpreter's in-memory IR/source tables. The
interpreter **never dereferences the compiled `Frame` pointer** — it uses
its own representation — so the compiled and interpreted memory models
never collide.
A comptime frame is instead a packed `(func_id: u32, span.start: u32)`
where `span.start` is the op's source byte offset — resolved through the
interpreter's in-memory IR/source tables. The interpreter **never
dereferences the compiled `Frame` pointer** — it uses its own
representation — so the compiled and interpreted memory models never
collide.
### The niladic trace-push op
Because the same IR runs in both machines, the push is a **dedicated,
niladic, span-stamped IR op** — the same pattern as `is_comptime` /
`interp_print_frames`. It carries **no operands and no global reference**;
each backend derives the frame from its own context:
Because the same IR runs in both machines, the frame value comes from a
**dedicated, niladic, span-stamped IR op** (`.trace_frame`) — the same
pattern as `is_comptime` / `interp_print_frames`. It carries **no operands
and no global reference**; each backend derives the frame from its own
context:
- **`emit_llvm`:** resolves the op's `span` + current function →
`{file, line, col, func}` (reusing the source map wired in for DWARF),
**interns and builds the `Frame` global in `emit_llvm`** (the same
mechanism as the tag-name table), then emits `call sx_trace_push(ptr)`.
- **`interp`:** pushes the packed `(func_id, ir_offset)` from its own
execution context.
- **`emit_llvm` (the `.trace_frame` arm):** resolves the op's `span` +
current function → `{file, line, col, func}` (reusing the source map
wired in for DWARF), **interns and builds the `Frame` global** in
[`src/backend/llvm/reflection.zig`](../src/backend/llvm/reflection.zig)
(the same mechanism, in the same file, as the tag-name table), and yields
its address as the op's value. The lowerer feeds that value to a separate
`sx_trace_push` call emitted through the normal call lowering.
- **`interp`:** yields the packed `(func_id, span.start)` from its own
execution context as the op's value. The separate `sx_trace_push` call
op consuming it is executed by the interp as a foreign call (via
`host_ffi`/dlsym, the same path as any extern), storing the packed value
in the buffer; the comptime `.trace_resolve` resolver later recovers
`file:line:col` from it.
This keeps the lowerer thin: at each push site it emits the op and nothing
else — no operand wiring, no global construction. The rejected
alternative — an op carrying a `GlobalId` to an IR-level `Frame` global —
would make the global visible to the interpreter (forcing comptime onto
the pointer-deref path) and fatten the lowerer; **do not do this.**
The op stays niladic by design: it carries no operand and no `GlobalId`,
so no IR-level `Frame` global is ever visible to the interpreter. The
rejected alternative — an op carrying a `GlobalId` to an IR-level `Frame`
global — would make the global visible to the interpreter (forcing
comptime onto the pointer-deref path) and fatten the lowerer; **do not do
this.**
`Frame` is defined **once** in sx (`trace.sx`/std); `emit_llvm` builds the
interned global off that `TypeId` through the normal struct-emission path,
never a bespoke byte layout (which would risk the "8-bytes-assumed"
clobber class of bug). `file`/`func` strings are interned into a shared
pool so a path shared by N push sites is stored once — the table stays
tiny. File paths are normalized to a stable relative form so trace output
is machine-independent and snapshot-testable.
`Frame` is defined **once** in sx (`trace.sx`/std), and its runtime layout —
`{ string file, i32 line, i32 col, string func, string line_text }` — is
mirrored by the cached LLVM **literal (anonymous) struct type** `getFrameStructType()`
(`src/ir/emit_llvm.zig`). The reflection builder
(`src/backend/llvm/reflection.zig`) assembles each push site's global as an
LLVM **named-struct constant** over that cached type via
`LLVMConstNamedStruct` — a type-safe LLVM struct, not hand-packed bytes
(which would risk the "8-bytes-assumed" clobber class of bug). It does
**not** derive the layout from the sx `Frame` `TypeId`, nor route through
the normal struct-emission path. `file`/`func`/`line_text` strings are
interned into a shared pool so a path shared by N push sites is stored once
— the table stays tiny. The `file` field is the source basename (full paths
live in DWARF), so trace output is machine-independent and snapshot-testable.
### Push and clear sites
@@ -193,8 +209,9 @@ stripped without affecting traces.
### What's emitted
In [`src/ir/emit_llvm.zig`](../src/ir/emit_llvm.zig), gated on the same
debug opt levels + a wired source map (`setDebugContext`):
In [`src/backend/llvm/debug.zig`](../src/backend/llvm/debug.zig) (the
`DebugInfo` helper, driven from `emit_llvm`'s `emit()` pipeline), gated on
the same debug opt levels + a wired source map (`setDebugContext`):
- one `DICompileUnit` + `DIFile` on the main file,
- a `DISubprogram` per emitted function (`LLVMSetSubprogram`),
@@ -237,10 +254,12 @@ both the trace path and the DWARF path. Items marked ✅ exist today;
|---|---|
| [`src/core.zig`](../src/core.zig) | `Compilation`: owns `import_sources` (file→source map), constructs the emitter, calls `setDebugContext` + `emit`; re-enters the interpreter for `#run`/post-link |
| [`src/ir/lower.zig`](../src/ir/lower.zig) | AST→IR. Stamps `Inst.span`; emits push/clear at failure/absorb sites; `tracesEnabled` gate; declares the `sx_trace_*` externs |
| [`src/ir/emit_llvm.zig`](../src/ir/emit_llvm.zig) | IR→LLVM. Builds the interned `Frame` table; lowers the push op to a pointer push; emits all DWARF metadata |
| [`src/ir/interp.zig`](../src/ir/interp.zig) | Comptime IR interpreter. Lowers the push op to a packed `(func_id, offset)`; resolves comptime frames |
| [`src/ir/emit_llvm.zig`](../src/ir/emit_llvm.zig) | IR→LLVM orchestrator. Owns `LLVMEmitter` + the source map (`setDebugContext`); dispatches the `.trace_frame` op and the DWARF passes to the helpers below |
| [`src/backend/llvm/reflection.zig`](../src/backend/llvm/reflection.zig) | `Reflection`: builds the interned `Frame` table + the tag-name / type-name tables; yields the `.trace_frame` op's value (the `Frame` global's address) — the `sx_trace_push` call itself is emitted by `lower.zig` |
| [`src/backend/llvm/debug.zig`](../src/backend/llvm/debug.zig) | `DebugInfo`: builds all DWARF metadata (compile unit, per-function subprograms, per-instruction `DILocation`) |
| [`src/ir/interp.zig`](../src/ir/interp.zig) | Comptime IR interpreter. The `.trace_frame` op yields a packed `(func_id, span.start)`; the separate `sx_trace_push` call op runs as a foreign call (dlsym); `.trace_resolve` recovers comptime frames |
| [`src/errors.zig`](../src/errors.zig) | `SourceLoc.compute(source, offset) → {line, col}`; the `import_sources` map type |
| [`src/ir/inst.zig`](../src/ir/inst.zig) | `Inst.span`, `Function.source_file`, the `Op` union (home of the trace-push op) |
| [`src/ir/inst.zig`](../src/ir/inst.zig) | `Inst.span`, `Function.source_file`, the `Op` union (home of the `.trace_frame` op) |
| [`library/vendors/sx_trace_runtime/sx_trace.c`](../library/vendors/sx_trace_runtime/sx_trace.c) | the thread-local ring buffer + `sx_trace_report_unhandled` |
| [`library/modules/trace.sx`](../library/modules/trace.sx) | the formatter (`to_string` / `print_current`) |
| [`src/llvm_api.zig`](../src/llvm_api.zig) | binds `llvm-c/Core.h` + `llvm-c/DebugInfo.h` |
@@ -270,17 +289,23 @@ traces and DWARF can never disagree:
1. `lower.zig` reaches a failure site — `lowerRaise`, `lowerTry`'s
propagation branch, `lowerFailableOr`, or `lowerDestructureDecl` — and
(when `tracesEnabled()`) emits the niladic `.trace_frame_push` op,
replacing today's `emitTracePush(placeholderTraceFrame())`. Absorbing
sites emit `emitTraceClear()``call sx_trace_clear()`.
2. **Compiled backend** (`emit_llvm.emitInst`, `.trace_frame_push` arm):
(when `tracesEnabled()`) emits the niladic `.trace_frame` op via
`placeholderTraceFrame()`, whose result feeds a separate `sx_trace_push`
call via `emitTracePush()`. Absorbing sites emit `emitTraceClear()`
`call sx_trace_clear()`.
2. **Compiled backend** (`emit_llvm.emitInst`, `.trace_frame` arm):
resolve the op's `span` + current function → `{file,line,col,func}`,
intern into the `Frame` table (built alongside `tag_name_array`), and
emit `call sx_trace_push(ptr_to_Frame)`. The `sx_trace_push` extern is
yield the `Frame` global's address as the op's value, which the separate
`sx_trace_push` call (step 1) consumes. The `sx_trace_push` extern is
declared lazily by `getTraceFids()` (which sets `needs_trace_runtime`).
3. **Interpreter** (`interp.zig`, same op): pack `(current_func_id,
ir_offset)` into a `u64` and call the foreign `sx_trace_push` (resolved
via `host_ffi` `dlsym` against the linked `sx_trace.c`).
span.start)` into a `u64` and return it as the op's value. The separate
`sx_trace_push` call op is then executed by the interp as a foreign call
(`callForeign` → `host_ffi.lookupSymbol`/dlsym, the same path as any
extern), storing the packed value in the buffer. The comptime
`.trace_resolve` resolver later turns each packed value back into
`file:line:col` via the IR/source tables.
**Buffer (run time) ✅** — `sx_trace.c` stores the `u64`s. Linked into the
compiler so the JIT resolves `sx_trace_*` via `dlsym`; auto-injected as a
@@ -288,10 +313,10 @@ compiler so the JIT resolves `sx_trace_*` via `dlsym`; auto-injected as a
**Formatter (run time) ✅ (compiled 3a, comptime 3b)** — `trace.sx` `to_string()` loops
`sx_trace_len()` / `sx_trace_frame_at(i)` and resolves each `u64` through
a **read-side context-split primitive** (the mirror of the push op):
a **read-side context-split primitive** (the mirror of the `.trace_frame` op):
- compiled: cast the `u64` → `*Frame`, load the fields.
- comptime: unpack `(func_id, offset)`, resolve via the interpreter's
- comptime: unpack `(func_id, span.start)`, resolve via the interpreter's
IR/source tables → a `Frame`.
The same `trace.sx` source works in both because it runs in the matching
@@ -330,8 +355,8 @@ the failable-`main` wrapper, whose `ret` path in `emit_llvm`
### The gate: one switch, two consumers
`Lowering.tracesEnabled()` (lower.zig) and `LLVMEmitter.debugEnabled()`
(emit_llvm) both reduce to `opt_level == .none or .less`. The `Frame`
`Lowering.tracesEnabled()` (lower.zig) and `DebugInfo.debugEnabled()`
(backend/llvm/debug.zig) both reduce to `opt_level == .none or .less`. The `Frame`
table + push/clear ride `tracesEnabled`; DWARF rides `debugEnabled`.
Release (`-O2`/`-O3`) emits neither. `sx run` defaults to `-O0` (both on);
`sx ir`/`sx asm` default to `-O2` (both off) — which is why the `.ir`
@@ -455,7 +480,7 @@ a Mach-O debug map, never register JIT DWARF.
| IR instructions carry source spans | ✅ done — E3.0 slice 1 (`b44a5d0`) |
| DWARF emission (compile unit / subprogram / line table) | ✅ done — E3.0 slice 2 (`c32d694`) |
| Niladic trace-push op + interned `Frame` table (runtime) | ✅ done — E3.3 slice 3a (`1b6cbc1`) |
| Comptime resolver (`func_id, ir_offset` → location) | ✅ done — slice 3b |
| Comptime resolver (`func_id, span.start` → location) | ✅ done — slice 3b |
| Source snippet + `^` caret | ✅ done — slice 3c (line embedded in `Frame`) |
| `--emit-obj` artifact plumbing | ✅ done — slice 3d |
| Stepping verification: macOS lldb | ✅ done — 3e rung 1 (`tests/debug_stepping_smoke.sh`) |

View File

@@ -326,7 +326,9 @@ error trace:
Traces are on by default in debug builds and compiled out in release
(re-enable with `--release-traces`). They cost nothing on the success
path. Frame locations resolve through the binary's debug info, so
path. Each frame's location comes from `Frame` metadata
(file/line/col/func) baked in at the trace point — the trace resolves
itself with no debug info. Separately, sx emits standard DWARF, so
`lldb` / `gdb` work on sx binaries too.
Interpolating a tag with `{}` prints its **name**, not a number — in

View File

@@ -0,0 +1,180 @@
@__sx_default_context = internal global { { ptr, ptr, ptr }, ptr } { { ptr, ptr, ptr } { ptr null, ptr @__thunk_CAllocator_Allocator_alloc, ptr @__thunk_CAllocator_Allocator_dealloc }, ptr null }
; Function Attrs: nounwind
declare void @out(ptr) #0
declare ptr @malloc(i64)
declare void @free(ptr)
declare ptr @memcpy(ptr, ptr, i64)
declare ptr @memset(ptr, i32, i64)
; Function Attrs: nounwind
define internal ptr @CAllocator.alloc(ptr %0, ptr %1, i64 %2) #0 {
entry:
%alloca = alloca ptr, align 8
store ptr %1, ptr %alloca, align 8
%allocaN = alloca i64, align 8
store i64 %2, ptr %allocaN, align 8
%load = load i64, ptr %allocaN, align 8
%call = call ptr @malloc(i64 %load)
ret ptr %call
}
; Function Attrs: nounwind
define internal void @CAllocator.dealloc(ptr %0, ptr %1, ptr %2) #0 {
entry:
%alloca = alloca ptr, align 8
store ptr %1, ptr %alloca, align 8
%allocaN = alloca ptr, align 8
store ptr %2, ptr %allocaN, align 8
%load = load ptr, ptr %allocaN, align 8
call void @free(ptr %load)
ret void
}
; Function Attrs: nounwind
declare i64 @GPA.init(ptr) #0
; Function Attrs: nounwind
declare ptr @GPA.alloc(ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @GPA.dealloc(ptr, ptr, ptr) #0
; Function Attrs: nounwind
declare void @Arena.add_chunk(ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @Arena.init(ptr sret({ ptr, i64, { ptr, ptr, ptr } }), ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @Arena.reset(ptr, ptr) #0
; Function Attrs: nounwind
declare void @Arena.deinit(ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @Arena.alloc(ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @Arena.dealloc(ptr, ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @BufAlloc.init(ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @BufAlloc.reset(ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @BufAlloc.alloc(ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @BufAlloc.dealloc(ptr, ptr, ptr) #0
; Function Attrs: nounwind
declare void @TrackingAllocator.init(ptr sret({ { ptr, ptr, ptr }, i64, i64, i64 }), ptr, ptr) #0
; Function Attrs: nounwind
declare i64 @TrackingAllocator.leak_count(ptr, ptr) #0
; Function Attrs: nounwind
declare void @TrackingAllocator.report(ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @TrackingAllocator.alloc(ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @TrackingAllocator.dealloc(ptr, ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @cstring(ptr, i64) #0
; Function Attrs: nounwind
declare ptr @int_to_string(ptr, i64) #0
; Function Attrs: nounwind
declare ptr @bool_to_string(ptr, i1) #0
; Function Attrs: nounwind
declare ptr @float_to_string(ptr, double) #0
; Function Attrs: nounwind
declare void @hex_group(ptr, ptr, i64, i64) #0
; Function Attrs: nounwind
declare ptr @int_to_hex_string(ptr, i64) #0
; Function Attrs: nounwind
declare ptr @concat(ptr, ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @substr(ptr, ptr, i64, i64) #0
; Function Attrs: nounwind
declare ptr @xml_escape(ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @path_join(ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @any_to_string(ptr, [2 x i64]) #0
; Function Attrs: nounwind
declare ptr @build_format(ptr, ptr) #0
; Function Attrs: nounwind
define internal i64 @accept_c(ptr %0) #0 {
entry:
%byval.load = load { i64, i64, i64, i64 }, ptr %0, align 8
%alloca = alloca { i64, i64, i64, i64 }, align 8
store { i64, i64, i64, i64 } %byval.load, ptr %alloca, align 8
%load = load { i64, i64, i64, i64 }, ptr %alloca, align 8
%sg = extractvalue { i64, i64, i64, i64 } %load, 0
%loadN = load { i64, i64, i64, i64 }, ptr %alloca, align 8
%sgN = extractvalue { i64, i64, i64, i64 } %loadN, 1
%add = add i64 %sg, %sgN
%loadN = load { i64, i64, i64, i64 }, ptr %alloca, align 8
%sgN = extractvalue { i64, i64, i64, i64 } %loadN, 2
%addN = add i64 %add, %sgN
%loadN = load { i64, i64, i64, i64 }, ptr %alloca, align 8
%sgN = extractvalue { i64, i64, i64, i64 } %loadN, 3
%addN = add i64 %addN, %sgN
ret i64 %addN
}
; Function Attrs: nounwind
define i32 @main() #0 {
entry:
%alloca = alloca { i64, i64, i64, i64 }, align 8
store { i64, i64, i64, i64 } { i64 1, i64 10, i64 100, i64 1000 }, ptr %alloca, align 8
%load = load { i64, i64, i64, i64 }, ptr %alloca, align 8
%byval.tmp = alloca { i64, i64, i64, i64 }, align 8
store { i64, i64, i64, i64 } %load, ptr %byval.tmp, align 8
%call = call i64 @accept_c(ptr %byval.tmp)
%icmp = icmp ne i64 %call, 1111
br i1 %icmp, label %if.then.0, label %if.merge.1
if.then.0: ; preds = %entry
ret i32 1
if.merge.1: ; preds = %entry
ret i32 0
}
; Function Attrs: nounwind
define internal ptr @__thunk_CAllocator_Allocator_alloc(ptr %0, ptr %1, i64 %2) #0 {
entry:
%call = call ptr @CAllocator.alloc(ptr %0, ptr %1, i64 %2)
ret ptr %call
}
; Function Attrs: nounwind
define internal void @__thunk_CAllocator_Allocator_dealloc(ptr %0, ptr %1, ptr %2) #0 {
entry:
call void @CAllocator.dealloc(ptr %0, ptr %1, ptr %2)
ret void
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,313 @@
@OS = internal global i64 0
@ARCH = internal global i64 0
@POINTER_SIZE = internal global i64 8
@g_held_view = internal global ptr null
@__sx_default_context = internal global { { ptr, ptr, ptr }, ptr } { { ptr, ptr, ptr } { ptr null, ptr @__thunk_CAllocator_Allocator_alloc, ptr @__thunk_CAllocator_Allocator_dealloc }, ptr null }
@str = private unnamed_addr constant [9 x i8] c"onCreate\00", align 1
@str.1 = private unnamed_addr constant [23 x i8] c"(Landroid/os/Bundle;)V\00", align 1
@jni.parent.path = private unnamed_addr constant [21 x i8] c"android/app/Activity\00", align 1
@str.2 = private unnamed_addr constant [7 x i8] c"<init>\00", align 1
@str.3 = private unnamed_addr constant [29 x i8] c"(Landroid/content/Context;)V\00", align 1
@jni.ctor.path = private unnamed_addr constant [25 x i8] c"android/view/SurfaceView\00", align 1
; Function Attrs: nounwind
declare void @out(ptr) #0
declare ptr @malloc(i64)
declare void @free(ptr)
declare ptr @memcpy(ptr, ptr, i64)
declare ptr @memset(ptr, i32, i64)
; Function Attrs: nounwind
define internal ptr @CAllocator.alloc(ptr %0, ptr %1, i64 %2) #0 {
entry:
%alloca = alloca ptr, align 8
store ptr %1, ptr %alloca, align 8
%allocaN = alloca i64, align 8
store i64 %2, ptr %allocaN, align 8
%load = load i64, ptr %allocaN, align 8
%call = call ptr @malloc(i64 %load)
ret ptr %call
}
; Function Attrs: nounwind
define internal void @CAllocator.dealloc(ptr %0, ptr %1, ptr %2) #0 {
entry:
%alloca = alloca ptr, align 8
store ptr %1, ptr %alloca, align 8
%allocaN = alloca ptr, align 8
store ptr %2, ptr %allocaN, align 8
%load = load ptr, ptr %allocaN, align 8
call void @free(ptr %load)
ret void
}
; Function Attrs: nounwind
declare i64 @GPA.init(ptr) #0
; Function Attrs: nounwind
declare ptr @GPA.alloc(ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @GPA.dealloc(ptr, ptr, ptr) #0
; Function Attrs: nounwind
declare void @Arena.add_chunk(ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @Arena.init(ptr sret({ ptr, i64, { ptr, ptr, ptr } }), ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @Arena.reset(ptr, ptr) #0
; Function Attrs: nounwind
declare void @Arena.deinit(ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @Arena.alloc(ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @Arena.dealloc(ptr, ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @BufAlloc.init(ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @BufAlloc.reset(ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @BufAlloc.alloc(ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @BufAlloc.dealloc(ptr, ptr, ptr) #0
; Function Attrs: nounwind
declare void @TrackingAllocator.init(ptr sret({ { ptr, ptr, ptr }, i64, i64, i64 }), ptr, ptr) #0
; Function Attrs: nounwind
declare i64 @TrackingAllocator.leak_count(ptr, ptr) #0
; Function Attrs: nounwind
declare void @TrackingAllocator.report(ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @TrackingAllocator.alloc(ptr, ptr, i64) #0
; Function Attrs: nounwind
declare void @TrackingAllocator.dealloc(ptr, ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @cstring(ptr, i64) #0
; Function Attrs: nounwind
declare ptr @int_to_string(ptr, i64) #0
; Function Attrs: nounwind
declare ptr @bool_to_string(ptr, i1) #0
; Function Attrs: nounwind
declare ptr @float_to_string(ptr, double) #0
; Function Attrs: nounwind
declare void @hex_group(ptr, ptr, i64, i64) #0
; Function Attrs: nounwind
declare ptr @int_to_hex_string(ptr, i64) #0
; Function Attrs: nounwind
declare ptr @concat(ptr, ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @substr(ptr, ptr, i64, i64) #0
; Function Attrs: nounwind
declare ptr @xml_escape(ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @path_join(ptr, ptr) #0
; Function Attrs: nounwind
declare ptr @any_to_string(ptr, [2 x i64]) #0
; Function Attrs: nounwind
declare ptr @build_format(ptr, ptr) #0
; Function Attrs: nounwind
declare void @BuildOptions.add_link_flag(i64, ptr) #0
; Function Attrs: nounwind
declare void @BuildOptions.add_framework(i64, ptr) #0
; Function Attrs: nounwind
declare void @BuildOptions.set_output_path(i64, ptr) #0
; Function Attrs: nounwind
declare void @BuildOptions.set_wasm_shell(i64, ptr) #0
; Function Attrs: nounwind
declare void @BuildOptions.add_asset_dir(i64, ptr, ptr) #0
; Function Attrs: nounwind
declare i64 @BuildOptions.asset_dir_count(i64) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.asset_dir_src_at(i64, i64) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.asset_dir_dest_at(i64, i64) #0
; Function Attrs: nounwind
declare void @BuildOptions.set_post_link_callback(i64, ptr) #0
; Function Attrs: nounwind
declare void @BuildOptions.set_post_link_module(i64, ptr) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.binary_path(i64) #0
; Function Attrs: nounwind
declare void @BuildOptions.set_bundle_path(i64, ptr) #0
; Function Attrs: nounwind
declare void @BuildOptions.set_bundle_id(i64, ptr) #0
; Function Attrs: nounwind
declare void @BuildOptions.set_codesign_identity(i64, ptr) #0
; Function Attrs: nounwind
declare void @BuildOptions.set_provisioning_profile(i64, ptr) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.bundle_path(i64) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.bundle_id(i64) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.codesign_identity(i64) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.provisioning_profile(i64) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.target_triple(i64) #0
; Function Attrs: nounwind
declare i1 @BuildOptions.is_macos(i64) #0
; Function Attrs: nounwind
declare i1 @BuildOptions.is_ios(i64) #0
; Function Attrs: nounwind
declare i1 @BuildOptions.is_ios_device(i64) #0
; Function Attrs: nounwind
declare i1 @BuildOptions.is_ios_simulator(i64) #0
; Function Attrs: nounwind
declare i1 @BuildOptions.is_android(i64) #0
; Function Attrs: nounwind
declare i64 @BuildOptions.framework_count(i64) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.framework_at(i64, i64) #0
; Function Attrs: nounwind
declare i64 @BuildOptions.framework_path_count(i64) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.framework_path_at(i64, i64) #0
; Function Attrs: nounwind
declare void @BuildOptions.set_manifest_path(i64, ptr) #0
; Function Attrs: nounwind
declare void @BuildOptions.set_keystore_path(i64, ptr) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.manifest_path(i64) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.keystore_path(i64) #0
; Function Attrs: nounwind
declare i64 @BuildOptions.jni_main_count(i64) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.jni_main_foreign_path_at(i64, i64) #0
; Function Attrs: nounwind
declare ptr @BuildOptions.jni_main_java_source_at(i64, i64) #0
; Function Attrs: nounwind
declare i64 @build_options() #0
; Function Attrs: nounwind
define i32 @main() #0 {
entry:
ret i32 0
}
; Function Attrs: nounwind
define internal ptr @__thunk_CAllocator_Allocator_alloc(ptr %0, ptr %1, i64 %2) #0 {
entry:
%call = call ptr @CAllocator.alloc(ptr %0, ptr %1, i64 %2)
ret ptr %call
}
; Function Attrs: nounwind
define internal void @__thunk_CAllocator_Allocator_dealloc(ptr %0, ptr %1, ptr %2) #0 {
entry:
call void @CAllocator.dealloc(ptr %0, ptr %1, ptr %2)
ret void
}
; Function Attrs: nounwind
define void @Java_co_swipelab_sxjnictor_SxApp_sx_1onCreate(ptr %0, ptr %1, ptr %2) #0 {
entry:
%alloca = alloca ptr, align 8
store ptr %0, ptr %alloca, align 8
%allocaN = alloca ptr, align 8
store ptr %1, ptr %allocaN, align 8
%allocaN = alloca ptr, align 8
store ptr %2, ptr %allocaN, align 8
%load = load ptr, ptr %alloca, align 8
%loadN = load ptr, ptr %allocaN, align 8
%loadN = load ptr, ptr %allocaN, align 8
%jni.ifs = load ptr, ptr %load, align 8
%3 = getelementptr inbounds ptr, ptr %jni.ifs, i32 6
%jni.FindClass = load ptr, ptr %3, align 8
%jni.parent.cls = call ptr %jni.FindClass(ptr %load, ptr @jni.parent.path)
%4 = getelementptr inbounds ptr, ptr %jni.ifs, i32 33
%jni.GetMethodID = load ptr, ptr %4, align 8
%jni.mid = call ptr %jni.GetMethodID(ptr %load, ptr %jni.parent.cls, ptr @str, ptr @str.1)
%jni.parent.cls.slot = alloca ptr, align 8
store ptr %jni.parent.cls, ptr %jni.parent.cls.slot, align 8
%5 = getelementptr inbounds ptr, ptr %jni.ifs, i32 91
%jni.callfn.nonvirtual = load ptr, ptr %5, align 8
call void %jni.callfn.nonvirtual(ptr %load, ptr %loadN, ptr %jni.parent.cls, ptr %jni.mid, ptr %loadN)
%allocaN = alloca ptr, align 8
%loadN = load ptr, ptr %allocaN, align 8
store ptr %loadN, ptr %allocaN, align 8
%loadN = load ptr, ptr %allocaN, align 8
%jni.ifs8 = load ptr, ptr %load, align 8
%6 = getelementptr inbounds ptr, ptr %jni.ifs8, i32 6
%jni.FindClass9 = load ptr, ptr %6, align 8
%jni.ctor.cls = call ptr %jni.FindClass9(ptr %load, ptr @jni.ctor.path)
%7 = getelementptr inbounds ptr, ptr %jni.ifs8, i32 33
%jni.GetMethodID10 = load ptr, ptr %7, align 8
%jni.ctor.mid = call ptr %jni.GetMethodID10(ptr %load, ptr %jni.ctor.cls, ptr @str.2, ptr @str.3)
%8 = getelementptr inbounds ptr, ptr %jni.ifs8, i32 28
%jni.NewObject = load ptr, ptr %8, align 8
%jni.new.obj = call ptr %jni.NewObject(ptr %load, ptr %jni.ctor.cls, ptr %jni.ctor.mid, ptr %loadN)
%allocaN = alloca ptr, align 8
store ptr %jni.new.obj, ptr %allocaN, align 8
%loadN = load ptr, ptr %allocaN, align 8
store ptr %loadN, ptr @g_held_view, align 8
ret void
}

View File

@@ -0,0 +1,84 @@
# 0074 — silent `getRefIRType(arg_ref) orelse .void` fallback in FFI call-arg lowering
> **✅ RESOLVED.** Root cause: four FFI call-arg lowering loops resolved an
> argument's IR type via `getRefIRType(arg_ref) orelse .void` — a silent fallback
> to the load-bearing real type `.void`, which downstream `toLLVMType` →
> `abiCoerceParamType` → `coerceArg` treat as a legitimate (void-typed) foreign
> argument, corrupting the call ABI with no diagnostic. Fix: one shared resolver
> `LLVMEmitter.argIRTypeOrFail` ([src/ir/emit_llvm.zig]) returns the dedicated
> `.unresolved` sentinel on a failed lookup — never `.void`/`.s64` — so the failure
> cannot masquerade as a real type and trips `toLLVMType`'s existing hard `@panic`
> tripwire at the call site. All four sites
> ([src/ir/emit_llvm.zig] JNI constructor; [src/backend/llvm/ops.zig] objc_msgSend,
> JNI non-virtual, JNI `Call<Type>Method`) now route through the helper. Happy path
> is byte-identical (every real arg already has a resolved type) — FFI examples stay
> green with zero snapshot churn. Regression test (fail-before/pass-after):
> `src/ir/emit_llvm.test.zig` — "argIRTypeOrFail surfaces .unresolved for an
> unresolvable FFI arg ref (issue 0074)".
## Symptom
**One-line:** Four FFI call-arg lowering sites silently default a failed
argument-type lookup to `.void` — the forbidden silent-type-fallback anti-pattern
(`.void` as a failed-type-lookup sentinel), which would produce a void-typed
foreign-call argument (wrong LLVM param type → silent ABI corruption) with no
diagnostic if the lookup ever fails.
**Observed:** `self.getRefIRType(arg_ref) orelse .void` at:
- `src/ir/emit_llvm.zig:2463`
- `src/backend/llvm/ops.zig:517` (Obj-C `objc_msgSend` arg loop)
- `src/backend/llvm/ops.zig:731` (JNI non-virtual call arg loop)
- `src/backend/llvm/ops.zig:761` (JNI `Call<Type>Method` arg loop)
Each then does `toLLVMType(raw_ty)``abiCoerceParamType``coerceArg`, so a
`.void` fallback silently mis-types the foreign-call argument.
**Expected:** `getRefIRType` returning null for a real call argument is a
"must-succeed lookup" failure (every arg is a valid param/instruction ref). Per
`CLAUDE.md` REJECTED PATTERNS — *"`.void` is an UNACCEPTABLE sentinel for a failed
type lookup"* — the lookup failure must surface as a diagnostic / hard tripwire, not
a silently-corrupted argument type.
## Provenance / scope
Pre-existing pattern (the `emit_llvm.zig` site is original; the three `ops.zig` sites
were relocated **verbatim, behavior-preserving** by step A7.4c of the arch-refactor —
the flow reviewer/observer correctly approved the relocations as equivalence-preserving).
Surfaced by the **A9.3 final fallback-audit** of the arch-refactor stream. Not
introduced by the refactor; filed per the IMPASSIBLE RULE (*"If you find an existing
default-return in the compiler that swallows a lookup failure, treat it as a
discovered bug — file an issue, do not just delete the default in place"*).
## Reproduction
This is a **latent / static** finding: there is no known sx program that drives
`getRefIRType` to `null` for a valid foreign-call argument (well-formed IR always
has a type for every arg ref), so it cannot currently be triggered at runtime — which
is exactly why it is dangerous (a future IR change that breaks the invariant would
corrupt FFI ABI silently). The code paths are exercised (and must stay green after
the fix) by the existing FFI examples, e.g.:
```
examples/13xx-ffi-objc-* # objc_msgSend arg lowering (ops.zig:517)
examples/14xx-ffi-jni-* # JNI Call<Type>Method / non-virtual (ops.zig:731/761)
```
(No new minimal repro `.sx` is meaningful for a latent defensive fallback; the fix is
verified by (a) the FFI suite staying green and (b) a unit test that asserts the new
loud-failure path, see below.)
## Investigation prompt (ready to paste into a fresh session)
> In `/Users/agra/projects/sx`, four FFI call-arg lowering sites use the forbidden
> silent type-fallback `self.getRefIRType(arg_ref) orelse .void`
> (`src/ir/emit_llvm.zig:2463`; `src/backend/llvm/ops.zig:517`, `:731`, `:761`).
> `getRefIRType` (`src/ir/emit_llvm.zig:2229`, returns `?TypeId`) yields `null` only
> when a ref is neither a function param nor a block instruction result — a
> must-not-happen case for a real call argument. Replace the silent `.void` default
> with a loud failure that cannot be mistaken for a real type, per `CLAUDE.md`
> REJECTED PATTERNS: emit a diagnostic via `self.diagnostics.addFmt(.err, span,
> "...", .{...})` and/or a hard tripwire (`@panic`/`bailDetail`-style) naming the op
> and the bad ref — do NOT substitute another real type. Prefer a single shared
> helper (e.g. `argIRTypeOrFail(arg_ref, span)`) used by all four sites so the policy
> lives in one place. Then: (1) `/Users/agra/.zvm/bin/zig build && /Users/agra/.zvm/bin/zig
> build test && bash tests/run_examples.sh` must stay 361/0 with the FFI examples
> green (the happy path is unchanged); (2) add a `*.test.zig` unit test that
> constructs an FFI call op with a bogus arg ref and asserts the loud failure fires
> (not a `.void` silent default). Expected new behavior: an unresolved FFI arg type
> produces a clear compiler error / panic, never a void-typed foreign argument.

View File

@@ -0,0 +1,97 @@
> **RESOLVED** (2026-06-03)
> **Root cause:** the `type_name` / `type_eq` reflection builtins resolved their
> `Type` arg's IR type with `getRefIRType(...) orelse TypeId.s64`, then gated `== .any`
> — so a failed must-succeed lookup silently became "bare i64" (`.s64 != .any`),
> reading the wrong value with no diagnostic.
> **Fix:** added the sibling classifier `LLVMEmitter.reflectArgRepr`
> (`src/ir/emit_llvm.zig`) which routes the lookup through `argIRTypeOrFail` →
> `.unresolved` and returns `{ boxed, bare, unresolved }`. The three emit sites
> (`src/backend/llvm/ops.zig` `type_name` + `type_eq` ×2) now `switch` on it: `.boxed`
> extracts the `Any` value field, `.bare` uses the value directly, and `.unresolved`
> hits a hard `@panic` tripwire — never silently classified as bare. Happy path
> (real args always resolve) is byte-identical; suite stays 361/0.
> **Secondary (confirmed intentional):** `src/ir/lower.zig:2531/2532`
> (`null_literal` / `undef_literal` → `target_type orelse .void`) is a typeless-literal
> default, not a lookup-swallow — `emitConstNull`/`emitConstUndef` deliberately handle
> `.void` (null-ptr / undef-i64). Left in place with an invariant comment.
> **Regression test:** `src/ir/emit_llvm.test.zig` — "emit: reflectArgRepr surfaces
> .unresolved for an unresolvable reflection arg ref (issue 0075)" (fail-before with
> `orelse .s64` → `.bare`; pass-after → `.unresolved`).
# 0075 — silent `getRefIRType(...) orelse TypeId.s64` fallback in reflection builtins
## Symptom
**One-line:** The `type_name` and `type_eq` reflection builtins resolve their Type
argument's IR type via `getRefIRType(...) orelse TypeId.s64` — the forbidden
silent-type-lookup fallback (`.s64` is the exact issue-0042 sentinel the project
rules name) — so a failed must-succeed lookup silently decides "not boxed (`!= .any`)"
and mis-handles the value with no diagnostic.
**Observed (primary — must fix):** `self.e.getRefIRType(...) orelse TypeId.s64` at:
- `src/backend/llvm/ops.zig:1023` (`.type_name` builtin — `arg_ir_ty`, gates the
`== .any` boxed-extract vs bare-i64 decision)
- `src/backend/llvm/ops.zig:1049` (`.type_eq` builtin — first operand)
- `src/backend/llvm/ops.zig:1055` (`.type_eq` builtin — second operand)
`getRefIRType` (`src/ir/emit_llvm.zig:2229`, `?TypeId`) returns `null` only when a ref
is neither a function param nor a block instruction result — a must-not-happen case
for a real builtin argument. On `null` the code defaults to `.s64`, then tests
`arg_ir_ty == .any`; the `.s64` default silently means "treat as a bare TypeId index,
not a boxed `Any`", so a genuinely-boxed arg whose lookup failed would skip the
`ExtractValue` and use the wrong value — silent miscompile, no diagnostic.
**Expected:** per `CLAUDE.md` REJECTED PATTERNS, a failed must-succeed type lookup
surfaces a diagnostic / hard tripwire (e.g. the `.unresolved` sentinel introduced for
issue 0074), never a real-type default.
## Secondary (confirm — borderline)
- `src/ir/lower.zig:2527``.null_literal => constNull(self.target_type orelse .void)`
- `src/ir/lower.zig:2528``.undef_literal => constUndef(self.target_type orelse .void)`
`target_type` is a context hint that may be legitimately absent for a bare
`null`/`undef` with no expected type — this may be an INTENTIONAL default rather
than a lookup-swallow. The fix session should confirm: if a `null`/`undef` literal
reaching here without a `target_type` is actually a must-not-happen case, make it
loud; if a typeless null/undef is legitimate, leave it and add a one-line comment
stating the invariant.
## Audited — intentional language defaults (NO action; documented so they aren't re-flagged)
- `src/ir/lower.zig:4855``int_literal => constInt(lit.value, info.ty orelse .s64)`:
an untyped integer literal defaulting to `s64` is standard language semantics, not a
lookup failure.
- `src/ir/lower.zig:4856``float_literal => constFloat(..., info.ty orelse .f64)`:
untyped float literal defaults to `f64` — language semantics.
- `src/ir/type_bridge.zig:334``.tag_type = tag_type orelse .s64`: documented
("enum unions are always tagged (default i64)") — an intentional default tag type,
not a swallowed lookup.
## Provenance / scope
Pre-existing, NOT introduced by the arch-refactor. Discovered during the **issue-0074
fix** (the fix worker surfaced the reflection `.s64` fallbacks as a separate pattern
outside 0074's FFI-arg scope) and confirmed by a manager sweep
(`rg "orelse \.(s64|void|...)" src`). Filed per the IMPASSIBLE RULE (existing
default-returns that swallow a lookup failure → file, don't fix in place).
## Reproduction
Latent / static (same nature as 0074): well-formed IR always gives a builtin arg a
resolvable type, so the `.s64` default can't be driven at runtime today — which is why
it's dangerous (a future IR change would silently miscompile `type_name`/`type_eq`).
Exercised by the comptime/reflection examples; the fix must keep the suite at 361/0.
## Investigation prompt (ready to paste into a fresh session)
> In `/Users/agra/projects/sx`, the `.type_name` and `.type_eq` reflection builtins in
> `src/backend/llvm/ops.zig` (lines 1023, 1049, 1055) resolve a Type argument's IR type
> with the forbidden silent fallback `getRefIRType(...) orelse TypeId.s64`, used to gate
> a `== .any` boxed-vs-bare decision. Issue 0074 already added the shared resolver
> `LLVMEmitter.argIRTypeOrFail` (`src/ir/emit_llvm.zig`) returning the dedicated
> `.unresolved` sentinel on a failed lookup. Route these three sites through that helper
> (or a sibling) so a failed lookup yields `.unresolved` — never `.s64`; then `==.any`
> is false for `.unresolved` AND you must make the unresolved case loud (diagnostic via
> `self.diagnostics.addFmt(.err, span, ...)` or a hard tripwire), not silently "bare
> i64". Also resolve the borderline `lower.zig:2527/2528` `target_type orelse .void`
> (confirm intentional vs make-loud; comment the invariant either way). Leave the
> audited intentional defaults (`lower.zig:4855/4856`, `type_bridge.zig:334`) untouched.
> Verify: `/Users/agra/.zvm/bin/zig build && /Users/agra/.zvm/bin/zig build test &&
> bash tests/run_examples.sh` stays 361/0; add a `*.test.zig` regression test asserting
> the loud `.unresolved` path for a `type_name`/`type_eq` arg with an unresolvable ref
> (fail-before/pass-after). Expected new behavior: an unresolved reflection-builtin arg
> type surfaces loudly, never silently defaults to `.s64`.

123
src/backend/llvm/abi.zig Normal file
View File

@@ -0,0 +1,123 @@
const std = @import("std");
const llvm = @import("../../llvm_api.zig");
const c = llvm.c;
const ir_types = @import("../../ir/types.zig");
const emit = @import("../../ir/emit_llvm.zig");
const TypeId = ir_types.TypeId;
const LLVMEmitter = emit.LLVMEmitter;
/// C-ABI parameter coercion (architecture phase A7.1), extracted from
/// `LLVMEmitter`. A backend `*LLVMEmitter` facade: it borrows the emitter for
/// the cached LLVM handles, the IR type table, the module data layout, and the
/// IR builder. `LLVMEmitter.{abiCoerceParamType, abiCoerceParamTypeEx,
/// needsByval, materializeByvalArg}` are thin wrappers delegating here.
///
/// On ARM64 (and x86_64), the C calling convention coerces small struct
/// arguments to integers for register passing:
/// - String/slice {ptr, i64} → ptr (extract raw pointer)
/// - Small integer struct (≤ 8 bytes, non-HFA) → i64
/// - HFA (homogeneous float aggregate) → leave as-is (LLVM handles it)
pub const AbiLowering = struct {
e: *LLVMEmitter,
pub fn abiCoerceParamType(self: AbiLowering, ir_ty: TypeId, llvm_ty: c.LLVMTypeRef) c.LLVMTypeRef {
return self.abiCoerceParamTypeEx(ir_ty, llvm_ty, true);
}
/// Same as `abiCoerceParamType` but with an explicit
/// `is_foreign_c_api` knob. When true, sx `string` / `[]T` slices
/// collapse to `ptr` — the libc convention where the user writes
/// `string` to mean `char *` and the length is dropped. When
/// false (sx-internal `callconv(.c)` like block trampolines), the
/// full slice shape is preserved and goes through the general
/// struct-coerce path (16-byte slice → `[2 x i64]`, lands in two
/// registers on AArch64 — the true C ABI for a 16-byte
/// aggregate). Without the split, sx-to-sx calls through a
/// `(*Block, string) -> void callconv(.c)` fn-pointer mismatched
/// the caller's `{ptr, i64}` value against the trampoline's
/// collapsed `ptr` param.
pub fn abiCoerceParamTypeEx(self: AbiLowering, ir_ty: TypeId, llvm_ty: c.LLVMTypeRef, is_foreign_c_api: bool) c.LLVMTypeRef {
if (is_foreign_c_api) {
if (ir_ty == .string) return self.e.cached_ptr;
if (!ir_ty.isBuiltin()) {
const info = self.e.ir_mod.types.get(ir_ty);
if (info == .slice) return self.e.cached_ptr;
}
}
// WASM32: usize/isize are pointer-sized (i32 on wasm32).
// Other integer types (s64, u64) keep their declared size — they represent
// genuinely 64-bit values (SDL_WindowFlags, timestamps, etc.).
if (self.e.target_config.isWasm32()) {
if (ir_ty == .usize or ir_ty == .isize) return self.e.cached_i32;
return llvm_ty;
}
// Only coerce struct types
if (c.LLVMGetTypeKind(llvm_ty) != c.LLVMStructTypeKind) return llvm_ty;
// Check if it's an HFA (all float or all double fields) — leave as-is
const n_fields = c.LLVMCountStructElementTypes(llvm_ty);
if (n_fields >= 1 and n_fields <= 4) {
var all_float = true;
var all_double = true;
var fi: c_uint = 0;
while (fi < n_fields) : (fi += 1) {
const ft = c.LLVMStructGetTypeAtIndex(llvm_ty, fi);
const fk = c.LLVMGetTypeKind(ft);
if (fk != c.LLVMFloatTypeKind) all_float = false;
if (fk != c.LLVMDoubleTypeKind) all_double = false;
}
if (all_float or all_double) return llvm_ty;
}
// Small struct (≤ 8 bytes) → coerce to i64
const size = c.LLVMABISizeOfType(
c.LLVMGetModuleDataLayout(self.e.llvm_module),
llvm_ty,
);
if (size <= 8) return self.e.cached_i64;
// Medium struct (9-16 bytes) → coerce to [2 x i64]
if (size <= 16) {
return c.LLVMArrayType2(self.e.cached_i64, 2);
}
// Large composite (> 16 bytes) → pass by reference: ptr + byval(<T>) at
// the call/sig sites. LLVM's AArch64/x86_64 backend lowers byval to
// the right ABI sequence (caller copy + indirect arg).
return self.e.cached_ptr;
}
pub fn needsByval(self: AbiLowering, ir_ty: TypeId, raw_llvm_ty: c.LLVMTypeRef) bool {
if (self.e.target_config.isWasm32()) return false;
if (ir_ty == .string) return false;
if (!ir_ty.isBuiltin()) {
const info = self.e.ir_mod.types.get(ir_ty);
if (info == .slice) return false;
}
if (c.LLVMGetTypeKind(raw_llvm_ty) != c.LLVMStructTypeKind) return false;
const n = c.LLVMCountStructElementTypes(raw_llvm_ty);
if (n >= 1 and n <= 4) {
var all_f = true;
var all_d = true;
var i: c_uint = 0;
while (i < n) : (i += 1) {
const ft = c.LLVMStructGetTypeAtIndex(raw_llvm_ty, i);
const fk = c.LLVMGetTypeKind(ft);
if (fk != c.LLVMFloatTypeKind) all_f = false;
if (fk != c.LLVMDoubleTypeKind) all_d = false;
}
if (all_f or all_d) return false;
}
const size = c.LLVMABISizeOfType(c.LLVMGetModuleDataLayout(self.e.llvm_module), raw_llvm_ty);
return size > 16;
}
pub fn materializeByvalArg(self: AbiLowering, val: c.LLVMValueRef, struct_ty: c.LLVMTypeRef) c.LLVMValueRef {
const tmp = c.LLVMBuildAlloca(self.e.builder, struct_ty, "byval.tmp");
_ = c.LLVMBuildStore(self.e.builder, val, tmp);
return tmp;
}
};

160
src/backend/llvm/debug.zig Normal file
View File

@@ -0,0 +1,160 @@
const std = @import("std");
const llvm = @import("../../llvm_api.zig");
const c = llvm.c;
const errors = @import("../../errors.zig");
const emit = @import("../../ir/emit_llvm.zig");
const ir_inst = @import("../../ir/inst.zig");
const LLVMEmitter = emit.LLVMEmitter;
const Function = ir_inst.Function;
const Span = ir_inst.Span;
/// DWARF debug-info emission (architecture phase A7.2), extracted from
/// `LLVMEmitter`. A backend `*LLVMEmitter` facade (field `e`): it owns the
/// `DIBuilder` lifecycle, the compile unit, per-function `DISubprogram` scopes,
/// and per-instruction `DILocation`s. The mutable DI state (`di_builder`/
/// `di_cu`/`di_files`/`di_scope`/`current_func_file`) + the shared source map
/// (`import_sources`/`main_file`, also read by `#caller_location`) stay on
/// `LLVMEmitter`; this reads/writes them via `self.e.*`. `LLVMEmitter.emit`
/// drives the pass order and calls in via `self.debugInfo()`.
pub const DebugInfo = struct {
e: *LLVMEmitter,
/// Debug info is emitted only when error traces are kept (opt_level
/// none/less, matching `tracesEnabled` in lower.zig) and a source
/// map is available. Release builds (default/aggressive) skip it, so
/// the DWARF is strippable cost-free.
fn debugEnabled(self: DebugInfo) bool {
if (self.e.import_sources == null) return false;
return self.e.target_config.opt_level == .none or self.e.target_config.opt_level == .less;
}
/// The `DIFile` for `path`, created once and cached. Splits the path
/// into basename + directory as DWARF expects. The directory MUST be
/// non-empty: an empty `DW_AT_comp_dir` makes Apple's `ld` silently drop
/// the whole object's debug map (no `N_OSO`), so a binary built from a
/// bare filename (e.g. `sx build main.sx`) becomes undebuggable. Fall back
/// to "." when the path has no directory component.
fn diFileFor(self: DebugInfo, path: []const u8) c.LLVMMetadataRef {
if (self.e.di_files.get(path)) |f| return f;
const slash = std.mem.lastIndexOfScalar(u8, path, '/');
const dir = if (slash) |s| (if (s == 0) "/" else path[0..s]) else ".";
const base = if (slash) |s| path[s + 1 ..] else path;
const f = c.LLVMDIBuilderCreateFile(self.e.di_builder, base.ptr, base.len, dir.ptr, dir.len);
self.e.di_files.put(path, f) catch {};
return f;
}
/// Create the DIBuilder, the module flags ("Debug Info Version" /
/// "Dwarf Version"), and the single compile unit on the main file.
pub fn initDebugInfo(self: DebugInfo) void {
if (!self.debugEnabled()) return;
self.e.di_builder = c.LLVMCreateDIBuilder(self.e.llvm_module);
c.LLVMAddModuleFlag(
self.e.llvm_module,
c.LLVMModuleFlagBehaviorWarning,
"Debug Info Version",
"Debug Info Version".len,
c.LLVMValueAsMetadata(c.LLVMConstInt(self.e.cached_i32, c.LLVMDebugMetadataVersion(), 0)),
);
c.LLVMAddModuleFlag(
self.e.llvm_module,
c.LLVMModuleFlagBehaviorWarning,
"Dwarf Version",
"Dwarf Version".len,
c.LLVMValueAsMetadata(c.LLVMConstInt(self.e.cached_i32, 4, 0)),
);
const cu_file = self.diFileFor(if (self.e.main_file.len > 0) self.e.main_file else "sx");
self.e.di_cu = c.LLVMDIBuilderCreateCompileUnit(
self.e.di_builder,
c.LLVMDWARFSourceLanguageC,
cu_file,
"sx",
"sx".len,
0, // isOptimized
"",
0, // flags
0, // runtime version
"",
0, // split name
c.LLVMDWARFEmissionFull,
0, // DWOId
0, // split debug inlining
0, // debug info for profiling
"",
0, // sysroot
"",
0, // sdk
);
}
/// Create a `DISubprogram` for `func` and attach it to `llvm_func`,
/// making it the scope (`di_scope`) for the function's instruction
/// locations. Clears any stale builder location first so synthetic
/// functions emitted between sx functions carry none.
pub fn beginFunctionDebug(self: DebugInfo, func: *const Function, llvm_func: c.LLVMValueRef, name: []const u8) void {
self.e.di_scope = null;
c.LLVMSetCurrentDebugLocation2(self.e.builder, null);
if (self.e.di_builder == null) return;
const file = func.source_file orelse self.e.main_file;
self.e.current_func_file = file;
const di_file = self.diFileFor(file);
const subroutine_ty = c.LLVMDIBuilderCreateSubroutineType(self.e.di_builder, di_file, null, 0, c.LLVMDIFlagZero);
// Line = the first instruction's line (the function body's start),
// else 1 when the body is empty / span-less.
var line: c_uint = 1;
if (func.blocks.items.len > 0 and func.blocks.items[0].insts.items.len > 0) {
const sp = func.blocks.items[0].insts.items[0].span;
const src = self.e.sourceForFile(file);
line = errors.SourceLoc.compute(src, sp.start).line;
}
const is_local: c.LLVMBool = if (func.linkage == .external) 0 else 1;
const subprogram = c.LLVMDIBuilderCreateFunction(
self.e.di_builder,
di_file, // scope
name.ptr,
name.len,
name.ptr,
name.len, // linkage name
di_file,
line,
subroutine_ty,
is_local,
1, // is definition
line, // scope line
c.LLVMDIFlagZero,
0, // isOptimized
);
c.LLVMSetSubprogram(llvm_func, subprogram);
self.e.di_scope = subprogram;
}
/// End the current function's debug scope and clear the builder's
/// location, so the next (possibly synthetic) function doesn't
/// inherit a DILocation pointing into this function's subprogram.
pub fn endFunctionDebug(self: DebugInfo) void {
self.e.di_scope = null;
c.LLVMSetCurrentDebugLocation2(self.e.builder, null);
}
/// Set the builder's current debug location from an instruction span,
/// scoped to the current function's subprogram. No-op when debug info
/// is off (`di_scope == null`).
pub fn setInstDebugLocation(self: DebugInfo, span: Span) void {
const scope = self.e.di_scope orelse return;
const src = self.e.sourceForFile(self.e.current_func_file);
const loc = errors.SourceLoc.compute(src, span.start);
const di_loc = c.LLVMDIBuilderCreateDebugLocation(self.e.context, loc.line, loc.col, scope, null);
c.LLVMSetCurrentDebugLocation2(self.e.builder, di_loc);
}
pub fn finalizeDebugInfo(self: DebugInfo) void {
if (self.e.di_builder == null) return;
c.LLVMDIBuilderFinalize(self.e.di_builder);
}
};

View File

@@ -0,0 +1,509 @@
const std = @import("std");
const llvm = @import("../../llvm_api.zig");
const c = llvm.c;
const emit = @import("../../ir/emit_llvm.zig");
const LLVMEmitter = emit.LLVMEmitter;
const JniSlotPair = LLVMEmitter.JniSlotPair;
/// Obj-C / JNI runtime-constructor emission (architecture phase A7.3), extracted
/// from `LLVMEmitter`. A backend `*LLVMEmitter` facade (field `e`): it builds the
/// module-init constructors that populate the cached selector / class slots and
/// register sx-defined `#objc_class` class pairs (IMP tables, ivars, +alloc /
/// -dealloc / property IMPs, `#implements` protocol conformances). Reads the
/// emit-time caches (`ir_mod.objc_*_cache`, `global_map`) + cached LLVM handles
/// via `self.e.*`; the shared infra it calls back into
/// (`lazyDeclareCRuntime`/`emitPrivateCString`/`injectCtorIntoMain`) stays on
/// `LLVMEmitter`. `LLVMEmitter.emit` drives pass order via `self.ffiCtors()`.
pub const FfiCtors = struct {
e: *LLVMEmitter,
pub fn emitObjcSelectorInit(self: FfiCtors) void {
if (self.e.ir_mod.objc_selector_cache.items.len == 0) return;
// Lazy-declare sel_registerName for the constructor body —
// lower.zig only declares it when a non-literal selector
// appears, which the constructor doesn't depend on.
const sel_reg_name = "sel_registerName";
const sel_reg_z = self.e.alloc.dupeZ(u8, sel_reg_name) catch unreachable;
defer self.e.alloc.free(sel_reg_z);
var sel_reg_fn = c.LLVMGetNamedFunction(self.e.llvm_module, sel_reg_z.ptr);
var sel_reg_ty: c.LLVMTypeRef = undefined;
if (sel_reg_fn == null) {
var params: [1]c.LLVMTypeRef = .{self.e.cached_ptr};
sel_reg_ty = c.LLVMFunctionType(self.e.cached_ptr, &params, 1, 0);
sel_reg_fn = c.LLVMAddFunction(self.e.llvm_module, sel_reg_z.ptr, sel_reg_ty);
c.LLVMSetLinkage(sel_reg_fn, c.LLVMExternalLinkage);
} else {
sel_reg_ty = c.LLVMGlobalGetValueType(sel_reg_fn);
}
// Constructor: void __sx_objc_selector_init().
var no_params: [0]c.LLVMTypeRef = .{};
const ctor_ty = c.LLVMFunctionType(self.e.cached_void, &no_params, 0, 0);
const ctor = c.LLVMAddFunction(self.e.llvm_module, "__sx_objc_selector_init", ctor_ty);
c.LLVMSetLinkage(ctor, c.LLVMInternalLinkage);
const entry = c.LLVMAppendBasicBlockInContext(self.e.context, ctor, "entry");
c.LLVMPositionBuilderAtEnd(self.e.builder, entry);
for (self.e.ir_mod.objc_selector_cache.items) |entry_kv| {
const sel_str = entry_kv.sel;
const slot_gid = entry_kv.slot;
const slot_global = self.e.global_map.get(@intCast(slot_gid.index())) orelse continue;
// Method-name C-string — names match clang's convention
// so debuggers / nm / dyld see the same symbols, even
// though the surrounding section tagging isn't load-
// bearing in our JIT path.
const meth_str_z = self.e.alloc.allocSentinel(u8, sel_str.len, 0) catch continue;
defer self.e.alloc.free(meth_str_z);
@memcpy(meth_str_z[0..sel_str.len], sel_str);
const str_const = c.LLVMConstStringInContext(self.e.context, meth_str_z.ptr, @intCast(sel_str.len), 0);
const str_global = c.LLVMAddGlobal(self.e.llvm_module, c.LLVMTypeOf(str_const), "OBJC_METH_VAR_NAME_");
c.LLVMSetInitializer(str_global, str_const);
c.LLVMSetLinkage(str_global, c.LLVMPrivateLinkage);
c.LLVMSetGlobalConstant(str_global, 1);
c.LLVMSetUnnamedAddress(str_global, c.LLVMGlobalUnnamedAddr);
var sel_args: [1]c.LLVMValueRef = .{str_global};
const sel_val = c.LLVMBuildCall2(self.e.builder, sel_reg_ty, sel_reg_fn, &sel_args, 1, "sel");
_ = c.LLVMBuildStore(self.e.builder, sel_val, slot_global);
}
_ = c.LLVMBuildRetVoid(self.e.builder);
// Register the constructor in @llvm.global_ctors. dyld picks
// this up for a fully-linked binary at load time.
const i32_ty = self.e.cached_i32;
const ptr_ty = self.e.cached_ptr;
var ctor_field_types: [3]c.LLVMTypeRef = .{ i32_ty, ptr_ty, ptr_ty };
const ctor_struct_ty = c.LLVMStructTypeInContext(self.e.context, &ctor_field_types, 3, 0);
var ctor_fields: [3]c.LLVMValueRef = .{
c.LLVMConstInt(i32_ty, 65535, 0),
ctor,
c.LLVMConstNull(ptr_ty),
};
const ctor_entry = c.LLVMConstNamedStruct(ctor_struct_ty, &ctor_fields, 3);
const ctors_arr_ty = c.LLVMArrayType2(ctor_struct_ty, 1);
var ctor_entries: [1]c.LLVMValueRef = .{ctor_entry};
const ctors_init = c.LLVMConstArray2(ctor_struct_ty, &ctor_entries, 1);
const ctors_global = c.LLVMAddGlobal(self.e.llvm_module, ctors_arr_ty, "llvm.global_ctors");
c.LLVMSetInitializer(ctors_global, ctors_init);
c.LLVMSetLinkage(ctors_global, c.LLVMAppendingLinkage);
// BUT — LLVM's ORC JIT (the engine for `sx run`) doesn't
// automatically run `@llvm.global_ctors`. Inject a direct
// call from `main`'s entry block as well; idempotent under
// dyld (sel_registerName returns the same SEL on second call).
const main_z = "main";
const main_fn = c.LLVMGetNamedFunction(self.e.llvm_module, main_z);
if (main_fn != null) {
const entry_bb = c.LLVMGetEntryBasicBlock(main_fn);
const first_inst = c.LLVMGetFirstInstruction(entry_bb);
if (first_inst != null) {
c.LLVMPositionBuilderBefore(self.e.builder, first_inst);
} else {
c.LLVMPositionBuilderAtEnd(self.e.builder, entry_bb);
}
var no_args: [0]c.LLVMValueRef = .{};
_ = c.LLVMBuildCall2(self.e.builder, ctor_ty, ctor, &no_args, 0, "");
}
}
/// Phase 3.1 companion to `emitObjcSelectorInit`. Walks
/// `module.objc_class_cache` and synthesizes a constructor that
/// populates each cached `Class*` slot via `objc_getClass(name)`
/// exactly once at module-init. Registered in `@llvm.global_ctors`
/// AND injected at the top of `main()` for the ORC JIT path.
pub fn emitObjcClassInit(self: FfiCtors) void {
if (self.e.ir_mod.objc_class_cache.items.len == 0) return;
// Lazy-declare objc_getClass(name: *u8) -> *void.
const get_class_name = "objc_getClass";
const get_class_z = self.e.alloc.dupeZ(u8, get_class_name) catch unreachable;
defer self.e.alloc.free(get_class_z);
var get_class_fn = c.LLVMGetNamedFunction(self.e.llvm_module, get_class_z.ptr);
var get_class_ty: c.LLVMTypeRef = undefined;
if (get_class_fn == null) {
var params: [1]c.LLVMTypeRef = .{self.e.cached_ptr};
get_class_ty = c.LLVMFunctionType(self.e.cached_ptr, &params, 1, 0);
get_class_fn = c.LLVMAddFunction(self.e.llvm_module, get_class_z.ptr, get_class_ty);
c.LLVMSetLinkage(get_class_fn, c.LLVMExternalLinkage);
} else {
get_class_ty = c.LLVMGlobalGetValueType(get_class_fn);
}
// Constructor: void __sx_objc_class_init().
var no_params: [0]c.LLVMTypeRef = .{};
const ctor_ty = c.LLVMFunctionType(self.e.cached_void, &no_params, 0, 0);
const ctor = c.LLVMAddFunction(self.e.llvm_module, "__sx_objc_class_init", ctor_ty);
c.LLVMSetLinkage(ctor, c.LLVMInternalLinkage);
const entry = c.LLVMAppendBasicBlockInContext(self.e.context, ctor, "entry");
c.LLVMPositionBuilderAtEnd(self.e.builder, entry);
for (self.e.ir_mod.objc_class_cache.items) |entry_kv| {
const class_name = entry_kv.name;
const slot_gid = entry_kv.slot;
const slot_global = self.e.global_map.get(@intCast(slot_gid.index())) orelse continue;
// Class-name C-string.
const name_z = self.e.alloc.allocSentinel(u8, class_name.len, 0) catch continue;
defer self.e.alloc.free(name_z);
@memcpy(name_z[0..class_name.len], class_name);
const str_const = c.LLVMConstStringInContext(self.e.context, name_z.ptr, @intCast(class_name.len), 0);
const str_global = c.LLVMAddGlobal(self.e.llvm_module, c.LLVMTypeOf(str_const), "OBJC_CLASS_NAME_");
c.LLVMSetInitializer(str_global, str_const);
c.LLVMSetLinkage(str_global, c.LLVMPrivateLinkage);
c.LLVMSetGlobalConstant(str_global, 1);
c.LLVMSetUnnamedAddress(str_global, c.LLVMGlobalUnnamedAddr);
var call_args: [1]c.LLVMValueRef = .{str_global};
const class_val = c.LLVMBuildCall2(self.e.builder, get_class_ty, get_class_fn, &call_args, 1, "cls");
_ = c.LLVMBuildStore(self.e.builder, class_val, slot_global);
}
_ = c.LLVMBuildRetVoid(self.e.builder);
// Register in @llvm.global_ctors for AOT + inject into main for ORC JIT.
const i32_ty = self.e.cached_i32;
const ptr_ty = self.e.cached_ptr;
var ctor_field_types: [3]c.LLVMTypeRef = .{ i32_ty, ptr_ty, ptr_ty };
const ctor_struct_ty = c.LLVMStructTypeInContext(self.e.context, &ctor_field_types, 3, 0);
var ctor_fields: [3]c.LLVMValueRef = .{
c.LLVMConstInt(i32_ty, 65535, 0),
ctor,
c.LLVMConstNull(ptr_ty),
};
const ctor_entry = c.LLVMConstNamedStruct(ctor_struct_ty, &ctor_fields, 3);
// Append-vs-replace the existing global_ctors. Selector init may
// have created `@llvm.global_ctors` already — extend its array
// rather than overwriting.
const existing_z = "llvm.global_ctors";
const existing = c.LLVMGetNamedGlobal(self.e.llvm_module, existing_z);
if (existing != null) {
const existing_init = c.LLVMGetInitializer(existing);
const existing_arr_ty = c.LLVMGlobalGetValueType(existing);
const old_count = c.LLVMGetArrayLength(existing_arr_ty);
const new_count: c_uint = old_count + 1;
var new_entries = std.ArrayList(c.LLVMValueRef).empty;
defer new_entries.deinit(self.e.alloc);
var i: c_uint = 0;
while (i < old_count) : (i += 1) {
new_entries.append(self.e.alloc, c.LLVMGetAggregateElement(existing_init, i)) catch unreachable;
}
new_entries.append(self.e.alloc, ctor_entry) catch unreachable;
const new_arr_ty = c.LLVMArrayType2(ctor_struct_ty, new_count);
const new_init = c.LLVMConstArray2(ctor_struct_ty, new_entries.items.ptr, new_count);
const new_global = c.LLVMAddGlobal(self.e.llvm_module, new_arr_ty, "llvm.global_ctors.new");
c.LLVMSetInitializer(new_global, new_init);
c.LLVMSetLinkage(new_global, c.LLVMAppendingLinkage);
c.LLVMSetValueName2(existing, "llvm.global_ctors.old", "llvm.global_ctors.old".len);
c.LLVMSetValueName2(new_global, "llvm.global_ctors", "llvm.global_ctors".len);
c.LLVMDeleteGlobal(existing);
} else {
const ctors_arr_ty = c.LLVMArrayType2(ctor_struct_ty, 1);
var ctor_entries: [1]c.LLVMValueRef = .{ctor_entry};
const ctors_init = c.LLVMConstArray2(ctor_struct_ty, &ctor_entries, 1);
const ctors_global = c.LLVMAddGlobal(self.e.llvm_module, ctors_arr_ty, "llvm.global_ctors");
c.LLVMSetInitializer(ctors_global, ctors_init);
c.LLVMSetLinkage(ctors_global, c.LLVMAppendingLinkage);
}
// ORC JIT injection: same trick as emitObjcSelectorInit. Inject a
// direct call from main's entry so the JIT path populates the
// slots too. Must run AFTER the selector init's main injection
// (selectors are needed independently of class objects), so we
// place this call AFTER the first instruction (which is the
// selector-init call, if present) rather than at the very top.
const main_z = "main";
const main_fn = c.LLVMGetNamedFunction(self.e.llvm_module, main_z);
if (main_fn != null) {
const entry_bb = c.LLVMGetEntryBasicBlock(main_fn);
// Walk past any existing init calls (selector init etc.) so
// class init runs after them. The order within main's prelude
// doesn't matter functionally (the two caches are independent),
// but stable ordering keeps IR snapshots deterministic.
var insert_before = c.LLVMGetFirstInstruction(entry_bb);
while (insert_before != null) : (insert_before = c.LLVMGetNextInstruction(insert_before)) {
if (c.LLVMGetInstructionOpcode(insert_before) != c.LLVMCall) break;
}
if (insert_before != null) {
c.LLVMPositionBuilderBefore(self.e.builder, insert_before);
} else {
c.LLVMPositionBuilderAtEnd(self.e.builder, entry_bb);
}
var no_args: [0]c.LLVMValueRef = .{};
_ = c.LLVMBuildCall2(self.e.builder, ctor_ty, ctor, &no_args, 0, "");
}
}
/// M1.2 A.4 — emit class-pair registration constructor for every
/// sx-defined `#objc_class` declaration. Same shape as the Phase
/// 3.1 `emitObjcClassInit` companion: a `@llvm.global_ctors`-
/// registered constructor that runs at module load AND gets
/// injected at the top of `main` for the ORC JIT path (which
/// doesn't honor `@llvm.global_ctors`).
///
/// For each entry in `objc_defined_class_cache`:
/// super_cls = objc_getClass("<ParentName>") // default NSObject
/// cls = objc_allocateClassPair(super_cls, "<ClassName>", 0)
/// class_addIvar(cls, "__sx_state", 8, 3, "^v") // M1.2 A.4b.i
/// objc_registerClassPair(cls)
/// g_<ClassName>_state_ivar = class_getInstanceVariable(cls, "__sx_state")
///
/// Method IMPs (`class_addMethod`) and the `+alloc` / `-dealloc`
/// overrides come in A.4b.ii / A.5 / A.6.
pub fn emitObjcDefinedClassInit(self: FfiCtors) void {
if (self.e.ir_mod.objc_defined_class_cache.items.len == 0) return;
const ptr_ty = self.e.cached_ptr;
const i32_ty = self.e.cached_i32;
const i64_ty = self.e.cached_i64;
const i8_ty = c.LLVMInt8TypeInContext(self.e.context);
// Lazy-declare the Obj-C runtime APIs the constructor calls.
// objc_getClass(name: *u8) -> *void.
const get_class_fn, const get_class_ty = self.e.lazyDeclareCRuntime("objc_getClass", &[_]c.LLVMTypeRef{ptr_ty}, ptr_ty, 0);
// objc_allocateClassPair(super: *void, name: *u8, extra: usize) -> *void.
const alloc_pair_fn, const alloc_pair_ty = self.e.lazyDeclareCRuntime("objc_allocateClassPair", &[_]c.LLVMTypeRef{ ptr_ty, ptr_ty, i64_ty }, ptr_ty, 0);
// class_addIvar(cls: *void, name: *u8, size: u64, log2align: u8, type: *u8) -> bool.
const add_ivar_fn, const add_ivar_ty = self.e.lazyDeclareCRuntime("class_addIvar", &[_]c.LLVMTypeRef{ ptr_ty, ptr_ty, i64_ty, i8_ty, ptr_ty }, i8_ty, 0);
// sel_registerName(name: *u8) -> *void.
const sel_reg_fn, const sel_reg_ty = self.e.lazyDeclareCRuntime("sel_registerName", &[_]c.LLVMTypeRef{ptr_ty}, ptr_ty, 0);
// class_addMethod(cls: *void, sel: *void, imp: *void, types: *u8) -> bool.
const add_method_fn, const add_method_ty = self.e.lazyDeclareCRuntime("class_addMethod", &[_]c.LLVMTypeRef{ ptr_ty, ptr_ty, ptr_ty, ptr_ty }, i8_ty, 0);
// objc_registerClassPair(cls: *void) -> void.
const register_fn, const register_ty = self.e.lazyDeclareCRuntime("objc_registerClassPair", &[_]c.LLVMTypeRef{ptr_ty}, self.e.cached_void, 0);
// class_getInstanceVariable(cls: *void, name: *u8) -> *Ivar.
const get_iv_fn, const get_iv_ty = self.e.lazyDeclareCRuntime("class_getInstanceVariable", &[_]c.LLVMTypeRef{ ptr_ty, ptr_ty }, ptr_ty, 0);
// Constructor: void __sx_objc_defined_class_init().
var no_params: [0]c.LLVMTypeRef = .{};
const ctor_ty = c.LLVMFunctionType(self.e.cached_void, &no_params, 0, 0);
const ctor = c.LLVMAddFunction(self.e.llvm_module, "__sx_objc_defined_class_init", ctor_ty);
c.LLVMSetLinkage(ctor, c.LLVMInternalLinkage);
const entry = c.LLVMAppendBasicBlockInContext(self.e.context, ctor, "entry");
c.LLVMPositionBuilderAtEnd(self.e.builder, entry);
// Reusable C-string globals for ivar metadata (same across classes).
const sx_state_name_global = self.e.emitPrivateCString("__sx_state", "OBJC_IVAR_NAME_");
const sx_state_enc_global = self.e.emitPrivateCString("^v", "OBJC_IVAR_TYPE_");
for (self.e.ir_mod.objc_defined_class_cache.items) |entry_kv| {
const fcd = entry_kv.decl;
const class_name = fcd.name;
// Parent class — pre-resolved Obj-C runtime name from
// lower.zig (M2.3 resolveObjcParentName). Stored on the
// cache entry so emit_llvm doesn't re-walk
// foreign_class_map here.
const parent_name = entry_kv.parent_objc_name;
const parent_str_global = self.e.emitPrivateCString(parent_name, "OBJC_CLASS_NAME_");
const class_str_global = self.e.emitPrivateCString(class_name, "OBJC_CLASS_NAME_");
// super_cls = objc_getClass("ParentName")
var get_args: [1]c.LLVMValueRef = .{parent_str_global};
const super_val = c.LLVMBuildCall2(self.e.builder, get_class_ty, get_class_fn, &get_args, 1, "super_cls");
// cls = objc_allocateClassPair(super_cls, "ClassName", 0)
var alloc_args: [3]c.LLVMValueRef = .{ super_val, class_str_global, c.LLVMConstInt(i64_ty, 0, 0) };
const cls_val = c.LLVMBuildCall2(self.e.builder, alloc_pair_ty, alloc_pair_fn, &alloc_args, 3, "cls");
// class_addIvar(cls, "__sx_state", 8, 3, "^v")
// size = 8 (pointer) — sizeof(*void) on 64-bit
// log2align = 3 — alignof(*void) = 8 = 2^3
// type = "^v" (encoded *void)
var ivar_args: [5]c.LLVMValueRef = .{
cls_val,
sx_state_name_global,
c.LLVMConstInt(i64_ty, 8, 0),
c.LLVMConstInt(i8_ty, 3, 0),
sx_state_enc_global,
};
_ = c.LLVMBuildCall2(self.e.builder, add_ivar_ty, add_ivar_fn, &ivar_args, 5, "");
// Class-method registration (M2.1(b)) and the +alloc IMP
// (M1.2 A.5) both target the metaclass. Compute it once
// up-front so all metaclass-bound class_addMethod calls
// can reference the same LLVM value.
//
// metaclass = object_getClass(cls). (object_getClass on a
// Class returns the metaclass — a Class IS an instance of
// its metaclass. Distinct from objc_getClass(name).)
const obj_get_class_fn, const obj_get_class_ty = self.e.lazyDeclareCRuntime("object_getClass", &[_]c.LLVMTypeRef{ptr_ty}, ptr_ty, 0);
var ogc_args: [1]c.LLVMValueRef = .{cls_val};
const metaclass_val = c.LLVMBuildCall2(self.e.builder, obj_get_class_ty, obj_get_class_fn, &ogc_args, 1, "metacls");
// class_addMethod(target, sel_registerName(sel), imp, encoding)
// — register each method's IMP trampoline (M1.2 A.4b.iii
// + M2.1(b)). Instance methods register on `cls`; class
// methods (`is_class`) on the metaclass. Must run BEFORE
// objc_registerClassPair; the runtime locks the method
// list at registration time on some SDK versions.
for (entry_kv.methods) |method| {
const sel_str_global = self.e.emitPrivateCString(method.sel, "OBJC_METH_VAR_NAME_");
const enc_str_global = self.e.emitPrivateCString(method.encoding, "OBJC_METH_VAR_TYPE_");
var sel_args: [1]c.LLVMValueRef = .{sel_str_global};
const sel_val = c.LLVMBuildCall2(self.e.builder, sel_reg_ty, sel_reg_fn, &sel_args, 1, "sel");
const imp_z = self.e.alloc.dupeZ(u8, method.imp_name) catch continue;
defer self.e.alloc.free(imp_z);
const imp_fn = c.LLVMGetNamedFunction(self.e.llvm_module, imp_z.ptr);
if (imp_fn == null) continue;
const target_cls = if (method.is_class) metaclass_val else cls_val;
var add_args: [4]c.LLVMValueRef = .{ target_cls, sel_val, imp_fn, enc_str_global };
_ = c.LLVMBuildCall2(self.e.builder, add_method_ty, add_method_fn, &add_args, 4, "");
}
// M2.3 / M3.2 — register `#implements` protocol conformances
// BEFORE objc_registerClassPair. iOS checks
// `class_conformsToProtocol` when instantiating scene
// delegates and other protocol-typed callbacks; without
// these the runtime silently rejects the class.
//
// The protocol may not be present on every SDK / runtime
// (dead-strip pruning, version skew), so `objc_getProtocol`
// returning null is non-fatal — skip the addProtocol call.
const get_proto_fn, const get_proto_ty = self.e.lazyDeclareCRuntime("objc_getProtocol", &[_]c.LLVMTypeRef{ptr_ty}, ptr_ty, 0);
const add_proto_fn, const add_proto_ty = self.e.lazyDeclareCRuntime("class_addProtocol", &[_]c.LLVMTypeRef{ ptr_ty, ptr_ty }, i8_ty, 0);
for (fcd.members) |m| switch (m) {
.implements => |proto_alias| {
const proto_str_global = self.e.emitPrivateCString(proto_alias, "OBJC_PROTOCOL_NAME_");
var gp_args: [1]c.LLVMValueRef = .{proto_str_global};
const proto_val = c.LLVMBuildCall2(self.e.builder, get_proto_ty, get_proto_fn, &gp_args, 1, "proto");
var ap_args: [2]c.LLVMValueRef = .{ cls_val, proto_val };
_ = c.LLVMBuildCall2(self.e.builder, add_proto_ty, add_proto_fn, &ap_args, 2, "");
},
else => {},
};
// objc_registerClassPair(cls)
var reg_args: [1]c.LLVMValueRef = .{cls_val};
_ = c.LLVMBuildCall2(self.e.builder, register_ty, register_fn, &reg_args, 1, "");
// Cache the class pointer in `__<Cls>_class` global so the
// synthesized -dealloc trampoline (M1.2 A.6) can use it for
// [super dealloc] dispatch via objc_msgSendSuper2.
const class_global_name = std.fmt.allocPrint(self.e.alloc, "__{s}_class", .{class_name}) catch continue;
defer self.e.alloc.free(class_global_name);
const class_global_z = self.e.alloc.dupeZ(u8, class_global_name) catch continue;
defer self.e.alloc.free(class_global_z);
const class_global = c.LLVMGetNamedGlobal(self.e.llvm_module, class_global_z.ptr);
if (class_global != null) {
_ = c.LLVMBuildStore(self.e.builder, cls_val, class_global);
}
// M1.2 A.6 — register the synthesized `-dealloc` IMP on the
// class itself (instance method). The runtime fires it at
// refcount-zero; the IMP frees __sx_state and chains to
// [super dealloc].
const dealloc_imp_name = std.fmt.allocPrint(self.e.alloc, "__{s}_dealloc_imp", .{class_name}) catch continue;
defer self.e.alloc.free(dealloc_imp_name);
const dealloc_imp_z = self.e.alloc.dupeZ(u8, dealloc_imp_name) catch continue;
defer self.e.alloc.free(dealloc_imp_z);
const dealloc_imp_fn = c.LLVMGetNamedFunction(self.e.llvm_module, dealloc_imp_z.ptr);
if (dealloc_imp_fn != null) {
const dealloc_sel_global = self.e.emitPrivateCString("dealloc", "OBJC_METH_VAR_NAME_");
const dealloc_enc_global = self.e.emitPrivateCString("v@:", "OBJC_METH_VAR_TYPE_");
var sel_args: [1]c.LLVMValueRef = .{dealloc_sel_global};
const sel_val = c.LLVMBuildCall2(self.e.builder, sel_reg_ty, sel_reg_fn, &sel_args, 1, "sel_dealloc");
var add_args: [4]c.LLVMValueRef = .{ cls_val, sel_val, dealloc_imp_fn, dealloc_enc_global };
_ = c.LLVMBuildCall2(self.e.builder, add_method_ty, add_method_fn, &add_args, 4, "");
}
// M1.2 A.5 — register the synthesized `+alloc` IMP on the
// metaclass. Class methods live on the metaclass (every
// Class object's `isa` points to the metaclass), so we
// resolve it via `object_getClass(cls)` and `class_addMethod`
// the IMP there. Encoding `@@:` = returns id, takes Class,
// then SEL — Apple's standard `+alloc` shape. This override
// wins over NSObject's default +alloc; runtime instantiations
// (UIKit, Info.plist, NSCoder) go through our IMP and get the
// __sx_state ivar bound.
const alloc_imp_name = std.fmt.allocPrint(self.e.alloc, "__{s}_alloc_imp", .{class_name}) catch continue;
defer self.e.alloc.free(alloc_imp_name);
const alloc_imp_z = self.e.alloc.dupeZ(u8, alloc_imp_name) catch continue;
defer self.e.alloc.free(alloc_imp_z);
const alloc_imp_fn = c.LLVMGetNamedFunction(self.e.llvm_module, alloc_imp_z.ptr);
if (alloc_imp_fn != null) {
// metaclass_val was computed up-front above (shared
// with class-method registration). +alloc is a class
// method registered on the metaclass.
const alloc_sel_global = self.e.emitPrivateCString("alloc", "OBJC_METH_VAR_NAME_");
const alloc_enc_global = self.e.emitPrivateCString("@@:", "OBJC_METH_VAR_TYPE_");
var sel_args: [1]c.LLVMValueRef = .{alloc_sel_global};
const sel_val = c.LLVMBuildCall2(self.e.builder, sel_reg_ty, sel_reg_fn, &sel_args, 1, "sel_alloc");
var add_args: [4]c.LLVMValueRef = .{ metaclass_val, sel_val, alloc_imp_fn, alloc_enc_global };
_ = c.LLVMBuildCall2(self.e.builder, add_method_ty, add_method_fn, &add_args, 4, "");
}
// Cache the ivar handle in the per-class global so trampolines
// can read the __sx_state ivar without re-looking-it-up. The
// global is declared by lower.zig (M1.2 A.4b.i) and starts as
// null; the constructor fills it in here.
const ivar_global_name = std.fmt.allocPrint(self.e.alloc, "__{s}_state_ivar", .{class_name}) catch continue;
defer self.e.alloc.free(ivar_global_name);
const ivar_global_z = self.e.alloc.dupeZ(u8, ivar_global_name) catch continue;
defer self.e.alloc.free(ivar_global_z);
const ivar_global = c.LLVMGetNamedGlobal(self.e.llvm_module, ivar_global_z.ptr);
if (ivar_global != null) {
var iv_args: [2]c.LLVMValueRef = .{ cls_val, sx_state_name_global };
const iv_val = c.LLVMBuildCall2(self.e.builder, get_iv_ty, get_iv_fn, &iv_args, 2, "iv");
_ = c.LLVMBuildStore(self.e.builder, iv_val, ivar_global);
}
}
_ = c.LLVMBuildRetVoid(self.e.builder);
// Inject the call into main's entry block ONLY — skip
// @llvm.global_ctors. Apple's frameworks (UIKit on iOS,
// AppKit on macOS) register their Obj-C classes during
// dyld's image-init phase, which overlaps global_ctors. If
// we ran there too, `objc_getClass("UIResponder")` would
// return null and `objc_allocateClassPair(null, ...)` would
// crash inside objc_registerClassPair. main's entry runs
// AFTER dyld's framework init is complete but BEFORE user
// code (UIApplicationMain), so the runtime sees the parent
// class properly.
self.e.injectCtorIntoMain(ctor, ctor_ty);
_ = i32_ty;
}
/// Return `{cls_slot, mid_slot}` global pair for the
/// `(name, sig)` literal — created on first lookup, shared across
/// later `#jni_call` sites with the same literal pair. Both
/// slots are zero-initialized `ptr`; the call-site lowering does
/// lazy population on first dispatch. The cache (`jni_slots`) +
/// `mangleJniKey` stay on `LLVMEmitter`.
pub fn getOrCreateJniSlots(self: FfiCtors, name: []const u8, sig: []const u8) JniSlotPair {
// Compose the key from name + a separator + sig. The separator
// is a byte that can't appear in a JNI method name or signature
// (NUL), so the same key never collides across distinct pairs.
const key = std.fmt.allocPrint(self.e.alloc, "{s}\x00{s}", .{ name, sig }) catch unreachable;
if (self.e.jni_slots.get(key)) |existing| {
self.e.alloc.free(key);
return existing;
}
const mangled = self.e.mangleJniKey(name, sig);
defer self.e.alloc.free(mangled);
const cls_name = std.fmt.allocPrintSentinel(self.e.alloc, "SX_JNI_CLS_{s}", .{mangled}, 0) catch unreachable;
defer self.e.alloc.free(cls_name);
const mid_name = std.fmt.allocPrintSentinel(self.e.alloc, "SX_JNI_MID_{s}", .{mangled}, 0) catch unreachable;
defer self.e.alloc.free(mid_name);
const cls_slot = c.LLVMAddGlobal(self.e.llvm_module, self.e.cached_ptr, cls_name.ptr);
c.LLVMSetLinkage(cls_slot, c.LLVMInternalLinkage);
c.LLVMSetInitializer(cls_slot, c.LLVMConstNull(self.e.cached_ptr));
const mid_slot = c.LLVMAddGlobal(self.e.llvm_module, self.e.cached_ptr, mid_name.ptr);
c.LLVMSetLinkage(mid_slot, c.LLVMInternalLinkage);
c.LLVMSetInitializer(mid_slot, c.LLVMConstNull(self.e.cached_ptr));
const pair = JniSlotPair{ .cls_slot = cls_slot, .mid_slot = mid_slot };
self.e.jni_slots.put(key, pair) catch unreachable;
return pair;
}
};

1947
src/backend/llvm/ops.zig Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,205 @@
const std = @import("std");
const llvm = @import("../../llvm_api.zig");
const c = llvm.c;
const errors = @import("../../errors.zig");
const emit = @import("../../ir/emit_llvm.zig");
const ir_inst = @import("../../ir/inst.zig");
const ir_types = @import("../../ir/types.zig");
const LLVMEmitter = emit.LLVMEmitter;
const Inst = ir_inst.Inst;
const TypeId = ir_types.TypeId;
const StringId = ir_types.StringId;
/// Reflection metadata + trace-frame emission (architecture phase A7.2),
/// extracted from `LLVMEmitter`. A backend `*LLVMEmitter` facade (field `e`):
/// the type/field/tag reflection NAME-ARRAY builders (memoized into
/// `type_name_array`/`field_name_arrays`/`tag_name_array` on `LLVMEmitter`) and
/// the error-trace `Frame` builders. Reads cached LLVM handles / the IR type
/// table / the module via `self.e.*`; the memoizing composite getters
/// (`getStringStructType`/`getFrameStructType`) + `emitFieldValueGet` stay on
/// `LLVMEmitter`. Entry points are reached via `self.reflection()`.
pub const Reflection = struct {
e: *LLVMEmitter,
/// Lazy global `[N x string]` indexed by `TypeId.index()`, holding each
/// type's display name. Built on the first dynamic `type_name(t)` call site.
pub fn getOrBuildTypeNameArray(self: Reflection) c.LLVMValueRef {
if (self.e.type_name_array) |g| return g;
const n: u32 = @intCast(self.e.ir_mod.types.infos.items.len);
const string_ty = self.e.getStringStructType();
var field_vals = std.ArrayList(c.LLVMValueRef).empty;
defer field_vals.deinit(self.e.alloc);
var i: u32 = 0;
while (i < n) : (i += 1) {
const tid = TypeId.fromIndex(i);
const name_str = self.e.ir_mod.types.formatTypeName(self.e.alloc, tid);
const str_z = self.e.alloc.dupeZ(u8, name_str) catch unreachable;
defer self.e.alloc.free(str_z);
const global_str = c.LLVMAddGlobal(self.e.llvm_module, c.LLVMArrayType(self.e.cached_i8, @intCast(name_str.len + 1)), "tn.str");
c.LLVMSetInitializer(global_str, c.LLVMConstStringInContext(self.e.context, str_z.ptr, @intCast(name_str.len + 1), 1));
c.LLVMSetGlobalConstant(global_str, 1);
c.LLVMSetLinkage(global_str, c.LLVMPrivateLinkage);
const len_val = c.LLVMConstInt(self.e.cached_i64, name_str.len, 0);
var struct_fields = [2]c.LLVMValueRef{ global_str, len_val };
const const_struct = c.LLVMConstStructInContext(self.e.context, &struct_fields, 2, 0);
field_vals.append(self.e.alloc, const_struct) catch unreachable;
}
const arr_ty = c.LLVMArrayType(string_ty, n);
const arr_init = c.LLVMConstArray(string_ty, field_vals.items.ptr, n);
const global = c.LLVMAddGlobal(self.e.llvm_module, arr_ty, "__sx_type_names");
c.LLVMSetInitializer(global, arr_init);
c.LLVMSetGlobalConstant(global, 1);
c.LLVMSetLinkage(global, c.LLVMPrivateLinkage);
self.e.type_name_array = global;
self.e.type_name_array_len = n;
return global;
}
/// Build (or return cached) a global constant array of {ptr, i64} string values
/// for the field names of a struct type.
pub fn getOrBuildFieldNameArray(self: Reflection, struct_type: TypeId) c.LLVMValueRef {
if (self.e.field_name_arrays.get(struct_type.index())) |g| return g;
const info = self.e.ir_mod.types.get(struct_type);
// Collect name StringIds from struct fields, union fields, or enum variants
var name_ids = std.ArrayList(StringId).empty;
defer name_ids.deinit(self.e.alloc);
switch (info) {
.@"struct" => |s| {
for (s.fields) |f| name_ids.append(self.e.alloc, f.name) catch unreachable;
},
.@"union" => |u| {
for (u.fields) |f| name_ids.append(self.e.alloc, f.name) catch unreachable;
},
.tagged_union => |u| {
for (u.fields) |f| name_ids.append(self.e.alloc, f.name) catch unreachable;
},
.@"enum" => |e| {
for (e.variants) |v| name_ids.append(self.e.alloc, v) catch unreachable;
},
else => {},
}
const string_ty = self.e.getStringStructType();
const n: u32 = @intCast(name_ids.items.len);
// Build constant initializer: [N x {ptr, i64}]
var field_vals = std.ArrayList(c.LLVMValueRef).empty;
defer field_vals.deinit(self.e.alloc);
for (name_ids.items) |name_id| {
const name_str = self.e.ir_mod.types.getString(name_id);
const str_z = self.e.alloc.dupeZ(u8, name_str) catch unreachable;
defer self.e.alloc.free(str_z);
const global_str = c.LLVMAddGlobal(self.e.llvm_module, c.LLVMArrayType(self.e.cached_i8, @intCast(name_str.len + 1)), "fld.str");
c.LLVMSetInitializer(global_str, c.LLVMConstStringInContext(self.e.context, str_z.ptr, @intCast(name_str.len + 1), 1));
c.LLVMSetGlobalConstant(global_str, 1);
c.LLVMSetLinkage(global_str, c.LLVMPrivateLinkage);
// Build fat pointer {ptr, len} as constant struct
const len_val = c.LLVMConstInt(self.e.cached_i64, name_str.len, 0);
var struct_fields = [2]c.LLVMValueRef{ global_str, len_val };
const const_struct = c.LLVMConstStructInContext(self.e.context, &struct_fields, 2, 0);
field_vals.append(self.e.alloc, const_struct) catch unreachable;
}
// Create global array [N x {ptr, i64}]
const array_ty = c.LLVMArrayType(string_ty, n);
const array_init = c.LLVMConstArray(string_ty, field_vals.items.ptr, n);
const global = c.LLVMAddGlobal(self.e.llvm_module, array_ty, "field_names");
c.LLVMSetInitializer(global, array_init);
c.LLVMSetGlobalConstant(global, 1);
c.LLVMSetLinkage(global, c.LLVMPrivateLinkage);
self.e.field_name_arrays.put(struct_type.index(), global) catch unreachable;
return global;
}
/// The always-linked tag-name table: a `[N x {ptr, i64}]` global of tag
/// names indexed by global tag id (the `TagRegistry` namespace; slot 0 is
/// the reserved "" no-error name). `error_tag_name_get` GEPs into it at the
/// runtime tag id. Built once per module. Always emitted (not trace-gated)
/// so `{}` interpolation of an error tag works even in release builds.
pub fn getOrBuildTagNameArray(self: Reflection) c.LLVMValueRef {
if (self.e.tag_name_array) |g| return g;
const string_ty = self.e.getStringStructType();
const names = self.e.ir_mod.types.tags.names.items;
var field_vals = std.ArrayList(c.LLVMValueRef).empty;
defer field_vals.deinit(self.e.alloc);
for (names) |name_str| {
const str_z = self.e.alloc.dupeZ(u8, name_str) catch unreachable;
defer self.e.alloc.free(str_z);
const global_str = c.LLVMAddGlobal(self.e.llvm_module, c.LLVMArrayType(self.e.cached_i8, @intCast(name_str.len + 1)), "tag.str");
c.LLVMSetInitializer(global_str, c.LLVMConstStringInContext(self.e.context, str_z.ptr, @intCast(name_str.len + 1), 1));
c.LLVMSetGlobalConstant(global_str, 1);
c.LLVMSetLinkage(global_str, c.LLVMPrivateLinkage);
const len_val = c.LLVMConstInt(self.e.cached_i64, name_str.len, 0);
var struct_fields = [2]c.LLVMValueRef{ global_str, len_val };
const const_struct = c.LLVMConstStructInContext(self.e.context, &struct_fields, 2, 0);
field_vals.append(self.e.alloc, const_struct) catch unreachable;
}
const n: u32 = @intCast(names.len);
const array_ty = c.LLVMArrayType(string_ty, n);
const array_init = c.LLVMConstArray(string_ty, field_vals.items.ptr, n);
const global = c.LLVMAddGlobal(self.e.llvm_module, array_ty, "tag_names");
c.LLVMSetInitializer(global, array_init);
c.LLVMSetGlobalConstant(global, 1);
c.LLVMSetLinkage(global, c.LLVMPrivateLinkage);
self.e.tag_name_array = global;
return global;
}
/// An interned constant sx `string` (`{ ptr, i64 }`) of the cached string
/// struct type, backed by a private NUL-terminated data global. Cached by
/// content so a path/name shared by many push sites is emitted once.
fn buildStringConst(self: Reflection, s: []const u8) c.LLVMValueRef {
if (self.e.frame_str_cache.get(s)) |v| return v;
const str_z = self.e.alloc.dupeZ(u8, s) catch unreachable;
defer self.e.alloc.free(str_z);
const data = c.LLVMAddGlobal(self.e.llvm_module, c.LLVMArrayType(self.e.cached_i8, @intCast(s.len + 1)), "frame.str");
c.LLVMSetInitializer(data, c.LLVMConstStringInContext(self.e.context, str_z.ptr, @intCast(s.len + 1), 1));
c.LLVMSetGlobalConstant(data, 1);
c.LLVMSetLinkage(data, c.LLVMPrivateLinkage);
c.LLVMSetUnnamedAddress(data, c.LLVMGlobalUnnamedAddr);
var fields = [_]c.LLVMValueRef{ data, c.LLVMConstInt(self.e.cached_i64, s.len, 0) };
const str_const = c.LLVMConstNamedStruct(self.e.getStringStructType(), &fields, 2);
const key = self.e.alloc.dupe(u8, s) catch return str_const;
self.e.frame_str_cache.put(key, str_const) catch self.e.alloc.free(key);
return str_const;
}
/// Build the interned `Frame` global for a `.trace_frame` push site and
/// return its address as `i64` (the value `sx_trace_push` stores). Resolves
/// the instruction's span + current function to `{file,line,col,func}`. The
/// file is shown as its basename so trace output is machine-independent
/// (the harness passes absolute paths); full paths live in DWARF.
pub fn emitTraceFrame(self: Reflection, instruction: *const Inst) c.LLVMValueRef {
const file = std.fs.path.basename(self.e.current_func_file);
const src = self.e.sourceForFile(self.e.current_func_file);
const loc = errors.SourceLoc.compute(src, instruction.span.start);
const func_name = self.e.ir_mod.types.getString(self.e.ir_mod.functions.items[self.e.current_func_idx].name);
var fields = [_]c.LLVMValueRef{
self.buildStringConst(file),
c.LLVMConstInt(self.e.cached_i32, loc.line, 0),
c.LLVMConstInt(self.e.cached_i32, loc.col, 0),
self.buildStringConst(func_name),
self.buildStringConst(errors.lineAt(src, instruction.span.start)),
};
const frame_ty = self.e.getFrameStructType();
const frame_const = c.LLVMConstNamedStruct(frame_ty, &fields, 5);
const g = c.LLVMAddGlobal(self.e.llvm_module, frame_ty, "trace.frame");
c.LLVMSetInitializer(g, frame_const);
c.LLVMSetGlobalConstant(g, 1);
c.LLVMSetLinkage(g, c.LLVMPrivateLinkage);
return c.LLVMConstPtrToInt(g, self.e.cached_i64);
}
};

178
src/backend/llvm/types.zig Normal file
View File

@@ -0,0 +1,178 @@
const std = @import("std");
const llvm = @import("../../llvm_api.zig");
const c = llvm.c;
const ir_types = @import("../../ir/types.zig");
const emit = @import("../../ir/emit_llvm.zig");
const TypeId = ir_types.TypeId;
const LLVMEmitter = emit.LLVMEmitter;
/// IR-type → LLVM-type lowering (architecture phase A7.1), extracted from
/// `LLVMEmitter`. A backend `*LLVMEmitter` facade (the backend analogue of the
/// IR-side `*Lowering` facades): it borrows the emitter for the cached LLVM
/// handles (`context`/`cached_*`), the IR type table (`ir_mod`), the scratch
/// allocator, and the memoizing composite-type getters
/// (`getStringStructType`/`getAnyStructType`/`getClosureStructType`) that stay
/// on `LLVMEmitter`. `LLVMEmitter.toLLVMType` is a thin wrapper delegating here.
pub const TypeLowering = struct {
e: *LLVMEmitter,
pub fn toLLVMType(self: TypeLowering, ty: TypeId) c.LLVMTypeRef {
return switch (ty) {
.void => self.e.cached_void,
.bool => self.e.cached_i1,
.s8 => self.e.cached_i8,
.s16 => self.e.cached_i16,
.s32 => self.e.cached_i32,
.s64 => self.e.cached_i64,
.u8 => self.e.cached_i8,
.u16 => self.e.cached_i16,
.u32 => self.e.cached_i32,
.u64 => self.e.cached_i64,
.f32 => self.e.cached_f32,
.f64 => self.e.cached_f64,
.string => self.e.getStringStructType(),
.any => self.e.getAnyStructType(),
.noreturn => self.e.cached_void,
.isize, .usize => if (self.e.target_config.isWasm32()) self.e.cached_i32 else self.e.cached_i64,
else => self.toLLVMTypeInfo(ty),
};
}
fn toLLVMTypeInfo(self: TypeLowering, ty: TypeId) c.LLVMTypeRef {
const info = self.e.ir_mod.types.get(ty);
return switch (info) {
.signed => |w| switch (w) {
1 => self.e.cached_i1,
8 => self.e.cached_i8,
16 => self.e.cached_i16,
32 => self.e.cached_i32,
64 => self.e.cached_i64,
else => c.LLVMIntTypeInContext(self.e.context, w),
},
.unsigned => |w| switch (w) {
1 => self.e.cached_i1,
8 => self.e.cached_i8,
16 => self.e.cached_i16,
32 => self.e.cached_i32,
64 => self.e.cached_i64,
else => c.LLVMIntTypeInContext(self.e.context, w),
},
.f32 => self.e.cached_f32,
.f64 => self.e.cached_f64,
.void => self.e.cached_void,
.bool => self.e.cached_i1,
.error_set => self.e.cached_i32, // u32 tag id on the error channel
.string => self.e.getStringStructType(),
.pointer, .many_pointer, .function => self.e.cached_ptr,
.closure => self.e.getClosureStructType(),
.slice => self.e.getStringStructType(), // same {ptr, i64} layout
.optional => |opt| {
// ?*T / ?fn → bare pointer (null = none)
const child_info = self.e.ir_mod.types.get(opt.child);
if (child_info == .pointer or child_info == .many_pointer or child_info == .function) {
return self.e.cached_ptr;
}
if (child_info == .closure) {
return self.e.getClosureStructType();
}
// ?Protocol → protocol struct (ctx ptr = field 0 is null when none).
if (child_info == .@"struct" and child_info.@"struct".is_protocol) {
return self.toLLVMType(opt.child);
}
// ?T → { T, i1 }
var field_types: [2]c.LLVMTypeRef = .{
self.toLLVMType(opt.child),
self.e.cached_i1,
};
return c.LLVMStructTypeInContext(self.e.context, &field_types, 2, 0);
},
.array => |arr| {
const elem = self.toLLVMType(arr.element);
return c.LLVMArrayType2(elem, arr.length);
},
.vector => |vec| {
const elem = self.toLLVMType(vec.element);
return c.LLVMVectorType(elem, vec.length);
},
.any => self.e.getAnyStructType(),
.noreturn => self.e.cached_void,
.@"struct" => |s| {
// Build LLVM struct type from fields
const n: c_uint = @intCast(s.fields.len);
const field_llvm_types = self.e.alloc.alloc(c.LLVMTypeRef, s.fields.len) catch unreachable;
defer self.e.alloc.free(field_llvm_types);
for (s.fields, 0..) |field, j| {
field_llvm_types[j] = self.toLLVMType(field.ty);
}
return c.LLVMStructTypeInContext(self.e.context, field_llvm_types.ptr, n, 0);
},
.@"enum" => |e| {
// Use backing type if declared (e.g. enum u32 → i32), else i64
if (e.backing_type) |bt| return self.toLLVMType(bt);
return self.e.cached_i64;
},
.@"union" => |u| {
// Untagged union — just [N x i8]
var max_size: usize = 0;
for (u.fields) |field| {
const sz = self.e.ir_mod.types.typeSizeBytes(field.ty);
if (sz > max_size) max_size = sz;
}
if (max_size == 0) max_size = 8;
return c.LLVMArrayType2(self.e.cached_i8, @intCast(max_size));
},
.tagged_union => |u| {
// Tagged union — { header, [N x i8] }
var max_size: usize = 0;
for (u.fields) |field| {
const sz = self.e.ir_mod.types.typeSizeBytes(field.ty);
if (sz > max_size) max_size = sz;
}
if (max_size == 0) max_size = 8;
var header_size: usize = self.e.ir_mod.types.typeSizeBytes(u.tag_type);
if (u.backing_type) |bt| {
const bi = self.e.ir_mod.types.get(bt);
if (bi == .@"struct" and bi.@"struct".fields.len > 1) {
header_size = 0;
const fields = bi.@"struct".fields;
for (fields[0 .. fields.len - 1]) |f| {
header_size += self.e.ir_mod.types.typeSizeBytes(f.ty);
}
const backing_payload = self.e.ir_mod.types.typeSizeBytes(fields[fields.len - 1].ty);
if (backing_payload > max_size) max_size = backing_payload;
}
}
const header_llvm = c.LLVMIntTypeInContext(self.e.context, @intCast(header_size * 8));
var field_types: [2]c.LLVMTypeRef = .{
header_llvm,
c.LLVMArrayType2(self.e.cached_i8, @intCast(max_size)),
};
return c.LLVMStructTypeInContext(self.e.context, &field_types, 2, 0);
},
.tuple => |t| {
const n: c_uint = @intCast(t.fields.len);
const field_llvm_types = self.e.alloc.alloc(c.LLVMTypeRef, t.fields.len) catch unreachable;
defer self.e.alloc.free(field_llvm_types);
for (t.fields, 0..) |f, j| {
field_llvm_types[j] = self.toLLVMType(f);
}
return c.LLVMStructTypeInContext(self.e.context, field_llvm_types.ptr, n, 0);
},
.protocol => {
// Protocol values: { ctx: *void, vtable_or_fn_ptrs... }
// For now, use opaque ptr
return self.e.cached_ptr;
},
.usize, .isize => if (self.e.target_config.isWasm32()) self.e.cached_i32 else self.e.cached_i64,
// Comptime-only: a pack is expanded to flat positional args before
// codegen, so it must never reach LLVM type emission.
.pack => @panic("pack type has no LLVM representation (comptime-only)"),
// Tripwire: a failed type resolution must have been diagnosed and
// aborted long before LLVM emission.
.unresolved => @panic("unresolved type reached LLVM emission — a type resolution failure was not diagnosed/aborted"),
};
}
};

View File

@@ -2,7 +2,6 @@ const std = @import("std");
const ast = @import("ast.zig");
const parser = @import("parser.zig");
const imports = @import("imports.zig");
const sema = @import("sema.zig");
const errors = @import("errors.zig");
const c_import = @import("c_import.zig");
const ir = @import("ir/ir.zig");
@@ -27,7 +26,6 @@ pub const Compilation = struct {
import_sources: std.StringHashMap([:0]const u8),
module_scopes: std.StringHashMap(std.StringHashMap(void)),
import_graph: std.StringHashMap(std.StringHashMap(void)),
sema_result: ?sema.SemaResult = null,
ir_emitter: ?ir.LLVMEmitter = null,
/// Lowered IR module, kept alive past `generateCode` so post-link
/// callbacks can re-enter the interpreter to invoke sx functions
@@ -128,18 +126,6 @@ pub const Compilation = struct {
self.resolved_root = new_root;
}
pub fn analyze(self: *Compilation) !void {
const root = self.resolved_root orelse self.root orelse return error.CompileError;
var analyzer = sema.Analyzer.init(self.allocator);
self.sema_result = analyzer.analyze(root) catch return error.CompileError;
// Merge sema diagnostics into our list
if (self.sema_result) |sr| {
for (sr.diagnostics) |d| {
self.diagnostics.add(d.level, d.message, d.span);
}
}
}
/// Generate code via the IR pipeline: lower AST → IR → LLVM.
pub fn generateCode(self: *Compilation) !void {
// Heap-allocate the IR module so its address is stable during emit

View File

@@ -353,6 +353,83 @@ test "emit: type conversion toLLVMType" {
_ = emitter.toLLVMType(.noreturn);
}
// ── A7.1 scaffolding: ABI param coercion ────────────────────────────
// Lock the C-ABI struct-coercion buckets (abiCoerceParamType / needsByval),
// which feed callconv(.c) / #foreign signatures, before they move to
// src/backend/llvm/abi.zig in A7.1 sub-step 2.
const llvm = @import("../llvm_api.zig");
const cc = llvm.c;
fn internStruct(module: *Module, name: []const u8, field_tys: []const TypeId) TypeId {
var fields = std.ArrayList(types.TypeInfo.StructInfo.Field).empty;
defer fields.deinit(std.testing.allocator);
for (field_tys, 0..) |fty, i| {
var nb: [8]u8 = undefined;
const fname = std.fmt.bufPrint(&nb, "f{d}", .{i}) catch unreachable;
fields.append(std.testing.allocator, .{ .name = str(module, fname), .ty = fty }) catch unreachable;
}
// Dupe into the module arena so the interned struct's field slice lives for
// the module's lifetime (freed at module.deinit) — no testing-allocator leak.
const owned = module.slice_arena.allocator().dupe(types.TypeInfo.StructInfo.Field, fields.items) catch unreachable;
return module.types.intern(.{ .@"struct" = .{ .name = str(module, name), .fields = owned } });
}
test "emit: abiCoerceParamType coerces C-ABI structs by size bucket" {
const alloc = std.testing.allocator;
var module = Module.init(alloc);
defer module.deinit();
// Intern the shapes before building the emitter (toLLVMType reads live).
const small = internStruct(&module, "Small", &.{ .s32, .s32 }); // 8 bytes
const mid = internStruct(&module, "Mid", &.{ .s64, .s64 }); // 16 bytes
const big = internStruct(&module, "Big", &.{ .s64, .s64, .s64 }); // 24 bytes
const hfa_f = internStruct(&module, "HfaF", &.{ .f32, .f32, .f32, .f32 }); // 16, all-float
const hfa_d = internStruct(&module, "HfaD", &.{ .f64, .f64 }); // 16, all-double
const sl = module.types.sliceOf(.s32);
var emitter = LLVMEmitter.init(alloc, &module, "test_abi", .{});
defer emitter.deinit();
// ≤ 8 bytes → i64.
try std.testing.expect(emitter.abiCoerceParamType(small, emitter.toLLVMType(small)) == emitter.cached_i64);
// 916 bytes → [2 x i64].
try std.testing.expect(emitter.abiCoerceParamType(mid, emitter.toLLVMType(mid)) == cc.LLVMArrayType2(emitter.cached_i64, 2));
// > 16 bytes → ptr (passed byval at the call/sig sites).
try std.testing.expect(emitter.abiCoerceParamType(big, emitter.toLLVMType(big)) == emitter.cached_ptr);
// HFA (all-float / all-double, ≤ 4 fields) → unchanged.
try std.testing.expect(emitter.abiCoerceParamType(hfa_f, emitter.toLLVMType(hfa_f)) == emitter.toLLVMType(hfa_f));
try std.testing.expect(emitter.abiCoerceParamType(hfa_d, emitter.toLLVMType(hfa_d)) == emitter.toLLVMType(hfa_d));
// string / slice collapse to ptr at the C-API boundary (len dropped).
try std.testing.expect(emitter.abiCoerceParamType(.string, emitter.toLLVMType(.string)) == emitter.cached_ptr);
try std.testing.expect(emitter.abiCoerceParamType(sl, emitter.toLLVMType(sl)) == emitter.cached_ptr);
// Scalars pass through unchanged.
try std.testing.expect(emitter.abiCoerceParamType(.s32, emitter.toLLVMType(.s32)) == emitter.toLLVMType(.s32));
}
test "emit: needsByval only for > 16-byte non-HFA structs" {
const alloc = std.testing.allocator;
var module = Module.init(alloc);
defer module.deinit();
const small = internStruct(&module, "Small", &.{ .s32, .s32 });
const mid = internStruct(&module, "Mid", &.{ .s64, .s64 });
const big = internStruct(&module, "Big", &.{ .s64, .s64, .s64 });
const hfa_d = internStruct(&module, "HfaD", &.{ .f64, .f64 });
const sl = module.types.sliceOf(.s32);
var emitter = LLVMEmitter.init(alloc, &module, "test_byval", .{});
defer emitter.deinit();
try std.testing.expect(emitter.needsByval(big, emitter.toLLVMType(big))); // > 16
try std.testing.expect(!emitter.needsByval(small, emitter.toLLVMType(small)));
try std.testing.expect(!emitter.needsByval(mid, emitter.toLLVMType(mid))); // exactly 16
try std.testing.expect(!emitter.needsByval(hfa_d, emitter.toLLVMType(hfa_d))); // HFA
try std.testing.expect(!emitter.needsByval(.string, emitter.toLLVMType(.string)));
try std.testing.expect(!emitter.needsByval(sl, emitter.toLLVMType(sl)));
try std.testing.expect(!emitter.needsByval(.s32, emitter.toLLVMType(.s32))); // non-struct
}
// ── Struct/Enum/Union tests ─────────────────────────────────────────
test "emit: struct_init and struct_get" {
@@ -1020,3 +1097,101 @@ test "emit: ERR E3.0 — no DWARF without a debug context (unit-test default)" {
try std.testing.expect(std.mem.indexOf(u8, ir_str, "DICompileUnit") == null);
try std.testing.expect(std.mem.indexOf(u8, ir_str, "!dbg") == null);
}
// ── issue 0074: FFI arg-type lookup must fail loudly, never silently `.void` ──
// `argIRTypeOrFail` backs the four FFI call-arg lowering sites (objc_msgSend,
// JNI Call<Type>Method / non-virtual / constructor). A ref it cannot resolve is
// a codegen invariant violation; it must surface the dedicated `.unresolved`
// tripwire sentinel (which `toLLVMType` hard-panics on) rather than the old
// silent `.void` default that would emit a void-typed foreign-call argument.
test "emit: argIRTypeOrFail surfaces .unresolved for an unresolvable FFI arg ref (issue 0074)" {
const alloc = std.testing.allocator;
var module = Module.init(alloc);
defer module.deinit();
var b = Builder.init(&module);
// func ffifn(a: s64, b: f64) -> void { <entry> }
const fid = b.beginFunction(str(&module, "ffifn"), &[_]Function.Param{
.{ .name = str(&module, "a"), .ty = .s64 },
.{ .name = str(&module, "b"), .ty = .f64 },
}, .void);
const entry = b.appendBlock(str(&module, "entry"), &.{});
b.switchToBlock(entry);
b.retVoid();
b.finalize();
var emitter = LLVMEmitter.init(alloc, &module, "test_ffi_argty", .{});
defer emitter.deinit();
emitter.current_func_idx = fid.index();
// Happy path: a real arg ref (param 0 / param 1) resolves byte-identically
// to its declared IR type — the FFI fast path is unchanged.
try std.testing.expectEqual(TypeId.s64, emitter.argIRTypeOrFail(Ref.fromIndex(0)));
try std.testing.expectEqual(TypeId.f64, emitter.argIRTypeOrFail(Ref.fromIndex(1)));
// A ref past every param and instruction is unresolvable.
const bogus = Ref.fromIndex(100_000);
try std.testing.expectEqual(@as(?TypeId, null), emitter.getRefIRType(bogus));
// Fail-before: the old `getRefIRType(arg) orelse .void` would silently
// yield `.void` here — a real, load-bearing type that downstream ABI
// coercion treats as a legitimate (void-typed) foreign argument.
try std.testing.expectEqual(TypeId.void, emitter.getRefIRType(bogus) orelse TypeId.void);
// Pass-after: the helper returns the dedicated `.unresolved` sentinel,
// never `.void`, so the failure cannot masquerade as a real type.
try std.testing.expectEqual(TypeId.unresolved, emitter.argIRTypeOrFail(bogus));
try std.testing.expect(emitter.argIRTypeOrFail(bogus) != .void);
}
// ── issue 0075: reflection-builtin arg-type lookup must fail loudly, never `.s64` ──
// `reflectArgRepr` backs the `type_name` / `type_eq` reflection builtins, which read
// their `Type` arg as a boxed `Any` aggregate (`.any` → extract value field) or a bare
// i64 TypeId index. A ref it cannot resolve is a codegen invariant violation; it must
// surface `.unresolved` (which the emit site hard-panics on) instead of the old silent
// `getRefIRType(arg) orelse .s64` default that would mis-classify a boxed arg as bare
// and read the wrong value with no diagnostic.
test "emit: reflectArgRepr surfaces .unresolved for an unresolvable reflection arg ref (issue 0075)" {
const alloc = std.testing.allocator;
var module = Module.init(alloc);
defer module.deinit();
var b = Builder.init(&module);
// func reflfn(boxed: any, bare: s64) -> void { <entry> }
const fid = b.beginFunction(str(&module, "reflfn"), &[_]Function.Param{
.{ .name = str(&module, "boxed"), .ty = .any },
.{ .name = str(&module, "bare"), .ty = .s64 },
}, .void);
const entry = b.appendBlock(str(&module, "entry"), &.{});
b.switchToBlock(entry);
b.retVoid();
b.finalize();
var emitter = LLVMEmitter.init(alloc, &module, "test_refl_argty", .{});
defer emitter.deinit();
emitter.current_func_idx = fid.index();
// Happy path: a boxed `.any` Type arg classifies as `.boxed` (extract value
// field); a bare `.s64` TypeId arg classifies as `.bare` (use directly).
// These decisions are byte-identical to the pre-fix `== .any` gate.
try std.testing.expectEqual(LLVMEmitter.ReflectArgRepr.boxed, emitter.reflectArgRepr(Ref.fromIndex(0)));
try std.testing.expectEqual(LLVMEmitter.ReflectArgRepr.bare, emitter.reflectArgRepr(Ref.fromIndex(1)));
// A ref past every param and instruction is unresolvable.
const bogus = Ref.fromIndex(100_000);
try std.testing.expectEqual(@as(?TypeId, null), emitter.getRefIRType(bogus));
// Fail-before: the old `getRefIRType(arg) orelse .s64` would silently yield
// `.s64` here — which `!= .any`, so the reflection arm would treat a failed
// lookup as a bare i64 and read the wrong value with no diagnostic.
try std.testing.expectEqual(TypeId.s64, emitter.getRefIRType(bogus) orelse TypeId.s64);
try std.testing.expect((emitter.getRefIRType(bogus) orelse TypeId.s64) != .any);
// Pass-after: the classifier returns the dedicated `.unresolved` variant,
// never `.bare`, so the emit site trips its hard panic instead of silently
// reading the wrong value.
try std.testing.expectEqual(LLVMEmitter.ReflectArgRepr.unresolved, emitter.reflectArgRepr(bogus));
try std.testing.expect(emitter.reflectArgRepr(bogus) != .bare);
}

File diff suppressed because it is too large Load Diff

View File

@@ -64,7 +64,6 @@ pub const LLVMEmitter = emit_llvm.LLVMEmitter;
pub const type_bridge = @import("type_bridge.zig");
pub const resolveAstType = type_bridge.resolveAstType;
pub const bridgeType = type_bridge.bridgeType;
pub const jni_descriptor = @import("jni_descriptor.zig");
pub const jni_java_emit = @import("jni_java_emit.zig");

View File

@@ -2524,6 +2524,10 @@ pub const Lowering = struct {
const sid = self.module.types.internString(str);
break :blk self.builder.constString(sid);
},
// A bare `null` / `---` with no surrounding type expectation is a
// legitimate typeless literal, not a failed lookup: `.void` is its
// intentional default (emitConstNull/emitConstUndef handle void as
// null-ptr / undef-i64). Not a candidate for the `.unresolved` tripwire.
.null_literal => self.builder.constNull(self.target_type orelse .void),
.undef_literal => self.builder.constUndef(self.target_type orelse .void),

View File

@@ -9,42 +9,6 @@ const TypeId = types.TypeId;
const TypeInfo = types.TypeInfo;
const TypeTable = types.TypeTable;
test "bridgeType: primitives" {
const alloc = std.testing.allocator;
var table = TypeTable.init(alloc);
defer table.deinit();
try std.testing.expectEqual(TypeId.s32, type_bridge.bridgeType(.{ .signed = 32 }, &table, null));
try std.testing.expectEqual(TypeId.u8, type_bridge.bridgeType(.{ .unsigned = 8 }, &table, null));
try std.testing.expectEqual(TypeId.f64, type_bridge.bridgeType(.f64, &table, null));
try std.testing.expectEqual(TypeId.void, type_bridge.bridgeType(.void_type, &table, null));
try std.testing.expectEqual(TypeId.bool, type_bridge.bridgeType(.boolean, &table, null));
try std.testing.expectEqual(TypeId.string, type_bridge.bridgeType(.string_type, &table, null));
try std.testing.expectEqual(TypeId.any, type_bridge.bridgeType(.any_type, &table, null));
}
test "bridgeType: composite types" {
const alloc = std.testing.allocator;
var table = TypeTable.init(alloc);
defer table.deinit();
// Pointer
const ptr_id = type_bridge.bridgeType(.{ .pointer_type = .{ .pointee_name = "s32" } }, &table, null);
try std.testing.expectEqual(TypeInfo{ .pointer = .{ .pointee = .s32 } }, table.get(ptr_id));
// Slice
const slice_id = type_bridge.bridgeType(.{ .slice_type = .{ .element_name = "u8" } }, &table, null);
try std.testing.expectEqual(TypeInfo{ .slice = .{ .element = .u8 } }, table.get(slice_id));
// Array
const arr_id = type_bridge.bridgeType(.{ .array_type = .{ .element_name = "f32", .length = 4 } }, &table, null);
try std.testing.expectEqual(TypeInfo{ .array = .{ .element = .f32, .length = 4 } }, table.get(arr_id));
// Optional
const opt_id = type_bridge.bridgeType(.{ .optional_type = .{ .child_name = "s64" } }, &table, null);
try std.testing.expectEqual(TypeInfo{ .optional = .{ .child = .s64 } }, table.get(opt_id));
}
test "resolveAstType: primitive type_expr" {
const alloc = std.testing.allocator;
var table = TypeTable.init(alloc);

View File

@@ -2,7 +2,6 @@ const std = @import("std");
const Allocator = std.mem.Allocator;
const ast = @import("../ast.zig");
const Node = ast.Node;
const sx_types = @import("../types.zig");
const ir_types = @import("types.zig");
const TypeId = ir_types.TypeId;
const TypeInfo = ir_types.TypeInfo;
@@ -106,122 +105,8 @@ pub fn resolveAstType(node: ?*const Node, table: *TypeTable, alias_map: AliasMap
};
}
// ── types.Type → TypeId ─────────────────────────────────────────────────
// Translate an existing codegen Type value into an IR TypeId. Used when
// we have access to the codegen's resolved type info (Phase 3+).
pub fn bridgeType(ty: sx_types.Type, table: *TypeTable, alias_map: AliasMap) TypeId {
return switch (ty) {
.signed => |w| switch (w) {
8 => .s8,
16 => .s16,
32 => .s32,
64 => .s64,
// Non-standard width: intern the exact width rather than quantising
// to s64 (which would silently change the type's size).
else => table.intern(.{ .signed = w }),
},
.unsigned => |w| switch (w) {
8 => .u8,
16 => .u16,
32 => .u32,
64 => .u64,
else => table.intern(.{ .unsigned = w }),
},
.f32 => .f32,
.f64 => .f64,
.void_type => .void,
.boolean => .bool,
.string_type => .string,
.any_type => .any,
.usize_type => .usize,
.isize_type => .isize,
.enum_type => |name| resolveNamedType(name, .@"enum", table),
.struct_type => |name| resolveNamedType(name, .@"struct", table),
.union_type => |name| resolveNamedType(name, .@"union", table),
.array_type => |info| blk: {
const elem = resolveTypeName(info.element_name, table, alias_map);
break :blk table.arrayOf(elem, info.length);
},
.slice_type => |info| blk: {
const elem = resolveTypeName(info.element_name, table, alias_map);
break :blk table.sliceOf(elem);
},
.pointer_type => |info| blk: {
const pointee = resolveTypeName(info.pointee_name, table, alias_map);
break :blk table.ptrTo(pointee);
},
.many_pointer_type => |info| blk: {
const elem = resolveTypeName(info.element_name, table, alias_map);
break :blk table.manyPtrTo(elem);
},
.optional_type => |info| blk: {
const child = resolveTypeName(info.child_name, table, alias_map);
break :blk table.optionalOf(child);
},
.vector_type => |info| blk: {
const elem = resolveTypeName(info.element_name, table, alias_map);
break :blk table.vectorOf(elem, info.length);
},
.function_type => |info| blk: {
const alloc = table.alloc;
var param_ids = std.ArrayList(TypeId).empty;
for (info.param_types) |pt| {
param_ids.append(alloc, bridgeType(pt, table, alias_map)) catch unreachable;
}
const ret_id = bridgeType(info.return_type.*, table, alias_map);
break :blk table.functionType(param_ids.items, ret_id);
},
.closure_type => |info| blk: {
const alloc = table.alloc;
var param_ids = std.ArrayList(TypeId).empty;
for (info.param_types) |pt| {
param_ids.append(alloc, bridgeType(pt, table, alias_map)) catch unreachable;
}
const ret_id = bridgeType(info.return_type.*, table, alias_map);
break :blk table.closureType(param_ids.items, ret_id);
},
.tuple_type => |info| blk: {
const alloc = table.alloc;
var field_ids = std.ArrayList(TypeId).empty;
for (info.field_types) |ft| {
field_ids.append(alloc, bridgeType(ft, table, alias_map)) catch unreachable;
}
var name_ids: ?[]const StringId = null;
if (info.field_names) |names| {
var ids = std.ArrayList(StringId).empty;
for (names) |n| {
ids.append(alloc, table.internString(n)) catch unreachable;
}
name_ids = ids.items;
}
break :blk table.intern(.{ .tuple = .{
.fields = field_ids.items,
.names = name_ids,
} });
},
.meta_type => .any, // meta types map to Any for now
.unresolved => .unresolved,
};
}
// ── Internal helpers ─────────────────────────────────────────────────────
const NamedKind = enum { @"struct", @"enum", @"union" };
fn resolveNamedType(name: []const u8, kind: NamedKind, table: *TypeTable) TypeId {
// Check if primitive first
if (resolveTypePrimitive(name)) |id| return id;
// Register as a named type
const name_id = table.internString(name);
return switch (kind) {
.@"struct" => table.intern(.{ .@"struct" = .{ .name = name_id, .fields = &.{} } }),
.@"enum" => table.intern(.{ .@"enum" = .{ .name = name_id, .variants = &.{} } }),
.@"union" => table.intern(.{ .@"union" = .{ .name = name_id, .fields = &.{} } }),
};
}
/// Resolve a bare type name. The algorithm lives in `type_resolver.zig`
/// (`TypeResolver.resolveNamed`, the single source); `type_bridge` forwards the
/// caller-threaded `alias_map` (the single-source `ProgramIndex.type_alias_map`).

View File

@@ -23,7 +23,9 @@ pub const Document = struct {
version: i64,
/// AST root for this file only (not merged).
root: ?*sx.ast.Node,
/// Sema results for this file (references are relative to this source).
/// Editor index for this file — symbols/references/types for navigation,
/// completion, and hover (references are relative to this source). Not a
/// diagnostic source; see `sema.zig` module doc.
sema: ?sx.sema.SemaResult,
/// Last successful sema (preserved across parse failures for completions).
last_good_sema: ?sx.sema.SemaResult = null,

View File

@@ -158,7 +158,7 @@ pub const Server = struct {
const text = jsonStr(jsonGet(td, "text") orelse return) orelse return;
const version = jsonInt(jsonGet(td, "version") orelse return) orelse return;
try self.analyzeAndPublish(uri, text, version);
try self.refreshEditorIndex(uri, text, version);
self.runProjectCheck();
}
@@ -173,7 +173,7 @@ pub const Server = struct {
const last = changes_arr[changes_arr.len - 1];
const text = jsonStr(jsonGet(last, "text") orelse return) orelse return;
try self.analyzeAndPublish(uri, text, version);
try self.refreshEditorIndex(uri, text, version);
}
fn handleDidClose(_: *Server, params: std.json.Value) void {
@@ -1919,41 +1919,16 @@ pub const Server = struct {
// ---- Core analysis pipeline ----
fn analyzeAndPublish(self: *Server, uri: []const u8, text: []const u8, version: i64) !void {
/// Refresh the editor index for a document (symbols/references/types that
/// power navigation, completion, hover, and token classification). Publishes
/// no diagnostics — authoritative diagnostics come only from the canonical
/// compiler pipeline in `runProjectCheck`.
fn refreshEditorIndex(self: *Server, uri: []const u8, text: []const u8, version: i64) !void {
const file_path = uriToFilePath(uri) orelse "";
const source = try self.allocator.dupeZ(u8, text);
const doc = try self.documents.openOrUpdate(file_path, source, version);
self.documents.analyzeDocument(doc) catch {};
// Publish diagnostics from sema
if (doc.sema) |sema| {
try self.sendDiagnostics(uri, semaToLspDiags(self.allocator, doc.source, sema.diagnostics));
} else {
try self.sendDiagnostics(uri, &.{});
}
}
fn semaToLspDiags(allocator: std.mem.Allocator, source: [:0]const u8, diags: []const sx.errors.Diagnostic) []const lsp.Diagnostic {
var result = std.ArrayList(lsp.Diagnostic).empty;
for (diags) |d| {
const range = if (d.span) |span| spanToRange(source, span) else lsp.Range{
.start = .{ .line = 0, .character = 0 },
.end = .{ .line = 0, .character = 1 },
};
const severity: u32 = switch (d.level) {
.err => 1,
.warn => 2,
.note => 3,
.help => 4,
};
result.append(allocator, .{
.range = range,
.severity = severity,
.message = d.message,
}) catch continue;
}
return result.items;
}
fn sendDiagnostics(self: *Server, uri: []const u8, diagnostics: []const lsp.Diagnostic) !void {
@@ -2012,8 +1987,8 @@ pub const Server = struct {
}
/// Drive the whole-program check from the workspace entry point and publish
/// the real compiler's diagnostics per file (runs on save; the sema layer
/// keeps live per-keystroke feedback).
/// the real compiler's diagnostics per file. Runs on open and save; this is
/// the sole source of LSP diagnostics (the editor index publishes none).
fn runProjectCheck(self: *Server) void {
if (self.root_path.len == 0) return;
const entry_path = std.fmt.allocPrint(self.allocator, "{s}/main.sx", .{self.root_path}) catch return;

View File

@@ -565,7 +565,7 @@ pub const Analyzer = struct {
/// Infer an approximate editor `Type` for an expression (hover/completion;
/// metadata only — NOT a compiler type decision, which uses `TypeId`).
/// Uses fn_signatures for call return types, struct_types for field access,
/// symbols for identifier types, and Type.widen for arithmetic promotion.
/// and symbols for identifier types.
pub fn inferExprType(self: *Analyzer, node: *const Node) Type {
return switch (node.data) {
.int_literal => Type.s(64),
@@ -578,9 +578,13 @@ pub const Analyzer = struct {
switch (binop.op) {
.eq, .neq, .lt, .lte, .gt, .gte, .and_op, .or_op, .in_op => return .boolean,
else => {
// Editor display only: approximate an arithmetic result as
// its left operand's type (or the right when the left is
// unresolved). Numeric promotion is a compiler decision on
// `TypeId`, never recomputed here.
const lhs_ty = self.inferExprType(binop.lhs);
const rhs_ty = self.inferExprType(binop.rhs);
return Type.widen(lhs_ty, rhs_ty);
if (lhs_ty == .unresolved) return self.inferExprType(binop.rhs);
return lhs_ty;
},
}
},

View File

@@ -2,12 +2,13 @@ const std = @import("std");
const ast = @import("ast.zig");
const Node = ast.Node;
/// Editor metadata type model, used only by `src/sema.zig` (the language-server
/// symbol/type index) for navigation, completion, and hover. NOT the compiler's
/// source of truth: lowering, codegen, and layout use the canonical
/// `TypeId` / `TypeTable` model in `src/ir/types.zig`. Do not expand this to
/// carry new compiler semantics; the architecture endpoint (phase A8) is to
/// delete it or reduce it to display-only data derived from `TypeId`.
/// Editor-indexing and parse-time name metadata — used by `src/sema.zig` (the
/// language-server symbol/type index) for navigation, completion, and hover, and
/// by `src/parser.zig` for parse-time primitive-name classification. This is NOT
/// a compiler type model: it carries no type-resolution surface (no widening,
/// convertibility, or layout). The canonical model the compiler resolves, lowers,
/// and lays out against is `TypeId` / `TypeTable` in `src/ir/types.zig`. Keep this
/// display- and classification-only; never add resolution semantics here.
pub const Type = union(enum) {
// Variable-width integers (164 bits)
signed: u8,
@@ -86,61 +87,6 @@ pub const Type = union(enum) {
field_types: []const Type,
};
/// Content-based equality: compares string fields by content, not pointer identity.
pub fn eql(self: Type, other: Type) bool {
const Tag = std.meta.Tag(Type);
const self_tag: Tag = self;
const other_tag: Tag = other;
if (self_tag != other_tag) return false;
return switch (self) {
.signed => |w| w == other.signed,
.unsigned => |w| w == other.unsigned,
.f32, .f64, .void_type, .boolean, .string_type, .any_type, .usize_type, .isize_type, .unresolved => true,
.enum_type => |n| std.mem.eql(u8, n, other.enum_type),
.struct_type => |n| std.mem.eql(u8, n, other.struct_type),
.union_type => |n| std.mem.eql(u8, n, other.union_type),
.array_type => |info| info.length == other.array_type.length and
std.mem.eql(u8, info.element_name, other.array_type.element_name),
.slice_type => |info| std.mem.eql(u8, info.element_name, other.slice_type.element_name),
.pointer_type => |info| std.mem.eql(u8, info.pointee_name, other.pointer_type.pointee_name),
.many_pointer_type => |info| std.mem.eql(u8, info.element_name, other.many_pointer_type.element_name),
.vector_type => |info| info.length == other.vector_type.length and
std.mem.eql(u8, info.element_name, other.vector_type.element_name),
.function_type => |info| {
const o = other.function_type;
if (info.param_types.len != o.param_types.len) return false;
for (info.param_types, o.param_types) |a, b| {
if (!a.eql(b)) return false;
}
return info.return_type.eql(o.return_type.*);
},
.closure_type => |info| {
const o = other.closure_type;
if (info.param_types.len != o.param_types.len) return false;
for (info.param_types, o.param_types) |a, b| {
if (!a.eql(b)) return false;
}
return info.return_type.eql(o.return_type.*);
},
.optional_type => |info| std.mem.eql(u8, info.child_name, other.optional_type.child_name),
.meta_type => |info| std.mem.eql(u8, info.name, other.meta_type.name),
.tuple_type => |info| {
const o = other.tuple_type;
if (info.field_types.len != o.field_types.len) return false;
for (info.field_types, o.field_types) |a, b| {
if (!a.eql(b)) return false;
}
// If both have names, compare them
if (info.field_names != null and o.field_names != null) {
for (info.field_names.?, o.field_names.?) |a, b| {
if (!std.mem.eql(u8, a, b)) return false;
}
}
return true;
},
};
}
// Convenience constructors
pub fn s(width: u8) Type {
return .{ .signed = width };
@@ -255,13 +201,6 @@ pub const Type = union(enum) {
return fromName(node.data.type_expr.name);
}
pub fn isEnum(self: Type) bool {
return switch (self) {
.enum_type => true,
else => false,
};
}
pub fn isStruct(self: Type) bool {
return switch (self) {
.struct_type => true,
@@ -269,20 +208,6 @@ pub const Type = union(enum) {
};
}
pub fn isUnion(self: Type) bool {
return switch (self) {
.union_type => true,
else => false,
};
}
pub fn isTuple(self: Type) bool {
return switch (self) {
.tuple_type => true,
else => false,
};
}
pub fn isOptional(self: Type) bool {
return switch (self) {
.optional_type => true,
@@ -290,33 +215,6 @@ pub const Type = union(enum) {
};
}
pub fn optionalChild(self: Type) ?[]const u8 {
return switch (self) {
.optional_type => |info| info.child_name,
else => null,
};
}
pub fn isAny(self: Type) bool {
return switch (self) {
.any_type => true,
else => false,
};
}
pub fn isString(self: Type) bool {
return self == .string_type;
}
/// Returns true for both `string` (null-terminated) and `[]u8` (byte slice)
pub fn isStringLike(self: Type) bool {
if (self == .string_type) return true;
if (self.isSlice()) {
return std.mem.eql(u8, self.slice_type.element_name, "u8");
}
return false;
}
pub fn isSlice(self: Type) bool {
return switch (self) {
.slice_type => true,
@@ -324,13 +222,6 @@ pub const Type = union(enum) {
};
}
pub fn sliceElementType(self: Type) ?Type {
return switch (self) {
.slice_type => |info| fromName(info.element_name),
else => null,
};
}
pub fn isPointer(self: Type) bool {
return switch (self) {
.pointer_type => true,
@@ -352,35 +243,6 @@ pub const Type = union(enum) {
};
}
pub fn manyPointerElementType(self: Type) ?Type {
return switch (self) {
.many_pointer_type => |info| fromName(info.element_name),
else => null,
};
}
pub fn isFunctionType(self: Type) bool {
return switch (self) {
.function_type => true,
else => false,
};
}
pub fn isClosureType(self: Type) bool {
return switch (self) {
.closure_type => true,
else => false,
};
}
/// Returns true for both bare function pointers and closures
pub fn isCallable(self: Type) bool {
return switch (self) {
.function_type, .closure_type => true,
else => false,
};
}
pub fn isArray(self: Type) bool {
return switch (self) {
.array_type => true,
@@ -388,156 +250,6 @@ pub const Type = union(enum) {
};
}
pub fn isVector(self: Type) bool {
return switch (self) {
.vector_type => true,
else => false,
};
}
pub fn vectorElementType(self: Type) ?Type {
return switch (self) {
.vector_type => |info| fromName(info.element_name),
else => null,
};
}
pub fn isFloat(self: Type) bool {
return switch (self) {
.f32, .f64 => true,
else => false,
};
}
pub fn isInt(self: Type) bool {
return self.isSigned() or self.isUnsigned();
}
pub fn isSigned(self: Type) bool {
return switch (self) {
.signed => true,
else => false,
};
}
pub fn isUnsigned(self: Type) bool {
return switch (self) {
.unsigned => true,
else => false,
};
}
pub fn bitWidth(self: Type) u32 {
return switch (self) {
.signed => |w| w,
.unsigned => |w| w,
.f32 => 32,
.f64 => 64,
.boolean => 1,
.pointer_type, .many_pointer_type, .function_type => 64,
.closure_type => 128, // { ptr, ptr } = 16 bytes
else => 0,
};
}
/// Check if this type can be implicitly converted to `target` without `xx`.
/// Safe (implicit) conversions:
/// - Same type
/// - Both unsigned int, target width >= source width
/// - Both signed int, target width >= source width
/// - Unsigned to signed, target width strictly > source width
/// - Any int to any float
/// - Float to wider float (f32 → f64)
/// Everything else requires `xx`.
pub fn isImplicitlyConvertibleTo(self: Type, target: Type) bool {
if (self.eql(target)) return true;
// string <-> []u8: same layout, bidirectional implicit conversion
if (self == .string_type and target.isSlice() and
std.mem.eql(u8, target.slice_type.element_name, "u8")) return true;
if (self.isSlice() and std.mem.eql(u8, self.slice_type.element_name, "u8") and
target == .string_type) return true;
// *void is universal pointer (both directions)
if (self.isPointer() and target.isPointer()) {
if (std.mem.eql(u8, self.pointer_type.pointee_name, "void")) return true;
if (std.mem.eql(u8, target.pointer_type.pointee_name, "void")) return true;
}
// *T → [*]T: pointer to element is implicitly convertible to many-pointer
// null (*void) → [*]T is also allowed
if (self.isPointer() and target.isManyPointer()) {
if (std.mem.eql(u8, self.pointer_type.pointee_name, "void")) return true;
return std.mem.eql(u8, self.pointer_type.pointee_name, target.many_pointer_type.element_name);
}
// [*]T → *void: any many-pointer converts to void pointer
if (self.isManyPointer() and target.isPointer()) {
return std.mem.eql(u8, target.pointer_type.pointee_name, "void");
}
// Tuple → tuple: same field count and each field implicitly convertible
if (self.isTuple() and target.isTuple()) {
const si = self.tuple_type;
const ti = target.tuple_type;
if (si.field_types.len != ti.field_types.len) return false;
for (si.field_types, ti.field_types) |sf, tf| {
if (!sf.isImplicitlyConvertibleTo(tf)) return false;
}
return true;
}
// T → ?T: any type implicitly wraps into its optional
if (target.isOptional()) {
const child_name = target.optional_type.child_name;
// null → ?T
if (self.isPointer() and std.mem.eql(u8, self.pointer_type.pointee_name, "void")) return true;
// ?T → ?U when T → U
if (self.isOptional()) {
const self_child = fromName(self.optional_type.child_name) orelse return false;
const target_child = fromName(child_name) orelse return false;
return self_child.isImplicitlyConvertibleTo(target_child);
}
// T → ?T: check if self matches the child type
if (fromName(child_name)) |child_type| {
return self.eql(child_type) or self.isImplicitlyConvertibleTo(child_type);
}
// Non-primitive child (struct/enum name): compare by name
return switch (self) {
.struct_type => |n| std.mem.eql(u8, n, child_name),
.enum_type => |n| std.mem.eql(u8, n, child_name),
.union_type => |n| std.mem.eql(u8, n, child_name),
else => false,
};
}
const src_float = self.isFloat();
const dst_float = target.isFloat();
const src_int = self.isInt();
// Float → wider float
if (src_float and dst_float) {
return target.bitWidth() >= self.bitWidth();
}
// Int → float (always safe)
if (src_int and dst_float) return true;
// Both unsigned → target width >= source width
if (self.isUnsigned() and target.isUnsigned()) {
return target.bitWidth() >= self.bitWidth();
}
// Both signed → target width >= source width
if (self.isSigned() and target.isSigned()) {
return target.bitWidth() >= self.bitWidth();
}
// Unsigned → signed: target must be strictly wider
if (self.isUnsigned() and target.isSigned()) {
return target.bitWidth() > self.bitWidth();
}
// Everything else requires xx
return false;
}
fn fmtAlloc(allocator: std.mem.Allocator, comptime fmt: []const u8, args: anytype) ![]const u8 {
var buf: [128]u8 = undefined;
const result = std.fmt.bufPrint(&buf, fmt, args) catch
@@ -621,65 +333,4 @@ pub const Type = union(enum) {
},
};
}
/// Widen two types to a common type for binary operations.
/// Used for arithmetic type promotion (e.g., s16 + s32 → s32, int + float → float).
pub fn widen(a: Type, b: Type) Type {
// Same type → return it
if (a.eql(b)) return a;
// Tuple + tuple → return a if same field count
if (a.isTuple() and b.isTuple()) {
if (a.tuple_type.field_types.len == b.tuple_type.field_types.len) return a;
}
// Vector + vector of same dimensions → return a
if (a.isVector() and b.isVector()) return a;
// Vector + scalar → return vector (scalar will be broadcast)
if (a.isVector() and !b.isVector()) return a;
if (b.isVector() and !a.isVector()) return b;
const a_float = a.isFloat();
const b_float = b.isFloat();
const a_int = a.isInt();
const b_int = b.isInt();
// Both float → wider float
if (a_float and b_float) {
return if (a.bitWidth() >= b.bitWidth()) a else b;
}
// int + float → float
if (a_int and b_float) return b;
if (b_int and a_float) return a;
// Both signed → wider signed
if (a.isSigned() and b.isSigned()) {
return Type.s(@intCast(@max(a.bitWidth(), b.bitWidth())));
}
// Both unsigned → wider unsigned
if (a.isUnsigned() and b.isUnsigned()) {
return Type.u(@intCast(@max(a.bitWidth(), b.bitWidth())));
}
// signed + unsigned (mixed)
if (a_int and b_int) {
const aw = a.bitWidth();
const bw = b.bitWidth();
const max_w = @max(aw, bw);
// If same width, need one extra bit for sign; otherwise max is enough
const need: u32 = if (aw == bw) max_w + 1 else max_w;
const capped: u8 = @intCast(@min(need, 128));
return Type.s(capped);
}
// Optional types: widen inner types
if (a.isOptional() and b.isOptional()) return a;
// Pointer types: both are pointers → return first (all are opaque ptr at LLVM level)
if ((a.isPointer() or a.isManyPointer()) and (b.isPointer() or b.isManyPointer())) return a;
return a;
}
};