feat(asm): Phase C.1 + D — inline asm codegen (runs end-to-end)

lowerAsmExpr stops bailing and builds the inline_asm op: resolves each operand's
effective name (§II.5 — explicit [name] else the {reg} pin), interns
template/constraints/clobbers, lowers input Refs, derives the result TypeId
(0→void, 1→T). Adds the last deferred validation (every %[name] must name an
operand). Multi-output (N>1) bails with a named "Phase E" diagnostic.

emitInlineAsm (backend/llvm/ops.zig) ports Zig's airAssembly: assembles the LLVM
constraint string (outputs → inputs → ~{clobber}, ',' → '|'), rewrites the
template (%[name]→${N}, %%→%, $→$$, %=→${:uid}), then LLVMGetInlineAsm +
LLVMBuildCall2 (AT&T dialect). Dispatch wired in emit_llvm.zig (replacing the C.0
@panic tripwire).

inferType gains an .asm_expr arm (expr_typer.zig) so a bare `x := asm {…-> T}`
binding types correctly — without it the binding inferred .unresolved and
silently produced 0.

llvm_shim.c: LLVMInitializeNativeAsmParser() — the JIT must assemble inline asm
at run time.

Verified end-to-end on the aarch64 host: `mov`/`add` with register-class inputs
and a value output run (exit 42/99), `nop volatile` runs (exit 0). IR is
textbook: `call i64 asm "add ${0},${1},${2}", "=r,r,r"(…)`.

Locked with 1645 (aarch64 add, runs; ir-only on non-aarch64) + 1646 (:= binding).
Updated 1640 (now Phase-E bail) + 1642 (now runs).

zig build test green (654 corpus, 446 unit).
This commit is contained in:
agra
2026-06-15 21:39:54 +03:00
parent 6c08de8ec1
commit 5a5e04c6d5
23 changed files with 395 additions and 50 deletions

View File

@@ -6,7 +6,31 @@ commit, one step at a time per the cadence rule (no commit may both add a test
and make it pass).
## Last completed step
**C.0** — IR op `inline_asm` (lock; no behavior change). Added `inline_asm:
**C.1 + D** — inline asm CODEGEN (lowering builds the op + LLVM emit). **Inline
assembly now runs end-to-end.** `lowerAsmExpr` (`src/ir/lower/expr.zig`) stops
bailing: it resolves each operand's effective name (§II.5 auto-naming), interns
template/constraints/clobbers, lowers input `Ref`s, derives the result `TypeId`
(0→void, 1→T), and builds the `inline_asm` op. Added a `%[name]`-references-a-
real-operand check (the last deferred validation). Multi-output (N>1) still bails
loudly ("Phase E"). `emitInlineAsm` (`src/backend/llvm/ops.zig`, port of Zig's
`airAssembly`): assembles the LLVM constraint string (outputs→inputs→`~{clobber}`,
`,``|`), rewrites the template (`%[name]``${N}`, `%%``%`, `$``$$`, `%=`
`${:uid}`), then `LLVMGetInlineAsm` + `LLVMBuildCall2` (AT&T). Dispatch wired
(`emit_llvm.zig`, replacing the C.0 `@panic`). **`llvm_shim.c`**: added
`LLVMInitializeNativeAsmParser()` — the JIT must assemble inline asm at run time.
Verified end-to-end: aarch64 `add`/`mov` run on the host (exit 42), `nop volatile`
runs (1642 now exit 0), IR is textbook (`call i64 asm "add ${0},${1},${2}",
"=r,r,r"(…)`). Locked with `examples/1645-platform-asm-aarch64-add.sx` (runs on
aarch64, ir-only elsewhere via `.build` + `.ir`). Also added the `inferType`
`.asm_expr` arm (`src/ir/expr_typer.zig`, 0→void / 1→T) — without it a bare
`x := asm {…-> T}` binding inferred `.unresolved` and silently produced 0;
regression-locked with `examples/1646-platform-asm-value-binding.sx`. Updated
1640 (now Phase-E bail) + 1642 (now runs). `zig build test` green (654 corpus,
446 unit). Files: `src/ir/lower/expr.zig`, `src/backend/llvm/ops.zig`,
`src/ir/emit_llvm.zig`, `src/ir/expr_typer.zig`, `llvm_shim.c`,
`examples/164{0,2,5,6}-*`.
Prior: **C.0** — IR op `inline_asm` (lock; no behavior change). Added `inline_asm:
InlineAsm` to the IR `Op` union + the `InlineAsm` struct (`template: StringId`,
`operands: []const AsmOperand` {role/name/constraint/operand}, `clobbers:
[]const StringId`, `has_side_effects`) in `src/ir/inst.zig` — all strings
@@ -88,40 +112,40 @@ guards fire: corrupting the `.ir` → IR mismatch; deleting it → the require-f
`src/corpus_run.test.zig`, `examples/1639-*`.
## Current state
Phase A underway: `asm { … }` lexes (A.0) and **parses** into `AsmExpr` (A.1);
lowering bails LOUD + named (no IR op / emit yet). Result-type derivation, the
operand auto-naming rule, and the validation checklist are **Phase B** (not yet
implemented — any asm reaching lowering errors out). The adopted **operand
auto-naming rule** (design §II.5, decided this session): name auto-derived from a
`{reg}` pin; explicit `[name]` only when it differs or for register-class (`=r`)
operands; echo form `[eax] "={eax}"` rejected. Parser stores `name: ?[]const u8`;
the rule is a Phase-B (typing) concern, so the parser needs no change for it.
**Inline assembly works end-to-end for 0/1 value outputs.** Pipeline complete:
lex (A.0) → parse (A.1) → validate (B.0/B.1 + the `%[name]` check) → IR op (C.0)
→ lower-builds-op + LLVM emit + JIT asm-parser init (C.1/D). Single-value-output
and no-output `volatile` asm assemble and execute on the host JIT; the auto-naming
rule (§II.5) is live (effective name = explicit `[name]` else `{reg}`). **Phase E
(multi-output tuples) is the remaining feature gap** — N>1 value outputs bail with
a named "Phase E" diagnostic (1640). `-> @place` write-through outputs are still
rejected at parse (Phase 2). Global asm (Phase F) not started.
Known orthogonal bug: **issue 0137**`sx run` on a program with no `main`
segfaults (`src/target.zig:256-273`, unguarded JIT entry lookup). Pre-existing,
asm-independent; does NOT block the ASM stream (every example has a `main`).
Phase BE feasibility already confirmed against the live tree
Phase EF feasibility already confirmed against the live tree
(`LLVMGetInlineAsm` / `LLVMBuildCall2` / `LLVMAppendModuleInlineAsm` in LLVM@19
`Core.h`; ERR-stream `extractvalue`→tuple in `emit_llvm.zig:726-927`; lib-less
`extern`, 60 sites; `--target` a global CLI flag).
## Next step
**C.1 + D together** (must land as one green step) — wire `lowerAsmExpr` to BUILD
the `inline_asm` op (intern template + constraints + clobber names; resolve each
operand's effective name via the §II.5 auto-naming rule; lower input `Ref`s;
compute the result `TypeId` from the `out_value` operands — 0→void, 1→T, N→tuple,
named) AND implement `emitInlineAsm` in `src/ir/emit_llvm.zig` (replacing the
`@panic` tripwire) — the port of Zig's `airAssembly`: assemble the LLVM constraint
string (outputs `=`/`+`, inputs, `clobbers``~{name}`), rewrite `%[name]``${N}` /
`%%` / `%=`, `LLVMGetInlineAsm` + `LLVMBuildCall2`, AT&T dialect. They land
together because the moment lowering stops bailing, emit is reached — a half-step
would hit the tripwire. First target: the single-value-output syscall on
`x86_64-linux` (ir-only via a `.build` `{ "target": "x86_64-linux" }` + `.ir`
snapshot, since the host is aarch64). Result-type derivation for `expr_typer.zig`
(`inferType` `.asm_expr` arm) also lands here — now observable. Then E (multi-
return tuples) + remaining validation (`%[name]` references a real operand). See
`PLAN-ASM.md` Phases CE + design §II.6.
**Phase E** (multi-output tuples) — replace the N>1 "Phase E" bail in
`lowerAsmExpr`: build a tuple `TypeId` from the `out_value` types (named via the
effective-name rule), set it as the op result, and in `emitInlineAsm` make the
LLVM return type an anonymous struct `{T1,…,Tn}`, then `extractvalue i` per
`out_value` → assemble the sx tuple. Lock with `divmod``(quot,rem)` (reuse 1640's
shape, now running) + `cpuid`→4-tuple, arch-pinned. See `PLAN-ASM.md` Phase E +
design §II.6 (multi-return). Also worth adding: the x86_64-linux syscall-write
example (ir-only on this host via `.build { "target": "x86_64-linux" }` + `.ir`)
to lock the cross-target lowering, per the plan's D verification.
Then Phase 2 (`-> @place` write-through / read-write / indirect-memory) and Phase
F (global asm + `extern` call into asm symbols). Result-type derivation for the
0/1 cases now lives in BOTH `lowerAsmExpr` (the op's `Inst.ty`) and
`expr_typer.zig`'s `inferType` (for `:=`/value-position typing); Phase E extends
both to the tuple case.
## Log
- (init) Plan + design doc written; ASM stream opened.
@@ -151,6 +175,12 @@ return tuples) + remaining validation (`%[name]` references a real operand). See
- (C.0) IR op `inline_asm: InlineAsm` + interp `bailDetail` + print arm + emit
`@panic` tripwire (Phase D). No behavior change (lowering still bails). Unit
test `inline_asm op shape`. `zig build test` green (652 corpus, 446 unit).
- (C.1+D) CODEGEN — `lowerAsmExpr` builds the op (effective names, interned
strings, input Refs, 0/1 result type) + `%[name]` validation; `emitInlineAsm`
(constraint string + template rewrite + `LLVMGetInlineAsm`/`BuildCall2`, AT&T);
`inferType` arm; `LLVMInitializeNativeAsmParser` for the JIT. **Inline asm runs
end-to-end.** N>1 bails (Phase E). Locked with 1645 (aarch64 add, runs) + 1646
(`:=` binding); updated 1640/1642. `zig build test` green (654 corpus, 446 unit).
## Known issues
- **0137** — `sx run` on a program with no `main` segfaults (unguarded JIT entry

View File

@@ -1,10 +1,10 @@
// ASM stream Phase A.1 — `asm { … }` PARSES into an AsmExpr: template, named
// value outputs (`[quot] "={rax}" -> u64`), register-pinned inputs, and a
// `clobbers(.…)` clause are all accepted with no parse error. Codegen is not
// implemented yet (the IR op + LLVM emit land in Phases CE), so lowering bails
// LOUD + named. This example pins that intermediate diagnostic; a later phase
// turns it into a running multi-return example. Called from `main` so lowering
// actually reaches the asm body (lazy lowering skips uncalled functions).
// ASM stream — `asm { … }` parses + validates the full rich shape: named value
// outputs (`[quot] "={rax}" -> u64`), register-pinned inputs, and a
// `clobbers(.…)` clause, all accepted. This is a MULTI-output (tuple-returning)
// asm, which is deferred to Phase E — so lowering bails LOUD + named with the
// specific "Phase E" diagnostic (single-output asm already runs; see 1645).
// Called from `main` so lowering reaches the asm body (lazy lowering skips
// uncalled functions).
divmod :: (n: u64, d: u64) -> (quot: u64, rem: u64) {
return asm {
"divq %[d]",

View File

@@ -1,5 +1,5 @@
// ASM stream Phase B — the no-output form IS accepted when `volatile` is
// present: validation passes, and lowering then bails on the not-yet-
// implemented codegen (Phases CE). Confirms the volatile rule's positive side.
// ASM stream — the no-output `volatile` form runs end-to-end: a bare `nop`
// (no operands, no result) assembles and executes cleanly (exit 0). Confirms
// the no-output⇒volatile rule's positive side AND the zero-operand emit path.
nop :: () { asm volatile { "nop" }; }
main :: () { nop(); }

View File

@@ -0,0 +1,10 @@
// ASM stream Phase D — inline assembly that RUNS end-to-end. An aarch64 `add`
// with two register-class inputs (`%[a]`, `%[b]`) and a value output (`%[out]`)
// returned from the function. The `.build` pins aarch64-macOS: on a matching
// host the runner executes it (exit 42); elsewhere it falls to ir-only mode and
// asserts the `.ir` snapshot (the inline_asm op + LLVM `call asm` are target-
// independent in the IR text). Regression for the full lower→emit→JIT path.
add_asm :: (a: i64, b: i64) -> i64 {
return asm { "add %[out], %[a], %[b]", [out] "=r" -> i64, [a] "r" = a, [b] "r" = b };
}
main :: () -> i64 { return add_asm(40, 2); }

View File

@@ -0,0 +1,9 @@
// ASM stream Phase D — a bare `x := asm { … -> T }` binding (not a direct
// `return asm`) types correctly: the value output flows through the local and
// out as the exit code. Regression for the `inferType` `.asm_expr` arm (without
// it the binding inferred `.unresolved` and silently produced 0). aarch64-pinned
// via `.build` → runs on a matching host, ir-only elsewhere.
main :: () -> i64 {
x := asm { "mov %[out], #99", [out] "=r" -> i64 };
return x;
}

View File

@@ -1,4 +1,4 @@
error: inline assembly codegen is not yet implemented (ASM stream: lowering + emit land in Phases CE)
error: multi-output (tuple-returning) inline assembly is not yet implemented (ASM stream Phase E)
--> examples/1640-platform-asm-parse.sx:9:12
|
9 | return asm {

View File

@@ -1,5 +1 @@
error: inline assembly codegen is not yet implemented (ASM stream: lowering + emit land in Phases CE)
--> examples/1642-platform-asm-nop-volatile.sx:4:13
|
4 | nop :: () { asm volatile { "nop" }; }
| ^^^^^^^^^^^^^^^^^^^^^^

View File

@@ -0,0 +1 @@
{ "target": "macos" }

View File

@@ -0,0 +1 @@
42

View File

@@ -0,0 +1,21 @@
; Function Attrs: nounwind
define internal i64 @add_asm(i64 %0, i64 %1) #0 {
entry:
%alloca = alloca i64, align 8
store i64 %0, ptr %alloca, align 8
%allocaN = alloca i64, align 8
store i64 %1, ptr %allocaN, align 8
%load = load i64, ptr %alloca, align 8
%loadN = load i64, ptr %allocaN, align 8
%asm = call i64 asm "add ${0}, ${1}, ${2}", "=r,r,r"(i64 %load, i64 %loadN)
ret i64 %asm
}
; Function Attrs: nounwind
define i32 @main() #0 {
entry:
%call = call i64 @add_asm(i64 40, i64 2)
%ca.tr = trunc i64 %call to i32
ret i32 %ca.tr
}

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1 @@
{ "target": "macos" }

View File

@@ -0,0 +1 @@
99

View File

@@ -0,0 +1,11 @@
; Function Attrs: nounwind
define i32 @main() #0 {
entry:
%asm = call i64 asm "mov ${0}, #99", "=r"()
%alloca = alloca i64, align 8
store i64 %asm, ptr %alloca, align 8
%load = load i64, ptr %alloca, align 8
%ca.tr = trunc i64 %load to i32
ret i32 %ca.tr
}

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1 @@

View File

@@ -14,4 +14,7 @@ void sx_llvm_init_all_targets(void) {
void sx_llvm_init_native_target(void) {
LLVMInitializeNativeTarget();
LLVMInitializeNativeAsmPrinter();
// Required for inline assembly: the JIT must assemble the asm template at
// run time, which needs the target's asm parser (ASM stream Phase D).
LLVMInitializeNativeAsmParser();
}

View File

@@ -24,6 +24,7 @@ const Call = ir_inst.Call;
const CallIndirect = ir_inst.CallIndirect;
const ObjcMsgSend = ir_inst.ObjcMsgSend;
const JniMsgSend = ir_inst.JniMsgSend;
const InlineAsm = ir_inst.InlineAsm;
const BuiltinCall = ir_inst.BuiltinCall;
const TriOp = ir_inst.TriOp;
const Branch = ir_inst.Branch;
@@ -774,6 +775,161 @@ pub const Ops = struct {
self.e.mapRef(result);
}
/// Inline assembly (ASM stream Phase D) — the port of Zig's `airAssembly`.
/// Handles 0 value outputs (void) and 1 (scalar); multi-output tuples are
/// Phase E (lowering bails before reaching here). Builds the LLVM constraint
/// string, rewrites the `%[name]` template, then `LLVMGetInlineAsm` +
/// `LLVMBuildCall2`.
pub fn emitInlineAsm(self: Ops, instruction: *const Inst, a: InlineAsm) void {
const e = self.e;
const alloc = e.alloc;
var n_inputs: usize = 0;
for (a.operands) |op| {
if (op.role == .input) n_inputs += 1;
}
// Result LLVM type: void (no value output) or the single scalar.
const ret_ty = if (instruction.ty == .void) e.cached_void else e.toLLVMType(instruction.ty);
// One LLVM call param per input operand, in source order.
const param_types = alloc.alloc(c.LLVMTypeRef, n_inputs) catch unreachable;
defer alloc.free(param_types);
const call_args = alloc.alloc(c.LLVMValueRef, n_inputs) catch unreachable;
defer alloc.free(call_args);
{
var i: usize = 0;
for (a.operands) |op| {
if (op.role != .input) continue;
const raw_ty = e.argIRTypeOrFail(op.operand);
const llvm_ty = e.toLLVMType(raw_ty);
param_types[i] = llvm_ty;
call_args[i] = e.coerceArg(e.resolveRef(op.operand), llvm_ty);
i += 1;
}
}
// ── Constraint string: outputs first, then inputs, then ~{clobber}. ──
var cons: std.ArrayList(u8) = .empty;
defer cons.deinit(alloc);
self.appendAsmConstraints(&cons, a, false); // outputs (out_value / out_place)
self.appendAsmConstraints(&cons, a, true); // inputs
for (a.clobbers) |cl| {
if (cons.items.len != 0) cons.append(alloc, ',') catch unreachable;
cons.appendSlice(alloc, "~{") catch unreachable;
cons.appendSlice(alloc, e.ir_mod.types.getString(cl)) catch unreachable;
cons.append(alloc, '}') catch unreachable;
}
// ── Template rewrite: %[name]->${N}, %%->%, $->$$, %=->${:uid}. ──
var rendered: std.ArrayList(u8) = .empty;
defer rendered.deinit(alloc);
self.renderAsmTemplate(&rendered, a);
const fn_ty = c.LLVMFunctionType(ret_ty, param_types.ptr, @intCast(n_inputs), 0);
const asm_val = c.LLVMGetInlineAsm(
fn_ty,
rendered.items.ptr,
rendered.items.len,
cons.items.ptr,
cons.items.len,
@intFromBool(a.has_side_effects),
0, // IsAlignStack
c.LLVMInlineAsmDialectATT,
0, // CanThrow
);
const label: [*:0]const u8 = if (instruction.ty == .void) "" else "asm";
const result = c.LLVMBuildCall2(e.builder, fn_ty, asm_val, call_args.ptr, @intCast(n_inputs), label);
// Always mapRef — the IR Ref counter advances regardless of result type.
e.mapRef(result);
}
/// Append the constraint fragments for one role group (outputs or inputs),
/// comma-separated, with each operand's `,` rewritten to LLVM's `|`
/// (alternative-constraint separator). Mirrors `FuncGen.airAssembly`.
fn appendAsmConstraints(self: Ops, cons: *std.ArrayList(u8), a: InlineAsm, inputs: bool) void {
const e = self.e;
const alloc = e.alloc;
for (a.operands) |op| {
const is_input = op.role == .input;
if (is_input != inputs) continue;
if (cons.items.len != 0) cons.append(alloc, ',') catch unreachable;
const s = e.ir_mod.types.getString(op.constraint);
for (s) |ch| cons.append(alloc, if (ch == ',') '|' else ch) catch unreachable;
}
}
/// The positional index of a named operand in the LLVM operand list
/// (outputs first, then inputs) — the `N` in `%[name]` → `${N}`. Lowering
/// guarantees every `%[name]` names an operand, so callers can assume a hit.
fn asmOperandIndex(self: Ops, a: InlineAsm, name: []const u8) ?usize {
const e = self.e;
var idx: usize = 0;
for ([_]bool{ false, true }) |inputs| {
for (a.operands) |op| {
const is_input = op.role == .input;
if (is_input != inputs) continue;
if (op.name != .empty and std.mem.eql(u8, e.ir_mod.types.getString(op.name), name)) return idx;
idx += 1;
}
}
return null;
}
/// Rewrite the asm template into LLVM form. State machine over the bytes:
/// `$`→`$$`, `%%`→`%`, `%=`→`${:uid}`, `%[name]`→`${N}`, `%[name:mod]`→
/// `${N:mod}`. Port of `FuncGen.zig`'s template rewriter.
fn renderAsmTemplate(self: Ops, out: *std.ArrayList(u8), a: InlineAsm) void {
const e = self.e;
const alloc = e.alloc;
const tmpl = e.ir_mod.types.getString(a.template);
var i: usize = 0;
while (i < tmpl.len) {
const ch = tmpl[i];
if (ch == '$') {
out.appendSlice(alloc, "$$") catch unreachable;
i += 1;
continue;
}
if (ch == '%' and i + 1 < tmpl.len) {
const nxt = tmpl[i + 1];
if (nxt == '%') {
out.append(alloc, '%') catch unreachable;
i += 2;
continue;
}
if (nxt == '=') {
out.appendSlice(alloc, "${:uid}") catch unreachable;
i += 2;
continue;
}
if (nxt == '[') {
const close = std.mem.indexOfScalarPos(u8, tmpl, i + 2, ']').?; // lowering validated
var name = tmpl[i + 2 .. close];
var modifier: ?[]const u8 = null;
if (std.mem.indexOfScalar(u8, name, ':')) |colon| {
modifier = name[colon + 1 ..];
name = name[0..colon];
}
const idx = self.asmOperandIndex(a, name).?; // lowering validated
var buf: [16]u8 = undefined;
const ds = std.fmt.bufPrint(&buf, "{d}", .{idx}) catch unreachable;
out.appendSlice(alloc, "${") catch unreachable;
out.appendSlice(alloc, ds) catch unreachable;
if (modifier) |m| {
out.append(alloc, ':') catch unreachable;
out.appendSlice(alloc, m) catch unreachable;
}
out.append(alloc, '}') catch unreachable;
i = close + 1;
continue;
}
}
out.append(alloc, ch) catch unreachable;
i += 1;
}
}
pub fn emitCall(self: Ops, instruction: *const Inst, call_op: Call) void {
// Evaluate comptime functions at compile time
const callee_func = &self.e.ir_mod.functions.items[call_op.callee.index()];

View File

@@ -1563,11 +1563,7 @@ pub const LLVMEmitter = struct {
// ── Calls ─────────────────────────────────────────────
.objc_msg_send => |msg| self.ops().emitObjcMsgSend(instruction, msg),
.jni_msg_send => |msg| self.ops().emitJniMsgSend(instruction, msg),
// Tripwire (ASM stream): the IR op exists (Phase C.0) but emit lands
// in Phase D. Until then `lowerAsmExpr` still bails, so no inline_asm
// op is ever created — reaching here means lowering switched over
// before emit was ready. Crash loudly rather than miscompile.
.inline_asm => @panic("inline_asm reached LLVM emit before Phase D — lowering must still bail until emitInlineAsm lands"),
.inline_asm => |a| self.ops().emitInlineAsm(instruction, a),
.call => |call_op| self.ops().emitCall(instruction, call_op),
.call_indirect => |call_op| self.ops().emitCallIndirect(instruction, call_op),

View File

@@ -398,6 +398,22 @@ pub const ExprTyper = struct {
}
break :blk self.l.inferExprType(nc.rhs);
},
// Inline asm result type from the `out_value` operands: 0 → void,
// 1 → that operand's type. N>1 (tuple) is Phase E → `.unresolved`
// here (lowering bails on it anyway). Mirrors `lowerAsmExpr`, so a
// bare `x := asm {…-> T}` binding types correctly.
.asm_expr => |ae| blk: {
var n_out: usize = 0;
var first_out: ?*Node = null;
for (ae.operands) |op| {
if (op.role != .out_value) continue;
n_out += 1;
if (first_out == null) first_out = op.payload;
}
if (n_out == 0) break :blk .void;
if (n_out == 1) break :blk self.l.resolveTypeWithBindings(first_out.?);
break :blk .unresolved;
},
// Statements don't produce values (`.return_stmt` is handled above
// as `.noreturn` — it diverges rather than yielding `void`).
.assignment, .var_decl, .const_decl, .fn_decl,

View File

@@ -2261,9 +2261,98 @@ pub fn lowerAsmExpr(self: *Lowering, ae: *const ast.AsmExpr, span: ast.Span) Ref
return self.emitPlaceholder("inline_asm");
}
// Shape is valid — codegen just isn't implemented yet (Phases CE).
diags.addFmt(.err, span, "inline assembly codegen is not yet implemented (ASM stream: lowering + emit land in Phases CE)", .{});
// (4) Every `%[name]` in the template must name an operand (effective name:
// explicit `[name]` or auto-derived register). Caught here so emit's
// template rewriter never sees an unknown reference. §II.6.
{
const tmpl = ae.template.data.string_literal.raw;
var i: usize = 0;
while (i < tmpl.len) : (i += 1) {
if (tmpl[i] != '%' or i + 1 >= tmpl.len) continue;
const nxt = tmpl[i + 1];
if (nxt == '%' or nxt == '=') {
i += 1;
continue;
}
if (nxt != '[') continue;
const close = std.mem.indexOfScalarPos(u8, tmpl, i + 2, ']') orelse {
diags.addFmt(.err, span, "unterminated `%[` in asm template", .{});
return self.emitPlaceholder("inline_asm");
};
var ref_name = tmpl[i + 2 .. close];
if (std.mem.indexOfScalar(u8, ref_name, ':')) |colon| ref_name = ref_name[0..colon];
var found = false;
for (ae.operands) |op| {
const eff = op.name orelse (pinnedRegister(op.constraint) orelse "");
if (eff.len != 0 and std.mem.eql(u8, eff, ref_name)) {
found = true;
break;
}
}
if (!found) {
diags.addFmt(.err, span, "asm template references `%[{s}]` but no operand is named `{s}`", .{ ref_name, ref_name });
return self.emitPlaceholder("inline_asm");
}
i = close;
}
}
// ── Build the IR op (C.1). D emits 0 or 1 value output; N>1 (tuple result)
// is Phase E — bail loudly until then. ──
var n_value_outputs: usize = 0;
for (ae.operands) |op| {
if (op.role == .out_value) n_value_outputs += 1;
}
if (n_value_outputs > 1) {
diags.addFmt(.err, span, "multi-output (tuple-returning) inline assembly is not yet implemented (ASM stream Phase E)", .{});
return self.emitPlaceholder("inline_asm");
}
// Result type: 0 outputs → void; 1 → that operand's resolved type. (The
// resolver diagnoses an unresolvable type and returns `.unresolved`.)
var result_ty: TypeId = .void;
for (ae.operands) |op| {
if (op.role == .out_value) {
result_ty = self.resolveTypeWithBindings(op.payload);
break;
}
}
if (result_ty == .unresolved) return self.emitPlaceholder("inline_asm");
// IR operands, in source order (= `%N` index space + LLVM operand order).
const ir_ops = self.alloc.alloc(inst_mod.InlineAsm.AsmOperand, ae.operands.len) catch unreachable;
for (ae.operands, 0..) |op, i| {
// Effective name (design §II.5): explicit `[name]`, else auto-derived
// from a `{reg}` pin, else anonymous (`.empty`).
const eff_name: []const u8 = op.name orelse (pinnedRegister(op.constraint) orelse "");
ir_ops[i] = .{
.role = switch (op.role) {
.out_value => .out_value,
.out_place => .out_place,
.input => .input,
},
.name = if (eff_name.len == 0) types.StringId.empty else self.module.types.internString(eff_name),
.constraint = self.module.types.internString(op.constraint),
// input → the lowered value Ref; an output yields its value (none).
.operand = if (op.role == .input) self.lowerExpr(op.payload) else Ref.none,
};
}
const ir_clobbers = self.alloc.alloc(types.StringId, ae.clobbers.len) catch unreachable;
for (ae.clobbers, 0..) |cl, i| {
ir_clobbers[i] = self.module.types.internString(cl);
}
// Template text RAW — no sx escape processing (matches `#string` literal
// bytes; the `%[name]`/`%%`/`$` rewrite happens at emit). §II.11.
const template_text = ae.template.data.string_literal.raw;
return self.builder.emit(.{ .inline_asm = .{
.template = self.module.types.internString(template_text),
.operands = ir_ops,
.clobbers = ir_clobbers,
.has_side_effects = ae.is_volatile,
} }, result_ty);
}
/// If `node` names a `for xs: (*x)` by-ref capture (an `*elem`), returns