feat(asm): Phase C.1 + D — inline asm codegen (runs end-to-end)
lowerAsmExpr stops bailing and builds the inline_asm op: resolves each operand's
effective name (§II.5 — explicit [name] else the {reg} pin), interns
template/constraints/clobbers, lowers input Refs, derives the result TypeId
(0→void, 1→T). Adds the last deferred validation (every %[name] must name an
operand). Multi-output (N>1) bails with a named "Phase E" diagnostic.
emitInlineAsm (backend/llvm/ops.zig) ports Zig's airAssembly: assembles the LLVM
constraint string (outputs → inputs → ~{clobber}, ',' → '|'), rewrites the
template (%[name]→${N}, %%→%, $→$$, %=→${:uid}), then LLVMGetInlineAsm +
LLVMBuildCall2 (AT&T dialect). Dispatch wired in emit_llvm.zig (replacing the C.0
@panic tripwire).
inferType gains an .asm_expr arm (expr_typer.zig) so a bare `x := asm {…-> T}`
binding types correctly — without it the binding inferred .unresolved and
silently produced 0.
llvm_shim.c: LLVMInitializeNativeAsmParser() — the JIT must assemble inline asm
at run time.
Verified end-to-end on the aarch64 host: `mov`/`add` with register-class inputs
and a value output run (exit 42/99), `nop volatile` runs (exit 0). IR is
textbook: `call i64 asm "add ${0},${1},${2}", "=r,r,r"(…)`.
Locked with 1645 (aarch64 add, runs; ir-only on non-aarch64) + 1646 (:= binding).
Updated 1640 (now Phase-E bail) + 1642 (now runs).
zig build test green (654 corpus, 446 unit).
This commit is contained in:
@@ -6,7 +6,31 @@ commit, one step at a time per the cadence rule (no commit may both add a test
|
||||
and make it pass).
|
||||
|
||||
## Last completed step
|
||||
**C.0** — IR op `inline_asm` (lock; no behavior change). Added `inline_asm:
|
||||
**C.1 + D** — inline asm CODEGEN (lowering builds the op + LLVM emit). **Inline
|
||||
assembly now runs end-to-end.** `lowerAsmExpr` (`src/ir/lower/expr.zig`) stops
|
||||
bailing: it resolves each operand's effective name (§II.5 auto-naming), interns
|
||||
template/constraints/clobbers, lowers input `Ref`s, derives the result `TypeId`
|
||||
(0→void, 1→T), and builds the `inline_asm` op. Added a `%[name]`-references-a-
|
||||
real-operand check (the last deferred validation). Multi-output (N>1) still bails
|
||||
loudly ("Phase E"). `emitInlineAsm` (`src/backend/llvm/ops.zig`, port of Zig's
|
||||
`airAssembly`): assembles the LLVM constraint string (outputs→inputs→`~{clobber}`,
|
||||
`,`→`|`), rewrites the template (`%[name]`→`${N}`, `%%`→`%`, `$`→`$$`, `%=`→
|
||||
`${:uid}`), then `LLVMGetInlineAsm` + `LLVMBuildCall2` (AT&T). Dispatch wired
|
||||
(`emit_llvm.zig`, replacing the C.0 `@panic`). **`llvm_shim.c`**: added
|
||||
`LLVMInitializeNativeAsmParser()` — the JIT must assemble inline asm at run time.
|
||||
Verified end-to-end: aarch64 `add`/`mov` run on the host (exit 42), `nop volatile`
|
||||
runs (1642 now exit 0), IR is textbook (`call i64 asm "add ${0},${1},${2}",
|
||||
"=r,r,r"(…)`). Locked with `examples/1645-platform-asm-aarch64-add.sx` (runs on
|
||||
aarch64, ir-only elsewhere via `.build` + `.ir`). Also added the `inferType`
|
||||
`.asm_expr` arm (`src/ir/expr_typer.zig`, 0→void / 1→T) — without it a bare
|
||||
`x := asm {…-> T}` binding inferred `.unresolved` and silently produced 0;
|
||||
regression-locked with `examples/1646-platform-asm-value-binding.sx`. Updated
|
||||
1640 (now Phase-E bail) + 1642 (now runs). `zig build test` green (654 corpus,
|
||||
446 unit). Files: `src/ir/lower/expr.zig`, `src/backend/llvm/ops.zig`,
|
||||
`src/ir/emit_llvm.zig`, `src/ir/expr_typer.zig`, `llvm_shim.c`,
|
||||
`examples/164{0,2,5,6}-*`.
|
||||
|
||||
Prior: **C.0** — IR op `inline_asm` (lock; no behavior change). Added `inline_asm:
|
||||
InlineAsm` to the IR `Op` union + the `InlineAsm` struct (`template: StringId`,
|
||||
`operands: []const AsmOperand` {role/name/constraint/operand}, `clobbers:
|
||||
[]const StringId`, `has_side_effects`) in `src/ir/inst.zig` — all strings
|
||||
@@ -88,40 +112,40 @@ guards fire: corrupting the `.ir` → IR mismatch; deleting it → the require-f
|
||||
`src/corpus_run.test.zig`, `examples/1639-*`.
|
||||
|
||||
## Current state
|
||||
Phase A underway: `asm { … }` lexes (A.0) and **parses** into `AsmExpr` (A.1);
|
||||
lowering bails LOUD + named (no IR op / emit yet). Result-type derivation, the
|
||||
operand auto-naming rule, and the validation checklist are **Phase B** (not yet
|
||||
implemented — any asm reaching lowering errors out). The adopted **operand
|
||||
auto-naming rule** (design §II.5, decided this session): name auto-derived from a
|
||||
`{reg}` pin; explicit `[name]` only when it differs or for register-class (`=r`)
|
||||
operands; echo form `[eax] "={eax}"` rejected. Parser stores `name: ?[]const u8`;
|
||||
the rule is a Phase-B (typing) concern, so the parser needs no change for it.
|
||||
**Inline assembly works end-to-end for 0/1 value outputs.** Pipeline complete:
|
||||
lex (A.0) → parse (A.1) → validate (B.0/B.1 + the `%[name]` check) → IR op (C.0)
|
||||
→ lower-builds-op + LLVM emit + JIT asm-parser init (C.1/D). Single-value-output
|
||||
and no-output `volatile` asm assemble and execute on the host JIT; the auto-naming
|
||||
rule (§II.5) is live (effective name = explicit `[name]` else `{reg}`). **Phase E
|
||||
(multi-output tuples) is the remaining feature gap** — N>1 value outputs bail with
|
||||
a named "Phase E" diagnostic (1640). `-> @place` write-through outputs are still
|
||||
rejected at parse (Phase 2). Global asm (Phase F) not started.
|
||||
|
||||
Known orthogonal bug: **issue 0137** — `sx run` on a program with no `main`
|
||||
segfaults (`src/target.zig:256-273`, unguarded JIT entry lookup). Pre-existing,
|
||||
asm-independent; does NOT block the ASM stream (every example has a `main`).
|
||||
|
||||
Phase B–E feasibility already confirmed against the live tree
|
||||
Phase E–F feasibility already confirmed against the live tree
|
||||
(`LLVMGetInlineAsm` / `LLVMBuildCall2` / `LLVMAppendModuleInlineAsm` in LLVM@19
|
||||
`Core.h`; ERR-stream `extractvalue`→tuple in `emit_llvm.zig:726-927`; lib-less
|
||||
`extern`, 60 sites; `--target` a global CLI flag).
|
||||
|
||||
## Next step
|
||||
**C.1 + D together** (must land as one green step) — wire `lowerAsmExpr` to BUILD
|
||||
the `inline_asm` op (intern template + constraints + clobber names; resolve each
|
||||
operand's effective name via the §II.5 auto-naming rule; lower input `Ref`s;
|
||||
compute the result `TypeId` from the `out_value` operands — 0→void, 1→T, N→tuple,
|
||||
named) AND implement `emitInlineAsm` in `src/ir/emit_llvm.zig` (replacing the
|
||||
`@panic` tripwire) — the port of Zig's `airAssembly`: assemble the LLVM constraint
|
||||
string (outputs `=`/`+`, inputs, `clobbers`→`~{name}`), rewrite `%[name]`→`${N}` /
|
||||
`%%` / `%=`, `LLVMGetInlineAsm` + `LLVMBuildCall2`, AT&T dialect. They land
|
||||
together because the moment lowering stops bailing, emit is reached — a half-step
|
||||
would hit the tripwire. First target: the single-value-output syscall on
|
||||
`x86_64-linux` (ir-only via a `.build` `{ "target": "x86_64-linux" }` + `.ir`
|
||||
snapshot, since the host is aarch64). Result-type derivation for `expr_typer.zig`
|
||||
(`inferType` `.asm_expr` arm) also lands here — now observable. Then E (multi-
|
||||
return tuples) + remaining validation (`%[name]` references a real operand). See
|
||||
`PLAN-ASM.md` Phases C–E + design §II.6.
|
||||
**Phase E** (multi-output tuples) — replace the N>1 "Phase E" bail in
|
||||
`lowerAsmExpr`: build a tuple `TypeId` from the `out_value` types (named via the
|
||||
effective-name rule), set it as the op result, and in `emitInlineAsm` make the
|
||||
LLVM return type an anonymous struct `{T1,…,Tn}`, then `extractvalue i` per
|
||||
`out_value` → assemble the sx tuple. Lock with `divmod`→`(quot,rem)` (reuse 1640's
|
||||
shape, now running) + `cpuid`→4-tuple, arch-pinned. See `PLAN-ASM.md` Phase E +
|
||||
design §II.6 (multi-return). Also worth adding: the x86_64-linux syscall-write
|
||||
example (ir-only on this host via `.build { "target": "x86_64-linux" }` + `.ir`)
|
||||
to lock the cross-target lowering, per the plan's D verification.
|
||||
|
||||
Then Phase 2 (`-> @place` write-through / read-write / indirect-memory) and Phase
|
||||
F (global asm + `extern` call into asm symbols). Result-type derivation for the
|
||||
0/1 cases now lives in BOTH `lowerAsmExpr` (the op's `Inst.ty`) and
|
||||
`expr_typer.zig`'s `inferType` (for `:=`/value-position typing); Phase E extends
|
||||
both to the tuple case.
|
||||
|
||||
## Log
|
||||
- (init) Plan + design doc written; ASM stream opened.
|
||||
@@ -151,6 +175,12 @@ return tuples) + remaining validation (`%[name]` references a real operand). See
|
||||
- (C.0) IR op `inline_asm: InlineAsm` + interp `bailDetail` + print arm + emit
|
||||
`@panic` tripwire (Phase D). No behavior change (lowering still bails). Unit
|
||||
test `inline_asm op shape`. `zig build test` green (652 corpus, 446 unit).
|
||||
- (C.1+D) CODEGEN — `lowerAsmExpr` builds the op (effective names, interned
|
||||
strings, input Refs, 0/1 result type) + `%[name]` validation; `emitInlineAsm`
|
||||
(constraint string + template rewrite + `LLVMGetInlineAsm`/`BuildCall2`, AT&T);
|
||||
`inferType` arm; `LLVMInitializeNativeAsmParser` for the JIT. **Inline asm runs
|
||||
end-to-end.** N>1 bails (Phase E). Locked with 1645 (aarch64 add, runs) + 1646
|
||||
(`:=` binding); updated 1640/1642. `zig build test` green (654 corpus, 446 unit).
|
||||
|
||||
## Known issues
|
||||
- **0137** — `sx run` on a program with no `main` segfaults (unguarded JIT entry
|
||||
|
||||
@@ -1,10 +1,10 @@
|
||||
// ASM stream Phase A.1 — `asm { … }` PARSES into an AsmExpr: template, named
|
||||
// value outputs (`[quot] "={rax}" -> u64`), register-pinned inputs, and a
|
||||
// `clobbers(.…)` clause are all accepted with no parse error. Codegen is not
|
||||
// implemented yet (the IR op + LLVM emit land in Phases C–E), so lowering bails
|
||||
// LOUD + named. This example pins that intermediate diagnostic; a later phase
|
||||
// turns it into a running multi-return example. Called from `main` so lowering
|
||||
// actually reaches the asm body (lazy lowering skips uncalled functions).
|
||||
// ASM stream — `asm { … }` parses + validates the full rich shape: named value
|
||||
// outputs (`[quot] "={rax}" -> u64`), register-pinned inputs, and a
|
||||
// `clobbers(.…)` clause, all accepted. This is a MULTI-output (tuple-returning)
|
||||
// asm, which is deferred to Phase E — so lowering bails LOUD + named with the
|
||||
// specific "Phase E" diagnostic (single-output asm already runs; see 1645).
|
||||
// Called from `main` so lowering reaches the asm body (lazy lowering skips
|
||||
// uncalled functions).
|
||||
divmod :: (n: u64, d: u64) -> (quot: u64, rem: u64) {
|
||||
return asm {
|
||||
"divq %[d]",
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
// ASM stream Phase B — the no-output form IS accepted when `volatile` is
|
||||
// present: validation passes, and lowering then bails on the not-yet-
|
||||
// implemented codegen (Phases C–E). Confirms the volatile rule's positive side.
|
||||
// ASM stream — the no-output `volatile` form runs end-to-end: a bare `nop`
|
||||
// (no operands, no result) assembles and executes cleanly (exit 0). Confirms
|
||||
// the no-output⇒volatile rule's positive side AND the zero-operand emit path.
|
||||
nop :: () { asm volatile { "nop" }; }
|
||||
main :: () { nop(); }
|
||||
|
||||
10
examples/1645-platform-asm-aarch64-add.sx
Normal file
10
examples/1645-platform-asm-aarch64-add.sx
Normal file
@@ -0,0 +1,10 @@
|
||||
// ASM stream Phase D — inline assembly that RUNS end-to-end. An aarch64 `add`
|
||||
// with two register-class inputs (`%[a]`, `%[b]`) and a value output (`%[out]`)
|
||||
// returned from the function. The `.build` pins aarch64-macOS: on a matching
|
||||
// host the runner executes it (exit 42); elsewhere it falls to ir-only mode and
|
||||
// asserts the `.ir` snapshot (the inline_asm op + LLVM `call asm` are target-
|
||||
// independent in the IR text). Regression for the full lower→emit→JIT path.
|
||||
add_asm :: (a: i64, b: i64) -> i64 {
|
||||
return asm { "add %[out], %[a], %[b]", [out] "=r" -> i64, [a] "r" = a, [b] "r" = b };
|
||||
}
|
||||
main :: () -> i64 { return add_asm(40, 2); }
|
||||
9
examples/1646-platform-asm-value-binding.sx
Normal file
9
examples/1646-platform-asm-value-binding.sx
Normal file
@@ -0,0 +1,9 @@
|
||||
// ASM stream Phase D — a bare `x := asm { … -> T }` binding (not a direct
|
||||
// `return asm`) types correctly: the value output flows through the local and
|
||||
// out as the exit code. Regression for the `inferType` `.asm_expr` arm (without
|
||||
// it the binding inferred `.unresolved` and silently produced 0). aarch64-pinned
|
||||
// via `.build` → runs on a matching host, ir-only elsewhere.
|
||||
main :: () -> i64 {
|
||||
x := asm { "mov %[out], #99", [out] "=r" -> i64 };
|
||||
return x;
|
||||
}
|
||||
@@ -1,4 +1,4 @@
|
||||
error: inline assembly codegen is not yet implemented (ASM stream: lowering + emit land in Phases C–E)
|
||||
error: multi-output (tuple-returning) inline assembly is not yet implemented (ASM stream Phase E)
|
||||
--> examples/1640-platform-asm-parse.sx:9:12
|
||||
|
|
||||
9 | return asm {
|
||||
|
||||
@@ -1 +1 @@
|
||||
1
|
||||
0
|
||||
|
||||
@@ -1,5 +1 @@
|
||||
error: inline assembly codegen is not yet implemented (ASM stream: lowering + emit land in Phases C–E)
|
||||
--> examples/1642-platform-asm-nop-volatile.sx:4:13
|
||||
|
|
||||
4 | nop :: () { asm volatile { "nop" }; }
|
||||
| ^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
|
||||
1
examples/expected/1645-platform-asm-aarch64-add.build
Normal file
1
examples/expected/1645-platform-asm-aarch64-add.build
Normal file
@@ -0,0 +1 @@
|
||||
{ "target": "macos" }
|
||||
1
examples/expected/1645-platform-asm-aarch64-add.exit
Normal file
1
examples/expected/1645-platform-asm-aarch64-add.exit
Normal file
@@ -0,0 +1 @@
|
||||
42
|
||||
21
examples/expected/1645-platform-asm-aarch64-add.ir
Normal file
21
examples/expected/1645-platform-asm-aarch64-add.ir
Normal file
@@ -0,0 +1,21 @@
|
||||
|
||||
; Function Attrs: nounwind
|
||||
define internal i64 @add_asm(i64 %0, i64 %1) #0 {
|
||||
entry:
|
||||
%alloca = alloca i64, align 8
|
||||
store i64 %0, ptr %alloca, align 8
|
||||
%allocaN = alloca i64, align 8
|
||||
store i64 %1, ptr %allocaN, align 8
|
||||
%load = load i64, ptr %alloca, align 8
|
||||
%loadN = load i64, ptr %allocaN, align 8
|
||||
%asm = call i64 asm "add ${0}, ${1}, ${2}", "=r,r,r"(i64 %load, i64 %loadN)
|
||||
ret i64 %asm
|
||||
}
|
||||
|
||||
; Function Attrs: nounwind
|
||||
define i32 @main() #0 {
|
||||
entry:
|
||||
%call = call i64 @add_asm(i64 40, i64 2)
|
||||
%ca.tr = trunc i64 %call to i32
|
||||
ret i32 %ca.tr
|
||||
}
|
||||
1
examples/expected/1645-platform-asm-aarch64-add.stderr
Normal file
1
examples/expected/1645-platform-asm-aarch64-add.stderr
Normal file
@@ -0,0 +1 @@
|
||||
|
||||
1
examples/expected/1645-platform-asm-aarch64-add.stdout
Normal file
1
examples/expected/1645-platform-asm-aarch64-add.stdout
Normal file
@@ -0,0 +1 @@
|
||||
|
||||
1
examples/expected/1646-platform-asm-value-binding.build
Normal file
1
examples/expected/1646-platform-asm-value-binding.build
Normal file
@@ -0,0 +1 @@
|
||||
{ "target": "macos" }
|
||||
1
examples/expected/1646-platform-asm-value-binding.exit
Normal file
1
examples/expected/1646-platform-asm-value-binding.exit
Normal file
@@ -0,0 +1 @@
|
||||
99
|
||||
11
examples/expected/1646-platform-asm-value-binding.ir
Normal file
11
examples/expected/1646-platform-asm-value-binding.ir
Normal file
@@ -0,0 +1,11 @@
|
||||
|
||||
; Function Attrs: nounwind
|
||||
define i32 @main() #0 {
|
||||
entry:
|
||||
%asm = call i64 asm "mov ${0}, #99", "=r"()
|
||||
%alloca = alloca i64, align 8
|
||||
store i64 %asm, ptr %alloca, align 8
|
||||
%load = load i64, ptr %alloca, align 8
|
||||
%ca.tr = trunc i64 %load to i32
|
||||
ret i32 %ca.tr
|
||||
}
|
||||
1
examples/expected/1646-platform-asm-value-binding.stderr
Normal file
1
examples/expected/1646-platform-asm-value-binding.stderr
Normal file
@@ -0,0 +1 @@
|
||||
|
||||
1
examples/expected/1646-platform-asm-value-binding.stdout
Normal file
1
examples/expected/1646-platform-asm-value-binding.stdout
Normal file
@@ -0,0 +1 @@
|
||||
|
||||
@@ -14,4 +14,7 @@ void sx_llvm_init_all_targets(void) {
|
||||
void sx_llvm_init_native_target(void) {
|
||||
LLVMInitializeNativeTarget();
|
||||
LLVMInitializeNativeAsmPrinter();
|
||||
// Required for inline assembly: the JIT must assemble the asm template at
|
||||
// run time, which needs the target's asm parser (ASM stream Phase D).
|
||||
LLVMInitializeNativeAsmParser();
|
||||
}
|
||||
|
||||
@@ -24,6 +24,7 @@ const Call = ir_inst.Call;
|
||||
const CallIndirect = ir_inst.CallIndirect;
|
||||
const ObjcMsgSend = ir_inst.ObjcMsgSend;
|
||||
const JniMsgSend = ir_inst.JniMsgSend;
|
||||
const InlineAsm = ir_inst.InlineAsm;
|
||||
const BuiltinCall = ir_inst.BuiltinCall;
|
||||
const TriOp = ir_inst.TriOp;
|
||||
const Branch = ir_inst.Branch;
|
||||
@@ -774,6 +775,161 @@ pub const Ops = struct {
|
||||
self.e.mapRef(result);
|
||||
}
|
||||
|
||||
/// Inline assembly (ASM stream Phase D) — the port of Zig's `airAssembly`.
|
||||
/// Handles 0 value outputs (void) and 1 (scalar); multi-output tuples are
|
||||
/// Phase E (lowering bails before reaching here). Builds the LLVM constraint
|
||||
/// string, rewrites the `%[name]` template, then `LLVMGetInlineAsm` +
|
||||
/// `LLVMBuildCall2`.
|
||||
pub fn emitInlineAsm(self: Ops, instruction: *const Inst, a: InlineAsm) void {
|
||||
const e = self.e;
|
||||
const alloc = e.alloc;
|
||||
|
||||
var n_inputs: usize = 0;
|
||||
for (a.operands) |op| {
|
||||
if (op.role == .input) n_inputs += 1;
|
||||
}
|
||||
|
||||
// Result LLVM type: void (no value output) or the single scalar.
|
||||
const ret_ty = if (instruction.ty == .void) e.cached_void else e.toLLVMType(instruction.ty);
|
||||
|
||||
// One LLVM call param per input operand, in source order.
|
||||
const param_types = alloc.alloc(c.LLVMTypeRef, n_inputs) catch unreachable;
|
||||
defer alloc.free(param_types);
|
||||
const call_args = alloc.alloc(c.LLVMValueRef, n_inputs) catch unreachable;
|
||||
defer alloc.free(call_args);
|
||||
{
|
||||
var i: usize = 0;
|
||||
for (a.operands) |op| {
|
||||
if (op.role != .input) continue;
|
||||
const raw_ty = e.argIRTypeOrFail(op.operand);
|
||||
const llvm_ty = e.toLLVMType(raw_ty);
|
||||
param_types[i] = llvm_ty;
|
||||
call_args[i] = e.coerceArg(e.resolveRef(op.operand), llvm_ty);
|
||||
i += 1;
|
||||
}
|
||||
}
|
||||
|
||||
// ── Constraint string: outputs first, then inputs, then ~{clobber}. ──
|
||||
var cons: std.ArrayList(u8) = .empty;
|
||||
defer cons.deinit(alloc);
|
||||
self.appendAsmConstraints(&cons, a, false); // outputs (out_value / out_place)
|
||||
self.appendAsmConstraints(&cons, a, true); // inputs
|
||||
for (a.clobbers) |cl| {
|
||||
if (cons.items.len != 0) cons.append(alloc, ',') catch unreachable;
|
||||
cons.appendSlice(alloc, "~{") catch unreachable;
|
||||
cons.appendSlice(alloc, e.ir_mod.types.getString(cl)) catch unreachable;
|
||||
cons.append(alloc, '}') catch unreachable;
|
||||
}
|
||||
|
||||
// ── Template rewrite: %[name]->${N}, %%->%, $->$$, %=->${:uid}. ──
|
||||
var rendered: std.ArrayList(u8) = .empty;
|
||||
defer rendered.deinit(alloc);
|
||||
self.renderAsmTemplate(&rendered, a);
|
||||
|
||||
const fn_ty = c.LLVMFunctionType(ret_ty, param_types.ptr, @intCast(n_inputs), 0);
|
||||
const asm_val = c.LLVMGetInlineAsm(
|
||||
fn_ty,
|
||||
rendered.items.ptr,
|
||||
rendered.items.len,
|
||||
cons.items.ptr,
|
||||
cons.items.len,
|
||||
@intFromBool(a.has_side_effects),
|
||||
0, // IsAlignStack
|
||||
c.LLVMInlineAsmDialectATT,
|
||||
0, // CanThrow
|
||||
);
|
||||
const label: [*:0]const u8 = if (instruction.ty == .void) "" else "asm";
|
||||
const result = c.LLVMBuildCall2(e.builder, fn_ty, asm_val, call_args.ptr, @intCast(n_inputs), label);
|
||||
// Always mapRef — the IR Ref counter advances regardless of result type.
|
||||
e.mapRef(result);
|
||||
}
|
||||
|
||||
/// Append the constraint fragments for one role group (outputs or inputs),
|
||||
/// comma-separated, with each operand's `,` rewritten to LLVM's `|`
|
||||
/// (alternative-constraint separator). Mirrors `FuncGen.airAssembly`.
|
||||
fn appendAsmConstraints(self: Ops, cons: *std.ArrayList(u8), a: InlineAsm, inputs: bool) void {
|
||||
const e = self.e;
|
||||
const alloc = e.alloc;
|
||||
for (a.operands) |op| {
|
||||
const is_input = op.role == .input;
|
||||
if (is_input != inputs) continue;
|
||||
if (cons.items.len != 0) cons.append(alloc, ',') catch unreachable;
|
||||
const s = e.ir_mod.types.getString(op.constraint);
|
||||
for (s) |ch| cons.append(alloc, if (ch == ',') '|' else ch) catch unreachable;
|
||||
}
|
||||
}
|
||||
|
||||
/// The positional index of a named operand in the LLVM operand list
|
||||
/// (outputs first, then inputs) — the `N` in `%[name]` → `${N}`. Lowering
|
||||
/// guarantees every `%[name]` names an operand, so callers can assume a hit.
|
||||
fn asmOperandIndex(self: Ops, a: InlineAsm, name: []const u8) ?usize {
|
||||
const e = self.e;
|
||||
var idx: usize = 0;
|
||||
for ([_]bool{ false, true }) |inputs| {
|
||||
for (a.operands) |op| {
|
||||
const is_input = op.role == .input;
|
||||
if (is_input != inputs) continue;
|
||||
if (op.name != .empty and std.mem.eql(u8, e.ir_mod.types.getString(op.name), name)) return idx;
|
||||
idx += 1;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
/// Rewrite the asm template into LLVM form. State machine over the bytes:
|
||||
/// `$`→`$$`, `%%`→`%`, `%=`→`${:uid}`, `%[name]`→`${N}`, `%[name:mod]`→
|
||||
/// `${N:mod}`. Port of `FuncGen.zig`'s template rewriter.
|
||||
fn renderAsmTemplate(self: Ops, out: *std.ArrayList(u8), a: InlineAsm) void {
|
||||
const e = self.e;
|
||||
const alloc = e.alloc;
|
||||
const tmpl = e.ir_mod.types.getString(a.template);
|
||||
var i: usize = 0;
|
||||
while (i < tmpl.len) {
|
||||
const ch = tmpl[i];
|
||||
if (ch == '$') {
|
||||
out.appendSlice(alloc, "$$") catch unreachable;
|
||||
i += 1;
|
||||
continue;
|
||||
}
|
||||
if (ch == '%' and i + 1 < tmpl.len) {
|
||||
const nxt = tmpl[i + 1];
|
||||
if (nxt == '%') {
|
||||
out.append(alloc, '%') catch unreachable;
|
||||
i += 2;
|
||||
continue;
|
||||
}
|
||||
if (nxt == '=') {
|
||||
out.appendSlice(alloc, "${:uid}") catch unreachable;
|
||||
i += 2;
|
||||
continue;
|
||||
}
|
||||
if (nxt == '[') {
|
||||
const close = std.mem.indexOfScalarPos(u8, tmpl, i + 2, ']').?; // lowering validated
|
||||
var name = tmpl[i + 2 .. close];
|
||||
var modifier: ?[]const u8 = null;
|
||||
if (std.mem.indexOfScalar(u8, name, ':')) |colon| {
|
||||
modifier = name[colon + 1 ..];
|
||||
name = name[0..colon];
|
||||
}
|
||||
const idx = self.asmOperandIndex(a, name).?; // lowering validated
|
||||
var buf: [16]u8 = undefined;
|
||||
const ds = std.fmt.bufPrint(&buf, "{d}", .{idx}) catch unreachable;
|
||||
out.appendSlice(alloc, "${") catch unreachable;
|
||||
out.appendSlice(alloc, ds) catch unreachable;
|
||||
if (modifier) |m| {
|
||||
out.append(alloc, ':') catch unreachable;
|
||||
out.appendSlice(alloc, m) catch unreachable;
|
||||
}
|
||||
out.append(alloc, '}') catch unreachable;
|
||||
i = close + 1;
|
||||
continue;
|
||||
}
|
||||
}
|
||||
out.append(alloc, ch) catch unreachable;
|
||||
i += 1;
|
||||
}
|
||||
}
|
||||
|
||||
pub fn emitCall(self: Ops, instruction: *const Inst, call_op: Call) void {
|
||||
// Evaluate comptime functions at compile time
|
||||
const callee_func = &self.e.ir_mod.functions.items[call_op.callee.index()];
|
||||
|
||||
@@ -1563,11 +1563,7 @@ pub const LLVMEmitter = struct {
|
||||
// ── Calls ─────────────────────────────────────────────
|
||||
.objc_msg_send => |msg| self.ops().emitObjcMsgSend(instruction, msg),
|
||||
.jni_msg_send => |msg| self.ops().emitJniMsgSend(instruction, msg),
|
||||
// Tripwire (ASM stream): the IR op exists (Phase C.0) but emit lands
|
||||
// in Phase D. Until then `lowerAsmExpr` still bails, so no inline_asm
|
||||
// op is ever created — reaching here means lowering switched over
|
||||
// before emit was ready. Crash loudly rather than miscompile.
|
||||
.inline_asm => @panic("inline_asm reached LLVM emit before Phase D — lowering must still bail until emitInlineAsm lands"),
|
||||
.inline_asm => |a| self.ops().emitInlineAsm(instruction, a),
|
||||
.call => |call_op| self.ops().emitCall(instruction, call_op),
|
||||
.call_indirect => |call_op| self.ops().emitCallIndirect(instruction, call_op),
|
||||
|
||||
|
||||
@@ -398,6 +398,22 @@ pub const ExprTyper = struct {
|
||||
}
|
||||
break :blk self.l.inferExprType(nc.rhs);
|
||||
},
|
||||
// Inline asm result type from the `out_value` operands: 0 → void,
|
||||
// 1 → that operand's type. N>1 (tuple) is Phase E → `.unresolved`
|
||||
// here (lowering bails on it anyway). Mirrors `lowerAsmExpr`, so a
|
||||
// bare `x := asm {…-> T}` binding types correctly.
|
||||
.asm_expr => |ae| blk: {
|
||||
var n_out: usize = 0;
|
||||
var first_out: ?*Node = null;
|
||||
for (ae.operands) |op| {
|
||||
if (op.role != .out_value) continue;
|
||||
n_out += 1;
|
||||
if (first_out == null) first_out = op.payload;
|
||||
}
|
||||
if (n_out == 0) break :blk .void;
|
||||
if (n_out == 1) break :blk self.l.resolveTypeWithBindings(first_out.?);
|
||||
break :blk .unresolved;
|
||||
},
|
||||
// Statements don't produce values (`.return_stmt` is handled above
|
||||
// as `.noreturn` — it diverges rather than yielding `void`).
|
||||
.assignment, .var_decl, .const_decl, .fn_decl,
|
||||
|
||||
@@ -2261,9 +2261,98 @@ pub fn lowerAsmExpr(self: *Lowering, ae: *const ast.AsmExpr, span: ast.Span) Ref
|
||||
return self.emitPlaceholder("inline_asm");
|
||||
}
|
||||
|
||||
// Shape is valid — codegen just isn't implemented yet (Phases C–E).
|
||||
diags.addFmt(.err, span, "inline assembly codegen is not yet implemented (ASM stream: lowering + emit land in Phases C–E)", .{});
|
||||
// (4) Every `%[name]` in the template must name an operand (effective name:
|
||||
// explicit `[name]` or auto-derived register). Caught here so emit's
|
||||
// template rewriter never sees an unknown reference. §II.6.
|
||||
{
|
||||
const tmpl = ae.template.data.string_literal.raw;
|
||||
var i: usize = 0;
|
||||
while (i < tmpl.len) : (i += 1) {
|
||||
if (tmpl[i] != '%' or i + 1 >= tmpl.len) continue;
|
||||
const nxt = tmpl[i + 1];
|
||||
if (nxt == '%' or nxt == '=') {
|
||||
i += 1;
|
||||
continue;
|
||||
}
|
||||
if (nxt != '[') continue;
|
||||
const close = std.mem.indexOfScalarPos(u8, tmpl, i + 2, ']') orelse {
|
||||
diags.addFmt(.err, span, "unterminated `%[` in asm template", .{});
|
||||
return self.emitPlaceholder("inline_asm");
|
||||
};
|
||||
var ref_name = tmpl[i + 2 .. close];
|
||||
if (std.mem.indexOfScalar(u8, ref_name, ':')) |colon| ref_name = ref_name[0..colon];
|
||||
var found = false;
|
||||
for (ae.operands) |op| {
|
||||
const eff = op.name orelse (pinnedRegister(op.constraint) orelse "");
|
||||
if (eff.len != 0 and std.mem.eql(u8, eff, ref_name)) {
|
||||
found = true;
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (!found) {
|
||||
diags.addFmt(.err, span, "asm template references `%[{s}]` but no operand is named `{s}`", .{ ref_name, ref_name });
|
||||
return self.emitPlaceholder("inline_asm");
|
||||
}
|
||||
i = close;
|
||||
}
|
||||
}
|
||||
|
||||
// ── Build the IR op (C.1). D emits 0 or 1 value output; N>1 (tuple result)
|
||||
// is Phase E — bail loudly until then. ──
|
||||
var n_value_outputs: usize = 0;
|
||||
for (ae.operands) |op| {
|
||||
if (op.role == .out_value) n_value_outputs += 1;
|
||||
}
|
||||
if (n_value_outputs > 1) {
|
||||
diags.addFmt(.err, span, "multi-output (tuple-returning) inline assembly is not yet implemented (ASM stream Phase E)", .{});
|
||||
return self.emitPlaceholder("inline_asm");
|
||||
}
|
||||
|
||||
// Result type: 0 outputs → void; 1 → that operand's resolved type. (The
|
||||
// resolver diagnoses an unresolvable type and returns `.unresolved`.)
|
||||
var result_ty: TypeId = .void;
|
||||
for (ae.operands) |op| {
|
||||
if (op.role == .out_value) {
|
||||
result_ty = self.resolveTypeWithBindings(op.payload);
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (result_ty == .unresolved) return self.emitPlaceholder("inline_asm");
|
||||
|
||||
// IR operands, in source order (= `%N` index space + LLVM operand order).
|
||||
const ir_ops = self.alloc.alloc(inst_mod.InlineAsm.AsmOperand, ae.operands.len) catch unreachable;
|
||||
for (ae.operands, 0..) |op, i| {
|
||||
// Effective name (design §II.5): explicit `[name]`, else auto-derived
|
||||
// from a `{reg}` pin, else anonymous (`.empty`).
|
||||
const eff_name: []const u8 = op.name orelse (pinnedRegister(op.constraint) orelse "");
|
||||
ir_ops[i] = .{
|
||||
.role = switch (op.role) {
|
||||
.out_value => .out_value,
|
||||
.out_place => .out_place,
|
||||
.input => .input,
|
||||
},
|
||||
.name = if (eff_name.len == 0) types.StringId.empty else self.module.types.internString(eff_name),
|
||||
.constraint = self.module.types.internString(op.constraint),
|
||||
// input → the lowered value Ref; an output yields its value (none).
|
||||
.operand = if (op.role == .input) self.lowerExpr(op.payload) else Ref.none,
|
||||
};
|
||||
}
|
||||
|
||||
const ir_clobbers = self.alloc.alloc(types.StringId, ae.clobbers.len) catch unreachable;
|
||||
for (ae.clobbers, 0..) |cl, i| {
|
||||
ir_clobbers[i] = self.module.types.internString(cl);
|
||||
}
|
||||
|
||||
// Template text RAW — no sx escape processing (matches `#string` literal
|
||||
// bytes; the `%[name]`/`%%`/`$` rewrite happens at emit). §II.11.
|
||||
const template_text = ae.template.data.string_literal.raw;
|
||||
|
||||
return self.builder.emit(.{ .inline_asm = .{
|
||||
.template = self.module.types.internString(template_text),
|
||||
.operands = ir_ops,
|
||||
.clobbers = ir_clobbers,
|
||||
.has_side_effects = ae.is_volatile,
|
||||
} }, result_ty);
|
||||
}
|
||||
|
||||
/// If `node` names a `for xs: (*x)` by-ref capture (an `*elem`), returns
|
||||
|
||||
Reference in New Issue
Block a user