Files
sx/current/CHECKPOINT-ASM.md
agra 0e0ee40528 docs(asm): symbol refs are portable — explain the auto-:c mechanism
Updates the symbol-operand guide: x86 now uses the same plain %[fn] as
aarch64, and a 'How the portability works' note explains the mechanism
(compiler auto-injects LLVM's :c modifier for "s" operands, equivalent
to GCC :P/%P0 for x86 calls, no-op on aarch64, overridable). Drops the
stale per-arch :P guidance; checkpoint updated.
2026-06-16 09:05:15 +03:00

27 KiB
Raw Permalink Blame History

sx Inline Assembly — Checkpoint (ASM stream)

Companion to current/PLAN-ASM.md; design in design/inline-asm-design.md. Update after every commit, one step at a time per the cadence rule (no commit may both add a test and make it pass).

Last completed step

G (indirect-memory =*m place outputs) — the LAST substantive asm feature. Unlike a write-through = output (which returns a value then stored), an indirect output passes the place ADDRESS to the asm and the asm writes through it — no return slot. emitInlineAsm (src/backend/llvm/ops.zig): indirect outputs are excluded from the LLVM return type; their pointer is an opaque ptr call arg placed first (arg-consuming constraint order = output-section indirect pointers → inputs → read-write tied seeds); each gets an elementtype(T) call-site attribute (required in the opaque-pointer era) via LLVMCreateTypeAttribute/LLVMAddCallSiteAttribute; the store-back loop skips them. New asmIsIndirect(e, op) helper. Lowering (lowerAsmExpr) stops rejecting * (constraint kept verbatim, =*m reaches the constraint string as-is). asmOperandIndex unchanged — indirect outputs still count as operands, so %[name]${N} holds. Verified by running on aarch64: store-through- pointer (str x9, %[out] → 42, IR "=*m,~{x9}"(ptr elementtype(i64) …)) and a mixed case (indirect + value output + input → "=*m,=r,r", indirect ptr arg first, ${0}/${1}/${2} correct). Two commits per cadence: (1) examples/1652-platform-asm-indirect-mem.sx locked the rejection; (2) implemented

  • flipped 1652 to a runnable aarch64-pinned example ({ "target": "macos" }, ir-only elsewhere). zig build test green (661 corpus, 446 unit). Files: src/ir/lower/expr.zig, src/backend/llvm/ops.zig, examples/1652-*.

Prior: G (read-write + place outputs) — a +r / +{reg} -> @place output is now implemented. LLVM has no + constraint, so a read-write place lowers to: an output = constraint (return slot, stored back through the place after the call; the leading + rewritten to = in appendAsmConstraints), plus a tied input (the decimal index of that output) appended after the regular inputs, seeded with the place's loaded value passed as a call arg. Tied inputs come last so existing operand indices (%[name]${N}) are undisturbed — asmOperandIndex unchanged. Lowering (lowerAsmExpr) no longer rejects + (indirect * still rejected loudly). emitInlineAsm (src/backend/llvm/ops.zig): grows arg/param arrays by the rw count (n_args = n_inputs + n_rw), loads each seed (asm.rw.seed), emits the tied constraint, and the existing store-back path writes the modified output back. New asmIsReadWrite(e, op) helper. Verified by running: increment-in-place (41→42, IR "=r,0") and a mixed case (rw place + regular input + value output) → textbook "=r,=r,r,0" with correct ${N} indices and args (input, seed). Two commits per cadence: (1) examples/1650-platform-asm-rw-place.sx locked the rejection; (2) implemented + flipped 1650 to a runnable aarch64-pinned example ({ "target": "macos" }, ir-only elsewhere). zig build test green (658 corpus, 446 unit). Files: src/ir/lower/expr.zig, src/backend/llvm/ops.zig, examples/1650-*.

Prior: 2-> @place write-through outputs. An asm result can be stored through a place (local / struct field) instead of returned; the place output does NOT join the result tuple. Parser: -> @place parses the @place as an ordinary address-of expression → an out_place operand (src/parser.zig). Lowering (lowerAsmExpr): out_place operand = the lowered @place address, out_ty = the pointee; read-write (+) and indirect-memory (*) constraints rejected loudly (not yet implemented). Added out_ty: TypeId to the IR AsmOperand (src/ir/inst.zig) so emit builds the combined return struct (ALL outputs). emitInlineAsm rewrite (src/backend/llvm/ops.zig): the LLVM return type is now built from every output's out_ty; after the call, out_place slots are stored through their address and out_value slots rebuild the sx result — with a fast path (no place outputs → the asm's struct return IS the result, so pure-value asm IR is unchanged). Verified: write-to-local (get42→42), struct field (@p.b), mixed value+place (v=10 b=20), + rejected. Locked with examples/1649-platform-asm-place-output.sx (mixed, runs on aarch64). zig build test green (657 corpus, 446 unit). Files: src/parser.zig, src/ir/inst.zig, src/ir/lower/expr.zig, src/backend/llvm/ops.zig, examples/1649-*.

Prior: F — global (module-scope) asm. A top-level asm { "tmpl", }; block (template only) lowers to LLVM module asm, and a lib-less extern calls into the symbols it defines. New asm_global AST node (src/ast.zig) + parseAsmGlobal (src/parser.zig, dispatched from parseTopLevel on kw_asm) — rejects volatile and any operands/clobbers. The node forced (and got) arms in the same three Node.Data switches as asm_expr (sema.zig ×2, semantic_diagnostics.zig). Module gains a global_asm: ArrayList([]const u8) (src/ir/module.zig); lowerMainAndComptime captures each template (the dead lowerDecls is NOT the top-level pass — lowerRoot Pass 2 uses lowerMainAndComptime); emit_llvm.zig's emit() appends each via LLVMAppendModuleInlineAsm (source order). Verified end-to-end: an aarch64 _my_add global routine called via extern returns 42. Locked with examples/1648-platform-asm-global.sx (.build { "aot": true, "target": "macos" } → AOT build+run on aarch64, ir-only elsewhere). zig build test green (656 corpus, 446 unit). (Correction, later: module asm ALSO runs under the JIT — sx run compiles to an in-memory object, the integrated assembler assembles the module asm into it, ORC relocates and runs it, so the symbol is resolvable at JIT main execution. The original "AOT only" note was wrong; see 1653 for the JIT sibling. The genuine boundary is a COMPILE-TIME #run call into a module-asm symbol, which fails loud via host dlsym-miss — see 1654.) Files: src/ast.zig, src/parser.zig, src/sema.zig, src/ir/semantic_diagnostics.zig, src/ir/module.zig, src/ir/lower/decl.zig, src/ir/emit_llvm.zig, examples/1648-*.

Prior: E — multi-output tuples. Inline asm now returns tuples. Replaced the N>1 bail with a shared asmResultType helper (src/ir/lower/expr.zig, mixed into Lowering) that derives the result type from the out_value operands (0→void, 1→T, N→named tuple, named via the §II.5 effective-name rule). The key realization: toLLVMType(tuple) already produces a literal struct {T1,…,Tn} — exactly LLVM's multi-output asm return — so emit needed NO change; building the op with a tuple result type makes the asm call return the struct, which IS sx's tuple value (destructured by the normal tuple_get path). inferType's .asm_expr arm now also delegates to asmResultType (single owner), so return asm, x := asm, and q, r := asm all agree on the type. Verified end-to-end on aarch64: split(0x1234)(lo=52, hi=18), a udiv/msub divmod→ (3, 2). IR is textbook: call { i64, i64 } asm "divq ${4}", "={rax},={rdx},{rax},{rdx},r,~{cc}"(…) → extractvalue → tuple. Converted 1640 to the x86_64 multi-output IR lock (ir-only) + added 1647-platform-asm-aarch64-multi (runs on aarch64). zig build test green (655 corpus, 446 unit). Files: src/ir/lower/expr.zig, src/ir/lower.zig, src/ir/expr_typer.zig, examples/164{0,7}-*.

Prior: C.1 + D — inline asm CODEGEN (lowering builds the op + LLVM emit). Inline assembly now runs end-to-end. lowerAsmExpr (src/ir/lower/expr.zig) stops bailing: it resolves each operand's effective name (§II.5 auto-naming), interns template/constraints/clobbers, lowers input Refs, derives the result TypeId (0→void, 1→T), and builds the inline_asm op. Added a %[name]-references-a- real-operand check (the last deferred validation). Multi-output (N>1) still bails loudly ("Phase E"). emitInlineAsm (src/backend/llvm/ops.zig, port of Zig's airAssembly): assembles the LLVM constraint string (outputs→inputs→~{clobber}, ,|), rewrites the template (%[name]${N}, %%%, $$$, %=${:uid}), then LLVMGetInlineAsm + LLVMBuildCall2 (AT&T). Dispatch wired (emit_llvm.zig, replacing the C.0 @panic). llvm_shim.c: added LLVMInitializeNativeAsmParser() — the JIT must assemble inline asm at run time. Verified end-to-end: aarch64 add/mov run on the host (exit 42), nop volatile runs (1642 now exit 0), IR is textbook (call i64 asm "add ${0},${1},${2}", "=r,r,r"(…)). Locked with examples/1645-platform-asm-aarch64-add.sx (runs on aarch64, ir-only elsewhere via .build + .ir). Also added the inferType .asm_expr arm (src/ir/expr_typer.zig, 0→void / 1→T) — without it a bare x := asm {…-> T} binding inferred .unresolved and silently produced 0; regression-locked with examples/1646-platform-asm-value-binding.sx. Updated 1640 (now Phase-E bail) + 1642 (now runs). zig build test green (654 corpus, 446 unit). Files: src/ir/lower/expr.zig, src/backend/llvm/ops.zig, src/ir/emit_llvm.zig, src/ir/expr_typer.zig, llvm_shim.c, examples/164{0,2,5,6}-*.

Prior: C.0 — IR op inline_asm (lock; no behavior change). Added inline_asm: InlineAsm to the IR Op union + the InlineAsm struct (template: StringId, operands: []const AsmOperand {role/name/constraint/operand}, clobbers: []const StringId, has_side_effects) in src/ir/inst.zig — all strings interned, operands in source order, result on Inst.ty. The new variant forced (and got) arms in two exhaustive Op switches: src/ir/interp.zig (loud bailDetail — inline asm is never comptime-evaluable) and src/ir/print.zig (IR dump). src/ir/emit_llvm.zig gets a @panic tripwire — emit lands in Phase D, and until then lowerAsmExpr still bails so no inline_asm op is ever created (reaching emit would be a lowering-switched-over-too-early bug). Unit test inline_asm op shape in src/ir/inst.test.zig. zig build test green (652 corpus, 446 unit). Files: src/ir/inst.zig, src/ir/interp.zig, src/ir/print.zig, src/ir/emit_llvm.zig, src/ir/inst.test.zig.

Prior: B.1 — operand-name validation (design §II.5 auto-naming rule). Extended lowerAsmExpr with a pinnedRegister(constraint) helper ("={eax}"eax, "+{rax}"rax, "=r"→null) and two checks: (1) reject the echo form [eax] "={eax}" — a label identical to its own pinned register is redundant (the operand is already auto-named after the register); (2) reject duplicate operand names (ambiguous %[name] / result field). Locked with examples/1643-platform-asm-echo-name.sx + 1644-platform-asm-duplicate-name.sx. zig build test green (652 corpus, 0 failed; 445 unit). Files: src/ir/lower/expr.zig.

Prior: B.0 — asm shape validation (compile-path diagnostics). Restructured the .asm_expr lowering arm into lowerAsmExpr (src/ir/lower/expr.zig, mixed into Lowering in src/ir/lower.zig): it validates BEFORE the not-yet-implemented codegen bail, so the user sees the real problem first. Two checklist items now enforced with named diagnostics: (1) template must be a compile-time-known string ("..." / #string); (2) no value outputs ⇒ must be volatile (mirrors Zig — a result-less asm could be deleted). Valid shapes still bail with the "codegen not yet implemented" message. Result-type derivation + auto-naming stay deferred to a later step (observable only once Phase C produces a real IR op). Locked with examples/1641-platform-asm-missing-volatile.sx (volatile error) + 1642-platform-asm-nop-volatile.sx (volatile no-output accepted → codegen bail). zig build test green (650 corpus, 0 failed; 445 unit). Files: src/ir/lower/expr.zig, src/ir/lower.zig, examples/164{1,2}-*.

Prior: A.1 — parse asm { … } + loud lowering bail (folded A.1+A.2 into one honest lock commit, since the loud bail IS current correct behavior — cadence option (a)). Added AsmExpr/AsmOperand to src/ast.zig + the asm_expr Node.Data arm; parseAsmExpr in src/parser.zig (parsePrimary .kw_asm dispatch) — parses the template, flat operand list ([name]? "constraint" -> Type value output / = expr input), and clobbers(.…); volatile/clobbers recognized contextually via isContextualWord. The new asm_expr tag forced (and got) arms in three exhaustive Node.Data switches: src/sema.zig analyzeNode + findNodeAtOffset, src/ir/semantic_diagnostics.zig checkBindingNames (all recurse into template + operand payloads). Lowering bails LOUD + named in src/ir/lower/expr.zig ("inline assembly codegen is not yet implemented…") via an explicit .asm_expr arm (not the generic unknown_expr else) returning emitPlaceholder. -> @place write-through is rejected with a clear "Phase 2" parse error. Locked with examples/1640-platform-asm-parse.sx (multi-output divmod, named operands, register pins, clobbers — parses then bails; called from main). zig build test green (648 corpus, 0 failed; 445 unit). Files: src/ast.zig, src/parser.zig, src/sema.zig, src/ir/semantic_diagnostics.zig, src/ir/lower/expr.zig, examples/1640-*.

Prior: A.0kw_asm keyword (first compiler code). Added the kw_asm Token.Tag variant + .{ "asm", .kw_asm } keyword-map entry in src/token.zig; volatile / clobbers deliberately stay OUT of the global table (contextual). New exhaustive Tag switch in src/lsp/server.zig classifyToken flagged the missing arm (the intended coverage tripwire) — added .kw_asm to the keyword group. Lock test in new src/lexer.test.zig (asmkw_asm, volatile/clobbersidentifier), wired into the src/root.zig barrel as lexer_tests. zig build test green (648 corpus, 0 failed; 445 unit, 0 failed — +1). Files: src/token.zig, src/lexer.test.zig, src/root.zig, src/lsp/server.zig.

Prior: 0.2 — CLAUDE.md docs for <name>.build; Phase 0 COMPLETE. 0.1 — corpus runner ir-only branch for cross-target examples. Replaced 0.0's loud placeholder bail: when cfg.target doesn't match the host (ir_only), sweepRoot skips run/build/exec and verifies via sx ir --target only — asserting .exit (ir cmd) + .ir (normalized stdout) + .stderr, never .stdout (write skipped in update mode, assertion skipped in verify mode). An .ir snapshot is required in ir-only mode — its absence is a loud failure ("needs an .ir snapshot for ir-only mode"). Locked with examples/1639-platform-target-cross.sx (asm-free main :: () -> i64 { return 0; }), .build { "target": "x86_64-linux" }, + checked-in .ir. Verified both guards fire: corrupting the .ir → IR mismatch; deleting it → the require-failure. zig build test green (647 corpus, 0 failed; 444 unit). Files: src/corpus_run.test.zig, examples/1639-*.

Current state

Inline assembly works end-to-end: 0, 1, and N value outputs (tuples). Full pipeline: lex (A.0) → parse (A.1) → validate (B.0/B.1 + %[name] check) → IR op (C.0) → lower-builds-op + LLVM emit + JIT asm-parser init (C.1/D) → multi-output tuples (E). Register-class + register-pinned operands, inputs, symbol operands ("s" → direct bl/call to a function/global by mangled name), clobbers, #string multi-instruction templates, %[name]/%% rewriting, and the §II.5 auto-naming rule all work and execute on the host JIT. Global asm { … } (Phase F) works via lib-less extern under BOTH the JIT (sx run → 1653) and AOT (1648) — sx run compiles to an object, so the integrated assembler bakes the module asm symbol in and ORC resolves it. All three -> @place output forms now work and execute on aarch64: write-through = (Phase 2), read-write + (tied input), and indirect-memory =*m (pointer arg + elementtype, asm writes through it). Inline assembly is now feature-complete — no substantive features remain. The x86_64 syscall-write ir-only example is DONE (1651). Global asm runs under both JIT (1653) and AOT (1648). readme.md now has an "Inline Assembly" section.

Known orthogonal bug: issue 0137sx run on a program with no main segfaults (src/target.zig:256-273, unguarded JIT entry lookup). Pre-existing, asm-independent; does NOT block the ASM stream (every example has a main).

Phase EF feasibility already confirmed against the live tree (LLVMGetInlineAsm / LLVMBuildCall2 / LLVMAppendModuleInlineAsm in LLVM@19 Core.h; ERR-stream extractvalue→tuple in emit_llvm.zig:726-927; lib-less extern, 60 sites; --target a global CLI flag).

Next step

Inline assembly is feature-complete. All substantive features are done: 0/1/N value outputs (tuples), register-class + pinned operands, inputs, clobbers, #string templates, %[name]/%%/$/%= rewriting, §II.5 auto-naming, global asm { … } (AOT), and all three -> @place output forms — write-through (=), read-write (+), and indirect-memory (=*m). The x86_64 syscall-write ir-only example (1651) and the output-to-const rejection (issue 0138) are also done.

Global asm runs under BOTH the JIT (sx run → object → ORC; 1653) and AOT (1648) — the earlier "AOT only / sx run mishandles module-asm" note was stale and has been corrected. The one genuine boundary is a COMPILE-TIME #run into a module-asm symbol: the interpreter resolves externs via host dlsym, the symbol isn't linked yet, so it already fails loud (comptime extern call: symbol not found via dlsym) — pinned by 1654.

Remaining work, all polish (optional):

  • None substantive. Possible niceties: tighten the #run-into-module-asm error text to name module-asm specifically; broaden clobber validation to a checked per-arch enum (design doc Phase 4).

Orthogonal: issue 0137 (no-main JIT segfault).

Done since last: output-to-const rejection (issue 0138), x86_64 syscall-write ir-only example (1651).

Orthogonal: issue 0137 (no-main segfault).

Log

  • (init) Plan + design doc written; ASM stream opened.
  • (0.0) Corpus runner target-gating: <name>.build JSON config (replaces .aot marker), --target threading, hostMatchesTarget execute-gate, loud cross-target placeholder bail. Migrated 1226/1227 .aot.build; locked with 1638 fixture + unit tests. zig build test green.
  • (0.1) ir-only branch: cross-target examples verify via sx ir --target only (exit+ir+stderr, no stdout; .ir required). Locked with 1639 fixture; verified corrupt-.ir → mismatch and missing-.ir → loud failure. zig build test green.
  • (0.2) docs: CLAUDE.md documents <name>.build JSON sidecar (aot + target + ir-only gating), replacing stale .aot marker prose. Phase 0 COMPLETE.
  • (A.0) kw_asm keyword in token.zig (+ map entry); LSP classifyToken switch coverage; lock test in new lexer.test.zig (wired via root.zig). volatile / clobbers stay contextual identifiers. zig build test green (445 unit, +1).
  • (A.1) parse asm { … }AsmExpr + loud lowering bail; asm_expr arms in 3 exhaustive Node.Data switches; -> @place rejected (Phase 2). Adopted operand auto-naming rule (design §II.5). Locked with 1640 fixture. Filed orthogonal issue 0137 (no-main JIT segfault). zig build test green (648 corpus, 445 unit).
  • (B.0) asm shape validation in lowerAsmExpr: comptime-string template + no-output⇒volatile, with named diagnostics before the codegen bail. Locked with 1641 (volatile error) + 1642 (volatile accepted). zig build test green (650 corpus, 445 unit).
  • (B.1) operand-name validation: pinnedRegister helper + reject echo form ([eax] "={eax}") and duplicate names. Locked with 1643 + 1644. zig build test green (652 corpus, 445 unit).
  • (C.0) IR op inline_asm: InlineAsm + interp bailDetail + print arm + emit @panic tripwire (Phase D). No behavior change (lowering still bails). Unit test inline_asm op shape. zig build test green (652 corpus, 446 unit).
  • (C.1+D) CODEGEN — lowerAsmExpr builds the op (effective names, interned strings, input Refs, 0/1 result type) + %[name] validation; emitInlineAsm (constraint string + template rewrite + LLVMGetInlineAsm/BuildCall2, AT&T); inferType arm; LLVMInitializeNativeAsmParser for the JIT. Inline asm runs end-to-end. N>1 bails (Phase E). Locked with 1645 (aarch64 add, runs) + 1646 (:= binding); updated 1640/1642. zig build test green (654 corpus, 446 unit).
  • (E) multi-output tuples — asmResultType helper (0→void/1→T/N→named tuple), shared by lowering + inferType. toLLVMType(tuple) == LLVM multi-output struct, so emit unchanged; the asm struct return IS the sx tuple. Runs on aarch64 (1647: split(lo,hi)); 1640 → x86 multi-output IR lock (ir-only). zig build test green (655 corpus, 446 unit).
  • (F) global asm — asm_global AST node + parseAsmGlobal (top-level, rejects volatile/operands); Module.global_asm captured in lowerMainAndComptime; emit() appends via LLVMAppendModuleInlineAsm; call-into via lib-less extern. AOT-verified (1648, _my_add→42). zig build test green (656 corpus).
  • (docs) readme.md "Inline Assembly" section (b8800a2).
  • (2) -> @place write-through — out_place operand; out_ty on the IR AsmOperand; emitInlineAsm builds the combined output struct + splits (out_place → store-through, out_value → result), fast-path when no places. +/* rejected. Locked with 1649 (mixed, runs). zig build test green (657 corpus, 446 unit).
  • (G) read-write + place outputs — + lowers to an output = + a tied input (output-index constraint) seeded with the place's loaded value, tied inputs appended last (operand indices undisturbed). appendAsmConstraints rewrites +=; emitInlineAsm grows args by the rw count + loads seeds; asmIsReadWrite helper. Lowering stops rejecting + (* still rejected). Two commits (cadence): 1650 locked the rejection, then flipped to a runnable aarch64 example ("=r,0" IR). zig build test green (658 corpus, 446 unit).
  • (0138) output-to-const rejection — fixed the underlying general bug: scalar @const (address-of a folded :: constant) reinterpreted the value as a pointer (inttoptr). src/ir/lower/expr.zig .address_of now diagnoses a scalar const (local + module) instead of falling through; array/struct consts keep storage. asm -> @const gets the clean diagnostic for free (same path). Regression examples/1177-diagnostics-addr-of-const-rejected.sx. Issue 0138 RESOLVED. zig build test green (659 corpus, 446 unit).
  • (x86 syscall) x86_64 Linux write(2) via raw syscall — locks the constraint string ={rax},{rax},{rdi},{rsi},{rdx},~{rcx},~{r11},~{memory} (register-pinned inputs + pinned value output + pointer input + clobbers). ir-only on aarch64 (.ir asserted), runs on x86_64-linux (hand-authored "ok\n" stdout). examples/1651-platform-asm-x86-syscall-write.sx. Pure additive lock, no compiler change. zig build test green (660 corpus, 446 unit).
  • (G indirect) indirect-memory =*m place outputs — the place address is passed as an opaque ptr arg (with an elementtype(T) call-site attr), placed before inputs; asm writes through it; no return slot; store-back skips it. asmIsIndirect helper; lowering stops rejecting *. Verified by running on aarch64 (store-through → 42; mixed indirect+value+input → "=*m,=r,r"). Two commits (cadence): 1652 locked the rejection, then flipped to a runnable aarch64 example. Inline asm now feature-complete. zig build test green (661 corpus, 446 unit).
  • (jit) explored "asm in JIT": found it ALREADY works — sx run emits an in-memory object (integrated assembler bakes in both in-function inline asm and module asm), then ORC relocates+runs it. The stale "AOT only / sx run mishandles module-asm" checkpoint prose was corrected. Locked global-asm-under- JIT with examples/1653-platform-asm-global-jit.sx ({ "target": "macos" }, no aot, → 42). zig build test green (662 corpus, 446 unit).
  • (comptime guard) pinned the one genuine module-asm boundary: examples/1654-platform-asm-global-comptime-call.sx#run into a module-asm symbol fails loud (comptime extern call: symbol not found via dlsym) because the interpreter resolves externs via host dlsym before link. Arch-independent (no .build). zig build test green (663 corpus, 446 unit).
  • (round trip) examples/1655-platform-asm-callback-into-sx.sx — global-asm trampoline that bl _cb back into an exported sx function (sx→asm→sx, → 42). Documented that export (external linkage + C symbol + C ABI) is what makes the callback resolvable; callconv(.c) alone leaves it internal (DCE'd). zig build test green (664 corpus, 446 unit).
  • (symbol ops) symbol operands ("s") — feed a function/global symbol; the template emits its platform-mangled name so bl %[fn] is a DIRECT branch (one fewer indirection than register-indirect blr, portable — no hardcoded _). Emit passes the operand with its own llvm type (LLVMTypeOf), no coercion (asmIsSymbol helper); lowering lowers the function RHS to ptr @fn. Decided AGAINST mirroring Zig (which has no symbol operand — 483 std asm sites, none call a function) because the direct bl matters. Two commits (cadence): 1656 locked the rejection (replacing an LLVM-verifier crash), then implemented + flipped to a runnable aarch64 example (objdump-confirmed direct bl <_cb>). zig build test green (665 corpus, 446 unit).
  • (x86 cross-arch) ir-only x86_64 siblings so each emit path is locked on BOTH arches: 1657 read-write ("incq ${0}","=r,0"), 1658 indirect ("movq $$42, ${0}","=*m"(ptr elementtype)), 1659 symbol ("call ${2:P}", direct call). x86 templates validated by cross-emitting an object (integrated assembler accepts; objdump confirms 1659's direct call reloc). Pure additive locks. zig build test green (668 corpus, 446 unit).
  • (symbol portability) made %[fn] portable across arches — renderAsmTemplate auto-injects LLVM's :c modifier (${N}${N:c}) for symbol ("s") operands lacking an explicit modifier (asmNamedIsSymbol helper). Without it x86 renders $cb (a bad call target needing a hand-written :P); aarch64 unaffected. Verified :c:P for x86-64 calls (both → R_X86_64_PLT32). Explicit %[fn:X] still wins (escape hatch). 1659 dropped its :P → same plain %[fn] as aarch64 1656; both IRs regen to ${N:c}. zig build test green (668 corpus, 446 unit).

Known issues

  • 0138 — RESOLVED. @const (address-of a :: comptime constant) yielded a wild pointer (inttoptr (i64 <value> to ptr)). Fixed by diagnosing scalar @const in src/ir/lower/expr.zig .address_of (no storage; array/struct consts unaffected). Delivered the ASM "output-to-const rejection" for free. Regression examples/1177-diagnostics-addr-of-const-rejected.sx.
  • 0137sx run on a program with no main segfaults (unguarded JIT entry lookup, src/target.zig:256-273). Pre-existing, asm-independent. Filed issues/0137-jit-run-no-main-segfault.md. Does not block A.1.