Files
sx/current/PLAN-ASM.md
agra e7eeecc0f3 docs: move inline-asm design doc to a top-level design/ folder
Moves docs/inline-asm-design.md -> design/inline-asm-design.md (the
internal design record now lives under design/, separate from the
user-facing docs/). Updates all links: current/CHECKPOINT-ASM.md,
current/PLAN-ASM.md, current/PLAN-EXTERN-EXPORT.md (../docs -> ../design)
and docs/inline-assembly.md (same-dir -> ../design).
2026-06-16 07:46:01 +03:00

12 KiB
Raw Permalink Blame History

sx Inline Assembly — Implementation Plan (ASM stream)

Design source of truth: design/inline-asm-design.md. This plan turns that doc's §II.7 stage-map + §II.8 phasing into ordered, commit-sized, testable steps. Read the design doc first — this file is the how/when, not the what/why.

Surface (decided): asm volatile { "template", "=r" -> T, "r" = expr, clobbers(.cc, .memory) } — brace block; -> output / = input; clobbers(.…) dot-name list; N -> Type outputs return a tuple; templates are pure AT&T (via LLVM).

Feasibility (confirmed): sx links LLVM@19; src/llvm_api.zig @cImports llvm-c/Core.h, so llvm_api.c.* already exposes LLVMGetInlineAsm (9-arg), LLVMInlineAsmDialectATT, LLVMBuildCall2, LLVMAppendModuleInlineAsm. No shim.

Relationship to other streams:

  • Phases AE (the inline-asm expression) are independent of EXTERN-EXPORT.
  • Phase F (global asm) consumes extern/export to import/expose asm symbols — do it after PLAN-EXTERN-EXPORT.md Phase 2.

Cadence (IMPASSIBLE)

No commit may both add a test AND make it pass. Each feature step is either a behavior-locking PASSING test, or an xfail test the next commit turns green. Arch-pinned tests live in examples/16xx-platform-asm-* and declare their target via the expected/<name>.target sidecar marker (Phase 0). Never regenerate snapshots while red.

Phase 0 — corpus target-gating (test-infra prerequisite; no compiler code)

Why first. The flagship v1 examples are x86_64 (syscall-write, divmod, cpuid) but the dev host is aarch64-Darwin, and the corpus runner (src/corpus_run.test.zig) currently (a) never threads a per-example --target and (b) has no host-arch gate — its only skip is "marker has no .sx". So D.0's …-syscall-write markers asserting exit/stdout describe output the harness cannot produce on this host, which would violate the cadence rule (the "next commit turns it green" can never happen). Phase 0 closes that gap. It touches only the runner + two fixtures — zero compiler code, zero risk to AE, and unblocks every arch-pinned asm example.

Marker taxonomy (the cleanup). The runner currently spreads per-example directives across standalone boolean/value sidecars (.aot now, .target proposed, more later). Replace that sprawl with one optional config file, expected/<name>.build, holding all build/run directives; the output snapshots (.exit / .stdout / .stderr / .ir) stay separate — they are machine-regenerated data, not config. .exit remains the test-discovery key (every test has one; .build is optional).

.build format — JSON, parsed with std.json:

{ "aot": true, "target": "x86_64-linux" }

Parse via std.json.parseFromSlice(BuildConfig, …) into struct { aot: bool = false, target: ?[]const u8 = null }. Field defaults cover omitted keys; std.json's default ignore_unknown_fields = false makes an unknown key a loud error.UnknownField (surfaced as a runner failure, never a silent ignore — CLAUDE.md no-silent-default rule). Extensible: future "cpu", "link", "cwd" are just new optional struct fields, no new sidecar file and no custom parser.

What the directives do:

  1. target = <triple|shorthand> threads --target <value> into every sx invocation for that example (run / build / ir--target is a global flag, confirmed main.zig:39), AND host-match selects the mode. The runner parses the leading arch + os tokens of the resolved triple and compares them to @import("builtin").target (normalizing arm64aarch64):
    • matchexecute exactly as today (sx run, or aot build+exec) with the target threaded, plus the .ir diff if an .ir snapshot exists. ⇒ an x86_64 example gives real end-to-end coverage on an x86_64 CI runner.
    • mismatchir-only: run only sx ir <file> --target <t>; assert .exit (the ir command's exit), .ir (normalized stdout), and .stderr (diagnostics, normally empty). Do not run/build/exec; do not assert .stdout. An .ir snapshot is required in ir-only mode — its absence is a loud runner failure ("arch-pinned : ir-only mode requires an .ir snapshot"), never a silent pass. Robust even if sx ir treats --target as a partial no-op: the inline_asm op carries the template + constraint string verbatim, so the IR snapshot still locks the exact thing §II.11 flags as silently-miscompiling (the constraint assembler + template rewrite).
  2. aot is the existing JIT-vs-build+exec switch, just relocated from the standalone .aot marker into .build.

Negative compile-error examples need NO .build. …-missing-volatile (no-output-without-volatile) is a Sema diagnostic raised before codegen/JIT, so plain sx run reports it identically on any host — it stays a normal example with no config file.

update-goldens interaction: in ir-only mode, -Dupdate-goldens writes .exit (ir exit) + .ir (+ .stderr if non-empty) and skips .stdout. Execute mode (incl. aot) is unchanged. .build is hand-authored — update-goldens never writes it.

Step Commit What Files
0.0 lock Add BuildConfig + std.json parse of expected/<name>.build (unknown-key ⇒ error.UnknownField); migrate the 2 existing .aot markers → .build (content { "aot": true }) and delete them; thread target's --target into the spawned argv; add hostMatchesTarget(value) bool (arch+os token parse, arm64aarch64) gating the execute path. Lock with examples/16xx-platform-target-host.sx (trivial main) + a .build { "target": "<host arch triple>" } (still runs+passes) and unit tests for the JSON parse + hostMatchesTarget. src/corpus_run.test.zig, examples/expected/1226-*.{aot→build}, …/1227-*, + fixture
0.1 lock Implement the mismatch ⇒ ir-only branch (skip run/build/exec; assert .exit+.ir+.stderr from sx ir --target; require .ir). Lock with examples/16xx-platform-target-cross.sx (asm-free () -> i64 { return 0; }) + .build { "target": "x86_64-linux" } + a checked-in .ir snapshot — exercises ir-only on the arm64 host. src/corpus_run.test.zig + fixture
0.2 docs Update CLAUDE.md §"Test layout"/§"Testing" to document .build (format + aot/target keys) replacing the standalone .aot marker prose (lines ~435, ~492). CLAUDE.md

Both 0.0 and 0.1 are lock commits: the runner change and the fixture that exercises it land together and pass the moment they land (the mechanism works immediately — nothing is left red), which is the cadence rule's "lock in current behavior" flavor, not a feature red→green. No asm lowering is gated on either.

Phase 0 verification: zig build test green; deliberately corrupt the cross-target .ir fixture and confirm the runner reports an IR mismatch (proves ir-only actually asserts, isn't a no-op); delete it and confirm the "requires an .ir snapshot" failure fires.

Estimated runner delta: ~7090 lines (sidecar read + --target argv threading

  • hostMatchesTarget + the ir-only branch + update-mode tweak). Within the "no step > ~500 new lines" rule; well under the read budget.

Phase A — keyword + AST + parser (parses; no codegen)

Step Commit What Files
A.0 lock add kw_asm keyword + map entry; unit lex test asm → kw_asm src/token.zig, src/lexer.zig + .test.zig
A.1 xfail parse asm { … }AsmExpr/AsmOperand in parsePrimary; pin an AST/sx ir parse snapshot; lowering still bailDetail("inline asm codegen unimplemented") src/ast.zig (:85 union arm, :721 structs), src/parser.zig (parsePrimary), src/ir/interp.zig
A.2 green parse-shape snapshot lands green; the unimplemented bail is loud + named

Phase B — sema / typing

Step Commit What Files
B.0 xfail result-type rule (0→void / 1→T / N→named-or-positional tuple) + checklist (no-output⇒volatile, layout, comptime-string template) — pin error messages src/ir/expr_typer.zig
B.1 green typing + diagnostics implemented; .unresolved sentinel on failure (no silent default) src/ir/expr_typer.zig, src/ir/semantic_diagnostics.zig

Phase C — IR op + lowering

Step Commit What Files
C.0 lock add inline_asm: InlineAsm to Op + AsmOperand (role/name/constraint/operand) + interp bailDetail arm; unit tests for the IR shape src/ir/inst.zig (:80), src/ir/interp.zig
C.1 xfail→green lowerAsmExpr in lowerExpr dispatch — interns template/constraints/clobber-names, lowers input Refs, sets result TypeId src/ir/lower/expr.zig

Phase D — LLVM emit (single value-output; the core)

Step Commit What Files
D.0 xfail examples/16xx-platform-asm-syscall-write.sx + …-register-read.sx + …-no-output-volatile.sx + …-missing-volatile.sx (expected compile error) — all red examples + expected/ markers
D.1 green emitInlineAsm: port FuncGen.airAssembly — constraint-string assembler (outputs =/+, inputs, clobbers(.name)~{name}), %[name]${N} / %% / %= template rewriter, LLVMGetInlineAsm+LLVMBuildCall2, sideeffect=volatile, AT&T dialect src/ir/emit_llvm.zig (emitInst dispatch + handler)
D.2 green lock the template-rewrite + constraint string via an expected/*.ir snapshot on …-template-subst.sx examples

Phase D verification: zig build test; the syscall example runs on x86_64-linux; IR snapshot matches the design doc's worked sys_write lowering.

Phase E — multi-return tuples + clobbers(.…)

Step Commit What Files
E.0 xfail …-asm-multi-return.sx (divmod(quot,rem), cpuid→4-tuple) red examples
E.1 green N out_value → LLVM struct return + extractvalue i → sx tuple (named when operands named); clobbers(.name) dot-name lowering finalized src/ir/emit_llvm.zig, src/ir/lower/expr.zig

Phase F — global asm (needs EXTERN-EXPORT Phase 2)

Step Commit What Files
F.0 xfail top-level asm { … } decl parsed (reject operands/volatile); …-asm-global.sx (defines a symbol, imported via extern) red src/parser.zig, src/ast.zig
F.1 green lower asm_globalc.LLVMAppendModuleInlineAsm; comptime-call guard (dlsym-miss is loud); blocks concatenate in source order src/ir/lower/decl.zig, src/ir/emit_llvm.zig, src/ir/interp.zig

Phase G — later (own steps when scheduled)

-> @place write-through + read-write ("+r" -> @place) + indirect-memory ("=*m") outputs · %= unique-id · output-to-const rejection · Intel-dialect opt-in · naked functions (callconv(.naked), coordinate with EXTERN-EXPORT).

Open decisions (design doc §II.10)

Dialect (AT&T-only v1, recommended) · volatile contextual-keyword (recommended) · brace separator comma (recommended) · clobbers(.name) dot-name sugar now → checked per-arch Clobber enum later (Phase 4 of the design doc).

End-to-end verification (per phase)

zig build && zig build test; for arch-pinned examples confirm they run on a matching host or assert on sx ir/.s snapshots. After intentional output changes only: zig build test -Dupdate-goldens, then review the diff.