Moves docs/inline-asm-design.md -> design/inline-asm-design.md (the internal design record now lives under design/, separate from the user-facing docs/). Updates all links: current/CHECKPOINT-ASM.md, current/PLAN-ASM.md, current/PLAN-EXTERN-EXPORT.md (../docs -> ../design) and docs/inline-assembly.md (same-dir -> ../design).
168 lines
12 KiB
Markdown
168 lines
12 KiB
Markdown
# sx Inline Assembly — Implementation Plan (ASM stream)
|
||
|
||
**Design source of truth:** [design/inline-asm-design.md](../design/inline-asm-design.md).
|
||
This plan turns that doc's §II.7 stage-map + §II.8 phasing into ordered,
|
||
commit-sized, testable steps. Read the design doc first — this file is the
|
||
*how/when*, not the *what/why*.
|
||
|
||
**Surface (decided):**
|
||
`asm volatile { "template", "=r" -> T, "r" = expr, clobbers(.cc, .memory) }`
|
||
— brace block; `->` output / `=` input; `clobbers(.…)` dot-name list; N `-> Type`
|
||
outputs return a tuple; templates are pure AT&T (via LLVM).
|
||
|
||
**Feasibility (confirmed):** sx links LLVM@19; `src/llvm_api.zig` `@cImport`s
|
||
`llvm-c/Core.h`, so `llvm_api.c.*` already exposes `LLVMGetInlineAsm` (9-arg),
|
||
`LLVMInlineAsmDialectATT`, `LLVMBuildCall2`, `LLVMAppendModuleInlineAsm`. No shim.
|
||
|
||
**Relationship to other streams:**
|
||
- Phases A–E (the inline-asm *expression*) are independent of EXTERN-EXPORT.
|
||
- Phase F (global asm) consumes `extern`/`export` to import/expose asm symbols —
|
||
do it **after** `PLAN-EXTERN-EXPORT.md` Phase 2.
|
||
|
||
## Cadence (IMPASSIBLE)
|
||
No commit may both add a test AND make it pass. Each feature step is either a
|
||
behavior-locking PASSING test, or an xfail test the *next* commit turns green.
|
||
Arch-pinned tests live in `examples/16xx-platform-asm-*` and declare their target
|
||
via the `expected/<name>.target` sidecar marker (Phase 0). Never regenerate
|
||
snapshots while red.
|
||
|
||
## Phase 0 — corpus target-gating (test-infra prerequisite; no compiler code)
|
||
**Why first.** The flagship v1 examples are `x86_64` (syscall-write, divmod,
|
||
cpuid) but the dev host is `aarch64`-Darwin, and the corpus runner
|
||
([src/corpus_run.test.zig](../src/corpus_run.test.zig)) currently (a) never threads
|
||
a per-example `--target` and (b) has no host-arch gate — its only skip is "marker
|
||
has no `.sx`". So D.0's `…-syscall-write` markers asserting exit/stdout describe
|
||
output the harness *cannot* produce on this host, which would violate the cadence
|
||
rule (the "next commit turns it green" can never happen). Phase 0 closes that gap.
|
||
It touches **only the runner + two fixtures** — zero compiler code, zero risk to
|
||
A–E, and unblocks every arch-pinned asm example.
|
||
|
||
**Marker taxonomy (the cleanup).** The runner currently spreads per-example
|
||
*directives* across standalone boolean/value sidecars (`.aot` now, `.target`
|
||
proposed, more later). Replace that sprawl with **one optional config file,
|
||
`expected/<name>.build`**, holding all build/run directives; the output snapshots
|
||
(`.exit` / `.stdout` / `.stderr` / `.ir`) stay separate — they are
|
||
machine-regenerated data, not config. `.exit` remains the **test-discovery key**
|
||
(every test has one; `.build` is optional).
|
||
|
||
**`.build` format** — JSON, parsed with `std.json`:
|
||
```json
|
||
{ "aot": true, "target": "x86_64-linux" }
|
||
```
|
||
Parse via `std.json.parseFromSlice(BuildConfig, …)` into
|
||
`struct { aot: bool = false, target: ?[]const u8 = null }`. Field defaults cover
|
||
omitted keys; `std.json`'s default `ignore_unknown_fields = false` makes an
|
||
**unknown key a loud `error.UnknownField`** (surfaced as a runner failure, never a
|
||
silent ignore — CLAUDE.md no-silent-default rule). Extensible: future `"cpu"`,
|
||
`"link"`, `"cwd"` are just new optional struct fields, no new sidecar file and no
|
||
custom parser.
|
||
|
||
**What the directives do:**
|
||
|
||
1. **`target = <triple|shorthand>`** threads `--target <value>` into every `sx`
|
||
invocation for that example (`run` / `build` / `ir` — `--target` is a global
|
||
flag, confirmed [main.zig:39](../src/main.zig#L39)), AND **host-match selects
|
||
the mode.** The runner parses the leading `arch` + `os` tokens of the resolved
|
||
triple and compares them to `@import("builtin").target` (normalizing
|
||
`arm64`→`aarch64`):
|
||
- **match** → *execute* exactly as today (`sx run`, or `aot` build+exec) with
|
||
the target threaded, plus the `.ir` diff if an `.ir` snapshot exists. ⇒ an
|
||
x86_64 example gives **real end-to-end coverage on an x86_64 CI runner**.
|
||
- **mismatch** → **ir-only**: run *only* `sx ir <file> --target <t>`; assert
|
||
`.exit` (the ir command's exit), `.ir` (normalized stdout), and `.stderr`
|
||
(diagnostics, normally empty). Do **not** run/build/exec; do **not** assert
|
||
`.stdout`. An `.ir` snapshot is **required** in ir-only mode — its absence is
|
||
a loud runner failure ("arch-pinned <name>: ir-only mode requires an .ir
|
||
snapshot"), never a silent pass. Robust even if `sx ir` treats `--target` as
|
||
a partial no-op: the `inline_asm` op carries the template + constraint string
|
||
verbatim, so the IR snapshot still locks the exact thing §II.11 flags as
|
||
silently-miscompiling (the constraint assembler + template rewrite).
|
||
2. **`aot`** is the existing JIT-vs-build+exec switch, just relocated from the
|
||
standalone `.aot` marker into `.build`.
|
||
|
||
**Negative compile-error examples need NO `.build`.** `…-missing-volatile`
|
||
(no-output-without-`volatile`) is a Sema diagnostic raised before codegen/JIT, so
|
||
plain `sx run` reports it identically on any host — it stays a normal example with
|
||
no config file.
|
||
|
||
**update-goldens interaction:** in ir-only mode, `-Dupdate-goldens` writes `.exit`
|
||
(ir exit) + `.ir` (+ `.stderr` if non-empty) and skips `.stdout`. Execute mode
|
||
(incl. `aot`) is unchanged. `.build` is hand-authored — update-goldens never
|
||
writes it.
|
||
|
||
| Step | Commit | What | Files |
|
||
|---|---|---|---|
|
||
| 0.0 | lock | Add `BuildConfig` + `std.json` parse of `expected/<name>.build` (unknown-key ⇒ `error.UnknownField`); **migrate** the 2 existing `.aot` markers → `.build` (content `{ "aot": true }`) and delete them; thread `target`'s `--target` into the spawned argv; add `hostMatchesTarget(value) bool` (arch+os token parse, `arm64`→`aarch64`) gating the **execute** path. Lock with `examples/16xx-platform-target-host.sx` (trivial `main`) + a `.build` `{ "target": "<host arch triple>" }` (still runs+passes) and unit `test`s for the JSON parse + `hostMatchesTarget`. | `src/corpus_run.test.zig`, `examples/expected/1226-*.{aot→build}`, `…/1227-*`, + fixture |
|
||
| 0.1 | lock | Implement the **mismatch ⇒ ir-only** branch (skip run/build/exec; assert `.exit`+`.ir`+`.stderr` from `sx ir --target`; require `.ir`). Lock with `examples/16xx-platform-target-cross.sx` (asm-free `() -> i64 { return 0; }`) + `.build` `{ "target": "x86_64-linux" }` + a checked-in `.ir` snapshot — exercises ir-only on the arm64 host. | `src/corpus_run.test.zig` + fixture |
|
||
| 0.2 | docs | Update CLAUDE.md §"Test layout"/§"Testing" to document `.build` (format + `aot`/`target` keys) replacing the standalone `.aot` marker prose (lines ~435, ~492). | `CLAUDE.md` |
|
||
|
||
Both 0.0 and 0.1 are **lock** commits: the runner change and the fixture that
|
||
exercises it land together and pass the moment they land (the mechanism works
|
||
immediately — nothing is left red), which is the cadence rule's "lock in current
|
||
behavior" flavor, not a feature red→green. No asm lowering is gated on either.
|
||
|
||
**Phase 0 verification:** `zig build test` green; deliberately corrupt the
|
||
cross-target `.ir` fixture and confirm the runner reports an IR mismatch (proves
|
||
ir-only actually asserts, isn't a no-op); delete it and confirm the
|
||
"requires an .ir snapshot" failure fires.
|
||
|
||
**Estimated runner delta:** ~70–90 lines (sidecar read + `--target` argv threading
|
||
+ `hostMatchesTarget` + the ir-only branch + update-mode tweak). Within the
|
||
"no step > ~500 new lines" rule; well under the read budget.
|
||
|
||
## Phase A — keyword + AST + parser (parses; no codegen)
|
||
| Step | Commit | What | Files |
|
||
|---|---|---|---|
|
||
| A.0 | lock | add `kw_asm` keyword + map entry; unit lex test `asm → kw_asm` | `src/token.zig`, `src/lexer.zig` + `.test.zig` |
|
||
| A.1 | xfail | parse `asm { … }` → `AsmExpr`/`AsmOperand` in `parsePrimary`; pin an AST/`sx ir` parse snapshot; lowering still `bailDetail("inline asm codegen unimplemented")` | `src/ast.zig` (:85 union arm, :721 structs), `src/parser.zig` (parsePrimary), `src/ir/interp.zig` |
|
||
| A.2 | green | parse-shape snapshot lands green; the unimplemented bail is loud + named | — |
|
||
|
||
## Phase B — sema / typing
|
||
| Step | Commit | What | Files |
|
||
|---|---|---|---|
|
||
| B.0 | xfail | result-type rule (0→`void` / 1→`T` / N→named-or-positional tuple) + checklist (no-output⇒`volatile`, layout, comptime-string template) — pin error messages | `src/ir/expr_typer.zig` |
|
||
| B.1 | green | typing + diagnostics implemented; `.unresolved` sentinel on failure (no silent default) | `src/ir/expr_typer.zig`, `src/ir/semantic_diagnostics.zig` |
|
||
|
||
## Phase C — IR op + lowering
|
||
| Step | Commit | What | Files |
|
||
|---|---|---|---|
|
||
| C.0 | lock | add `inline_asm: InlineAsm` to `Op` + `AsmOperand` (role/name/constraint/operand) + interp `bailDetail` arm; unit tests for the IR shape | `src/ir/inst.zig` (:80), `src/ir/interp.zig` |
|
||
| C.1 | xfail→green | `lowerAsmExpr` in `lowerExpr` dispatch — interns template/constraints/clobber-names, lowers input `Ref`s, sets result `TypeId` | `src/ir/lower/expr.zig` |
|
||
|
||
## Phase D — LLVM emit (single value-output; the core)
|
||
| Step | Commit | What | Files |
|
||
|---|---|---|---|
|
||
| D.0 | xfail | `examples/16xx-platform-asm-syscall-write.sx` + `…-register-read.sx` + `…-no-output-volatile.sx` + `…-missing-volatile.sx` (expected compile error) — all red | examples + `expected/` markers |
|
||
| D.1 | green | `emitInlineAsm`: **port `FuncGen.airAssembly`** — constraint-string assembler (outputs `=`/`+`, inputs, `clobbers(.name)`→`~{name}`), `%[name]`→`${N}` / `%%` / `%=` template rewriter, `LLVMGetInlineAsm`+`LLVMBuildCall2`, `sideeffect=volatile`, AT&T dialect | `src/ir/emit_llvm.zig` (emitInst dispatch + handler) |
|
||
| D.2 | green | lock the template-rewrite + constraint string via an `expected/*.ir` snapshot on `…-template-subst.sx` | examples |
|
||
|
||
**Phase D verification:** `zig build test`; the syscall example runs on
|
||
`x86_64-linux`; IR snapshot matches the design doc's worked `sys_write` lowering.
|
||
|
||
## Phase E — multi-return tuples + `clobbers(.…)`
|
||
| Step | Commit | What | Files |
|
||
|---|---|---|---|
|
||
| E.0 | xfail | `…-asm-multi-return.sx` (`divmod`→`(quot,rem)`, `cpuid`→4-tuple) red | examples |
|
||
| E.1 | green | N `out_value` → LLVM struct return + `extractvalue i` → sx tuple (named when operands named); `clobbers(.name)` dot-name lowering finalized | `src/ir/emit_llvm.zig`, `src/ir/lower/expr.zig` |
|
||
|
||
## Phase F — global asm (needs EXTERN-EXPORT Phase 2)
|
||
| Step | Commit | What | Files |
|
||
|---|---|---|---|
|
||
| F.0 | xfail | top-level `asm { … }` decl parsed (reject operands/`volatile`); `…-asm-global.sx` (defines a symbol, imported via `extern`) red | `src/parser.zig`, `src/ast.zig` |
|
||
| F.1 | green | lower `asm_global` → `c.LLVMAppendModuleInlineAsm`; comptime-call guard (dlsym-miss is loud); blocks concatenate in source order | `src/ir/lower/decl.zig`, `src/ir/emit_llvm.zig`, `src/ir/interp.zig` |
|
||
|
||
## Phase G — later (own steps when scheduled)
|
||
`-> @place` write-through + read-write (`"+r" -> @place`) + indirect-memory
|
||
(`"=*m"`) outputs · `%=` unique-id · output-to-const rejection · Intel-dialect
|
||
opt-in · naked functions (`callconv(.naked)`, coordinate with EXTERN-EXPORT).
|
||
|
||
## Open decisions (design doc §II.10)
|
||
Dialect (AT&T-only v1, recommended) · `volatile` contextual-keyword (recommended)
|
||
· brace separator comma (recommended) · `clobbers(.name)` dot-name sugar now →
|
||
checked per-arch `Clobber` enum later (Phase 4 of the design doc).
|
||
|
||
## End-to-end verification (per phase)
|
||
`zig build && zig build test`; for arch-pinned examples confirm they run on a
|
||
matching host or assert on `sx ir`/`.s` snapshots. After intentional output
|
||
changes only: `zig build test -Dupdate-goldens`, then review the diff.
|