Files
sx/current/PLAN-ASM.md
agra e7eeecc0f3 docs: move inline-asm design doc to a top-level design/ folder
Moves docs/inline-asm-design.md -> design/inline-asm-design.md (the
internal design record now lives under design/, separate from the
user-facing docs/). Updates all links: current/CHECKPOINT-ASM.md,
current/PLAN-ASM.md, current/PLAN-EXTERN-EXPORT.md (../docs -> ../design)
and docs/inline-assembly.md (same-dir -> ../design).
2026-06-16 07:46:01 +03:00

168 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# sx Inline Assembly — Implementation Plan (ASM stream)
**Design source of truth:** [design/inline-asm-design.md](../design/inline-asm-design.md).
This plan turns that doc's §II.7 stage-map + §II.8 phasing into ordered,
commit-sized, testable steps. Read the design doc first — this file is the
*how/when*, not the *what/why*.
**Surface (decided):**
`asm volatile { "template", "=r" -> T, "r" = expr, clobbers(.cc, .memory) }`
— brace block; `->` output / `=` input; `clobbers(.…)` dot-name list; N `-> Type`
outputs return a tuple; templates are pure AT&T (via LLVM).
**Feasibility (confirmed):** sx links LLVM@19; `src/llvm_api.zig` `@cImport`s
`llvm-c/Core.h`, so `llvm_api.c.*` already exposes `LLVMGetInlineAsm` (9-arg),
`LLVMInlineAsmDialectATT`, `LLVMBuildCall2`, `LLVMAppendModuleInlineAsm`. No shim.
**Relationship to other streams:**
- Phases AE (the inline-asm *expression*) are independent of EXTERN-EXPORT.
- Phase F (global asm) consumes `extern`/`export` to import/expose asm symbols —
do it **after** `PLAN-EXTERN-EXPORT.md` Phase 2.
## Cadence (IMPASSIBLE)
No commit may both add a test AND make it pass. Each feature step is either a
behavior-locking PASSING test, or an xfail test the *next* commit turns green.
Arch-pinned tests live in `examples/16xx-platform-asm-*` and declare their target
via the `expected/<name>.target` sidecar marker (Phase 0). Never regenerate
snapshots while red.
## Phase 0 — corpus target-gating (test-infra prerequisite; no compiler code)
**Why first.** The flagship v1 examples are `x86_64` (syscall-write, divmod,
cpuid) but the dev host is `aarch64`-Darwin, and the corpus runner
([src/corpus_run.test.zig](../src/corpus_run.test.zig)) currently (a) never threads
a per-example `--target` and (b) has no host-arch gate — its only skip is "marker
has no `.sx`". So D.0's `…-syscall-write` markers asserting exit/stdout describe
output the harness *cannot* produce on this host, which would violate the cadence
rule (the "next commit turns it green" can never happen). Phase 0 closes that gap.
It touches **only the runner + two fixtures** — zero compiler code, zero risk to
AE, and unblocks every arch-pinned asm example.
**Marker taxonomy (the cleanup).** The runner currently spreads per-example
*directives* across standalone boolean/value sidecars (`.aot` now, `.target`
proposed, more later). Replace that sprawl with **one optional config file,
`expected/<name>.build`**, holding all build/run directives; the output snapshots
(`.exit` / `.stdout` / `.stderr` / `.ir`) stay separate — they are
machine-regenerated data, not config. `.exit` remains the **test-discovery key**
(every test has one; `.build` is optional).
**`.build` format** — JSON, parsed with `std.json`:
```json
{ "aot": true, "target": "x86_64-linux" }
```
Parse via `std.json.parseFromSlice(BuildConfig, …)` into
`struct { aot: bool = false, target: ?[]const u8 = null }`. Field defaults cover
omitted keys; `std.json`'s default `ignore_unknown_fields = false` makes an
**unknown key a loud `error.UnknownField`** (surfaced as a runner failure, never a
silent ignore — CLAUDE.md no-silent-default rule). Extensible: future `"cpu"`,
`"link"`, `"cwd"` are just new optional struct fields, no new sidecar file and no
custom parser.
**What the directives do:**
1. **`target = <triple|shorthand>`** threads `--target <value>` into every `sx`
invocation for that example (`run` / `build` / `ir``--target` is a global
flag, confirmed [main.zig:39](../src/main.zig#L39)), AND **host-match selects
the mode.** The runner parses the leading `arch` + `os` tokens of the resolved
triple and compares them to `@import("builtin").target` (normalizing
`arm64``aarch64`):
- **match** → *execute* exactly as today (`sx run`, or `aot` build+exec) with
the target threaded, plus the `.ir` diff if an `.ir` snapshot exists. ⇒ an
x86_64 example gives **real end-to-end coverage on an x86_64 CI runner**.
- **mismatch** → **ir-only**: run *only* `sx ir <file> --target <t>`; assert
`.exit` (the ir command's exit), `.ir` (normalized stdout), and `.stderr`
(diagnostics, normally empty). Do **not** run/build/exec; do **not** assert
`.stdout`. An `.ir` snapshot is **required** in ir-only mode — its absence is
a loud runner failure ("arch-pinned <name>: ir-only mode requires an .ir
snapshot"), never a silent pass. Robust even if `sx ir` treats `--target` as
a partial no-op: the `inline_asm` op carries the template + constraint string
verbatim, so the IR snapshot still locks the exact thing §II.11 flags as
silently-miscompiling (the constraint assembler + template rewrite).
2. **`aot`** is the existing JIT-vs-build+exec switch, just relocated from the
standalone `.aot` marker into `.build`.
**Negative compile-error examples need NO `.build`.** `…-missing-volatile`
(no-output-without-`volatile`) is a Sema diagnostic raised before codegen/JIT, so
plain `sx run` reports it identically on any host — it stays a normal example with
no config file.
**update-goldens interaction:** in ir-only mode, `-Dupdate-goldens` writes `.exit`
(ir exit) + `.ir` (+ `.stderr` if non-empty) and skips `.stdout`. Execute mode
(incl. `aot`) is unchanged. `.build` is hand-authored — update-goldens never
writes it.
| Step | Commit | What | Files |
|---|---|---|---|
| 0.0 | lock | Add `BuildConfig` + `std.json` parse of `expected/<name>.build` (unknown-key ⇒ `error.UnknownField`); **migrate** the 2 existing `.aot` markers → `.build` (content `{ "aot": true }`) and delete them; thread `target`'s `--target` into the spawned argv; add `hostMatchesTarget(value) bool` (arch+os token parse, `arm64``aarch64`) gating the **execute** path. Lock with `examples/16xx-platform-target-host.sx` (trivial `main`) + a `.build` `{ "target": "<host arch triple>" }` (still runs+passes) and unit `test`s for the JSON parse + `hostMatchesTarget`. | `src/corpus_run.test.zig`, `examples/expected/1226-*.{aot→build}`, `…/1227-*`, + fixture |
| 0.1 | lock | Implement the **mismatch ⇒ ir-only** branch (skip run/build/exec; assert `.exit`+`.ir`+`.stderr` from `sx ir --target`; require `.ir`). Lock with `examples/16xx-platform-target-cross.sx` (asm-free `() -> i64 { return 0; }`) + `.build` `{ "target": "x86_64-linux" }` + a checked-in `.ir` snapshot — exercises ir-only on the arm64 host. | `src/corpus_run.test.zig` + fixture |
| 0.2 | docs | Update CLAUDE.md §"Test layout"/§"Testing" to document `.build` (format + `aot`/`target` keys) replacing the standalone `.aot` marker prose (lines ~435, ~492). | `CLAUDE.md` |
Both 0.0 and 0.1 are **lock** commits: the runner change and the fixture that
exercises it land together and pass the moment they land (the mechanism works
immediately — nothing is left red), which is the cadence rule's "lock in current
behavior" flavor, not a feature red→green. No asm lowering is gated on either.
**Phase 0 verification:** `zig build test` green; deliberately corrupt the
cross-target `.ir` fixture and confirm the runner reports an IR mismatch (proves
ir-only actually asserts, isn't a no-op); delete it and confirm the
"requires an .ir snapshot" failure fires.
**Estimated runner delta:** ~7090 lines (sidecar read + `--target` argv threading
+ `hostMatchesTarget` + the ir-only branch + update-mode tweak). Within the
"no step > ~500 new lines" rule; well under the read budget.
## Phase A — keyword + AST + parser (parses; no codegen)
| Step | Commit | What | Files |
|---|---|---|---|
| A.0 | lock | add `kw_asm` keyword + map entry; unit lex test `asm → kw_asm` | `src/token.zig`, `src/lexer.zig` + `.test.zig` |
| A.1 | xfail | parse `asm { … }``AsmExpr`/`AsmOperand` in `parsePrimary`; pin an AST/`sx ir` parse snapshot; lowering still `bailDetail("inline asm codegen unimplemented")` | `src/ast.zig` (:85 union arm, :721 structs), `src/parser.zig` (parsePrimary), `src/ir/interp.zig` |
| A.2 | green | parse-shape snapshot lands green; the unimplemented bail is loud + named | — |
## Phase B — sema / typing
| Step | Commit | What | Files |
|---|---|---|---|
| B.0 | xfail | result-type rule (0→`void` / 1→`T` / N→named-or-positional tuple) + checklist (no-output⇒`volatile`, layout, comptime-string template) — pin error messages | `src/ir/expr_typer.zig` |
| B.1 | green | typing + diagnostics implemented; `.unresolved` sentinel on failure (no silent default) | `src/ir/expr_typer.zig`, `src/ir/semantic_diagnostics.zig` |
## Phase C — IR op + lowering
| Step | Commit | What | Files |
|---|---|---|---|
| C.0 | lock | add `inline_asm: InlineAsm` to `Op` + `AsmOperand` (role/name/constraint/operand) + interp `bailDetail` arm; unit tests for the IR shape | `src/ir/inst.zig` (:80), `src/ir/interp.zig` |
| C.1 | xfail→green | `lowerAsmExpr` in `lowerExpr` dispatch — interns template/constraints/clobber-names, lowers input `Ref`s, sets result `TypeId` | `src/ir/lower/expr.zig` |
## Phase D — LLVM emit (single value-output; the core)
| Step | Commit | What | Files |
|---|---|---|---|
| D.0 | xfail | `examples/16xx-platform-asm-syscall-write.sx` + `…-register-read.sx` + `…-no-output-volatile.sx` + `…-missing-volatile.sx` (expected compile error) — all red | examples + `expected/` markers |
| D.1 | green | `emitInlineAsm`: **port `FuncGen.airAssembly`** — constraint-string assembler (outputs `=`/`+`, inputs, `clobbers(.name)``~{name}`), `%[name]``${N}` / `%%` / `%=` template rewriter, `LLVMGetInlineAsm`+`LLVMBuildCall2`, `sideeffect=volatile`, AT&T dialect | `src/ir/emit_llvm.zig` (emitInst dispatch + handler) |
| D.2 | green | lock the template-rewrite + constraint string via an `expected/*.ir` snapshot on `…-template-subst.sx` | examples |
**Phase D verification:** `zig build test`; the syscall example runs on
`x86_64-linux`; IR snapshot matches the design doc's worked `sys_write` lowering.
## Phase E — multi-return tuples + `clobbers(.…)`
| Step | Commit | What | Files |
|---|---|---|---|
| E.0 | xfail | `…-asm-multi-return.sx` (`divmod``(quot,rem)`, `cpuid`→4-tuple) red | examples |
| E.1 | green | N `out_value` → LLVM struct return + `extractvalue i` → sx tuple (named when operands named); `clobbers(.name)` dot-name lowering finalized | `src/ir/emit_llvm.zig`, `src/ir/lower/expr.zig` |
## Phase F — global asm (needs EXTERN-EXPORT Phase 2)
| Step | Commit | What | Files |
|---|---|---|---|
| F.0 | xfail | top-level `asm { … }` decl parsed (reject operands/`volatile`); `…-asm-global.sx` (defines a symbol, imported via `extern`) red | `src/parser.zig`, `src/ast.zig` |
| F.1 | green | lower `asm_global``c.LLVMAppendModuleInlineAsm`; comptime-call guard (dlsym-miss is loud); blocks concatenate in source order | `src/ir/lower/decl.zig`, `src/ir/emit_llvm.zig`, `src/ir/interp.zig` |
## Phase G — later (own steps when scheduled)
`-> @place` write-through + read-write (`"+r" -> @place`) + indirect-memory
(`"=*m"`) outputs · `%=` unique-id · output-to-const rejection · Intel-dialect
opt-in · naked functions (`callconv(.naked)`, coordinate with EXTERN-EXPORT).
## Open decisions (design doc §II.10)
Dialect (AT&T-only v1, recommended) · `volatile` contextual-keyword (recommended)
· brace separator comma (recommended) · `clobbers(.name)` dot-name sugar now →
checked per-arch `Clobber` enum later (Phase 4 of the design doc).
## End-to-end verification (per phase)
`zig build && zig build test`; for arch-pinned examples confirm they run on a
matching host or assert on `sx ir`/`.s` snapshots. After intentional output
changes only: `zig build test -Dupdate-goldens`, then review the diff.