docs(asm): checkpoint Phase G — read-write + place outputs

This commit is contained in:
agra
2026-06-15 23:08:24 +03:00
parent 4128416d48
commit 97a4050462

View File

@@ -6,7 +6,28 @@ commit, one step at a time per the cadence rule (no commit may both add a test
and make it pass).
## Last completed step
**2** — `-> @place` write-through outputs. An asm result can be **stored through
**G (read-write `+` place outputs)** — a `+r` / `+{reg}` `-> @place` output is now
implemented (the last substantive feature). LLVM has no `+` constraint, so a
read-write place lowers to: an output **`=`** constraint (return slot, stored back
through the place after the call; the leading `+` rewritten to `=` in
`appendAsmConstraints`), **plus** a **tied input** (the decimal index of that
output) appended **after** the regular inputs, seeded with the place's loaded
value passed as a call arg. Tied inputs come **last** so existing operand indices
(`%[name]``${N}`) are undisturbed — `asmOperandIndex` unchanged. Lowering
(`lowerAsmExpr`) no longer rejects `+` (indirect `*` still rejected loudly).
`emitInlineAsm` (`src/backend/llvm/ops.zig`): grows arg/param arrays by the rw
count (`n_args = n_inputs + n_rw`), loads each seed (`asm.rw.seed`), emits the
tied constraint, and the existing store-back path writes the modified output back.
New `asmIsReadWrite(e, op)` helper. Verified by **running**: increment-in-place
(41→42, IR `"=r,0"`) and a mixed case (rw place + regular input + value output) →
textbook `"=r,=r,r,0"` with correct `${N}` indices and args `(input, seed)`. Two
commits per cadence: (1) `examples/1650-platform-asm-rw-place.sx` locked the
rejection; (2) implemented + flipped 1650 to a runnable aarch64-pinned example
(`{ "target": "macos" }`, ir-only elsewhere). `zig build test` green (658 corpus,
446 unit). Files: `src/ir/lower/expr.zig`, `src/backend/llvm/ops.zig`,
`examples/1650-*`.
Prior: **2**`-> @place` write-through outputs. An asm result can be **stored through
a place** (local / struct field) instead of returned; the place output does NOT
join the result tuple. Parser: `-> @place` parses the `@place` as an ordinary
address-of expression → an `out_place` operand (`src/parser.zig`). Lowering
@@ -173,9 +194,10 @@ pipeline: lex (A.0) → parse (A.1) → validate (B.0/B.1 + `%[name]` check) →
tuples (E). Register-class + register-pinned operands, inputs, clobbers, `#string`
multi-instruction templates, `%[name]`/`%%` rewriting, and the §II.5 auto-naming
rule all work and execute on the host JIT. Global `asm { … }` (Phase F) works AOT (call-into-asm
via lib-less `extern`). `-> @place` **write-through** outputs work (Phase 2);
read-write (`+`) and indirect-memory (`*`) place outputs are rejected loudly as
not-yet-implemented — the remaining feature work. Smaller
via lib-less `extern`). `-> @place` **write-through** outputs work (Phase 2) and
**read-write (`+`)** place outputs work (Phase G — tied-input lowering, runs on
aarch64). Indirect-memory (`*`) place outputs are still rejected loudly as
not-yet-implemented — the only remaining substantive feature. Smaller
follow-ups: the comptime-call guard for global asm (`#run` into a module-asm
symbol should fail loud via dlsym-miss — pin a test), a JIT-vs-global-asm note
(`sx run` silently mishandles module-asm symbols; AOT is correct), and the x86_64
@@ -191,13 +213,8 @@ Phase EF feasibility already confirmed against the live tree
`extern`, 60 sites; `--target` a global CLI flag).
## Next step
Inline assembly is **feature-complete for the common surface**. Remaining work,
all optional / additive (pick any):
- **Read-write (`"+…" -> @place`) place outputs**: LLVM expresses `+` as an
output `=` + a TIED input (`0` referencing the output index), with the seed
value passed as an arg — Zig's `llvm_rw_vals` mechanism. Currently rejected at
lowering. Needs the tied-input plumbing in `emitInlineAsm` + seeding a load of
the place.
Inline assembly is **feature-complete for the common surface** plus read-write
(`+`) place outputs. Remaining work, all optional / additive (pick any):
- **Indirect-memory (`"=*m"`) outputs**: pass the place address as an arg, asm
writes through it (no return slot). Currently rejected.
- **Output-to-`const` rejection** for `-> @place` (the place must be mutable).
@@ -255,6 +272,13 @@ Orthogonal: **issue 0137** (no-`main` segfault).
(out_place → store-through, out_value → result), fast-path when no places.
`+`/`*` rejected. Locked with 1649 (mixed, runs). `zig build test` green (657
corpus, 446 unit).
- (G) read-write `+` place outputs — `+` lowers to an output `=` + a tied input
(output-index constraint) seeded with the place's loaded value, tied inputs
appended last (operand indices undisturbed). `appendAsmConstraints` rewrites
`+``=`; `emitInlineAsm` grows args by the rw count + loads seeds;
`asmIsReadWrite` helper. Lowering stops rejecting `+` (`*` still rejected). Two
commits (cadence): 1650 locked the rejection, then flipped to a runnable
aarch64 example (`"=r,0"` IR). `zig build test` green (658 corpus, 446 unit).
## Known issues
- **0137** — `sx run` on a program with no `main` segfaults (unguarded JIT entry