docs(asm): checkpoint Phase G — read-write + place outputs
This commit is contained in:
@@ -6,7 +6,28 @@ commit, one step at a time per the cadence rule (no commit may both add a test
|
||||
and make it pass).
|
||||
|
||||
## Last completed step
|
||||
**2** — `-> @place` write-through outputs. An asm result can be **stored through
|
||||
**G (read-write `+` place outputs)** — a `+r` / `+{reg}` `-> @place` output is now
|
||||
implemented (the last substantive feature). LLVM has no `+` constraint, so a
|
||||
read-write place lowers to: an output **`=`** constraint (return slot, stored back
|
||||
through the place after the call; the leading `+` rewritten to `=` in
|
||||
`appendAsmConstraints`), **plus** a **tied input** (the decimal index of that
|
||||
output) appended **after** the regular inputs, seeded with the place's loaded
|
||||
value passed as a call arg. Tied inputs come **last** so existing operand indices
|
||||
(`%[name]`→`${N}`) are undisturbed — `asmOperandIndex` unchanged. Lowering
|
||||
(`lowerAsmExpr`) no longer rejects `+` (indirect `*` still rejected loudly).
|
||||
`emitInlineAsm` (`src/backend/llvm/ops.zig`): grows arg/param arrays by the rw
|
||||
count (`n_args = n_inputs + n_rw`), loads each seed (`asm.rw.seed`), emits the
|
||||
tied constraint, and the existing store-back path writes the modified output back.
|
||||
New `asmIsReadWrite(e, op)` helper. Verified by **running**: increment-in-place
|
||||
(41→42, IR `"=r,0"`) and a mixed case (rw place + regular input + value output) →
|
||||
textbook `"=r,=r,r,0"` with correct `${N}` indices and args `(input, seed)`. Two
|
||||
commits per cadence: (1) `examples/1650-platform-asm-rw-place.sx` locked the
|
||||
rejection; (2) implemented + flipped 1650 to a runnable aarch64-pinned example
|
||||
(`{ "target": "macos" }`, ir-only elsewhere). `zig build test` green (658 corpus,
|
||||
446 unit). Files: `src/ir/lower/expr.zig`, `src/backend/llvm/ops.zig`,
|
||||
`examples/1650-*`.
|
||||
|
||||
Prior: **2** — `-> @place` write-through outputs. An asm result can be **stored through
|
||||
a place** (local / struct field) instead of returned; the place output does NOT
|
||||
join the result tuple. Parser: `-> @place` parses the `@place` as an ordinary
|
||||
address-of expression → an `out_place` operand (`src/parser.zig`). Lowering
|
||||
@@ -173,9 +194,10 @@ pipeline: lex (A.0) → parse (A.1) → validate (B.0/B.1 + `%[name]` check) →
|
||||
tuples (E). Register-class + register-pinned operands, inputs, clobbers, `#string`
|
||||
multi-instruction templates, `%[name]`/`%%` rewriting, and the §II.5 auto-naming
|
||||
rule all work and execute on the host JIT. Global `asm { … }` (Phase F) works AOT (call-into-asm
|
||||
via lib-less `extern`). `-> @place` **write-through** outputs work (Phase 2);
|
||||
read-write (`+`) and indirect-memory (`*`) place outputs are rejected loudly as
|
||||
not-yet-implemented — the remaining feature work. Smaller
|
||||
via lib-less `extern`). `-> @place` **write-through** outputs work (Phase 2) and
|
||||
**read-write (`+`)** place outputs work (Phase G — tied-input lowering, runs on
|
||||
aarch64). Indirect-memory (`*`) place outputs are still rejected loudly as
|
||||
not-yet-implemented — the only remaining substantive feature. Smaller
|
||||
follow-ups: the comptime-call guard for global asm (`#run` into a module-asm
|
||||
symbol should fail loud via dlsym-miss — pin a test), a JIT-vs-global-asm note
|
||||
(`sx run` silently mishandles module-asm symbols; AOT is correct), and the x86_64
|
||||
@@ -191,13 +213,8 @@ Phase E–F feasibility already confirmed against the live tree
|
||||
`extern`, 60 sites; `--target` a global CLI flag).
|
||||
|
||||
## Next step
|
||||
Inline assembly is **feature-complete for the common surface**. Remaining work,
|
||||
all optional / additive (pick any):
|
||||
- **Read-write (`"+…" -> @place`) place outputs**: LLVM expresses `+` as an
|
||||
output `=` + a TIED input (`0` referencing the output index), with the seed
|
||||
value passed as an arg — Zig's `llvm_rw_vals` mechanism. Currently rejected at
|
||||
lowering. Needs the tied-input plumbing in `emitInlineAsm` + seeding a load of
|
||||
the place.
|
||||
Inline assembly is **feature-complete for the common surface** plus read-write
|
||||
(`+`) place outputs. Remaining work, all optional / additive (pick any):
|
||||
- **Indirect-memory (`"=*m"`) outputs**: pass the place address as an arg, asm
|
||||
writes through it (no return slot). Currently rejected.
|
||||
- **Output-to-`const` rejection** for `-> @place` (the place must be mutable).
|
||||
@@ -255,6 +272,13 @@ Orthogonal: **issue 0137** (no-`main` segfault).
|
||||
(out_place → store-through, out_value → result), fast-path when no places.
|
||||
`+`/`*` rejected. Locked with 1649 (mixed, runs). `zig build test` green (657
|
||||
corpus, 446 unit).
|
||||
- (G) read-write `+` place outputs — `+` lowers to an output `=` + a tied input
|
||||
(output-index constraint) seeded with the place's loaded value, tied inputs
|
||||
appended last (operand indices undisturbed). `appendAsmConstraints` rewrites
|
||||
`+`→`=`; `emitInlineAsm` grows args by the rw count + loads seeds;
|
||||
`asmIsReadWrite` helper. Lowering stops rejecting `+` (`*` still rejected). Two
|
||||
commits (cadence): 1650 locked the rejection, then flipped to a runnable
|
||||
aarch64 example (`"=r,0"` IR). `zig build test` green (658 corpus, 446 unit).
|
||||
|
||||
## Known issues
|
||||
- **0137** — `sx run` on a program with no `main` segfaults (unguarded JIT entry
|
||||
|
||||
Reference in New Issue
Block a user