diff --git a/current/CHECKPOINT-FIBERS.md b/current/CHECKPOINT-FIBERS.md index b94dd656..fc2de3fd 100644 --- a/current/CHECKPOINT-FIBERS.md +++ b/current/CHECKPOINT-FIBERS.md @@ -4,55 +4,67 @@ Companion to [PLAN-FIBERS.md](PLAN-FIBERS.md). Update after every step (one step per the cadence rule). New corpus category: `18xx` concurrency. ## Last completed step -**B1.0a (`abi(.pure)` lock commit) — DONE.** Plumbed the `is_pure` flag end-to-end and made -emit bail loudly: -- IR `Function.is_pure: bool` ([inst.zig](../src/ir/inst.zig)) — set from `fd.abi == .pure` - at both `declareFunction` decl sites ([decl.zig](../src/ir/lower/decl.zig)). -- `funcWantsImplicitCtx` returns false for `.pure` (mirrors the `.c` skip, decl.zig:515) — - a `.pure` fn gets no synthetic `__sx_ctx`. -- Both body-lowering paths bypass `lowerValueBody` for `.pure`: lower the asm body as - statements + cap with `unreachable` (a `.pure` body has no sx return — the asm rets - itself; this avoids the implicit-return diagnostic). -- `emit_llvm` Pass 2 (~line 402) **bails loudly** when `func.is_pure` - ("`abi(.pure)` function '…' LLVM emission not yet implemented") via `comptime_failed` - (driver aborts nonzero) — NOT a framed body (whose epilogue would corrupt a context - switch's SP-in ≠ SP-out). -- `examples/1800-concurrency-pure-asm.sx` — one host example (no `.build` pin; the bail is - host-independent, fires before any asm/instruction selection), locked to the bail snapshot - (exit 1, empty stdout, the loud diagnostic on stderr). -- **Adversarial review (closed in-step):** the review caught that `is_pure` was set ONLY at - the two `declareFunction` decl sites — generic monomorphization - ([generic.zig](../src/ir/lower/generic.zig)) and pack expansion - ([pack.zig](../src/ir/lower/pack.zig)) create the `Function` via a different path and left - `is_pure` false, so a generic `.pure` instance silently shipped a framed body (returned 42 - but leaked the prologue's stack adjustment — the exact corruption the lock prevents). Both - paths now set `is_pure` + route `.pure` bodies through the asm-only + `unreachable` cap. - Locked by `examples/1801-concurrency-pure-generic-bail.sx`. (The review's other CRITICAL — - a `.pure` *lambda* — is a **false positive**: `isLambda`'s return-type scan - (parser.zig:3652) breaks on the `abi` keyword, so a `.pure` lambda is unparseable and - `parseLambda`'s abi-handling is never reached. Latent `isLambda`/`parseLambda` - inconsistency, not a B1 concern.) -- **Naming:** the sx-facing name is **`pure`** throughout (field, diagnostic); LLVM's - `naked` attribute is only the B1.0b lowering mechanism (per user direction — don't call - the function "naked"). -- `zig build && zig build test` green: **723 ran, 0 failed**. +**B1.0b (`abi(.pure)` real emission) — DONE. B1.0 complete.** Replaced the emit bail with +real LLVM `naked` emission: +- `emit_llvm` declaration pass: for `func.is_pure`, add the LLVM `naked` + `noinline` + + `nounwind` attributes and **skip** the `frame-pointer=all` attribute (incompatible with a + frameless function). Pass 2 now emits the `.pure` body normally — `naked` makes the + backend emit it verbatim (the inline asm + its own `ret`) with no prologue/epilogue. +- IR shape (verified): `; Function Attrs: naked noinline nounwind` / `define internal i64 + @answer() #0 { entry: call void asm sideeffect "…ret…", ""() unreachable }` / + `attributes #0 = { naked noinline nounwind }`. The caller invokes it as an ordinary + `() -> i64` call (`.pure` is `call_conv == .default`). +- `examples/1800-concurrency-pure-asm.sx` — now GREEN, aarch64-pinned (`.build {"target": + "macos"}`): runs end-to-end → **exit 42** on this host, ir-only on a mismatch; `.ir` + snapshot captured. +- `examples/1801-concurrency-pure-generic.sx` (renamed from `-bail`) — the generic `.pure` + now emits a correct naked `answer__i64` (exit 42), proving generic.zig produces a naked + body, not a framed one. aarch64-pinned. +- `examples/1802-concurrency-pure-asm-x86.sx` — x86_64 cross sibling (`.build {"target": + "x86_64-linux"}`, ir-only here): `.ir` locks `naked` + `movl $42, %eax` / `ret`. +- Unit test `emit: abi(.pure) function gets the naked attribute (no frame-pointer)` in + `emit_llvm.test.zig` (asserts `naked` present, `frame-pointer` absent). +- **B1.0c (review-hardening):** a param-bearing `.pure` fn emitted invalid LLVM (loud + verifier error "cannot use argument of naked function") because the param-alloca loop + wasn't gated. Fixed forward (this *enables* the B1.3 context-switch use case rather than + rejecting it): gated the param-alloca loop on `fd.abi != .pure` in decl.zig (both paths) + + generic.zig; a naked fn's args stay in registers (read by asm), declared-but-unused in + LLVM. Locked by `examples/1803-concurrency-pure-asm-param.sx` (`add(a,b)` → x0+x1 → 42). +- `zig build && zig build test` green: **725 ran, 0 failed** + unit tests. + +### Earlier — B1.0a (lock + review hardening) +Plumbed `Function.is_pure` (set from `fd.abi == .pure` at both decl sites + generic.zig + +pack.zig); `funcWantsImplicitCtx` skips `.pure` (no synthetic ctx, like `.c`); all +body-lowering paths bypass `lowerValueBody` for `.pure` (asm body + `unreachable` cap — no sx +return); `emit_llvm` Pass 2 bailed loudly (since flipped to real emission). Adversarial +review caught the generic/pack `is_pure` gap (a generic `.pure` silently shipped a framed +body); closed + locked. The review's `.pure`-lambda CRITICAL was a false positive +(unparseable — `isLambda` breaks on the `abi` keyword). ## Current state -Stream A (atomics) is feature-complete (✅) and unblocks B2-channels. Stream B1: **B1.0a -landed**; the `abi(.pure)` ABI is plumbed but emit deliberately bails (B1.0b flips it to -real LLVM `naked` emission). No fibers/Io/scheduler code yet. Grounded floor facts: +Stream A (atomics) is feature-complete (✅). Stream B1: **B1.0 complete** — `abi(.pure)` +emits a real LLVM `naked` function end-to-end (decl, generic, pack paths), the substrate for +the fiber context-switch. No fibers/Io/scheduler code yet. Grounded floor facts: - `context` is already an implicit `*Context` param (slot 0) + `push Context` is a stack `alloca` ⇒ **fiber-local for free**. Only shared root = `__sx_default_context` global (entry-point bind). B1.1 expected to be a **library convention** (spawn trampoline snapshots the spawner's ctx into slot 0), **likely zero compiler change** — probe first. - Inline asm works end-to-end (lower→emit→JIT, aarch64 + x86_64) — the `.pure` body reuses it. +- **`.pure` with PARAMS works** (B1.0c, the B1.3 substrate): the param-alloca loop is gated + on `fd.abi != .pure` in decl.zig (both paths) + generic.zig — a naked fn's args stay in + ABI registers (read by the asm body), declared-but-unused in LLVM (verifier-legal). + Example `1803-concurrency-pure-asm-param.sx` (`add(a,b)` reads x0/x1). **Unsupported (loud, + not silent):** a `.pure` *variadic-pack* fn (pack.zig's param loop is intertwined with + comptime-param/`#insert` handling, and a naked fn can't read a runtime-sized pack from + registers anyway) → loud LLVM-verifier error for that nonsensical construct. Acceptable + boundary; a sharper sx diagnostic for it is a candidate polish, not a blocker. ## Next step -**B1.0b (`abi(.pure)` real emission)** — per PLAN-FIBERS.md "Phases → B1.0 → B1.0b" and the -kickoff prompt at the bottom of that file. Replace the emit bail with LLVM's `naked` -attribute + asm-only body; pin `1800` aarch64 (run end-to-end → exit 42, capture `.ir`); add -x86_64 cross sibling `1802` (ir-only); add an `emit_llvm.test.zig` unit test asserting the -`naked` attr. Separate commit (cadence rule — B1.0a locked, B1.0b greens). +**B1.1 (per-fiber `context` root) — probe-first.** Per PLAN-FIBERS.md "Phases → B1.1". Write +a probe confirming a spawn trampoline can pass a snapshotted `Context` as slot 0 with no +compiler change (grounded as likely zero-change); lock the behavior with an `18xx` example + +a checkpoint note on the convention. Only if the probe surfaces a real gap (a path re-reads +`__sx_default_context` mid-stack) does this become a compiler step. ## Known issues / capability gaps - **Orthogonal (not a B1 blocker):** default VALUES for comptime params don't bind on @@ -117,4 +129,16 @@ x86_64 cross sibling `1802` (ir-only); add an `emit_llvm.test.zig` unit test ass corrupted the stack). Fixed generic.zig + pack.zig (set `is_pure` + asm-only `unreachable` cap); locked by `examples/1801-concurrency-pure-generic-bail.sx`. The review's `.pure`- lambda CRITICAL was a false positive (unparseable — `isLambda` breaks on `abi`). Suite - green (723/0). **Next: B1.0b (real `naked` emission).** + green (723/0). +- **B1.0b** — real `naked` emission: emit_llvm declaration pass adds LLVM `naked`/`noinline`/ + `nounwind` + skips `frame-pointer` for `func.is_pure`; Pass 2 emits the body verbatim (no + prologue). `1800` green aarch64-pinned (exit 42 + `.ir`); renamed `1801` → `-generic` + (generic `.pure` emits a naked body, exit 42); added x86_64 sibling `1802` (ir-only, `.ir` + locks `naked` + `movl $42, %eax`). Unit test asserts `naked` present + `frame-pointer` + absent. Suite green (724/0). +- **B1.0c** — review-hardening: param-bearing `.pure` emitted invalid LLVM (loud verifier + error). Gated the param-alloca loop on `fd.abi != .pure` (decl.zig both paths + generic.zig) + — naked args stay in registers, read by the asm body (the B1.3 context-switch shape). + Locked by `examples/1803-concurrency-pure-asm-param.sx`. Pack `.pure` left unsupported + (loud, nonsensical). **B1.0 complete.** Suite green (725/0). **Next: B1.1 (per-fiber + context, probe-first).** diff --git a/current/PLAN-FIBERS.md b/current/PLAN-FIBERS.md index 1172f23c..fca4a46a 100644 --- a/current/PLAN-FIBERS.md +++ b/current/PLAN-FIBERS.md @@ -1,8 +1,8 @@ # PLAN-FIBERS — Stream B1 (fibers + Io + M:1 scheduler) -> **STATUS: 🚧 in progress.** B1.0a (`abi(.pure)` lock commit) ✅ landed. Next step = -> **B1.0b** (`abi(.pure)` real emission — LLVM `naked` attr). See the kickoff prompt at the -> bottom. +> **STATUS: 🚧 in progress.** B1.0 (`abi(.pure)` codegen) ✅ complete — emits a real LLVM +> `naked` function end-to-end (decl / generic / pack paths; examples 1800/1801/1802 + unit +> test). Next step = **B1.1** (per-fiber `context` root — probe-first, likely library-only). Carved from [PLAN-POST-METATYPE.md](PLAN-POST-METATYPE.md) Stream B (§B1) + the design-of-record [../design/execution-evolution-roadmap.md](../design/execution-evolution-roadmap.md) @@ -152,23 +152,25 @@ B1.0 (`.pure`) forces these plumbing sites: ## Phases (xfail→green steps) -### B1.0 — `abi(.pure)` codegen -- **B1.0a (lock) — ✅ DONE** (commit pending). Carried `abi == .pure` into IR - `Function.is_pure`; threaded through `decl.zig` (`funcWantsImplicitCtx` skips `.pure` like - `.c`; both body-lowering paths bypass `lowerValueBody` for `.pure`, lowering the asm body + - capping with `unreachable`); `emit_llvm` Pass 2 **bails loudly** on `func.is_pure` - ("`abi(.pure)` function '…' LLVM emission not yet implemented", build-gating nonzero exit). - `examples/1800-concurrency-pure-asm.sx` (one host example, no `.build` pin — the bail is - host-independent) locked to the bail snapshot. Suite green (722/0). -- **B1.0b (green) ← NEXT** — emit LLVM's `naked` attr - (`LLVMGetEnumAttributeKindForName("naked", 5)` + `LLVMCreateEnumAttribute` + - `LLVMAddAttributeAtIndex` at func index −1; shape per `nounwind` at emit_llvm.zig:1339); - emit the `.pure` body as the asm block only (no prologue/epilogue/ctx). Pin `1800` - aarch64 (`.build {"target":"aarch64-macos"}`) → runs end-to-end (exit 42) on this host, - ir-only on a mismatch; capture its `.ir` (asserts `naked` + the asm). Add an x86_64 cross - sibling `examples/1802-concurrency-pure-asm-x86.sx` (`.build {"target":"x86_64-linux"}`, - ir-only here). Add a unit test in `emit_llvm.test.zig` asserting the `naked` attribute is - present on a `.pure` function. Review the diff (no stray error text). Commit. +### B1.0 — `abi(.pure)` codegen — ✅ COMPLETE +- **B1.0a (lock) — ✅ DONE.** Carried `abi == .pure` into IR `Function.is_pure`; threaded + through `decl.zig` (`funcWantsImplicitCtx` skips `.pure` like `.c`; all body-lowering paths + bypass `lowerValueBody` for `.pure`, lowering the asm body + capping with `unreachable`) + + generic.zig + pack.zig; `emit_llvm` Pass 2 bailed loudly on `func.is_pure`. Locked by + `examples/1800-concurrency-pure-asm.sx` + the generic regression (review-found gap). +- **B1.0b (green) — ✅ DONE.** `emit_llvm` declaration pass adds LLVM `naked` + `noinline` + + `nounwind` for `func.is_pure` and skips `frame-pointer=all` (incompatible with a frameless + function); Pass 2 emits the body normally (`naked` ⇒ verbatim asm + own `ret`, no + prologue). `1800` pinned aarch64 → exit 42 + `.ir`; `1801-concurrency-pure-generic.sx` + (renamed from `-bail`) proves the generic path emits a naked body (exit 42); + `1802-concurrency-pure-asm-x86.sx` x86_64 cross sibling (ir-only here, `.ir` locks `naked` + + `movl $42, %eax`). Unit test `emit: abi(.pure) function gets the naked attribute` asserts + `naked` present + `frame-pointer` absent. Suite green (724/0). +- **B1.0c (review-hardening) — ✅ DONE.** A param-bearing `.pure` fn emitted invalid LLVM + (loud verifier error). Gated the param-alloca loop on `fd.abi != .pure` (decl.zig both + paths + generic.zig) so a naked fn's args stay in registers (read by the asm body) — this + *enables* B1.3's `swap_context(from, to)`. Locked by `1803-concurrency-pure-asm-param.sx`. + Pack `.pure` (variadic + naked, nonsensical) left unsupported → loud verifier error. ### B1.1 — per-fiber `context` root (probe-first; likely zero compiler change) - **B1.1a (probe + lock)** — write a probe (`.sx-tmp/`) + an `18xx` example that snapshots a diff --git a/examples/1803-concurrency-pure-asm-param.sx b/examples/1803-concurrency-pure-asm-param.sx new file mode 100644 index 00000000..9da05e67 --- /dev/null +++ b/examples/1803-concurrency-pure-asm-param.sx @@ -0,0 +1,25 @@ +// Stream B1 (fibers) — an `abi(.pure)` function with PARAMETERS reads its args +// from ABI registers (the shape the fiber context-switch needs: `swap_context` +// reads `from`/`to` from x0/x1). +// +// A naked function has no frame, so params are NOT spilled to stack slots — they +// stay in their ABI registers and the asm body reads them directly. Here `a` is +// in x0, `b` in x1 (aarch64 AAPCS), and the result returns in x0: `add x0, x0, +// x1`. The lowering skips the param-alloca loop for `.pure` (decl.zig / +// generic.zig); the LLVM args are declared-but-unused, which the verifier allows +// (spilling them would emit `store i64 %0, …` → "cannot use argument of naked +// function"). aarch64-pinned; runs end-to-end (exit 42), ir-only on a mismatch. +// +// Regression for an adversarial-review finding: before the param-alloca guard, a +// param-bearing `.pure` fn emitted invalid LLVM (loud verifier error) instead of +// a working naked function. +add :: (a: i64, b: i64) -> i64 abi(.pure) { + asm volatile { + #string A + add x0, x0, x1 + ret +A + }; +} + +main :: () -> i64 { return add(40, 2); } diff --git a/examples/expected/1803-concurrency-pure-asm-param.build b/examples/expected/1803-concurrency-pure-asm-param.build new file mode 100644 index 00000000..42e24dd2 --- /dev/null +++ b/examples/expected/1803-concurrency-pure-asm-param.build @@ -0,0 +1 @@ +{ "target": "macos" } diff --git a/examples/expected/1803-concurrency-pure-asm-param.exit b/examples/expected/1803-concurrency-pure-asm-param.exit new file mode 100644 index 00000000..d81cc071 --- /dev/null +++ b/examples/expected/1803-concurrency-pure-asm-param.exit @@ -0,0 +1 @@ +42 diff --git a/examples/expected/1803-concurrency-pure-asm-param.ir b/examples/expected/1803-concurrency-pure-asm-param.ir new file mode 100644 index 00000000..bd696436 --- /dev/null +++ b/examples/expected/1803-concurrency-pure-asm-param.ir @@ -0,0 +1,15 @@ + +; Function Attrs: naked noinline nounwind +define internal i64 @add(i64 %0, i64 %1) #0 { +entry: + call void asm sideeffect " add x0, x0, x1\0A ret\0A", ""() + unreachable +} + +; Function Attrs: nounwind +define i32 @main() #1 { +entry: + %call = call i64 @add(i64 40, i64 2) + %ca.tr = trunc i64 %call to i32 + ret i32 %ca.tr +} diff --git a/examples/expected/1803-concurrency-pure-asm-param.stderr b/examples/expected/1803-concurrency-pure-asm-param.stderr new file mode 100644 index 00000000..8b137891 --- /dev/null +++ b/examples/expected/1803-concurrency-pure-asm-param.stderr @@ -0,0 +1 @@ + diff --git a/examples/expected/1803-concurrency-pure-asm-param.stdout b/examples/expected/1803-concurrency-pure-asm-param.stdout new file mode 100644 index 00000000..8b137891 --- /dev/null +++ b/examples/expected/1803-concurrency-pure-asm-param.stdout @@ -0,0 +1 @@ + diff --git a/src/ir/lower/decl.zig b/src/ir/lower/decl.zig index ca1e1e7f..ec200ac9 100644 --- a/src/ir/lower/decl.zig +++ b/src/ir/lower/decl.zig @@ -2660,13 +2660,19 @@ pub fn lowerFunctionBodyInto(self: *Lowering, fd: *const ast.FnDecl, fid: FuncId const user_param_base: u32 = if (wants_ctx) 1 else 0; if (wants_ctx) self.current_ctx_ref = Ref.fromIndex(0); - for (fd.params, 0..) |p, i| { + // An `abi(.pure)` (naked) function has no frame: its params arrive in ABI + // registers and are read directly by the asm body (e.g. `swap_context`'s + // `from`/`to`). Spilling them to allocas would (a) need a frame and (b) emit + // `store i64 %0, …` — "cannot use argument of naked function" (LLVM verifier). + // Leave the LLVM args declared-but-unused (the verifier allows that); the asm + // references the registers. + if (fd.abi != .pure) for (fd.params, 0..) |p, i| { const pty = self.resolveParamType(&p); const slot = self.builder.alloca(pty); const param_ref = Ref.fromIndex(@intCast(i + user_param_base)); self.builder.store(slot, param_ref); scope.put(p.name, .{ .ref = slot, .ty = pty, .is_alloca = true }); - } + }; // Inbound entry points + abi(.c) sx functions: bind current_ctx_ref // to the static default before any user code runs. @@ -2812,7 +2818,10 @@ pub fn lowerFunction(self: *Lowering, fd: *const ast.FnDecl, name: []const u8, i const user_param_base_lf: u32 = if (wants_ctx_lf) 1 else 0; if (wants_ctx_lf) self.current_ctx_ref = Ref.fromIndex(0); - for (fd.params, 0..) |p, i| { + // `abi(.pure)` (naked): params arrive in registers, read directly by the asm + // body — no frame, no alloca/store (which the LLVM verifier rejects on a + // naked function). See the sibling guard in the other body-lowering path. + if (fd.abi != .pure) for (fd.params, 0..) |p, i| { const pty = self.resolveParamType(&p); // Allocate stack slot for param, store initial value. // Refs 0..N-1 are reserved for function parameters by beginFunction. @@ -2820,7 +2829,7 @@ pub fn lowerFunction(self: *Lowering, fd: *const ast.FnDecl, name: []const u8, i const param_ref = Ref.fromIndex(@intCast(i + user_param_base_lf)); self.builder.store(slot, param_ref); scope.put(p.name, .{ .ref = slot, .ty = pty, .is_alloca = true }); - } + }; // Inbound entry points + abi(.c) sx functions: bind // current_ctx_ref to &__sx_default_context. See companion comment diff --git a/src/ir/lower/generic.zig b/src/ir/lower/generic.zig index fce78e26..5169a0bd 100644 --- a/src/ir/lower/generic.zig +++ b/src/ir/lower/generic.zig @@ -128,7 +128,10 @@ pub fn monomorphizeFunction(self: *Lowering, fd: *const ast.FnDecl, mangled_name defer scope.deinit(); self.scope = &scope; - { + // `abi(.pure)` (naked): no frame — params arrive in registers, read by the + // asm body, never spilled to allocas (the LLVM verifier rejects a naked + // function that uses its arguments). Mirrors the decl-path guard. + if (fd.abi != .pure) { var param_idx: u32 = if (wants_ctx) 1 else 0; for (fd.params) |p| { if (isTypeParamDecl(&p, fd.type_params)) continue;