Files
sx/current/PLAN-FIBERS.md
agra b631590574 fibers B1.0c: support params in abi(.pure) (read from registers)
Adversarial review of B1.0b found a param-bearing abi(.pure) function
emitted invalid LLVM ("cannot use argument of naked function" — loud
verifier error, not silent) because the param-alloca loop spilled the
args to stack slots, which a naked function cannot have.

Fixed forward — this ENABLES the B1.3 context-switch use case rather
than rejecting it: gate the param-alloca loop on fd.abi != .pure in
decl.zig (both body-lowering paths) and generic.zig. A naked function's
args stay in their ABI registers and are read directly by the asm body
(e.g. swap_context reads from/to from x0/x1); the LLVM args are
declared-but-unused, which the verifier allows.

examples/1803-concurrency-pure-asm-param.sx: naked add(a, b) reads x0/x1
(add x0, x0, x1; ret) -> 40 + 2 = 42. aarch64-pinned.

Pack abi(.pure) (variadic + naked — nonsensical, can't read a runtime
pack from registers) left unsupported: pack.zig's param loop is
intertwined with comptime-param/#insert handling, so that case still
hits the loud verifier error. Documented in the checkpoint.

Also updates PLAN-FIBERS / CHECKPOINT-FIBERS for B1.0 completion.
B1.0 complete. Suite green (725/0).
2026-06-20 16:36:31 +03:00

244 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# PLAN-FIBERS — Stream B1 (fibers + Io + M:1 scheduler)
> **STATUS: 🚧 in progress.** B1.0 (`abi(.pure)` codegen) ✅ complete — emits a real LLVM
> `naked` function end-to-end (decl / generic / pack paths; examples 1800/1801/1802 + unit
> test). Next step = **B1.1** (per-fiber `context` root — probe-first, likely library-only).
Carved from [PLAN-POST-METATYPE.md](PLAN-POST-METATYPE.md) Stream B (§B1) + the
design-of-record [../design/execution-evolution-roadmap.md](../design/execution-evolution-roadmap.md)
§4 (async), §7 steps 49, §8.1 (risks), §10 (testing). Progress in
[CHECKPOINT-FIBERS.md](CHECKPOINT-FIBERS.md). Stream B2 (channels/cancel/stdlib) is a
separate carve ([PLAN-CHANNELS.md], when reached) and depends on this + atomics (✅).
**Goal:** the colorblind, stackful, **pure-sx** async runtime — fibers behind an `Io`
interface, an M:1 scheduler, blocking + deterministic-sim + event-loop `Io` impls. The
**compiler floor is small and net-new**: make `abi(.pure)` actually emit an LLVM `naked`
function (B1.0), and confirm/close the per-fiber `context` root (B1.1). **Everything
else — the context-switch asm, fiber bootstrap, `mmap` stacks, the scheduler, futures,
the `Io` vtables — is ordinary sx library code** (design §4, §4.4). The irreducible FFI
floor: the per-arch asm context-switch (in `.sx`), syscall `extern`s, and `mmap`.
**Cadence (IMPASSIBLE):** no commit both adds a test AND makes it pass (lock-to-bail, then
flip to green); `zig build && zig build test` green after every step; never regen snapshots
while red; scope regens with `-Dname=examples/NNNN-…sx -Dupdate-goldens` + review the diff.
New corpus category: `18xx` concurrency. On an **unrelated** compiler bug → file
`issues/NNNN`, mark this checkpoint BLOCKED, STOP (CLAUDE.md). The in-session
worker-fix override (delegate a blocker to a worker) applies only with explicit user
authorization.
---
## Design (grounded against the tree)
### B1.0 — `abi(.pure)` codegen (the one genuinely net-new compiler piece in B1)
The design doc spells this `callconv(.naked)`; the **real sx surface is `abi(.pure)`**
written in the postfix slot, `name :: (sig) -> Ret abi(.pure) { asm { … }; }` (cf.
`build_options :: () -> BuildOptions abi(.compiler);` in [build.sx:28](../library/modules/build.sx#L28)).
The sx-facing name is **pure** throughout (field, flag, diagnostics); LLVM's `naked`
function attribute is only the *lowering mechanism* (B1.0b), not what we call the function.
**Grounding (verified — do not re-derive):**
- The `ABI` enum **already carries `.pure`**`ABI = enum { default, c, compiler, pure }`
([ast.zig:142](../src/ast.zig#L142)), documented "pure / naked function (inline asm
body), no calling-convention prologue/epilogue." So B1.0 is **NOT** "extend the enum."
- `.pure` is **inert today**: [type_resolver.zig:237](../src/ir/type_resolver.zig#L237)
maps `.compiler, .pure → .default` CC, and `emit_llvm` emits **no LLVM `naked`
attribute**. So the net-new work is exactly: **carry `abi == .pure` into the IR
`Function`, emit LLVM's `naked` attr, and skip the implicit-`Context` / prologue
lowering** so the body is just the asm block + its own `ret`.
- The IR `Function` struct ([inst.zig:605](../src/ir/inst.zig#L605)) carries `call_conv`
(default/c) + `is_compiler_domain`, but **no pure flag** — add one (`is_pure: bool`).
- Attribute API is in-tree: `nounwind` is set at
[emit_llvm.zig:1339](../src/ir/emit_llvm.zig#L1339) via
`LLVMGetEnumAttributeKindForName("nounwind", 8)``LLVMCreateEnumAttribute(ctx, id, 0)`
`LLVMAddAttributeAtIndex(func, func_idx_attr /* -1 */, attr)`. The LLVM `naked` attr
is the same shape: `LLVMGetEnumAttributeKindForName("naked", 5)`.
- The `.c` ABI **already skips the implicit ctx** at lowering — `lam.abi == .c` /
`fd.abi == .c` gates (closure.zig:171, [decl.zig:515](../src/ir/lower/decl.zig#L515)).
`.pure` must skip it **too** (a `.pure` fn gets no synthetic `__sx_ctx`, no stack frame,
no prologue — args arrive in ABI registers and are read directly from asm). The
implicit-return machinery (`lowerValueBody`) must also be bypassed: a `.pure` body has no
sx return (the asm rets itself), so lower its statements and cap the block with
`unreachable`.
- **Inline asm already works end-to-end** (lower→emit→JIT): aarch64
([examples/1645](../examples/1645-platform-asm-aarch64-add.sx)), x86_64
([examples/1651](../examples/1651-platform-asm-x86-syscall-write.sx)), global asm, JIT
([1653](../examples/1653-platform-asm-global-jit.sx)). `emitInlineAsm` /
`LLVMGetInlineAsm` at [ops.zig:915](../src/backend/llvm/ops.zig#L915). The `.pure` body
is a single asm block reusing this path.
**`.pure``.c` (design §4.6 context-switch note):** a `.c` epilogue restores SP from the
frame; a context switch deliberately makes SP-in ≠ SP-out, so the `.c` epilogue would
restore from the *wrong* stack. `.pure` = no prologue/epilogue/frame — the asm emits its
own `ret`. This is *why* the switch must be `.pure`, not `.c`.
**Snapshot story (per the atomics precedent):** a `.pure` fn's *body is raw per-arch asm*
(it can't be portable — that's the point), while LLVM's `naked` attribute text is
arch-invariant. **B1.0a** (lock) needs only **one host example** locked to the emit bail —
the bail fires at the function level *before* any asm/instruction selection, so it is
host-independent (no `.build` target pin). **B1.0b** (green) adds emission, pins that
example aarch64 (`.build {"target": "aarch64-macos"}`, end-to-end on a matching host,
ir-only on a mismatch), and adds an x86_64 cross sibling — mirroring the existing asm
corpus split (1645 aarch64 / 1651 x86). The ir-only `.ir` (only producible once emission
lands in B1.0b) asserts the `naked` attribute + the asm body. State loudly: **the `.ir`
proves the `naked` keyword + asm emitted, NOT that any hand-written register save/restore
is correct** — that is the B1.3 switch-stress harness's job, never the corpus's.
### B1.1 — per-fiber `context` root (grounding says this is SMALL, likely library-only)
**Grounding (verified — closes the design doc's open sizing question):**
- `context` is an **implicit `*Context` parameter** (`__sx_ctx`, slot 0), threaded through
every default-conv sx call ([lower.zig:259](../src/ir/lower.zig#L259)) — **not raw TLS**.
Inside a function `current_ctx_ref = Ref.fromIndex(0)` (the param) → it **rides the fiber
stack frame for free**.
- `push Context.{…}` allocates the new `Context` with a **stack `alloca`** and rebinds
`current_ctx_ref` to that slot ([stmt.zig:1263](../src/ir/lower/stmt.zig#L1263)) — "No
global, no walk." So **push frames are fiber-local for free**.
- The **only shared root** is the `__sx_default_context` **global**, bound at
entry-points / `abi(.c)` fns *before any user code runs*
([decl.zig:2667](../src/ir/lower/decl.zig#L2667), :2815).
⇒ The design doc's "lower as swappable indirection, never raw TLS" guards a **non-problem**
(confirmed). The **real, now-sized** B1.1 work is purely a **library convention**: a
freshly-`spawn`ed fiber must take its root `Context` from the **spawner's snapshot** (passed
as the fiber-entry fn's `__sx_ctx` slot-0 arg by the spawn trampoline), **not** the
`__sx_default_context` global. That is sx-side (the trampoline already controls slot 0) —
**expected to be ZERO compiler change.** B1.1's first action is a probe confirming this; if
a fiber genuinely re-reads the global root mid-stack (it should not — entry binds once),
*then* and only then is there a compiler obligation. **Ground the probe before sizing any
compiler work.** Prerequisite of B1.3 (a fiber needs a valid root before it switches).
### B1.2B1.5 — pure sx over the primitives (design §4)
- **B1.2 (A1):** `Io` interface + `context.io` + `Future` + `cancel()` — a protocol/vtable
threaded exactly like `Allocator` (which already lives at `Context` field 0; see
`allocViaContext` [call.zig:1214](../src/ir/lower/call.zig#L1214)). `Io` becomes another
`Context` field. No compiler change — protocols + context already carry it.
- **B1.3 (A2):** the fiber runtime — naked context-switch asm (per-arch), bootstrap, `mmap`
stacks **with mandatory guard pages**. All sx. **Highest corruption risk in the stream**
(§8.1.1) and **untestable by the deterministic `Io`** (which tests *scheduling*, not the
*switch*). Its **first deliverable, before the scheduler AND the deterministic `Io`**: a
standalone **2-fiber ping-pong switch-stress harness** (§10.7) — scribble every
callee-saved register + a stack canary before each suspend, deep/recursive chains, verify
all survive post-resume. This harness — not B1.4 — is A2's correctness gate.
- **B1.4 (A3):** `Io` impls in order **blocking → deterministic-sim (KEYSTONE) → event-loop**
(kqueue/epoll/io_uring). Build the deterministic `Io` right after blocking; **calibrate it
against blocking `Io`** before trusting it to gate everything async (§8.1.3, §10.7) — a
deterministic-but-wrong scheduler snapshots garbage. (Open, deferred: the event loop does
**not** yet cooperate with a platform UI run loop — CFRunLoop/ALooper; that's a §6
app-target gap, out of B1.)
- **B1.5 (A5·M:1):** the single-thread scheduler — validates the whole colorblind stack
end-to-end. `18xx` corpus runs under the deterministic `Io`, asserting a **program-emitted
ordering contract** (sequence markers), not raw interleaving, so scheduler-policy tweaks
don't churn every snapshot.
### Files the compiler floor touches (B1.0 only; B1.1B1.5 are library + tests)
B1.0 (`.pure`) forces these plumbing sites:
- [ast.zig:142](../src/ast.zig#L142) — `ABI.pure` (exists; reference only).
- [inst.zig:605](../src/ir/inst.zig#L605) — add `is_pure: bool = false` to `Function`.
- [decl.zig](../src/ir/lower/decl.zig) — set `is_pure` from `fd.abi == .pure`; gate the
implicit-ctx off for `.pure` in `funcWantsImplicitCtx` (mirror the `.c` skip at
decl.zig:515) and bypass `lowerValueBody` for `.pure` bodies (lower statements + cap with
`unreachable`, in both body-lowering paths) — a `.pure` fn binds no ctx and has no sx
return.
- [type_resolver.zig:237](../src/ir/type_resolver.zig#L237) — leave CC `.default` (a `.pure`
fn-pointer type has no CC of its own; pureness is a decl-level emit attribute).
- [emit_llvm.zig:402](../src/ir/emit_llvm.zig#L402) Pass 2 — **B1.0a:** bail loudly when
`func.is_pure` (build-gating). **B1.0b:** instead emit LLVM's `naked` attr (shape per
`nounwind` at emit_llvm.zig:1339) + the asm-only body (no prologue).
- Any `.op`/`Function`-field switch the Zig build flags — let the build tell you.
---
## Phases (xfail→green steps)
### B1.0 — `abi(.pure)` codegen — ✅ COMPLETE
- **B1.0a (lock) — ✅ DONE.** Carried `abi == .pure` into IR `Function.is_pure`; threaded
through `decl.zig` (`funcWantsImplicitCtx` skips `.pure` like `.c`; all body-lowering paths
bypass `lowerValueBody` for `.pure`, lowering the asm body + capping with `unreachable`) +
generic.zig + pack.zig; `emit_llvm` Pass 2 bailed loudly on `func.is_pure`. Locked by
`examples/1800-concurrency-pure-asm.sx` + the generic regression (review-found gap).
- **B1.0b (green) — ✅ DONE.** `emit_llvm` declaration pass adds LLVM `naked` + `noinline` +
`nounwind` for `func.is_pure` and skips `frame-pointer=all` (incompatible with a frameless
function); Pass 2 emits the body normally (`naked` ⇒ verbatim asm + own `ret`, no
prologue). `1800` pinned aarch64 → exit 42 + `.ir`; `1801-concurrency-pure-generic.sx`
(renamed from `-bail`) proves the generic path emits a naked body (exit 42);
`1802-concurrency-pure-asm-x86.sx` x86_64 cross sibling (ir-only here, `.ir` locks `naked`
+ `movl $42, %eax`). Unit test `emit: abi(.pure) function gets the naked attribute` asserts
`naked` present + `frame-pointer` absent. Suite green (724/0).
- **B1.0c (review-hardening) — ✅ DONE.** A param-bearing `.pure` fn emitted invalid LLVM
(loud verifier error). Gated the param-alloca loop on `fd.abi != .pure` (decl.zig both
paths + generic.zig) so a naked fn's args stay in registers (read by the asm body) — this
*enables* B1.3's `swap_context(from, to)`. Locked by `1803-concurrency-pure-asm-param.sx`.
Pack `.pure` (variadic + naked, nonsensical) left unsupported → loud verifier error.
### B1.1 — per-fiber `context` root (probe-first; likely zero compiler change)
- **B1.1a (probe + lock)** — write a probe (`.sx-tmp/`) + an `18xx` example that snapshots a
`Context` (e.g. a custom allocator pushed via `push Context`) and confirms it is carried by
slot 0 across an ordinary call chain (it is — grounded). If the probe shows a fiber-entry
trampoline can pass a snapshotted ctx as slot 0 with **no compiler change**, this phase is a
**library convention doc** (record it in the checkpoint) + a corpus example locking the
behavior. If (and only if) the probe surfaces a real compiler gap (a path re-reads
`__sx_default_context` mid-stack), file it as a step here and size it then.
### B1.2 — A1: `Io` interface + `context.io` + `Future` + `cancel()` API
Library-only. `Io` as a protocol added to `Context` (mirror `Allocator`). `Future`/`cancel`
API surface. xfail→green via an `18xx` example exercising the blocking `Io` default (real
suspend lands in B1.3). No compiler change expected; if a protocol-in-context gap appears,
file it.
### B1.3 — A2: fiber runtime (naked switch + bootstrap + guarded `mmap` stacks)
- **B1.3a (switch-stress harness FIRST)** — the standalone 2-fiber ping-pong harness
(register + canary survival, deep chains) per §10.7. This is A2's gate and predates the
scheduler + deterministic `Io`. Arch-gated run test (matching-host run; ir-only elsewhere).
- **B1.3b** — fiber bootstrap + `mmap` stacks **with guard pages** (mandatory — §8.1.1).
- (Cadence inside B1.3 follows lock→green per sub-piece; the asm switch is the highest-risk
artifact — review adversarially, with a worker if authorized.)
### B1.4 — A3: `Io` impls (blocking → deterministic-sim KEYSTONE → event-loop)
Blocking first; then the deterministic-sim `Io`, **calibrated against blocking** before any
`18xx` test trusts it; then the event loop. The deterministic `Io` is the test harness for
*all* of B1.5 + Stream B2.
### B1.5 — A5: M:1 scheduler
End-to-end validation of the colorblind stack. `18xx` corpus under the deterministic `Io`,
asserting program-emitted ordering contracts.
---
## Gates
- **B1.0:** unit `emit_llvm.test.zig` (the `naked` attr present on a `.pure` fn); two
arch-gated examples (aarch64 + x86_64) run end-to-end on a matching host, ir-only on a
mismatch (assert `naked` + asm in `.ir`). **OUT of corpus scope, stated loudly:** the
*correctness* of any hand-written register save/restore — that's the B1.3 stress harness.
- **B1.1:** an `18xx` example locking context-carried-by-slot-0 behavior + a checkpoint note
on the spawn-trampoline convention.
- **B1.3:** the **switch-stress harness is A2's gate** (register/canary survival — §10.7),
NOT a run/snapshot test; plus arch-gated run tests.
- **B1.4:** deterministic `Io` **calibrated** against blocking `Io` (§8.1.3) before trusting
it; `18xx` under the deterministic `Io`.
- **B1.5:** `18xx` ordering-contract snapshots under the deterministic `Io`.
## Kickoff prompt (B1.0b — paste into a fresh session)
> Implement Stream B1 step **B1.0b** (`abi(.pure)` real emission) per
> `current/PLAN-FIBERS.md`. Verify `zig build && zig build test` is green first (B1.0a is
> already landed: `Function.is_pure` plumbed, `decl.zig` skips ctx + bypasses implicit-return
> for `.pure`, `emit_llvm` Pass 2 bails loudly, `examples/1800-concurrency-pure-asm.sx`
> locked to the bail). Then: (1) in `src/ir/emit_llvm.zig` Pass 2 (~line 402), REPLACE the
> `func.is_pure` bail with real emission — set LLVM's `naked` attribute on the function
> (`LLVMGetEnumAttributeKindForName("naked", 5)` → `LLVMCreateEnumAttribute(ctx, id, 0)` →
> `LLVMAddAttributeAtIndex(llvm_func, -1, attr)`; shape per the `nounwind` set at
> emit_llvm.zig:1339) and emit the `.pure` body as its asm block only, no prologue/epilogue
> (the body already lowers to the inline-asm op + an `unreachable` terminator). (2) Pin
> `examples/1800-concurrency-pure-asm.sx` aarch64 with a `.build` sidecar
> `{"target":"aarch64-macos"}`; on this aarch64 host it runs end-to-end (exit 42), capture
> `.ir` + regen (`-Dname=examples/1800-concurrency-pure-asm.sx -Dupdate-goldens`), review the
> diff (assert the `.ir` shows the `naked` attr + `mov x0, #42` / `ret`, NO stray error
> text). (3) Add `examples/1802-concurrency-pure-asm-x86.sx` (x86_64 body, `.build
> {"target":"x86_64-linux"}`, ir-only on this host — requires its `.ir`, now producible).
> (4) Add a unit test in `src/ir/emit_llvm.test.zig` asserting the `naked` attribute is
> present on an `abi(.pure)` function. Confirm `zig build test` green, commit. NOTE: the
> `.ir` proves the keyword + asm emitted, NOT register-save correctness (that's the B1.3
> switch-stress harness). If you hit an UNRELATED compiler bug, file `issues/NNNN`, mark
> `CHECKPOINT-FIBERS.md` BLOCKED, and STOP.