Files
sx/current/CHECKPOINT-FIBERS.md
agra 40424df1b8 fibers B1.0a: close generic/pack is_pure gap (review)
Adversarial review of dd363ca found is_pure was set only at the two
declareFunction decl sites. Generic monomorphization (generic.zig) and
pack expansion (pack.zig) create the IR Function via a different path
and left is_pure false, so a generic abi(.pure) instance bypassed the
emit bail and silently shipped a framed body — it returned 42 but
leaked the prologue's stack adjustment (the exact SP-in != SP-out
corruption the lock exists to prevent).

Both paths now set is_pure and route .pure bodies through the asm-only
+ unreachable cap, mirroring the decl path. Locked by
examples/1801-concurrency-pure-generic-bail.sx (generic .pure reaches
the loud bail).

The review's other CRITICAL (a .pure lambda) is a false positive:
isLambda's return-type scan (parser.zig:3652) breaks on the abi
keyword, so a .pure lambda is unparseable and parseLambda's abi
handling is never reached. Latent isLambda/parseLambda inconsistency,
not a B1 concern.

Suite green (723/0).
2026-06-20 14:45:29 +03:00

121 lines
8.6 KiB
Markdown

# CHECKPOINT-FIBERS — Stream B1 (fibers + Io + M:1 scheduler)
Companion to [PLAN-FIBERS.md](PLAN-FIBERS.md). Update after every step (one step at a time,
per the cadence rule). New corpus category: `18xx` concurrency.
## Last completed step
**B1.0a (`abi(.pure)` lock commit) — DONE.** Plumbed the `is_pure` flag end-to-end and made
emit bail loudly:
- IR `Function.is_pure: bool` ([inst.zig](../src/ir/inst.zig)) — set from `fd.abi == .pure`
at both `declareFunction` decl sites ([decl.zig](../src/ir/lower/decl.zig)).
- `funcWantsImplicitCtx` returns false for `.pure` (mirrors the `.c` skip, decl.zig:515) —
a `.pure` fn gets no synthetic `__sx_ctx`.
- Both body-lowering paths bypass `lowerValueBody` for `.pure`: lower the asm body as
statements + cap with `unreachable` (a `.pure` body has no sx return — the asm rets
itself; this avoids the implicit-return diagnostic).
- `emit_llvm` Pass 2 (~line 402) **bails loudly** when `func.is_pure`
("`abi(.pure)` function '…' LLVM emission not yet implemented") via `comptime_failed`
(driver aborts nonzero) — NOT a framed body (whose epilogue would corrupt a context
switch's SP-in ≠ SP-out).
- `examples/1800-concurrency-pure-asm.sx` — one host example (no `.build` pin; the bail is
host-independent, fires before any asm/instruction selection), locked to the bail snapshot
(exit 1, empty stdout, the loud diagnostic on stderr).
- **Adversarial review (closed in-step):** the review caught that `is_pure` was set ONLY at
the two `declareFunction` decl sites — generic monomorphization
([generic.zig](../src/ir/lower/generic.zig)) and pack expansion
([pack.zig](../src/ir/lower/pack.zig)) create the `Function` via a different path and left
`is_pure` false, so a generic `.pure` instance silently shipped a framed body (returned 42
but leaked the prologue's stack adjustment — the exact corruption the lock prevents). Both
paths now set `is_pure` + route `.pure` bodies through the asm-only + `unreachable` cap.
Locked by `examples/1801-concurrency-pure-generic-bail.sx`. (The review's other CRITICAL —
a `.pure` *lambda* — is a **false positive**: `isLambda`'s return-type scan
(parser.zig:3652) breaks on the `abi` keyword, so a `.pure` lambda is unparseable and
`parseLambda`'s abi-handling is never reached. Latent `isLambda`/`parseLambda`
inconsistency, not a B1 concern.)
- **Naming:** the sx-facing name is **`pure`** throughout (field, diagnostic); LLVM's
`naked` attribute is only the B1.0b lowering mechanism (per user direction — don't call
the function "naked").
- `zig build && zig build test` green: **723 ran, 0 failed**.
## Current state
Stream A (atomics) is feature-complete (✅) and unblocks B2-channels. Stream B1: **B1.0a
landed**; the `abi(.pure)` ABI is plumbed but emit deliberately bails (B1.0b flips it to
real LLVM `naked` emission). No fibers/Io/scheduler code yet. Grounded floor facts:
- `context` is already an implicit `*Context` param (slot 0) + `push Context` is a stack
`alloca`**fiber-local for free**. Only shared root = `__sx_default_context` global
(entry-point bind). B1.1 expected to be a **library convention** (spawn trampoline
snapshots the spawner's ctx into slot 0), **likely zero compiler change** — probe first.
- Inline asm works end-to-end (lower→emit→JIT, aarch64 + x86_64) — the `.pure` body reuses it.
## Next step
**B1.0b (`abi(.pure)` real emission)** — per PLAN-FIBERS.md "Phases → B1.0 → B1.0b" and the
kickoff prompt at the bottom of that file. Replace the emit bail with LLVM's `naked`
attribute + asm-only body; pin `1800` aarch64 (run end-to-end → exit 42, capture `.ir`); add
x86_64 cross sibling `1802` (ir-only); add an `emit_llvm.test.zig` unit test asserting the
`naked` attr. Separate commit (cadence rule — B1.0a locked, B1.0b greens).
## Known issues / capability gaps
- **Orthogonal (not a B1 blocker):** default VALUES for comptime params don't bind on
generic-struct methods (free-fn defaults DO work) — inherited from Stream A. Only matters
if a B2 lib type wants a defaulted comptime param; atomics/fibers require explicit, so
unaffected.
- **Issue 0144 (open, independent):** calling an unrecognized bodiless `#builtin` silently
returns 0 / exit 0 — a silent-fallback footgun in the generic builtin-call path. Filed;
leave for its own fix session unless prioritized. Not a B1 blocker.
- **Deferred design gap (documented):** the B1.4 event-loop `Io` does not yet cooperate with
a platform UI run loop (CFRunLoop/NSRunLoop/ALooper); pinning gives thread-affinity, not
run-loop integration — a §6 app-target concern, out of B1 scope.
## Decisions (Stream B1 specifics; surface locked in design §4 / §4.6)
- **The async runtime is sx LIBRARY code.** The compiler provides only: the general
primitives (inline asm ✅, `abi(.pure)` naked [B1.0], atomics ✅) + fiber-safe codegen
(`context` already fiber-local — B1.1). Schedulers, fibers, channels, futures, `Io`
vtables, `mmap` stacks are all sx.
- **`abi(.pure)` is the real spelling of the design's `callconv(.naked)`** — postfix slot,
`name :: (sig) -> Ret abi(.pure) { asm { … }; }`. B1.0 = carry it into IR + emit LLVM
`naked` + skip prologue/ctx (mirror the existing `.c` skip), NOT extend the enum (it's
already there, just inert).
- **`.pure``.c`:** a `.c` epilogue would restore SP from the wrong stack across a context
switch (SP-in ≠ SP-out by design). `.pure` = no prologue/epilogue/frame; the asm emits its
own `ret`. This is why the switch must be `.pure`.
- **Naming:** sx-facing name is **`pure`** (field `is_pure`, the diagnostic). LLVM's `naked`
function attribute is only the lowering mechanism (B1.0b) — do not call the function
"naked" (user direction).
- **B1.0 snapshot scope:** a `.pure` body is raw per-arch asm; LLVM's `naked` attr text is
arch-invariant. **B1.0a** = one host example locked to the emit bail (host-independent —
fires before instruction selection; no `.build` pin). **B1.0b** = pin aarch64 + add an
x86_64 cross sibling (`.build` target-gated, ir-only on mismatch), like the asm corpus
split. The `.ir` proves the `naked` attr + asm emitted, NOT register-save correctness
(that's B1.3's stress harness).
- **B1.1 grounded as library-only (pending probe):** push frames are stack-`alloca`'d and
the implicit ctx rides slot 0, so a spawn trampoline can pass a snapshotted ctx with no
compiler change. The design doc's "never raw TLS" guards a non-problem (context is not
TLS). Probe to confirm before sizing any compiler work.
- **Test keystones (design §10):** the **B1.3 switch-stress harness** gates the
context-switch (the one piece the deterministic `Io` can't test — §8.1.1, §10.7); the
**B1.4 deterministic-sim `Io`** (calibrated against blocking `Io` — §8.1.3) gates all
scheduling tests. Both must exist + be calibrated before the async tests they gate are
trusted. `18xx` asserts program-emitted ordering contracts, not raw interleaving.
## Log
- **carve** — wrote PLAN-FIBERS.md + CHECKPOINT-FIBERS.md. Grounded the B1 compiler floor:
`ABI.pure` inert (type_resolver.zig:237), IR `Function` has no naked flag (inst.zig:605),
attribute API pattern (emit_llvm.zig:1339 nounwind), `.c` ctx-skip precedent
(decl.zig:515), `push Context` stack-alloca + slot-0 implicit ctx (stmt.zig:1263,
lower.zig:259), `__sx_default_context` root (decl.zig:2667/2815), inline-asm corpus
(1645/1651). Corrected the design's `callconv(.naked)` → real `abi(.pure)` spelling and
the B1.0 snapshot story. B1.1 grounded as likely library-only. Baseline green (721/0).
- **B1.0a** — plumbed `Function.is_pure` (set from `fd.abi == .pure` at both decl sites);
`funcWantsImplicitCtx` skips `.pure` (no implicit ctx, like `.c`); both body-lowering
paths bypass `lowerValueBody` for `.pure` (asm body + `unreachable` cap — no sx return);
`emit_llvm` Pass 2 bails loudly on `func.is_pure`. `examples/1800-concurrency-pure-asm.sx`
locked to the bail (exit 1 + diagnostic). Renamed `is_naked``is_pure` per user direction
(sx says `pure`, not "naked"; LLVM `naked` attr is only the B1.0b mechanism). Suite green
(722/0).
- **B1.0a review-hardening** — adversarial review found generic/pack Function-creation paths
left `is_pure` false (silent framed body for a generic `.pure` instance — returned 42 but
corrupted the stack). Fixed generic.zig + pack.zig (set `is_pure` + asm-only `unreachable`
cap); locked by `examples/1801-concurrency-pure-generic-bail.sx`. The review's `.pure`-
lambda CRITICAL was a false positive (unparseable — `isLambda` breaks on `abi`). Suite
green (723/0). **Next: B1.0b (real `naked` emission).**