Files
sx/current/CHECKPOINT-FIBERS.md
agra 0ab26c8a40 fibers B1.2: record review findings — async surface blocked on 0151 (widened)
Adversarial review of 45d869d: the Io infrastructure (both materializers,
push-inherit, 37 .ir regens, !-lint) is correct + landed; but await/cancel
(*Future($R)) are uncallable in EVERY form because sx can't infer a generic
$T from a pointer-wrapped arg. Widened issue 0151 to that root (repro:
unbox(b: *Box($T)) -> $T). Checkpoint: B1.2 partially landed; next = fix 0151
generic inference -> make await/cancel callable -> add 1805/1806 -> B1.3.
2026-06-21 00:43:09 +03:00

22 KiB

CHECKPOINT-FIBERS — Stream B1 (fibers + Io + M:1 scheduler)

Companion to PLAN-FIBERS.md. Update after every step (one step at a time, per the cadence rule). New corpus category: 18xx concurrency.

Last completed step

B1.2 (Io capability) — PARTIALLY LANDED + adversarially reviewed. Infrastructure GOOD; async SURFACE blocked on a generic-inference compiler bug (issue 0151, widened). Commits a1b14f0 (lock) + 45d869d (Io capability) + 3eeb965 (issue 0151). Suite green 726/0, master clean.

  • LANDED + review-confirmed correct (commit 45d869d): Io :: protocol #inline (spawn_raw/suspend_raw/ready/poll/now_ms/arm_timer) + io field on Context ({allocator; data; io}, io LAST); BOTH __sx_default_context materializers (protocol.zig + comptime_vm.zig) build an identical CBlockingIo→Io vtable (review verified byte-for-byte agreement; context.io.now_ms() dispatches at runtime AND comptime); the push Context.{…} omitted-field-inherits-ambient fix (review: correct, right fix, no bad blast radius); library/modules/std/io.sx (Future($R), CBlockingIo, async/await/cancel); the !-protocol-impl-lint suppression; 37 .ir regens (review: pure layout/type-table, no error text, zero .exit/.stdout/.stderr change).
  • BLOCKED — async surface non-functional: await/cancel take *Future($R) and are uncallable in EVERY form (not just UFCS) — sx can't infer a generic $T from a pointer-wrapped arg (*Future($R)). async(...) (create) works via explicit call and produces a correct .ready Future, but you can't await it. Root bug = issue 0151 (WIDENED): infer $T from *T-wrapped params + closure-return-via-pack + UFCS dispatch. Minimal repro: unbox :: (b: *Box($T)) -> $T fails to infer T.
  • No async example in the corpus (1805 was removed because it needs the blocked surface) → the green suite does NOT cover async. Restore 1805 (async/await) + add 1806 (cancel) once 0151 is fixed.

Earlier — B1.1 (per-fiber context root) — DONE. Zero compiler change (confirmed by probe).

The fiber-spawn context convention works end-to-end with ordinary language features:

  • snap := context captures the spawner's Context as a value;
  • the snapshot is stored in a struct (the stand-in Fiber);
  • a trampoline running under a different ambient context installs the fiber's stored root with push f.root { … }, and the body reads the snapshot — not the trampoline's ambient context — because context is an implicit slot-0 *Context param (call-carried, rides the callee's own stack) and push allocates on the caller frame (no global, no TLS).
  • Locked by examples/1804-concurrency-context-snapshot.sx: prints fiber root: 42 (the installed snapshot wins over ambient 99) + ambient after: 99 (the push scope restores the ambient context on exit). No fiber runtime yet (that's B1.3) — this proves the plumbing it will build on. No .build pin (pure sx, host-independent).
  • Probe result: the design doc's "lower as swappable indirection, never raw TLS" guarded a non-problem — context was already param-carried, never TLS. No path re-reads __sx_default_context mid-stack, so there is no compiler obligation here.
  • zig build && zig build test green: 726 ran, 0 failed.

Earlier — B1.0 (abi(.naked) codegen) — complete

Replaced the emit bail with real LLVM naked emission:

  • emit_llvm declaration pass: for func.is_naked, add the LLVM naked + noinline + nounwind attributes and skip the frame-pointer=all attribute (incompatible with a frameless function). Pass 2 now emits the .naked body normally — naked makes the backend emit it verbatim (the inline asm + its own ret) with no prologue/epilogue.
  • IR shape (verified): ; Function Attrs: naked noinline nounwind / define internal i64 @answer() #0 { entry: call void asm sideeffect "…ret…", ""() unreachable } / attributes #0 = { naked noinline nounwind }. The caller invokes it as an ordinary () -> i64 call (.naked is call_conv == .default).
  • examples/1800-concurrency-naked-asm.sx — now GREEN, aarch64-pinned (.build {"target": "macos"}): runs end-to-end → exit 42 on this host, ir-only on a mismatch; .ir snapshot captured.
  • examples/1801-concurrency-naked-generic.sx (renamed from -bail) — the generic .naked now emits a correct naked answer__i64 (exit 42), proving generic.zig produces a naked body, not a framed one. aarch64-pinned.
  • examples/1802-concurrency-naked-asm-x86.sx — x86_64 cross sibling (.build {"target": "x86_64-linux"}, ir-only here): .ir locks naked + movl $42, %eax / ret.
  • Unit test emit: abi(.naked) function gets the naked attribute (no frame-pointer) in emit_llvm.test.zig (asserts naked present, frame-pointer absent).
  • B1.0c (review-hardening): a param-bearing .naked fn emitted invalid LLVM (loud verifier error "cannot use argument of naked function") because the param-alloca loop wasn't gated. Fixed forward (this enables the B1.3 context-switch use case rather than rejecting it): gated the param-alloca loop on fd.abi != .naked in decl.zig (both paths) + generic.zig; a naked fn's args stay in registers (read by asm), declared-but-unused in LLVM. Locked by examples/1803-concurrency-naked-asm-param.sx (add(a,b) → x0+x1 → 42).
  • zig build && zig build test green: 725 ran, 0 failed + unit tests.

Earlier — B1.0a (lock + review hardening)

Plumbed Function.is_naked (set from fd.abi == .naked at both decl sites + generic.zig + pack.zig); funcWantsImplicitCtx skips .naked (no synthetic ctx, like .c); all body-lowering paths bypass lowerValueBody for .naked (asm body + unreachable cap — no sx return); emit_llvm Pass 2 bailed loudly (since flipped to real emission). Adversarial review caught the generic/pack is_naked gap (a generic .naked silently shipped a framed body); closed + locked. The review's .naked-lambda CRITICAL was a false positive (unparseable — isLambda breaks on the abi keyword).

Current state

B1.2 is UNBLOCKED. Master GREEN (726/0), installed sx clean. The earlier "blockers" were NOT real: issue 0151 was INVALID (its repro used the non-idiomatic ($A)->$R bare-fn-ptr form) — removed. The correct async idiom works today with no compiler change (verified live): spawn :: (worker: Closure(..$args) -> $R, ..$args) -> Wrap($R) with a lambda worker + the w : Wrap($R) = ---; w.v = worker(..args); build form — mirrors the canonical examples/0543-packs-canonical-map.sx. Ran 42 42 for homogeneous + heterogeneous args. Caveats (work within them, not "bugs"): lambda params must be annotated ((a: i64, b: i64) -> i64 => …); a bare named fn passed as the worker is non-idiomatic — use a lambda; build the result struct with = --- + field-assign, not a struct-literal in return. Issue 0150 (void struct field → SIGTRAP exit 133) is a real bug but only reached via Future(void) (void-returning worker / timeout) — DEFERRED: B1.2 supports non-void workers; revisit Future(void) in B1.4 (or fix 0150 standalone). The B1.2 design (Io protocol on Context, blocking CBlockingIo, context.io.now_ms()) was validated live; WIP at .sx-tmp/b12-wip/ has the working Io/Context/materializer parts — reuse those, rewrite the async layer to the pack-lambda idiom above.

B1.2 attempt (BLOCKED — design proven, two compiler bugs filed)

What was built + verified WORKING (then reverted to keep master green):

  • Io :: protocol #inline { spawn_raw; suspend_raw -> !; ready; poll; now_ms; arm_timer; } in core.sx next to Allocator, with SpawnOpts{ pin: PinTarget } + ParkToken{ handle }. Six methods, each justified by a downstream consumer (B1.3-B1.5).
  • Context :: struct { allocator; data; io: Io; }io appended LAST so allocator stays index 0 (the call.zig:1229 hardcode) and data keeps index 1 (minimal VM-fallback churn).
  • Both __sx_default_context materializers updated in lockstep + verified: protocol.zig emitDefaultContextGlobal (extended ctx_fields 2→3, built the CBlockingIo→Io inline 7-word vtable {null-ctx, fn0..fn5} via getOrCreateThunks("Io","CBlockingIo")) and comptime_vm.zig materializeDefaultContext fallback (wrote the 6 thunk func-refs at io_base = addr + 4*ps, offset + (i+1)*ps). The global path auto-followed the 3-field Context type. context.io.now_ms() printed clock ok live — the capability threads + the vtable dispatches correctly.
  • Stateless CBlockingIo :: struct {} + impl Io for CBlockingIo (mirror of CAllocator): blocking semantics — spawn_raw/ready/poll/arm_timer no-op/0, now_mstime.mono_ms().
  • push-inherit-omitted fix (stmt.zig lowerPush): a push Context.{...} now SEEDS the new slot from the ambient context (load+store), then overwrites ONLY the literal's named fields — so omitted fields (now incl. io) are INHERITED, never zero-inited to a null vtable. Eliminates the omitted-field footgun globally (zero per-site churn across the 17 partial-literal sites). This is the correct capability-bag semantics; it compiled clean.
  • !-protocol-method warning fix (error_analysis.zig + a new Lowering.impl_method_names set populated in protocols.zig registerImplBlock): a protocol impl method may be declared ! by contract (e.g. Io.suspend_raw) yet never raise; the "declared ! but never errors — drop the !" hint is a false positive for impl methods, now suppressed for them.

Where it BROKE (the two blockers — both INDEPENDENT of the Io design, both repro standalone):

  • issue 0150Future(void) (for timeout -> Future(void)) makes a result: void field; a void struct field crashes the compiler with an unsized-type SIGTRAP in LLVM getTypeSizeInBits (a bare struct { v: void; } repros it). timeout was DEFERRED (it is a B1.4 stub needing arm_timer anyway) rather than routed around with a non-void shape.
  • issue 0151async(io, worker: ($A) -> $R, arg: $A) -> Future($R): $R inferred from a fn-pointer parameter's RETURN type type-checks the call but is NOT bound as a usable type in the body, so Future(R) errors unknown type 'R'. A direct arg: $A binds fine — the gap is specific to type-vars nested in a fn-ptr/closure param signature. This blocks the central async/await free-fns. (Manifested as the "unresolved type reached LLVM emission" panic — the same one another session filed against my dirty binary as issue 0149, now moot after the revert.)

Per the IMPASSABLE STOP rule: filed 0150 + 0151, reverted all B1.2 working changes (master green again, photo project unbroken), STOPPED. Resume B1.2 once 0150 + 0151 land — the WIP in .sx-tmp/b12-wip/ makes it ~mechanical (the design is proven).

Earlier — B1.0 + B1.1 complete

Stream A (atomics) is feature-complete (). Stream B1: B1.0 + B1.1 complete. The two compiler-floor preconditions for the fiber runtime are in place: (1) abi(.naked) emits a real LLVM naked function end-to-end (decl, generic, pack paths) — the context-switch substrate; (2) per-fiber context root needs no compiler change — the spawn convention (snapshot context, store, push it from the trampoline) is pure library sx. No fibers/Io/scheduler code yet. Grounded floor facts:

  • context is an implicit slot-0 *Context param + push Context is a stack allocafiber-local for free (confirmed by the B1.1 probe — never TLS, never re-read from the __sx_default_context global mid-stack). A spawn passes the snapshot as the fiber-entry fn's slot-0 ctx via push f.root { entry(args) }. Locked by 1804-...-context-snapshot.
  • Inline asm works end-to-end (lower→emit→JIT, aarch64 + x86_64) — the .naked body reuses it.
  • .naked with PARAMS works (B1.0c, the B1.3 substrate): the param-alloca loop is gated on fd.abi != .naked in decl.zig (both paths) + generic.zig — a naked fn's args stay in ABI registers (read by the asm body), declared-but-unused in LLVM (verifier-legal). Example 1803-concurrency-naked-asm-param.sx (add(a,b) reads x0/x1). Unsupported (loud, not silent): a .naked variadic-pack fn (pack.zig's param loop is intertwined with comptime-param/#insert handling, and a naked fn can't read a runtime-sized pack from registers anyway) → loud LLVM-verifier error for that nonsensical construct. Acceptable boundary; a sharper sx diagnostic for it is a candidate polish, not a blocker.

Next step

Fix issue 0151 (generic inference through a pointer) so await/cancel become callable — then complete B1.2's async surface. Sequence:

  1. Fix 0151 (WIDENED): make the generic-inference engine bind $T from a pointer-wrapped param (b: *Box($T)unbox(@b) infers T), which unblocks await/cancel (*Future($R)). The same bug has two more faces — $R from a closure-return via a pack, and via UFCS dot-dispatch (the original 0151 + the f.await() SIGTRAP). Acceptance cases are in issues/0151-...md. This is generic-inference-engine work (src/ir/lower/generic.zig extractTypeParam + the UFCS call path) — its own focused step.
  2. Restore/add the async examples: examples/1805-concurrency-io-blocking-async.sx (context.io.async((a:i64,b:i64)->i64 => a+b, 40, 2).await() → 42) + 1806-...-io-cancel.sx (cancel → state .canceled). lock→green. Regen .ir only after green; confirm layout-only.
  3. Then B1.2 is truly done → proceed to B1.3 (fiber runtime).

Deferred (do NOT block B1.2 on these): issue 0150 (void struct field SIGTRAP) — only Future(void)/timeout, which are B1.4. The :: callable-parameter feature (named-fn async workers async(read_a, conn)) — WIP at .sx-tmp/wip-callable-params/patch.diff (parser done, inference incomplete); a dedicated effort; lambda workers are the B1.2 idiom meanwhile.

Context layout settled: { allocator; data; io; } (allocator index 0 fixed by call.zig:1229, io last). Io protocol + materializers + push-inherit are LANDED + reviewed.

Known issues / capability gaps

  • 🔴 B1.2 BLOCKERS (both filed, both standalone-reproducible, both independent of the Io design):
    • issue 0150 — a void struct field crashes the compiler (unsized-type SIGTRAP in LLVM getTypeSizeInBits). Blocks Future(void)timeout. Repro: issues/0150-....
    • issue 0151 — a type-var inferred from a fn-pointer parameter's RETURN type is not bound in the function body (unknown type 'R'). Blocks async(io, worker: ($A)->$R, arg)'s Future(R). Repro: issues/0151-....
    • (Note: issue 0149, filed by another session against the dirty in-progress binary, was a manifestation of 0151 — "unresolved type reached LLVM emission". Moot after the revert; its real root cause is 0151.)
  • Orthogonal (not a B1 blocker): default VALUES for comptime params don't bind on generic-struct methods (free-fn defaults DO work) — inherited from Stream A. Only matters if a B2 lib type wants a defaulted comptime param; atomics/fibers require explicit, so unaffected.
  • Issue 0144 (open, independent): calling an unrecognized bodiless #builtin silently returns 0 / exit 0 — a silent-fallback footgun in the generic builtin-call path. Filed; leave for its own fix session unless prioritized. Not a B1 blocker.
  • Deferred design gap (documented): the B1.4 event-loop Io does not yet cooperate with a platform UI run loop (CFRunLoop/NSRunLoop/ALooper); pinning gives thread-affinity, not run-loop integration — a §6 app-target concern, out of B1 scope.

Decisions (Stream B1 specifics; surface locked in design §4 / §4.6)

  • The async runtime is sx LIBRARY code. The compiler provides only: the general primitives (inline asm , abi(.naked) naked [B1.0], atomics ) + fiber-safe codegen (context already fiber-local — B1.1). Schedulers, fibers, channels, futures, Io vtables, mmap stacks are all sx.
  • abi(.naked) is the real spelling of the design's callconv(.naked) — postfix slot, name :: (sig) -> Ret abi(.naked) { asm { … }; }. B1.0 = carry it into IR + emit LLVM naked + skip prologue/ctx (mirror the existing .c skip), NOT extend the enum (it's already there, just inert).
  • .naked.c: a .c epilogue would restore SP from the wrong stack across a context switch (SP-in ≠ SP-out by design). .naked = no prologue/epilogue/frame; the asm emits its own ret. This is why the switch must be .naked.
  • Naming: sx-facing name is naked (keyword abi(.naked), field is_naked, the diagnostic), matching LLVM's naked attribute and the industry term (Zig/Rust/GCC/Clang). The ABI variant was renamed .pure → .naked (user direction): "pure" universally means side-effect-free, the opposite of a register-clobbering context switch.
  • B1.0 snapshot scope: a .naked body is raw per-arch asm; LLVM's naked attr text is arch-invariant. B1.0a = one host example locked to the emit bail (host-independent — fires before instruction selection; no .build pin). B1.0b = pin aarch64 + add an x86_64 cross sibling (.build target-gated, ir-only on mismatch), like the asm corpus split. The .ir proves the naked attr + asm emitted, NOT register-save correctness (that's B1.3's stress harness).
  • B1.1 — per-fiber context is library-only (CONFIRMED by probe): push frames are stack-alloca'd and the implicit ctx rides slot 0, so the spawn convention — snapshot context, store it, push f.root { entry(args) } from the trampoline — installs the fiber's root with no compiler change. Verified: the body reads the snapshot over a different ambient context, and push restores ambient on exit (1804-...-context-snapshot). The design doc's "never raw TLS" guarded a non-problem (context was never TLS).
  • Test keystones (design §10): the B1.3 switch-stress harness gates the context-switch (the one piece the deterministic Io can't test — §8.1.1, §10.7); the B1.4 deterministic-sim Io (calibrated against blocking Io — §8.1.3) gates all scheduling tests. Both must exist + be calibrated before the async tests they gate are trusted. 18xx asserts program-emitted ordering contracts, not raw interleaving.

Log

  • carve — wrote PLAN-FIBERS.md + CHECKPOINT-FIBERS.md. Grounded the B1 compiler floor: ABI.naked inert (type_resolver.zig:237), IR Function has no naked flag (inst.zig:605), attribute API pattern (emit_llvm.zig:1339 nounwind), .c ctx-skip precedent (decl.zig:515), push Context stack-alloca + slot-0 implicit ctx (stmt.zig:1263, lower.zig:259), __sx_default_context root (decl.zig:2667/2815), inline-asm corpus (1645/1651). Corrected the design's callconv(.naked) → real abi(.naked) spelling and the B1.0 snapshot story. B1.1 grounded as likely library-only. Baseline green (721/0).
  • B1.0a — plumbed Function.is_naked (set from fd.abi == .naked at both decl sites); funcWantsImplicitCtx skips .naked (no implicit ctx, like .c); both body-lowering paths bypass lowerValueBody for .naked (asm body + unreachable cap — no sx return); emit_llvm Pass 2 bails loudly on func.is_naked. examples/1800-concurrency-naked-asm.sx locked to the bail (exit 1 + diagnostic). Suite green (722/0). (ABI variant later renamed .pure → .naked — see the Naming decision above — so all is_*/abi(.*)/example names here read naked.)
  • B1.0a review-hardening — adversarial review found generic/pack Function-creation paths left is_naked false (silent framed body for a generic .naked instance — returned 42 but corrupted the stack). Fixed generic.zig + pack.zig (set is_naked + asm-only unreachable cap); locked by examples/1801-concurrency-naked-generic-bail.sx. The review's .naked- lambda CRITICAL was a false positive (unparseable — isLambda breaks on abi). Suite green (723/0).
  • B1.0b — real naked emission: emit_llvm declaration pass adds LLVM naked/noinline/ nounwind + skips frame-pointer for func.is_naked; Pass 2 emits the body verbatim (no prologue). 1800 green aarch64-pinned (exit 42 + .ir); renamed 1801-generic (generic .naked emits a naked body, exit 42); added x86_64 sibling 1802 (ir-only, .ir locks naked + movl $42, %eax). Unit test asserts naked present + frame-pointer absent. Suite green (724/0).
  • B1.0c — review-hardening: param-bearing .naked emitted invalid LLVM (loud verifier error). Gated the param-alloca loop on fd.abi != .naked (decl.zig both paths + generic.zig) — naked args stay in registers, read by the asm body (the B1.3 context-switch shape). Locked by examples/1803-concurrency-naked-asm-param.sx. Pack .naked left unsupported (loud, nonsensical). B1.0 complete. Suite green (725/0).
  • rename — ABI variant .pure → .naked (keyword, Function.is_naked, diagnostics, examples 1800-1803 *-pure-* → *-naked-*, docs). "pure" universally means side-effect-free — wrong for a register-clobbering switch; "naked" matches LLVM/Zig/Rust/GCC/Clang. Pure cosmetics, no semantic change. Suite green (725/0).
  • B1.1 — per-fiber context root: zero compiler change (probe-confirmed). The spawn convention (snapshot context → store in a struct → push f.root { entry() } from the trampoline) installs the fiber's root via the implicit slot-0 *Context param; the body reads the snapshot, not the trampoline's ambient ctx, and the push scope restores ambient on exit. Locked by examples/1804-concurrency-context-snapshot.sx (prints fiber root: 42 / ambient after: 99). Suite green (726/0). Next: B1.2 (Io interface + context.io).
  • B1.2 (BLOCKED) — built the full Io capability (protocol on Context, stateless CBlockingIo blocking default, both __sx_default_context materializers, push-inherit-omitted fix, !-impl-method warning fix) and VERIFIED the core works live (context.io.now_ms()clock ok). Two independent compiler bugs blocked the async/await/timeout layer: 0150 (void struct field → unsized SIGTRAP, blocks Future(void)) and 0151 (type-var from a fn-ptr param's return type not bound in the body, blocks async's Future(R)). Both filed with standalone repros + investigation prompts. Per the STOP rule: reverted ALL B1.2 working changes (master green again, 726/0; the dirty binary had broken the photo project — see the now-moot 0149), saved WIP to .sx-tmp/b12-wip/, STOPPED. Resume after 0150 + 0151.