fibers: event-loop Io — real fd readiness via kqueue (B1.4c)

A fiber can block on a file descriptor and the run loop blocks on
kevent until the kernel reports it ready. Reuses the existing
std/net/kqueue.sx bindings. Scheduler gains a lazy kq fd + an
io_waiters list; block_on_fd arms a one-shot EVFILT_READ registration,
records an IoWaiter, and suspends. Run-loop Mode 2: when the ready
queue drains and no timer is pending, block on kq_wait(-1), match each
fired ident to its waiter, evict it, wake the fiber. wake evicts a
pending fd-waiter (cancel_io_waiter_for) so no stale IoWaiter outlives
a reaped fiber.

Adversarial review found two CRITICALs: (1) two fibers on the same fd
share one kqueue registration (macOS EV_ADD replaces), so one is lost
and the loop hangs -- fixed by enforcing one-waiter-per-fd with a loud
abort; (2) an fd-waiter on a never-ready fd 'hangs' -- reclassified as
correct event-loop semantics (a server idling on a socket), with the
misleading orphan-check comment corrected. UAF parity, ident width,
EINTR handling, timer/io precedence all probed safe.

Example: 1816 (pipe roundtrip -- reader blocks, writer writes, reader
wakes via kqueue). macOS only; linux epoll twin deferred. Suite green 754/0.
This commit is contained in:
agra
2026-06-21 19:39:16 +03:00
parent 62ffea0663
commit 1b0d640f73
8 changed files with 433 additions and 40 deletions

View File

@@ -4,8 +4,65 @@ Companion to [PLAN-FIBERS.md](PLAN-FIBERS.md). Update after every step (one step
per the cadence rule). New corpus category: `18xx` concurrency.
## Last completed step
**B1.4bdeterministic VIRTUAL-TIME timer scheduling (the KEYSTONE) — landed + adversarially
reviewed (caught a CRITICAL UAF, fixed).** `library/modules/std/sched.sx` gained a virtual clock +
**B1.4cREAL fd-readiness blocking via kqueue (macOS).** `library/modules/std/sched.sx` now lets a
fiber park on a file descriptor and the run loop block on `kevent` until the kernel reports it ready.
Reuses the existing verified `library/modules/std/net/kqueue.sx` bindings (`Kevent` (32 bytes),
`kqueue`/`kevent`/`kq_apply`/`kq_wait` + the `EVFILT_READ`/`EV_ADD`/`EV_ENABLE`/`EV_ONESHOT`
constants) rather than re-deriving the FFI — sched.sx imports it as `kqb`. Added to `Scheduler`:
- `kq: i32` (LAZY — `-1` in `init`, opened by the first `block_on_fd`, so a pure-compute /
virtual-timer scheduler never opens a kqueue fd; leaks one fd at exit once opened, same class as the
documented spawn-env / go-Task leaks — no deinit yet);
- `io_waiters: List(IoWaiter)` (`IoWaiter :: struct { fd: i32; fiber: *Fiber; }`, grown through
`own_allocator` per the long-lived-container rule);
- `block_on_fd(self, fd, want_read)` — lazily opens `kq`, arms a one-shot `EVFILT_READ` registration,
records an `IoWaiter{fd, current}`, then `suspend_self()`. Guards a null `current` (loud abort, like
`sleep`); `want_read=false` (write-readiness) is not wired yet → loud abort rather than silently
arming a read filter.
- Run-loop: after the ready queue drains, **Mode 1 (virtual time)** fires the earliest pending timer
(takes precedence — a program uses `sleep` OR fds, documented non-unification limitation); **Mode 2
(real fd)** — if `io_waiters` is non-empty, BLOCK on `kq_wait(kq, evbuf, MAXEV=16, -1)` (null
timeout), then for each fired event match `ev.ident` back to its waiter, evict it, and `wake` the
fiber; **else** break. Orphan-deadlock check unchanged in spirit but now correct: an fd waiter is NOT
an orphan (while `io_waiters.len > 0` the loop blocks on kqueue rather than reaching the check), and
a genuine no-timer/no-fd suspend still aborts loudly (verified with a probe: exit 134).
- `wake` now also evicts a pending fd-waiter (`cancel_io_waiter_for`, mirror of `cancel_timer_for`) —
same UAF reasoning: a fiber woken by another path must not leave a stale `IoWaiter` pointing at a
reaped `*Fiber`. The kqueue registration is `EV_ONESHOT` so we never `EV_DELETE` (a never-fired
one-shot lingers harmlessly; the drain ignores an unmatched ident; closing the fd auto-removes it).
- DE-RISK probe (run first, no scheduler): confirmed `size_of(Kevent) == 32`, the pipe roundtrip
(`kq_wait` returned 1 with `out.ident == read_fd`, `out.filter == -1` (EVFILT_READ), `out.data == 1`
byte readable) — the struct layout reads back the fd correctly.
- Locked by `examples/concurrency/1816-concurrency-fiber-io-pipe.sx`: a `pipe`; a reader fiber spawned
FIRST blocks on the empty read end, then a writer fiber writes `a b c` → the run loop blocks on
kqueue, wakes the reader, which reads the 3 bytes. Output `log: wrote read 3 [97 98 99]` /
`n_suspended: 0` (the "wrote" before "read" ordering proves the reader actually blocked then woke via
kqueue readiness). `.build` `{ "target": "macos" }` (matches host arch → runs end-to-end; ir-only on
a mismatch, like 1814/1815 — no `.ir` snapshot needed since it runs here). The example declares its
own `read`/`write`/`close` externs with the CANONICAL signatures std already binds
(`(i32,[*]u8,usize)->isize` / `(i32)->i32`) — a divergent re-binding is rejected by the extern dedupe.
- **Adversarial review (worker) of the run-loop change — found 2 CRITICALs:**
- **(1) two fibers on the SAME fd → lost wakeup + permanent hang.** macOS `EV_ADD` for an existing
`(ident, filter)` REPLACES the registration (doesn't stack), so two waiters share one registration:
the fd fires once, one wakes, the other is stranded in `io_waiters` and the next `kq_wait(-1)` blocks
forever. FIXED: `block_on_fd` now enforces one-waiter-per-fd with a loud abort (the model already
assumed it). Verified: dup-fd → `sched: block_on_fd: fd N already has a waiter`, not a hang.
- **(2) an fd-waiter on a never-ready fd hangs instead of the timer path's loud abort.** Re-examined:
this is CORRECT event-loop semantics — blocking on I/O until ready (possibly forever, like a server
idling on a socket) is the point; the scheduler cannot know an fd will never become ready, so it must
keep waiting. NOT a scheduler deadlock. Fixed the MISLEADING comment that implied the orphan check
covers fd-waiters: it does not, by design (it covers only pure `suspend_self` parks). No code change —
the "hang" is a caller-side logic issue (waiting on input that never arrives), not a bug to abort on.
- Review CLEARED: the IoWaiter UAF parity (early-wake evicts the waiter; a lingering one-shot that later
fires hits no match → clean no-op), ident width/sign, `kq_wait` EINTR/error handling, timer-vs-io
precedence (timer wins; no hang). All probed safe.
- Suite GREEN **754/0** (incl. the dup-fd guard, no new example needed — the abort is host-fragile to
pin like 1809's guard-firing). Next: **B1.5** (end-to-end M:1 validation under the deterministic timers
/ fd readiness); a linux epoll twin of `block_on_fd` (mirror via `std/net/epoll`, the OS-neutral facade
is `std.event`) is future work.
### Earlier — B1.4b — deterministic VIRTUAL-TIME timer scheduling (the KEYSTONE) — landed + adversarially
reviewed (caught a CRITICAL UAF, fixed).
`library/modules/std/sched.sx` gained a virtual clock +
sleep timers so fibers schedule in reproducible simulated time (no real clock): `clock_ms` (advances
ONLY as timers fire), a `timers: List(Timer)` (insertion-order, linear min-scan, FIFO tiebreak),
`now_ms()`, `sleep(ms)` (arm `{clock_ms+ms, current}` + `suspend_self`), and a timer-driven `run`
@@ -282,17 +339,21 @@ body); closed + locked. The review's `.naked`-lambda CRITICAL was a false positi
(unparseable — `isLambda` breaks on the `abi` keyword).
## Current state
**B1.4b COMPLETE — deterministic virtual-time timer scheduling exists.** `library/modules/std/sched.sx`
**B1.4c COMPLETE — real fd-readiness blocking via kqueue (macOS) exists.** `library/modules/std/sched.sx`
now carries: the M:1 scheduler core (B1.5a: `spawn`/`yield_now`/`suspend_self`/`wake`/`run`), the
suspending fiber-task async (B1.4a: `Task($R)`/`go`/`wait`/`cancel`), AND deterministic timers (B1.4b:
`clock_ms` virtual clock, `timers` list, `now_ms`/`sleep`, timer-driven `run`). Fibers `sleep(ms)` in
reproducible simulated time and wake in deadline order. The timer-vs-early-wake UAF found in review is
fixed (`wake` evicts the fiber's pending timer). Locked by `1811` (round-robin), `1812` (suspend/wake),
`1813` (async go/wait/cancel), `1814` (sim-timer deadline ordering), `1815` (timer early-wake eviction).
Suite GREEN (count below).
suspending fiber-task async (B1.4a: `Task($R)`/`go`/`wait`/`cancel`), deterministic timers (B1.4b:
`clock_ms` virtual clock, `timers` list, `now_ms`/`sleep`, timer-driven `run`), AND real fd readiness
(B1.4c: lazy `kq`, `io_waiters` list, `block_on_fd`, a kqueue-blocking run-loop Mode 2 that wakes the
fiber whose fd fired). It reuses the verified `std/net/kqueue.sx` bindings (imported as `kqb`) rather
than re-deriving the FFI. Fibers can now block on either virtual `sleep(ms)` OR a real fd; both park
paths are balanced through `wake` (which evicts a stale timer AND a stale fd-waiter, the UAF guard).
Locked by `1811` (round-robin), `1812` (suspend/wake), `1813` (async go/wait/cancel), `1814` (sim-timer
deadline ordering), `1815` (timer early-wake eviction), `1816` (pipe fd block→kqueue-wake→read). Suite
GREEN **754/0**.
The remaining B1 work: **B1.4c** the event-loop `Io` (kqueue mac / epoll linux — real fd readiness),
then **B1.5** end-to-end M:1 validation under the deterministic timers. NOTE: the suspending async +
The remaining B1 work: **B1.5** end-to-end M:1 validation under the deterministic timers / fd readiness;
a **linux epoll twin** of `block_on_fd` (mirror via `std/net/epoll`; the OS-neutral facade is
`std.event`) is future work — B1.4c wired the **macOS kqueue** path only. NOTE: the suspending async +
deterministic timers live as `sched.*` methods (M:1, receiver-driven), NOT routed through the erased
`context.io` (which would force sched.sx into every std consumer + duplicate the `_fib_tramp` global
asm); the `Io` protocol's `spawn_raw`/`suspend_raw`/`ready`/`arm_timer`/`poll` remain reserved for the
@@ -388,17 +449,20 @@ fibers/Io/scheduler code yet. Grounded floor facts:
boundary; a sharper sx diagnostic for it is a candidate polish, not a blocker.
## Next step
**→ B1.4c — the event-loop `Io` (real fd readiness).** B1.4b (deterministic virtual-time timers,
`sched.sleep`/`now_ms`/timer-`run`) is done — the KEYSTONE deterministic harness exists at the
scheduler level. Now add real-I/O readiness: a `poll`-style step over `kqueue` (macOS) / `epoll`
(linux) that blocks until an fd is readable/writable (or a real-time timeout), then wakes the parked
fiber waiting on it. Likely shape: a `block_on_fd(fd, events)` that registers the current fiber's
interest, suspends, and is woken when `run`'s poll step reports the fd ready. Lock with an `18xx`
example doing genuine fd I/O (e.g. a `pipe(2)`: a fiber blocks reading, another writes, the reader
wakes with the bytes) — aarch64-macOS-pinned, kqueue. The deterministic timers (1814) and real I/O
should compose (a real `poll` with a timeout vs the virtual clock — keep them as separate run modes,
or unify with care). Then **B1.5** end-to-end M:1 validation. The §10.7 gate (1808) + guarded-stack
(1809) + Win64 (1810) + scheduler/async/timers (1811-1815) must keep passing throughout.
**→ B1.5 — end-to-end M:1 validation under the deterministic timers / fd readiness.** B1.4c (real
fd-readiness blocking via kqueue, `sched.block_on_fd` + the kqueue-blocking run-loop Mode 2) is done —
the macOS event-loop path exists. Build an `18xx` example that exercises the full M:1 story together
(multiple fibers, a mix of `sleep`/`go`/`wait` and `block_on_fd`, reaping, the orphan-deadlock guard).
The §10.7 gate (1808) + guarded-stack (1809) + Win64 (1810) + scheduler/async/timers/fd
(1811-1816) must keep passing throughout.
**Deferred (future B1.4c sibling): the linux epoll twin of `block_on_fd`.** B1.4c wired the **macOS
kqueue** path only (the host is aarch64-macOS). The linux mirror would register interest via
`std/net/epoll` and the OS-neutral facade is `std.event` — keep the two as separate run modes inside
`run`, branching on the platform, exactly as the timer-vs-fd modes are kept separate now. Documented
non-unification: virtual-time timers and real kqueue timeouts are NOT merged — `run` fires a pending
timer before ever blocking on kqueue (a program uses `sleep` OR fds); a true "fd-or-real-timeout" wants
a kqueue `EVFILT_TIMER`, future work.
Design note carried forward: an event-loop `Io` needs a current-`Scheduler` handle. `sched.*` methods
thread it via `self`/the `Task`; if B1.4c wants the capability-threaded `context.io` form it'll need
@@ -508,6 +572,21 @@ incomplete); a dedicated effort; lambda workers are the idiom meanwhile.
trusted. `18xx` asserts program-emitted ordering contracts, not raw interleaving.
## Log
- **B1.4c — real fd-readiness blocking via kqueue (macOS).** De-risked first with a no-scheduler probe
(confirmed `size_of(Kevent)==32` and the pipe→kevent roundtrip: `kq_wait` returned 1, `out.ident ==
read_fd`, `out.filter == -1`, `out.data == 1` — the struct layout reads the fd back correctly). Then
added to `library/modules/std/sched.sx` (importing the existing verified `std/net/kqueue.sx` as `kqb`
rather than re-deriving the FFI): a lazy `kq: i32` (-1 until first use), `io_waiters: List(IoWaiter)`,
`block_on_fd(fd, want_read)` (arm one-shot `EVFILT_READ`, record waiter, `suspend_self`), a run-loop
Mode 2 (block on `kq_wait(kq, evbuf, MAXEV=16, -1)` when only fd waiters remain, wake the fiber whose
fd fired), and `wake` now also evicts a stale fd-waiter (`cancel_io_waiter_for`, the same UAF guard as
`cancel_timer_for`). Timers keep precedence over fds (documented non-unification). Orphan-deadlock
check still fires for a genuine no-timer/no-fd suspend (probed: exit 134). Locked by
`1816-concurrency-fiber-io-pipe.sx` (reader blocks on empty pipe → writer writes `a b c` → kqueue
wakes reader → reads 3 bytes; `log: wrote read 3 [97 98 99]`, `n_suspended: 0`), `.build`
`{ "target": "macos" }`, runs end-to-end on host. The example's `read`/`write`/`close` externs use the
canonical signatures std already binds (extern-dedupe rejects a divergent re-binding). Suite GREEN
**754/0**. Next: B1.5 (end-to-end M:1 validation); linux epoll twin deferred.
- **carve** — wrote PLAN-FIBERS.md + CHECKPOINT-FIBERS.md. Grounded the B1 compiler floor:
`ABI.naked` inert (type_resolver.zig:237), IR `Function` has no naked flag (inst.zig:605),
attribute API pattern (emit_llvm.zig:1339 nounwind), `.c` ctx-skip precedent