fibers: deterministic virtual-time timers (B1.4b)
Add a virtual clock + sleep timers to the M:1 scheduler so fibers
schedule in reproducible simulated time. Scheduler gains clock_ms (the
virtual clock, advances only as timers fire), a timers list, now_ms(),
sleep(ms) (arm {clock_ms+ms, current} + suspend), and a timer-driven
run (drain ready -> fire earliest timer -> advance clock -> wake ->
repeat; the orphan-suspend deadlock check is preserved for a genuine
no-timer park). Wakes fire in deadline order with a FIFO tiebreak.
Adversarial review found a use-after-free: a fiber woken early (manual
or Task wake) before its sleep timer fired was reaped while its Timer
kept a dangling *Fiber, so a later fire dereferenced freed memory.
Fixed: wake evicts the fiber's pending timer (cancel_timer_for) -- every
re-ready path funnels through wake, so no stale timer outlives its fiber.
Examples: 1814 (sim-timer deadline ordering), 1815 (early-wake timer
eviction regression). Suite green 753/0.
This commit is contained in:
@@ -4,8 +4,33 @@ Companion to [PLAN-FIBERS.md](PLAN-FIBERS.md). Update after every step (one step
|
||||
per the cadence rule). New corpus category: `18xx` concurrency.
|
||||
|
||||
## Last completed step
|
||||
**B1.4a — a truly-SUSPENDING fiber-task async layer (`go`/`wait`/`cancel`) — landed +
|
||||
adversarially reviewed; cleared two more compiler blockers en route.** `library/modules/std/sched.sx`
|
||||
**B1.4b — deterministic VIRTUAL-TIME timer scheduling (the KEYSTONE) — landed + adversarially
|
||||
reviewed (caught a CRITICAL UAF, fixed).** `library/modules/std/sched.sx` gained a virtual clock +
|
||||
sleep timers so fibers schedule in reproducible simulated time (no real clock): `clock_ms` (advances
|
||||
ONLY as timers fire), a `timers: List(Timer)` (insertion-order, linear min-scan, FIFO tiebreak),
|
||||
`now_ms()`, `sleep(ms)` (arm `{clock_ms+ms, current}` + `suspend_self`), and a timer-driven `run`
|
||||
(drain ready → fire earliest timer → advance clock → wake → repeat; orphan-deadlock check preserved
|
||||
for a genuine no-timer suspend). Locked by `1814` (5 fibers sleep 30/10/20/15/15 → wake order
|
||||
B@10, D@15, E@15 (FIFO), C@20, A@30 — deadline order, not spawn order; `now_ms()` reads each virtual
|
||||
deadline; final clock 30). §8.1.3 calibration note in the header: the deterministic wake ORDER
|
||||
equals what real `sleep`s produce, reproducing blocking semantics' observable ordering without real
|
||||
time. The deterministic-sim `Io` is realized at the scheduler level (`sleep`/`now_ms`/timer-`run`),
|
||||
not as an erased `Io`-protocol impl (same erasure reason as FiberIo).
|
||||
- **Adversarial review (worker) of the run-loop change: found a CRITICAL use-after-free** — a fiber
|
||||
that armed a `sleep` timer but was woken EARLY by another path (a manual/`Task` `wake`) ran to
|
||||
completion + was reaped (stack `munmap`'d, `Fiber` freed) while its `Timer` still held a dangling
|
||||
`*Fiber`; a later fire would `wake` freed memory (silent-corruption: "passes" only because the
|
||||
freed slot coincidentally read `state != .suspended`). FIXED: `wake` now evicts the woken fiber's
|
||||
pending timer (`cancel_timer_for`) — every re-ready path funnels through `wake` (the timer-fire in
|
||||
`run` already removed the fired timer, so it's a harmless re-scan there), so no stale timer can
|
||||
outlive its fiber. Regression `1815-concurrency-fiber-timer-early-wake.sx` (early wake → `clock: 0`,
|
||||
the stale 100ms timer evicted, not fired). Review CLEARED: `n_suspended` accounting,
|
||||
orphan-deadlock false-positives, timer-list integrity (re-arm during fire), clock monotonicity,
|
||||
termination — all traced/probed safe.
|
||||
- Suite GREEN (count below). Next: **B1.4c** (event-loop `Io` — real fd readiness, kqueue/epoll).
|
||||
|
||||
### Earlier — B1.4a — a truly-SUSPENDING fiber-task async layer (`go`/`wait`/`cancel`)
|
||||
landed + adversarially reviewed; cleared two more compiler blockers en route. `library/modules/std/sched.sx`
|
||||
now carries `Task($R)` + `Scheduler.go(work) -> *Task($R)` + `wait`/`cancel` (a `ufcs` layer over
|
||||
the M:1 scheduler). `s.go(work)` runs the nullary thunk `work` as a REAL fiber; `t.wait()` SUSPENDS
|
||||
the caller until it completes (vs io.sx's blocking `context.io.async`, which runs inline). Locked by
|
||||
@@ -257,21 +282,21 @@ body); closed + locked. The review's `.naked`-lambda CRITICAL was a false positi
|
||||
(unparseable — `isLambda` breaks on the `abi` keyword).
|
||||
|
||||
## Current state
|
||||
**B1.4a COMPLETE — truly-suspending fiber-task async exists.** `library/modules/std/sched.sx` carries
|
||||
the M:1 scheduler core (B1.5a) PLUS the async-task layer: `Task($R)` + `Scheduler.go(work) ->
|
||||
*Task($R)` + `wait`/`cancel`. `s.go(work)` spawns a nullary thunk as a fiber; `t.wait()` suspends
|
||||
the caller until it completes. Locked by `1813` (`sequence: 1 2 3 42 100 -99` — real interleave +
|
||||
awaited values + cancel). Two compiler blockers fixed en route (0156 Part 1 — `$R` type-arg in a
|
||||
pack-fn; 0157 — UFCS generic name collision), both regression-tested (`0216`, `0217`). Adversarially
|
||||
reviewed; determinism + non-fiber-wait + cancel-skip-work all hardened. The io.sx blocking
|
||||
`context.io.async` (1805/1806) is untouched and coexists. Suite GREEN 751/0.
|
||||
**B1.4b COMPLETE — deterministic virtual-time timer scheduling exists.** `library/modules/std/sched.sx`
|
||||
now carries: the M:1 scheduler core (B1.5a: `spawn`/`yield_now`/`suspend_self`/`wake`/`run`), the
|
||||
suspending fiber-task async (B1.4a: `Task($R)`/`go`/`wait`/`cancel`), AND deterministic timers (B1.4b:
|
||||
`clock_ms` virtual clock, `timers` list, `now_ms`/`sleep`, timer-driven `run`). Fibers `sleep(ms)` in
|
||||
reproducible simulated time and wake in deadline order. The timer-vs-early-wake UAF found in review is
|
||||
fixed (`wake` evicts the fiber's pending timer). Locked by `1811` (round-robin), `1812` (suspend/wake),
|
||||
`1813` (async go/wait/cancel), `1814` (sim-timer deadline ordering), `1815` (timer early-wake eviction).
|
||||
Suite GREEN (count below).
|
||||
|
||||
The remaining B1.4 work: **B1.4b** the deterministic-sim `Io` (virtual clock + timer min-heap,
|
||||
calibrated against blocking — the KEYSTONE test harness), **B1.4c** the event-loop `Io`
|
||||
(kqueue/epoll). Then **B1.5** end-to-end M:1 validation under the deterministic `Io`. NOTE: the
|
||||
suspending async lives as `sched.go`/`wait` (M:1, receiver-driven), NOT routed through the erased
|
||||
The remaining B1 work: **B1.4c** the event-loop `Io` (kqueue mac / epoll linux — real fd readiness),
|
||||
then **B1.5** end-to-end M:1 validation under the deterministic timers. NOTE: the suspending async +
|
||||
deterministic timers live as `sched.*` methods (M:1, receiver-driven), NOT routed through the erased
|
||||
`context.io` (which would force sched.sx into every std consumer + duplicate the `_fib_tramp` global
|
||||
asm); the `Io` protocol's `spawn_raw`/`suspend_raw`/`ready` remain reserved for the M:N evolution.
|
||||
asm); the `Io` protocol's `spawn_raw`/`suspend_raw`/`ready`/`arm_timer`/`poll` remain reserved for the
|
||||
M:N evolution / when a program wants the capability-threaded form.
|
||||
|
||||
### Earlier — B1.5a COMPLETE — the M:1 scheduler CORE exists
|
||||
`library/modules/std/sched.sx` drives N fibers
|
||||
@@ -363,24 +388,22 @@ fibers/Io/scheduler code yet. Grounded floor facts:
|
||||
boundary; a sharper sx diagnostic for it is a candidate polish, not a blocker.
|
||||
|
||||
## Next step
|
||||
**→ B1.4b — the deterministic-sim `Io` (the KEYSTONE test harness).** B1.4a (suspending fiber-task
|
||||
async, `sched.go`/`wait`) is done. Now build a deterministic `Io` impl: a virtual clock (`now_ms`
|
||||
returns simulated time), a timer min-heap (`arm_timer` schedules a wake at a sim deadline), and
|
||||
`poll` advances the clock to the next due timer and wakes its parked fiber. Drive it over the M:1
|
||||
scheduler so a program using sim-time sleeps/timeouts runs fully deterministically. **Calibrate it
|
||||
against blocking `Io`** (§8.1.3): the same program under blocking vs deterministic `Io` must produce
|
||||
the same observable result before the deterministic one is trusted to gate async tests. Lock with an
|
||||
`18xx` example asserting a program-emitted ORDERING contract (sim-time scheduling), aarch64-pinned
|
||||
(`.build {"target":"macos"}`). This harness gates B1.5 + Stream B2.
|
||||
**→ B1.4c — the event-loop `Io` (real fd readiness).** B1.4b (deterministic virtual-time timers,
|
||||
`sched.sleep`/`now_ms`/timer-`run`) is done — the KEYSTONE deterministic harness exists at the
|
||||
scheduler level. Now add real-I/O readiness: a `poll`-style step over `kqueue` (macOS) / `epoll`
|
||||
(linux) that blocks until an fd is readable/writable (or a real-time timeout), then wakes the parked
|
||||
fiber waiting on it. Likely shape: a `block_on_fd(fd, events)` that registers the current fiber's
|
||||
interest, suspends, and is woken when `run`'s poll step reports the fd ready. Lock with an `18xx`
|
||||
example doing genuine fd I/O (e.g. a `pipe(2)`: a fiber blocks reading, another writes, the reader
|
||||
wakes with the bytes) — aarch64-macOS-pinned, kqueue. The deterministic timers (1814) and real I/O
|
||||
should compose (a real `poll` with a timeout vs the virtual clock — keep them as separate run modes,
|
||||
or unify with care). Then **B1.5** end-to-end M:1 validation. The §10.7 gate (1808) + guarded-stack
|
||||
(1809) + Win64 (1810) + scheduler/async/timers (1811-1815) must keep passing throughout.
|
||||
|
||||
Then: **B1.4c** event-loop `Io` (kqueue mac / epoll linux — real fd readiness), **B1.5** end-to-end
|
||||
M:1 validation under the deterministic `Io`. The §10.7 gate (1808) + guarded-stack (1809) + Win64
|
||||
(1810) + scheduler (1811/1812) + async (1813) must keep passing throughout.
|
||||
|
||||
Open design question for B1.4b/c: a deterministic/event-loop `Io` needs a current-`Scheduler`
|
||||
handle to park/wake. `sched.go`/`wait` thread it via the `Task`; an `Io` impl that wants the same
|
||||
will likely need an ambient current-scheduler accessor in sched.sx (deferred from B1.4a — the
|
||||
`Task`-threaded form sufficed). Decide when wiring `arm_timer` → a parked fiber.
|
||||
Design note carried forward: an event-loop `Io` needs a current-`Scheduler` handle. `sched.*` methods
|
||||
thread it via `self`/the `Task`; if B1.4c wants the capability-threaded `context.io` form it'll need
|
||||
an ambient current-scheduler accessor in sched.sx (still deferred — the `sched.*`-method form
|
||||
suffices). The `Io` protocol's `poll`/`arm_timer` map onto this when/if that wiring is built.
|
||||
|
||||
**Side thread (optional, low priority): the SysV/Linux x86_64 sibling.** A THIRD switch variant
|
||||
for `x86_64-linux`: SysV callee-saved = rbx, rbp, r12-r15 + rsp (6 GP + sp; **no** callee-saved
|
||||
@@ -670,3 +693,13 @@ incomplete); a dedicated effort; lambda workers are the idiom meanwhile.
|
||||
diagnostic), a `wait`-outside-fiber null-deref (loud guard), and cancel-not-skipping-work (skip
|
||||
if pre-canceled) — all fixed. Simplified `1812` (`**Fiber` → `Sh.parked`). 0156 Part 2 reframed
|
||||
OPEN/non-blocking. Suite GREEN **751/0**. Next: B1.4b (deterministic-sim `Io`, the KEYSTONE).
|
||||
- **B1.4b COMPLETE (this session) — deterministic virtual-time timers + a CRITICAL UAF fix.** Added
|
||||
`clock_ms`/`timers`/`now_ms`/`sleep` + a timer-driven `run` to `sched.sx` (worker-built): fibers
|
||||
sleep in reproducible simulated time, waking in deadline order (FIFO tiebreak). Locked `1814`
|
||||
(5 fibers, wake order B@10/D@15/E@15/C@20/A@30). Adversarial review of the run-loop change found a
|
||||
CRITICAL use-after-free — a fiber woken EARLY (manual/Task `wake`) before its `sleep` timer fired
|
||||
was reaped while its `Timer` kept a dangling `*Fiber`; a later fire dereferenced freed memory
|
||||
(silent "pass" only by luck). Fixed: `wake` evicts the fiber's pending timer (`cancel_timer_for`);
|
||||
regression `1815` (early wake → `clock: 0`, stale timer never fires). Review cleared n_suspended
|
||||
accounting, deadlock false-positives, timer-list integrity, clock monotonicity, termination.
|
||||
Suite GREEN **753/0**. Next: B1.4c (event-loop `Io`, kqueue/epoll).
|
||||
|
||||
@@ -7,11 +7,10 @@
|
||||
> `suspend_self`/`wake`/`run`) ✅** (fixed blocker 0154) · **B1.4a (suspending fiber-task async —
|
||||
> `sched.go`/`wait`/`cancel` over `Task($R)`, nullary-thunk) ✅** (adversarially reviewed; fixed
|
||||
> blockers 0156-Part1 + 0157 en route; locked `1813`).
|
||||
> **→ NOW: B1.4b** — the deterministic-sim `Io` (virtual clock + timer min-heap, calibrated against
|
||||
> blocking — §8.1.3, the KEYSTONE test harness). Then B1.4c (event-loop `Io`), B1.5 (end-to-end M:1
|
||||
> under deterministic `Io`). Detailed progress in [CHECKPOINT-FIBERS.md](CHECKPOINT-FIBERS.md).
|
||||
> NOTE: the suspending async is `sched.go`/`wait` (M:1, receiver-driven), NOT routed through the
|
||||
> erased `context.io` (avoids forcing sched.sx into every std consumer + the `_fib_tramp` dup-symbol
|
||||
> **B1.4b (deterministic virtual-time timers — sched.sleep/now_ms/timer-run) ✅** (reviewed; fixed a CRITICAL timer-vs-early-wake UAF; locked 1814/1815).
|
||||
> **→ NOW: B1.4c** — the event-loop `Io` (kqueue/epoll, real fd readiness). Then B1.5 (end-to-end
|
||||
> M:1). Detailed progress in [CHECKPOINT-FIBERS.md](CHECKPOINT-FIBERS.md). NOTE: suspending async +
|
||||
> deterministic timers live as `sched.*` methods (M:1), NOT routed through the erased `context.io` (avoids forcing sched.sx into every std consumer + the `_fib_tramp` dup-symbol
|
||||
> trap); the `Io` protocol's `spawn_raw`/`suspend_raw`/`ready` stay reserved for M:N. Deferred:
|
||||
> issue 0150 (`Future(void)`/`timeout`); 0156-Part2 (deferred `..` spread); the `::` callable-param
|
||||
> feature.
|
||||
|
||||
Reference in New Issue
Block a user