From 62ffea0663ccad93576db1220d3d144e62e20872 Mon Sep 17 00:00:00 2001 From: agra Date: Sun, 21 Jun 2026 19:09:22 +0300 Subject: [PATCH] fibers: deterministic virtual-time timers (B1.4b) Add a virtual clock + sleep timers to the M:1 scheduler so fibers schedule in reproducible simulated time. Scheduler gains clock_ms (the virtual clock, advances only as timers fire), a timers list, now_ms(), sleep(ms) (arm {clock_ms+ms, current} + suspend), and a timer-driven run (drain ready -> fire earliest timer -> advance clock -> wake -> repeat; the orphan-suspend deadlock check is preserved for a genuine no-timer park). Wakes fire in deadline order with a FIFO tiebreak. Adversarial review found a use-after-free: a fiber woken early (manual or Task wake) before its sleep timer fired was reaped while its Timer kept a dangling *Fiber, so a later fire dereferenced freed memory. Fixed: wake evicts the fiber's pending timer (cancel_timer_for) -- every re-ready path funnels through wake, so no stale timer outlives its fiber. Examples: 1814 (sim-timer deadline ordering), 1815 (early-wake timer eviction regression). Suite green 753/0. --- current/CHECKPOINT-FIBERS.md | 97 ++++++--- current/PLAN-FIBERS.md | 9 +- .../1814-concurrency-fiber-sim-timer.sx | 74 +++++++ ...1815-concurrency-fiber-timer-early-wake.sx | 47 +++++ .../1814-concurrency-fiber-sim-timer.build | 1 + .../1814-concurrency-fiber-sim-timer.exit | 1 + .../1814-concurrency-fiber-sim-timer.stderr | 1 + .../1814-concurrency-fiber-sim-timer.stdout | 8 + ...5-concurrency-fiber-timer-early-wake.build | 1 + ...15-concurrency-fiber-timer-early-wake.exit | 1 + ...-concurrency-fiber-timer-early-wake.stderr | 1 + ...-concurrency-fiber-timer-early-wake.stdout | 2 + library/modules/std/sched.sx | 184 +++++++++++++++--- 13 files changed, 363 insertions(+), 64 deletions(-) create mode 100644 examples/concurrency/1814-concurrency-fiber-sim-timer.sx create mode 100644 examples/concurrency/1815-concurrency-fiber-timer-early-wake.sx create mode 100644 examples/concurrency/expected/1814-concurrency-fiber-sim-timer.build create mode 100644 examples/concurrency/expected/1814-concurrency-fiber-sim-timer.exit create mode 100644 examples/concurrency/expected/1814-concurrency-fiber-sim-timer.stderr create mode 100644 examples/concurrency/expected/1814-concurrency-fiber-sim-timer.stdout create mode 100644 examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.build create mode 100644 examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.exit create mode 100644 examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.stderr create mode 100644 examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.stdout diff --git a/current/CHECKPOINT-FIBERS.md b/current/CHECKPOINT-FIBERS.md index 8a804ffc..180ee24f 100644 --- a/current/CHECKPOINT-FIBERS.md +++ b/current/CHECKPOINT-FIBERS.md @@ -4,8 +4,33 @@ Companion to [PLAN-FIBERS.md](PLAN-FIBERS.md). Update after every step (one step per the cadence rule). New corpus category: `18xx` concurrency. ## Last completed step -**B1.4a — a truly-SUSPENDING fiber-task async layer (`go`/`wait`/`cancel`) — landed + -adversarially reviewed; cleared two more compiler blockers en route.** `library/modules/std/sched.sx` +**B1.4b — deterministic VIRTUAL-TIME timer scheduling (the KEYSTONE) — landed + adversarially +reviewed (caught a CRITICAL UAF, fixed).** `library/modules/std/sched.sx` gained a virtual clock + +sleep timers so fibers schedule in reproducible simulated time (no real clock): `clock_ms` (advances +ONLY as timers fire), a `timers: List(Timer)` (insertion-order, linear min-scan, FIFO tiebreak), +`now_ms()`, `sleep(ms)` (arm `{clock_ms+ms, current}` + `suspend_self`), and a timer-driven `run` +(drain ready → fire earliest timer → advance clock → wake → repeat; orphan-deadlock check preserved +for a genuine no-timer suspend). Locked by `1814` (5 fibers sleep 30/10/20/15/15 → wake order +B@10, D@15, E@15 (FIFO), C@20, A@30 — deadline order, not spawn order; `now_ms()` reads each virtual +deadline; final clock 30). §8.1.3 calibration note in the header: the deterministic wake ORDER +equals what real `sleep`s produce, reproducing blocking semantics' observable ordering without real +time. The deterministic-sim `Io` is realized at the scheduler level (`sleep`/`now_ms`/timer-`run`), +not as an erased `Io`-protocol impl (same erasure reason as FiberIo). +- **Adversarial review (worker) of the run-loop change: found a CRITICAL use-after-free** — a fiber + that armed a `sleep` timer but was woken EARLY by another path (a manual/`Task` `wake`) ran to + completion + was reaped (stack `munmap`'d, `Fiber` freed) while its `Timer` still held a dangling + `*Fiber`; a later fire would `wake` freed memory (silent-corruption: "passes" only because the + freed slot coincidentally read `state != .suspended`). FIXED: `wake` now evicts the woken fiber's + pending timer (`cancel_timer_for`) — every re-ready path funnels through `wake` (the timer-fire in + `run` already removed the fired timer, so it's a harmless re-scan there), so no stale timer can + outlive its fiber. Regression `1815-concurrency-fiber-timer-early-wake.sx` (early wake → `clock: 0`, + the stale 100ms timer evicted, not fired). Review CLEARED: `n_suspended` accounting, + orphan-deadlock false-positives, timer-list integrity (re-arm during fire), clock monotonicity, + termination — all traced/probed safe. +- Suite GREEN (count below). Next: **B1.4c** (event-loop `Io` — real fd readiness, kqueue/epoll). + +### Earlier — B1.4a — a truly-SUSPENDING fiber-task async layer (`go`/`wait`/`cancel`) +landed + adversarially reviewed; cleared two more compiler blockers en route. `library/modules/std/sched.sx` now carries `Task($R)` + `Scheduler.go(work) -> *Task($R)` + `wait`/`cancel` (a `ufcs` layer over the M:1 scheduler). `s.go(work)` runs the nullary thunk `work` as a REAL fiber; `t.wait()` SUSPENDS the caller until it completes (vs io.sx's blocking `context.io.async`, which runs inline). Locked by @@ -257,21 +282,21 @@ body); closed + locked. The review's `.naked`-lambda CRITICAL was a false positi (unparseable — `isLambda` breaks on the `abi` keyword). ## Current state -**B1.4a COMPLETE — truly-suspending fiber-task async exists.** `library/modules/std/sched.sx` carries -the M:1 scheduler core (B1.5a) PLUS the async-task layer: `Task($R)` + `Scheduler.go(work) -> -*Task($R)` + `wait`/`cancel`. `s.go(work)` spawns a nullary thunk as a fiber; `t.wait()` suspends -the caller until it completes. Locked by `1813` (`sequence: 1 2 3 42 100 -99` — real interleave + -awaited values + cancel). Two compiler blockers fixed en route (0156 Part 1 — `$R` type-arg in a -pack-fn; 0157 — UFCS generic name collision), both regression-tested (`0216`, `0217`). Adversarially -reviewed; determinism + non-fiber-wait + cancel-skip-work all hardened. The io.sx blocking -`context.io.async` (1805/1806) is untouched and coexists. Suite GREEN 751/0. +**B1.4b COMPLETE — deterministic virtual-time timer scheduling exists.** `library/modules/std/sched.sx` +now carries: the M:1 scheduler core (B1.5a: `spawn`/`yield_now`/`suspend_self`/`wake`/`run`), the +suspending fiber-task async (B1.4a: `Task($R)`/`go`/`wait`/`cancel`), AND deterministic timers (B1.4b: +`clock_ms` virtual clock, `timers` list, `now_ms`/`sleep`, timer-driven `run`). Fibers `sleep(ms)` in +reproducible simulated time and wake in deadline order. The timer-vs-early-wake UAF found in review is +fixed (`wake` evicts the fiber's pending timer). Locked by `1811` (round-robin), `1812` (suspend/wake), +`1813` (async go/wait/cancel), `1814` (sim-timer deadline ordering), `1815` (timer early-wake eviction). +Suite GREEN (count below). -The remaining B1.4 work: **B1.4b** the deterministic-sim `Io` (virtual clock + timer min-heap, -calibrated against blocking — the KEYSTONE test harness), **B1.4c** the event-loop `Io` -(kqueue/epoll). Then **B1.5** end-to-end M:1 validation under the deterministic `Io`. NOTE: the -suspending async lives as `sched.go`/`wait` (M:1, receiver-driven), NOT routed through the erased +The remaining B1 work: **B1.4c** the event-loop `Io` (kqueue mac / epoll linux — real fd readiness), +then **B1.5** end-to-end M:1 validation under the deterministic timers. NOTE: the suspending async + +deterministic timers live as `sched.*` methods (M:1, receiver-driven), NOT routed through the erased `context.io` (which would force sched.sx into every std consumer + duplicate the `_fib_tramp` global -asm); the `Io` protocol's `spawn_raw`/`suspend_raw`/`ready` remain reserved for the M:N evolution. +asm); the `Io` protocol's `spawn_raw`/`suspend_raw`/`ready`/`arm_timer`/`poll` remain reserved for the +M:N evolution / when a program wants the capability-threaded form. ### Earlier — B1.5a COMPLETE — the M:1 scheduler CORE exists `library/modules/std/sched.sx` drives N fibers @@ -363,24 +388,22 @@ fibers/Io/scheduler code yet. Grounded floor facts: boundary; a sharper sx diagnostic for it is a candidate polish, not a blocker. ## Next step -**→ B1.4b — the deterministic-sim `Io` (the KEYSTONE test harness).** B1.4a (suspending fiber-task -async, `sched.go`/`wait`) is done. Now build a deterministic `Io` impl: a virtual clock (`now_ms` -returns simulated time), a timer min-heap (`arm_timer` schedules a wake at a sim deadline), and -`poll` advances the clock to the next due timer and wakes its parked fiber. Drive it over the M:1 -scheduler so a program using sim-time sleeps/timeouts runs fully deterministically. **Calibrate it -against blocking `Io`** (§8.1.3): the same program under blocking vs deterministic `Io` must produce -the same observable result before the deterministic one is trusted to gate async tests. Lock with an -`18xx` example asserting a program-emitted ORDERING contract (sim-time scheduling), aarch64-pinned -(`.build {"target":"macos"}`). This harness gates B1.5 + Stream B2. +**→ B1.4c — the event-loop `Io` (real fd readiness).** B1.4b (deterministic virtual-time timers, +`sched.sleep`/`now_ms`/timer-`run`) is done — the KEYSTONE deterministic harness exists at the +scheduler level. Now add real-I/O readiness: a `poll`-style step over `kqueue` (macOS) / `epoll` +(linux) that blocks until an fd is readable/writable (or a real-time timeout), then wakes the parked +fiber waiting on it. Likely shape: a `block_on_fd(fd, events)` that registers the current fiber's +interest, suspends, and is woken when `run`'s poll step reports the fd ready. Lock with an `18xx` +example doing genuine fd I/O (e.g. a `pipe(2)`: a fiber blocks reading, another writes, the reader +wakes with the bytes) — aarch64-macOS-pinned, kqueue. The deterministic timers (1814) and real I/O +should compose (a real `poll` with a timeout vs the virtual clock — keep them as separate run modes, +or unify with care). Then **B1.5** end-to-end M:1 validation. The §10.7 gate (1808) + guarded-stack +(1809) + Win64 (1810) + scheduler/async/timers (1811-1815) must keep passing throughout. -Then: **B1.4c** event-loop `Io` (kqueue mac / epoll linux — real fd readiness), **B1.5** end-to-end -M:1 validation under the deterministic `Io`. The §10.7 gate (1808) + guarded-stack (1809) + Win64 -(1810) + scheduler (1811/1812) + async (1813) must keep passing throughout. - -Open design question for B1.4b/c: a deterministic/event-loop `Io` needs a current-`Scheduler` -handle to park/wake. `sched.go`/`wait` thread it via the `Task`; an `Io` impl that wants the same -will likely need an ambient current-scheduler accessor in sched.sx (deferred from B1.4a — the -`Task`-threaded form sufficed). Decide when wiring `arm_timer` → a parked fiber. +Design note carried forward: an event-loop `Io` needs a current-`Scheduler` handle. `sched.*` methods +thread it via `self`/the `Task`; if B1.4c wants the capability-threaded `context.io` form it'll need +an ambient current-scheduler accessor in sched.sx (still deferred — the `sched.*`-method form +suffices). The `Io` protocol's `poll`/`arm_timer` map onto this when/if that wiring is built. **Side thread (optional, low priority): the SysV/Linux x86_64 sibling.** A THIRD switch variant for `x86_64-linux`: SysV callee-saved = rbx, rbp, r12-r15 + rsp (6 GP + sp; **no** callee-saved @@ -670,3 +693,13 @@ incomplete); a dedicated effort; lambda workers are the idiom meanwhile. diagnostic), a `wait`-outside-fiber null-deref (loud guard), and cancel-not-skipping-work (skip if pre-canceled) — all fixed. Simplified `1812` (`**Fiber` → `Sh.parked`). 0156 Part 2 reframed OPEN/non-blocking. Suite GREEN **751/0**. Next: B1.4b (deterministic-sim `Io`, the KEYSTONE). +- **B1.4b COMPLETE (this session) — deterministic virtual-time timers + a CRITICAL UAF fix.** Added + `clock_ms`/`timers`/`now_ms`/`sleep` + a timer-driven `run` to `sched.sx` (worker-built): fibers + sleep in reproducible simulated time, waking in deadline order (FIFO tiebreak). Locked `1814` + (5 fibers, wake order B@10/D@15/E@15/C@20/A@30). Adversarial review of the run-loop change found a + CRITICAL use-after-free — a fiber woken EARLY (manual/Task `wake`) before its `sleep` timer fired + was reaped while its `Timer` kept a dangling `*Fiber`; a later fire dereferenced freed memory + (silent "pass" only by luck). Fixed: `wake` evicts the fiber's pending timer (`cancel_timer_for`); + regression `1815` (early wake → `clock: 0`, stale timer never fires). Review cleared n_suspended + accounting, deadlock false-positives, timer-list integrity, clock monotonicity, termination. + Suite GREEN **753/0**. Next: B1.4c (event-loop `Io`, kqueue/epoll). diff --git a/current/PLAN-FIBERS.md b/current/PLAN-FIBERS.md index 3bb56772..4324b9d9 100644 --- a/current/PLAN-FIBERS.md +++ b/current/PLAN-FIBERS.md @@ -7,11 +7,10 @@ > `suspend_self`/`wake`/`run`) ✅** (fixed blocker 0154) · **B1.4a (suspending fiber-task async — > `sched.go`/`wait`/`cancel` over `Task($R)`, nullary-thunk) ✅** (adversarially reviewed; fixed > blockers 0156-Part1 + 0157 en route; locked `1813`). -> **→ NOW: B1.4b** — the deterministic-sim `Io` (virtual clock + timer min-heap, calibrated against -> blocking — §8.1.3, the KEYSTONE test harness). Then B1.4c (event-loop `Io`), B1.5 (end-to-end M:1 -> under deterministic `Io`). Detailed progress in [CHECKPOINT-FIBERS.md](CHECKPOINT-FIBERS.md). -> NOTE: the suspending async is `sched.go`/`wait` (M:1, receiver-driven), NOT routed through the -> erased `context.io` (avoids forcing sched.sx into every std consumer + the `_fib_tramp` dup-symbol +> **B1.4b (deterministic virtual-time timers — sched.sleep/now_ms/timer-run) ✅** (reviewed; fixed a CRITICAL timer-vs-early-wake UAF; locked 1814/1815). +> **→ NOW: B1.4c** — the event-loop `Io` (kqueue/epoll, real fd readiness). Then B1.5 (end-to-end +> M:1). Detailed progress in [CHECKPOINT-FIBERS.md](CHECKPOINT-FIBERS.md). NOTE: suspending async + +> deterministic timers live as `sched.*` methods (M:1), NOT routed through the erased `context.io` (avoids forcing sched.sx into every std consumer + the `_fib_tramp` dup-symbol > trap); the `Io` protocol's `spawn_raw`/`suspend_raw`/`ready` stay reserved for M:N. Deferred: > issue 0150 (`Future(void)`/`timeout`); 0156-Part2 (deferred `..` spread); the `::` callable-param > feature. diff --git a/examples/concurrency/1814-concurrency-fiber-sim-timer.sx b/examples/concurrency/1814-concurrency-fiber-sim-timer.sx new file mode 100644 index 00000000..a7cb0a6b --- /dev/null +++ b/examples/concurrency/1814-concurrency-fiber-sim-timer.sx @@ -0,0 +1,74 @@ +// Stream B1 (fibers) B1.4b — deterministic VIRTUAL-TIME timer scheduling (the +// KEYSTONE), in pure sx over the M:1 scheduler. A fiber `sleep(ms)`s in +// SIMULATED time; the scheduler wakes fibers in DEADLINE order, advancing a +// virtual clock that moves only when the ready queue drains and the earliest +// timer fires. No real wall clock is ever read — the wake ORDER and the +// observed timestamps are fully reproducible, which is exactly what a +// deterministic-sim Io test harness needs. +// +// HOW IT WORKS. `s.sleep(ms)` arms a timer `{ clock_ms + ms, current }` and +// parks the fiber off-queue. `s.run` drives ready fibers to quiescence, then +// fires the earliest pending timer: it advances `clock_ms` to that deadline and +// `wake`s the sleeper (re-readying it), and repeats until both the ready queue +// AND the timer set are empty. So a fiber that just woke reads `now_ms()` equal +// to its own deadline. +// +// WHAT THIS PROVES. +// - Deadline-ordered wake (NOT spawn order): spawn A, B, C in that order; +// A sleep(30), B sleep(10), C sleep(20). Wakes fire B(10), C(20), A(30) — +// reordered by deadline, not by spawn order. +// - Virtual timestamps: each fiber on wake reads `now_ms()` == its deadline +// (10, 20, 30) — the virtual clock landed exactly on the firing deadline. +// - FIFO tiebreak: two fibers D, E both sleep(15) — they wake in spawn +// (insertion) order D then E, the deterministic equal-deadline contract. +// +// §8.1.3 CALIBRATION NOTE. The deterministic virtual-time wake ORDER equals +// what real `sleep`s would produce: under real blocking sleeps the OS would +// also wake the shortest sleeper first, i.e. in deadline order. The sim +// reproduces blocking semantics' OBSERVABLE ordering (and the relative +// timestamps) without consuming real time or admitting nondeterminism — so a +// harness can assert exact orderings that a wall-clock test could only +// approximate. (No real-time variant is run here; the equivalence is the +// contract the deterministic test relies on.) +// +// aarch64-macOS-pinned (the scheduler's `swap_context` asm + guard-page mmap +// constants are per-arch / Apple-specific): runs end-to-end on a matching host, +// ir-only on a mismatch. +#import "modules/std.sx"; +sched :: #import "modules/std/sched.sx"; + +// Shared wake log, captured by pointer into each fiber's thunk (closure +// capture-by-value does not write back, so outputs flow through `*Log`). +Log :: struct { ids: [16]i64; ts: [16]i64; n: i64; } +rec :: (l: *Log, id: i64, t: i64) { l.ids[l.n] = id; l.ts[l.n] = t; l.n = l.n + 1; } + +main :: () -> i64 { + lg : Log = ---; + lg.n = 0; + + s := sched.Scheduler.init(); + ps := @s; + pl := @lg; + + // Spawn order A, B, C, D, E — but the WAKE order is set by deadline. + ps.spawn(() => { ps.sleep(30); rec(pl, 1, ps.now_ms()); }); // A: latest + ps.spawn(() => { ps.sleep(10); rec(pl, 2, ps.now_ms()); }); // B: earliest + ps.spawn(() => { ps.sleep(20); rec(pl, 3, ps.now_ms()); }); // C: middle + // Same-deadline FIFO pair: D before E, both at t=15 → wake D then E. + ps.spawn(() => { ps.sleep(15); rec(pl, 4, ps.now_ms()); }); // D + ps.spawn(() => { ps.sleep(15); rec(pl, 5, ps.now_ms()); }); // E + + s.run(); + + // Ordering contract: deadline order with a FIFO tiebreak → B, D, E, C, A + // at virtual times 10, 15, 15, 20, 30. + print("wake order (id @ virtual-ms):\n"); + i := 0; + while i < lg.n { + print(" id={} @ {}ms\n", lg.ids[i], lg.ts[i]); + i = i + 1; + } + print("final virtual clock: {}ms\n", s.now_ms()); + print("spawned: {}\n", s.n_spawned); + return 0; +} diff --git a/examples/concurrency/1815-concurrency-fiber-timer-early-wake.sx b/examples/concurrency/1815-concurrency-fiber-timer-early-wake.sx new file mode 100644 index 00000000..d279c0ae --- /dev/null +++ b/examples/concurrency/1815-concurrency-fiber-timer-early-wake.sx @@ -0,0 +1,47 @@ +// Stream B1 (fibers) B1.4b — a fiber's pending `sleep` timer is EVICTED when it +// is woken early by another path, so a stale timer can never outlive (and +// dereference) a reaped fiber. +// +// Scenario: a "sleeper" fiber arms `sleep(100)` and parks; a "waker" fiber wakes +// it EARLY (at virtual t=0) via `wake`. The sleeper resumes, finishes, and is +// reaped (its stack `munmap`'d + `Fiber` freed). Its 100ms timer must already be +// gone — otherwise, when the run loop later fired that stale timer, it would +// `wake` a freed `*Fiber` (use-after-free) and wrongly advance the virtual clock +// to 100. Here `wake` evicts the timer, so the clock stays at 0 and nothing +// dereferences freed memory. +// +// Regression: the timer-vs-early-wake use-after-free found reviewing B1.4b. +// Contract: `log: 2 1` (waker records 2, then the early-woken sleeper records 1), +// `clock: 0` (no stale timer fired), `n_suspended: 0` (balanced). +// +// aarch64-macOS-pinned (the scheduler's per-arch asm + Apple mmap constants): +// runs end-to-end on a matching host, ir-only on a mismatch. +#import "modules/std.sx"; +sched :: #import "modules/std/sched.sx"; + +S :: struct { sleeper: *sched.Fiber; log: [8]i64; n: i64; } +rec :: (s: *S, v: i64) { s.log[s.n] = v; s.n = s.n + 1; } + +main :: () -> i64 { + st : S = ---; st.n = 0; st.sleeper = null; + s := sched.Scheduler.init(); + ps := @s; pst := @st; + + // Sleeper: arm sleep(100), park; when woken (early), record 1 and finish. + mk_sleeper :: (ps: *sched.Scheduler, pst: *S) { + pst.sleeper = ps.spawn(() => { ps.sleep(100); rec(pst, 1); }); + } + // Waker: record 2, then wake the sleeper BEFORE its 100ms timer fires. + mk_waker :: (ps: *sched.Scheduler, pst: *S) { + ps.spawn(() => { rec(pst, 2); ps.wake(pst.sleeper); }); + } + mk_sleeper(ps, pst); + mk_waker(ps, pst); + s.run(); + + print("log:"); + i := 0; while i < st.n { print(" {}", st.log[i]); i = i + 1; } + print("\n"); + print("clock: {} n_suspended: {}\n", s.now_ms(), s.n_suspended); + return 0; +} diff --git a/examples/concurrency/expected/1814-concurrency-fiber-sim-timer.build b/examples/concurrency/expected/1814-concurrency-fiber-sim-timer.build new file mode 100644 index 00000000..42e24dd2 --- /dev/null +++ b/examples/concurrency/expected/1814-concurrency-fiber-sim-timer.build @@ -0,0 +1 @@ +{ "target": "macos" } diff --git a/examples/concurrency/expected/1814-concurrency-fiber-sim-timer.exit b/examples/concurrency/expected/1814-concurrency-fiber-sim-timer.exit new file mode 100644 index 00000000..573541ac --- /dev/null +++ b/examples/concurrency/expected/1814-concurrency-fiber-sim-timer.exit @@ -0,0 +1 @@ +0 diff --git a/examples/concurrency/expected/1814-concurrency-fiber-sim-timer.stderr b/examples/concurrency/expected/1814-concurrency-fiber-sim-timer.stderr new file mode 100644 index 00000000..8b137891 --- /dev/null +++ b/examples/concurrency/expected/1814-concurrency-fiber-sim-timer.stderr @@ -0,0 +1 @@ + diff --git a/examples/concurrency/expected/1814-concurrency-fiber-sim-timer.stdout b/examples/concurrency/expected/1814-concurrency-fiber-sim-timer.stdout new file mode 100644 index 00000000..44850627 --- /dev/null +++ b/examples/concurrency/expected/1814-concurrency-fiber-sim-timer.stdout @@ -0,0 +1,8 @@ +wake order (id @ virtual-ms): + id=2 @ 10ms + id=4 @ 15ms + id=5 @ 15ms + id=3 @ 20ms + id=1 @ 30ms +final virtual clock: 30ms +spawned: 5 diff --git a/examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.build b/examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.build new file mode 100644 index 00000000..42e24dd2 --- /dev/null +++ b/examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.build @@ -0,0 +1 @@ +{ "target": "macos" } diff --git a/examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.exit b/examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.exit new file mode 100644 index 00000000..573541ac --- /dev/null +++ b/examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.exit @@ -0,0 +1 @@ +0 diff --git a/examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.stderr b/examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.stderr new file mode 100644 index 00000000..8b137891 --- /dev/null +++ b/examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.stderr @@ -0,0 +1 @@ + diff --git a/examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.stdout b/examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.stdout new file mode 100644 index 00000000..32e17e2e --- /dev/null +++ b/examples/concurrency/expected/1815-concurrency-fiber-timer-early-wake.stdout @@ -0,0 +1,2 @@ +log: 2 1 +clock: 0 n_suspended: 0 diff --git a/library/modules/std/sched.sx b/library/modules/std/sched.sx index 388b8686..5fb4a45b 100644 --- a/library/modules/std/sched.sx +++ b/library/modules/std/sched.sx @@ -57,6 +57,15 @@ Fiber :: struct { next: *Fiber; // intrusive FIFO ready-queue link } +// A pending virtual-time timer: wake `fiber` once the virtual clock reaches +// `deadline_ms`. Stored in `Scheduler.timers` (a `List`) in insertion order, so +// a linear min-scan that takes the FIRST entry at the minimum deadline gives a +// stable FIFO tiebreak for equal deadlines. +Timer :: struct { + deadline_ms: i64; + fiber: *Fiber; +} + Scheduler :: struct { sched_ctx: FiberCtx; // the scheduler loop's own saved context current: *Fiber; // running fiber; null while in the scheduler loop @@ -67,6 +76,15 @@ Scheduler :: struct { n_spawned: i64; n_suspended: i64; // fibers parked off-queue (suspend_self minus wake) + // --- B1.4b: deterministic virtual-time timer scheduling ---------------- + clock_ms: i64; // the VIRTUAL clock (ms). Starts 0; advances ONLY + // when the ready queue drains and the earliest + // pending timer fires. No real wall clock is ever + // read — wake ORDER + timestamps are reproducible. + timers: List(Timer); // pending sleep timers, in insertion order. Grown + // through `own_allocator` (long-lived-container + // rule: a timer outlives the `sleep` call's scope). + // Construct a scheduler BY VALUE (allocator value-return convention). // Captures the current `context.allocator` into `own_allocator` — fibers and // their heap `Fiber` structs outlive their spawn scope, so all internal @@ -81,6 +99,8 @@ Scheduler :: struct { s.next_id = 0; s.n_spawned = 0; s.n_suspended = 0; + s.clock_ms = 0; + s.timers = .{}; return s; } @@ -162,38 +182,97 @@ Scheduler :: struct { // a genuinely parked fiber may be re-enqueued; any other wake is a no-op. wake :: (self: *Scheduler, f: *Fiber) { if f.state != .suspended { return; } + // Evict any pending sleep timer for `f`. EVERY path that re-readies a + // suspended fiber funnels through `wake` (a manual/Task wake, or the + // timer-fire in `run` — which already removed the fired timer, so this + // is a harmless re-scan there). Without this, a fiber that armed a + // `sleep` timer but was woken EARLY by another path would run to + // completion and be reaped (stack munmap'd + Fiber freed) while its + // Timer still held a dangling `*Fiber` — a later fire would dereference + // freed memory (use-after-free). One timer per fiber max in the M:1 + // model, so a single eviction suffices; it also prevents a stale timer + // from spuriously re-waking a since-re-slept fiber. + cancel_timer_for(self, f); self.n_suspended = self.n_suspended - 1; f.state = .ready; enqueue(self, f); } - // The scheduler loop. Runs until the ready queue drains. Each iteration: - // dequeue the next fiber, switch into it, and — on its switch back — reap it - // if done (munmap stack, free the Fiber), re-enqueue it if it yielded, or - // leave it parked if it suspended. - run :: (self: *Scheduler) { - while self.ready_head != null { - f := dequeue(self); - self.current = f; - f.state = .running; - swap_context(@self.sched_ctx, @f.ctx); // returns here when f yields / suspends / finishes - self.current = null; - if f.state == .done { - // We've switched OFF f's stack already (the final swap landed - // here), so the stack is free to unmap. Free the Fiber struct - // AFTER munmap. - munmap(f.stack_region, f.stack_len); - self.own_allocator.dealloc_bytes(xx f); - } else if f.state == .ready { - enqueue(self, f); - } - // .suspended: leave it parked (not in any queue; `wake` re-adds it). + // Read the VIRTUAL clock — the simulated millisecond time. Advances only as + // timers fire (in `run`), never from a real wall clock, so two runs of the + // same fiber program observe identical timestamps. A fiber that just woke + // from `sleep(ms)` sees `now_ms()` equal to its deadline. + now_ms :: (self: *Scheduler) -> i64 { + return self.clock_ms; + } + + // Sleep the running fiber for `ms` simulated milliseconds: arm a timer at + // `clock_ms + ms`, then park off-queue. The scheduler advances the virtual + // clock to this deadline and wakes the fiber once the ready queue has fully + // drained AND no earlier timer is pending (deadline order, FIFO tiebreak). + // MUST be called from inside a fiber (there must be a `current` to park); + // a null `current` bails loudly, mirroring `suspend_self`. + // + // Virtual time only moves forward: `ms >= 0` makes the deadline + // `>= clock_ms`, so a fired timer never rewinds the clock. + sleep :: (self: *Scheduler, ms: i64) { + cur := self.current; + if cur == null { + print("sched: sleep() called outside a fiber (no running fiber)\n"); + abort(); } - // The queue drained. If any fiber is still parked, nothing will ever - // wake it — its stack + struct are leaked and the program believes it - // finished. That is a deadlock; surface it loudly rather than returning - // a silent success. (FiberIo, which uses suspend/wake, must balance - // every suspend with a wake before the queue empties.) + t : Timer = .{ deadline_ms = self.clock_ms + ms, fiber = cur }; + // Long-lived-container rule: a timer outlives this `sleep` call's scope + // (it survives in `self.timers` until the scheduler fires it), so grow + // through the captured `own_allocator`, never the transient current one. + self.timers.append(t, self.own_allocator); + self.suspend_self(); // parks `cur` off-queue; the timer fire re-wakes it + } + + // The scheduler loop. Drives ready fibers to quiescence, then advances the + // virtual clock by firing the earliest pending timer (which re-readies its + // sleeper), and repeats — until both the ready queue and the timer set are + // empty. Within the inner drain each iteration: dequeue the next fiber, + // switch into it, and — on its switch back — reap it if done (munmap stack, + // free the Fiber), re-enqueue it if it yielded, or leave it parked if it + // suspended. + run :: (self: *Scheduler) { + while true { + while self.ready_head != null { + f := dequeue(self); + self.current = f; + f.state = .running; + swap_context(@self.sched_ctx, @f.ctx); // returns here when f yields / suspends / finishes + self.current = null; + if f.state == .done { + // We've switched OFF f's stack already (the final swap landed + // here), so the stack is free to unmap. Free the Fiber struct + // AFTER munmap. + munmap(f.stack_region, f.stack_len); + self.own_allocator.dealloc_bytes(xx f); + } else if f.state == .ready { + enqueue(self, f); + } + // .suspended: leave it parked (not in any queue; `wake` re-adds it). + } + // Ready queue drained. Fire the earliest pending timer — the one + // sleeper whose deadline is next — advancing the virtual clock to it. + // No timers left ⇒ nothing more can run; exit the loop. + idx := earliest_timer(self); + if idx < 0 { break; } + t := self.timers.items[idx]; + remove_timer(self, idx); + self.clock_ms = t.deadline_ms; // advance VIRTUAL time forward + self.wake(t.fiber); // re-enqueue the sleeper → drain again + } + // Both the ready queue and the timer set are empty. If a fiber is STILL + // parked, no timer will ever wake it (a `suspend_self` without an armed + // timer, never externally woken) — its stack + struct are leaked and the + // program believes it finished. That is a genuine deadlock; surface it + // loudly. (Timer sleepers are balanced: each `sleep` increments + // `n_suspended` via `suspend_self`, and the timer-fire `wake` decrements + // it — so once every timer has fired, `n_suspended` counts only true + // orphans.) if self.n_suspended != 0 { print("sched: deadlock — {} fiber(s) suspended with an empty run queue\n", self.n_suspended); abort(); @@ -303,8 +382,59 @@ dequeue :: (self: *Scheduler) -> *Fiber { return f; } +// --- virtual-time timer set (linear min-scan, FIFO tiebreak) --------------- +// +// The timer set is a plain `List(Timer)` kept in INSERTION order. Fiber counts +// are tiny, so a linear scan for the minimum deadline is ideal — no heap to +// maintain — and "first entry at the minimum" naturally gives FIFO ordering for +// equal deadlines (the earlier-inserted timer is visited first, so it wins the +// tie). Removal shifts the tail down by one to preserve that insertion order for +// the remaining entries. + +// Index of the earliest-deadline pending timer, or -1 if none. On a deadline +// tie the lowest index (earliest inserted) wins → deterministic FIFO wake order. +earliest_timer :: (self: *Scheduler) -> i64 { + if self.timers.len == 0 { return -1; } + best := 0; + i := 1; + while i < self.timers.len { + // Strict `<` so equal deadlines do NOT displace the earlier (lower) + // index — that is the FIFO tiebreak. + if self.timers.items[i].deadline_ms < self.timers.items[best].deadline_ms { + best = i; + } + i = i + 1; + } + return best; +} + +// Remove the timer at `idx`, shifting every later entry down one slot so the +// remaining timers keep their insertion order (preserving the FIFO tiebreak). +remove_timer :: (self: *Scheduler, idx: i64) { + i := idx; + while i < self.timers.len - 1 { + self.timers.items[i] = self.timers.items[i + 1]; + i = i + 1; + } + self.timers.len = self.timers.len - 1; +} + +// Remove a pending sleep timer referencing fiber `f`, if any. A fiber has at +// most one pending timer in the M:1 model (it can only `sleep` once before +// suspending), so the first match is the only one. No-op if `f` has none. +cancel_timer_for :: (self: *Scheduler, f: *Fiber) { + i := 0; + while i < self.timers.len { + if self.timers.items[i].fiber == f { + remove_timer(self, i); + return; + } + i = i + 1; + } +} + // The public API lives as methods on `Scheduler` (above): `init`, `spawn`, -// `yield_now`, `suspend_self`, `wake`, `run`. +// `yield_now`, `suspend_self`, `wake`, `run`, `now_ms`, `sleep`. // --- B1.4a: truly-suspending fiber-task async (`go` / `wait` / `cancel`) ---- //