fibers: deterministic virtual-time timers (B1.4b)

Add a virtual clock + sleep timers to the M:1 scheduler so fibers
schedule in reproducible simulated time. Scheduler gains clock_ms (the
virtual clock, advances only as timers fire), a timers list, now_ms(),
sleep(ms) (arm {clock_ms+ms, current} + suspend), and a timer-driven
run (drain ready -> fire earliest timer -> advance clock -> wake ->
repeat; the orphan-suspend deadlock check is preserved for a genuine
no-timer park). Wakes fire in deadline order with a FIFO tiebreak.

Adversarial review found a use-after-free: a fiber woken early (manual
or Task wake) before its sleep timer fired was reaped while its Timer
kept a dangling *Fiber, so a later fire dereferenced freed memory.
Fixed: wake evicts the fiber's pending timer (cancel_timer_for) -- every
re-ready path funnels through wake, so no stale timer outlives its fiber.

Examples: 1814 (sim-timer deadline ordering), 1815 (early-wake timer
eviction regression). Suite green 753/0.
This commit is contained in:
agra
2026-06-21 19:09:22 +03:00
parent 02ab077bfb
commit 62ffea0663
13 changed files with 363 additions and 64 deletions

View File

@@ -57,6 +57,15 @@ Fiber :: struct {
next: *Fiber; // intrusive FIFO ready-queue link
}
// A pending virtual-time timer: wake `fiber` once the virtual clock reaches
// `deadline_ms`. Stored in `Scheduler.timers` (a `List`) in insertion order, so
// a linear min-scan that takes the FIRST entry at the minimum deadline gives a
// stable FIFO tiebreak for equal deadlines.
Timer :: struct {
deadline_ms: i64;
fiber: *Fiber;
}
Scheduler :: struct {
sched_ctx: FiberCtx; // the scheduler loop's own saved context
current: *Fiber; // running fiber; null while in the scheduler loop
@@ -67,6 +76,15 @@ Scheduler :: struct {
n_spawned: i64;
n_suspended: i64; // fibers parked off-queue (suspend_self minus wake)
// --- B1.4b: deterministic virtual-time timer scheduling ----------------
clock_ms: i64; // the VIRTUAL clock (ms). Starts 0; advances ONLY
// when the ready queue drains and the earliest
// pending timer fires. No real wall clock is ever
// read — wake ORDER + timestamps are reproducible.
timers: List(Timer); // pending sleep timers, in insertion order. Grown
// through `own_allocator` (long-lived-container
// rule: a timer outlives the `sleep` call's scope).
// Construct a scheduler BY VALUE (allocator value-return convention).
// Captures the current `context.allocator` into `own_allocator` — fibers and
// their heap `Fiber` structs outlive their spawn scope, so all internal
@@ -81,6 +99,8 @@ Scheduler :: struct {
s.next_id = 0;
s.n_spawned = 0;
s.n_suspended = 0;
s.clock_ms = 0;
s.timers = .{};
return s;
}
@@ -162,38 +182,97 @@ Scheduler :: struct {
// a genuinely parked fiber may be re-enqueued; any other wake is a no-op.
wake :: (self: *Scheduler, f: *Fiber) {
if f.state != .suspended { return; }
// Evict any pending sleep timer for `f`. EVERY path that re-readies a
// suspended fiber funnels through `wake` (a manual/Task wake, or the
// timer-fire in `run` — which already removed the fired timer, so this
// is a harmless re-scan there). Without this, a fiber that armed a
// `sleep` timer but was woken EARLY by another path would run to
// completion and be reaped (stack munmap'd + Fiber freed) while its
// Timer still held a dangling `*Fiber` — a later fire would dereference
// freed memory (use-after-free). One timer per fiber max in the M:1
// model, so a single eviction suffices; it also prevents a stale timer
// from spuriously re-waking a since-re-slept fiber.
cancel_timer_for(self, f);
self.n_suspended = self.n_suspended - 1;
f.state = .ready;
enqueue(self, f);
}
// The scheduler loop. Runs until the ready queue drains. Each iteration:
// dequeue the next fiber, switch into it, and — on its switch back — reap it
// if done (munmap stack, free the Fiber), re-enqueue it if it yielded, or
// leave it parked if it suspended.
run :: (self: *Scheduler) {
while self.ready_head != null {
f := dequeue(self);
self.current = f;
f.state = .running;
swap_context(@self.sched_ctx, @f.ctx); // returns here when f yields / suspends / finishes
self.current = null;
if f.state == .done {
// We've switched OFF f's stack already (the final swap landed
// here), so the stack is free to unmap. Free the Fiber struct
// AFTER munmap.
munmap(f.stack_region, f.stack_len);
self.own_allocator.dealloc_bytes(xx f);
} else if f.state == .ready {
enqueue(self, f);
}
// .suspended: leave it parked (not in any queue; `wake` re-adds it).
// Read the VIRTUAL clock — the simulated millisecond time. Advances only as
// timers fire (in `run`), never from a real wall clock, so two runs of the
// same fiber program observe identical timestamps. A fiber that just woke
// from `sleep(ms)` sees `now_ms()` equal to its deadline.
now_ms :: (self: *Scheduler) -> i64 {
return self.clock_ms;
}
// Sleep the running fiber for `ms` simulated milliseconds: arm a timer at
// `clock_ms + ms`, then park off-queue. The scheduler advances the virtual
// clock to this deadline and wakes the fiber once the ready queue has fully
// drained AND no earlier timer is pending (deadline order, FIFO tiebreak).
// MUST be called from inside a fiber (there must be a `current` to park);
// a null `current` bails loudly, mirroring `suspend_self`.
//
// Virtual time only moves forward: `ms >= 0` makes the deadline
// `>= clock_ms`, so a fired timer never rewinds the clock.
sleep :: (self: *Scheduler, ms: i64) {
cur := self.current;
if cur == null {
print("sched: sleep() called outside a fiber (no running fiber)\n");
abort();
}
// The queue drained. If any fiber is still parked, nothing will ever
// wake it — its stack + struct are leaked and the program believes it
// finished. That is a deadlock; surface it loudly rather than returning
// a silent success. (FiberIo, which uses suspend/wake, must balance
// every suspend with a wake before the queue empties.)
t : Timer = .{ deadline_ms = self.clock_ms + ms, fiber = cur };
// Long-lived-container rule: a timer outlives this `sleep` call's scope
// (it survives in `self.timers` until the scheduler fires it), so grow
// through the captured `own_allocator`, never the transient current one.
self.timers.append(t, self.own_allocator);
self.suspend_self(); // parks `cur` off-queue; the timer fire re-wakes it
}
// The scheduler loop. Drives ready fibers to quiescence, then advances the
// virtual clock by firing the earliest pending timer (which re-readies its
// sleeper), and repeats — until both the ready queue and the timer set are
// empty. Within the inner drain each iteration: dequeue the next fiber,
// switch into it, and — on its switch back — reap it if done (munmap stack,
// free the Fiber), re-enqueue it if it yielded, or leave it parked if it
// suspended.
run :: (self: *Scheduler) {
while true {
while self.ready_head != null {
f := dequeue(self);
self.current = f;
f.state = .running;
swap_context(@self.sched_ctx, @f.ctx); // returns here when f yields / suspends / finishes
self.current = null;
if f.state == .done {
// We've switched OFF f's stack already (the final swap landed
// here), so the stack is free to unmap. Free the Fiber struct
// AFTER munmap.
munmap(f.stack_region, f.stack_len);
self.own_allocator.dealloc_bytes(xx f);
} else if f.state == .ready {
enqueue(self, f);
}
// .suspended: leave it parked (not in any queue; `wake` re-adds it).
}
// Ready queue drained. Fire the earliest pending timer — the one
// sleeper whose deadline is next — advancing the virtual clock to it.
// No timers left ⇒ nothing more can run; exit the loop.
idx := earliest_timer(self);
if idx < 0 { break; }
t := self.timers.items[idx];
remove_timer(self, idx);
self.clock_ms = t.deadline_ms; // advance VIRTUAL time forward
self.wake(t.fiber); // re-enqueue the sleeper → drain again
}
// Both the ready queue and the timer set are empty. If a fiber is STILL
// parked, no timer will ever wake it (a `suspend_self` without an armed
// timer, never externally woken) — its stack + struct are leaked and the
// program believes it finished. That is a genuine deadlock; surface it
// loudly. (Timer sleepers are balanced: each `sleep` increments
// `n_suspended` via `suspend_self`, and the timer-fire `wake` decrements
// it — so once every timer has fired, `n_suspended` counts only true
// orphans.)
if self.n_suspended != 0 {
print("sched: deadlock — {} fiber(s) suspended with an empty run queue\n", self.n_suspended);
abort();
@@ -303,8 +382,59 @@ dequeue :: (self: *Scheduler) -> *Fiber {
return f;
}
// --- virtual-time timer set (linear min-scan, FIFO tiebreak) ---------------
//
// The timer set is a plain `List(Timer)` kept in INSERTION order. Fiber counts
// are tiny, so a linear scan for the minimum deadline is ideal — no heap to
// maintain — and "first entry at the minimum" naturally gives FIFO ordering for
// equal deadlines (the earlier-inserted timer is visited first, so it wins the
// tie). Removal shifts the tail down by one to preserve that insertion order for
// the remaining entries.
// Index of the earliest-deadline pending timer, or -1 if none. On a deadline
// tie the lowest index (earliest inserted) wins → deterministic FIFO wake order.
earliest_timer :: (self: *Scheduler) -> i64 {
if self.timers.len == 0 { return -1; }
best := 0;
i := 1;
while i < self.timers.len {
// Strict `<` so equal deadlines do NOT displace the earlier (lower)
// index — that is the FIFO tiebreak.
if self.timers.items[i].deadline_ms < self.timers.items[best].deadline_ms {
best = i;
}
i = i + 1;
}
return best;
}
// Remove the timer at `idx`, shifting every later entry down one slot so the
// remaining timers keep their insertion order (preserving the FIFO tiebreak).
remove_timer :: (self: *Scheduler, idx: i64) {
i := idx;
while i < self.timers.len - 1 {
self.timers.items[i] = self.timers.items[i + 1];
i = i + 1;
}
self.timers.len = self.timers.len - 1;
}
// Remove a pending sleep timer referencing fiber `f`, if any. A fiber has at
// most one pending timer in the M:1 model (it can only `sleep` once before
// suspending), so the first match is the only one. No-op if `f` has none.
cancel_timer_for :: (self: *Scheduler, f: *Fiber) {
i := 0;
while i < self.timers.len {
if self.timers.items[i].fiber == f {
remove_timer(self, i);
return;
}
i = i + 1;
}
}
// The public API lives as methods on `Scheduler` (above): `init`, `spawn`,
// `yield_now`, `suspend_self`, `wake`, `run`.
// `yield_now`, `suspend_self`, `wake`, `run`, `now_ms`, `sleep`.
// --- B1.4a: truly-suspending fiber-task async (`go` / `wait` / `cancel`) ----
//