fibers: deterministic virtual-time timers (B1.4b)
Add a virtual clock + sleep timers to the M:1 scheduler so fibers
schedule in reproducible simulated time. Scheduler gains clock_ms (the
virtual clock, advances only as timers fire), a timers list, now_ms(),
sleep(ms) (arm {clock_ms+ms, current} + suspend), and a timer-driven
run (drain ready -> fire earliest timer -> advance clock -> wake ->
repeat; the orphan-suspend deadlock check is preserved for a genuine
no-timer park). Wakes fire in deadline order with a FIFO tiebreak.
Adversarial review found a use-after-free: a fiber woken early (manual
or Task wake) before its sleep timer fired was reaped while its Timer
kept a dangling *Fiber, so a later fire dereferenced freed memory.
Fixed: wake evicts the fiber's pending timer (cancel_timer_for) -- every
re-ready path funnels through wake, so no stale timer outlives its fiber.
Examples: 1814 (sim-timer deadline ordering), 1815 (early-wake timer
eviction regression). Suite green 753/0.
This commit is contained in:
@@ -57,6 +57,15 @@ Fiber :: struct {
|
||||
next: *Fiber; // intrusive FIFO ready-queue link
|
||||
}
|
||||
|
||||
// A pending virtual-time timer: wake `fiber` once the virtual clock reaches
|
||||
// `deadline_ms`. Stored in `Scheduler.timers` (a `List`) in insertion order, so
|
||||
// a linear min-scan that takes the FIRST entry at the minimum deadline gives a
|
||||
// stable FIFO tiebreak for equal deadlines.
|
||||
Timer :: struct {
|
||||
deadline_ms: i64;
|
||||
fiber: *Fiber;
|
||||
}
|
||||
|
||||
Scheduler :: struct {
|
||||
sched_ctx: FiberCtx; // the scheduler loop's own saved context
|
||||
current: *Fiber; // running fiber; null while in the scheduler loop
|
||||
@@ -67,6 +76,15 @@ Scheduler :: struct {
|
||||
n_spawned: i64;
|
||||
n_suspended: i64; // fibers parked off-queue (suspend_self minus wake)
|
||||
|
||||
// --- B1.4b: deterministic virtual-time timer scheduling ----------------
|
||||
clock_ms: i64; // the VIRTUAL clock (ms). Starts 0; advances ONLY
|
||||
// when the ready queue drains and the earliest
|
||||
// pending timer fires. No real wall clock is ever
|
||||
// read — wake ORDER + timestamps are reproducible.
|
||||
timers: List(Timer); // pending sleep timers, in insertion order. Grown
|
||||
// through `own_allocator` (long-lived-container
|
||||
// rule: a timer outlives the `sleep` call's scope).
|
||||
|
||||
// Construct a scheduler BY VALUE (allocator value-return convention).
|
||||
// Captures the current `context.allocator` into `own_allocator` — fibers and
|
||||
// their heap `Fiber` structs outlive their spawn scope, so all internal
|
||||
@@ -81,6 +99,8 @@ Scheduler :: struct {
|
||||
s.next_id = 0;
|
||||
s.n_spawned = 0;
|
||||
s.n_suspended = 0;
|
||||
s.clock_ms = 0;
|
||||
s.timers = .{};
|
||||
return s;
|
||||
}
|
||||
|
||||
@@ -162,38 +182,97 @@ Scheduler :: struct {
|
||||
// a genuinely parked fiber may be re-enqueued; any other wake is a no-op.
|
||||
wake :: (self: *Scheduler, f: *Fiber) {
|
||||
if f.state != .suspended { return; }
|
||||
// Evict any pending sleep timer for `f`. EVERY path that re-readies a
|
||||
// suspended fiber funnels through `wake` (a manual/Task wake, or the
|
||||
// timer-fire in `run` — which already removed the fired timer, so this
|
||||
// is a harmless re-scan there). Without this, a fiber that armed a
|
||||
// `sleep` timer but was woken EARLY by another path would run to
|
||||
// completion and be reaped (stack munmap'd + Fiber freed) while its
|
||||
// Timer still held a dangling `*Fiber` — a later fire would dereference
|
||||
// freed memory (use-after-free). One timer per fiber max in the M:1
|
||||
// model, so a single eviction suffices; it also prevents a stale timer
|
||||
// from spuriously re-waking a since-re-slept fiber.
|
||||
cancel_timer_for(self, f);
|
||||
self.n_suspended = self.n_suspended - 1;
|
||||
f.state = .ready;
|
||||
enqueue(self, f);
|
||||
}
|
||||
|
||||
// The scheduler loop. Runs until the ready queue drains. Each iteration:
|
||||
// dequeue the next fiber, switch into it, and — on its switch back — reap it
|
||||
// if done (munmap stack, free the Fiber), re-enqueue it if it yielded, or
|
||||
// leave it parked if it suspended.
|
||||
run :: (self: *Scheduler) {
|
||||
while self.ready_head != null {
|
||||
f := dequeue(self);
|
||||
self.current = f;
|
||||
f.state = .running;
|
||||
swap_context(@self.sched_ctx, @f.ctx); // returns here when f yields / suspends / finishes
|
||||
self.current = null;
|
||||
if f.state == .done {
|
||||
// We've switched OFF f's stack already (the final swap landed
|
||||
// here), so the stack is free to unmap. Free the Fiber struct
|
||||
// AFTER munmap.
|
||||
munmap(f.stack_region, f.stack_len);
|
||||
self.own_allocator.dealloc_bytes(xx f);
|
||||
} else if f.state == .ready {
|
||||
enqueue(self, f);
|
||||
}
|
||||
// .suspended: leave it parked (not in any queue; `wake` re-adds it).
|
||||
// Read the VIRTUAL clock — the simulated millisecond time. Advances only as
|
||||
// timers fire (in `run`), never from a real wall clock, so two runs of the
|
||||
// same fiber program observe identical timestamps. A fiber that just woke
|
||||
// from `sleep(ms)` sees `now_ms()` equal to its deadline.
|
||||
now_ms :: (self: *Scheduler) -> i64 {
|
||||
return self.clock_ms;
|
||||
}
|
||||
|
||||
// Sleep the running fiber for `ms` simulated milliseconds: arm a timer at
|
||||
// `clock_ms + ms`, then park off-queue. The scheduler advances the virtual
|
||||
// clock to this deadline and wakes the fiber once the ready queue has fully
|
||||
// drained AND no earlier timer is pending (deadline order, FIFO tiebreak).
|
||||
// MUST be called from inside a fiber (there must be a `current` to park);
|
||||
// a null `current` bails loudly, mirroring `suspend_self`.
|
||||
//
|
||||
// Virtual time only moves forward: `ms >= 0` makes the deadline
|
||||
// `>= clock_ms`, so a fired timer never rewinds the clock.
|
||||
sleep :: (self: *Scheduler, ms: i64) {
|
||||
cur := self.current;
|
||||
if cur == null {
|
||||
print("sched: sleep() called outside a fiber (no running fiber)\n");
|
||||
abort();
|
||||
}
|
||||
// The queue drained. If any fiber is still parked, nothing will ever
|
||||
// wake it — its stack + struct are leaked and the program believes it
|
||||
// finished. That is a deadlock; surface it loudly rather than returning
|
||||
// a silent success. (FiberIo, which uses suspend/wake, must balance
|
||||
// every suspend with a wake before the queue empties.)
|
||||
t : Timer = .{ deadline_ms = self.clock_ms + ms, fiber = cur };
|
||||
// Long-lived-container rule: a timer outlives this `sleep` call's scope
|
||||
// (it survives in `self.timers` until the scheduler fires it), so grow
|
||||
// through the captured `own_allocator`, never the transient current one.
|
||||
self.timers.append(t, self.own_allocator);
|
||||
self.suspend_self(); // parks `cur` off-queue; the timer fire re-wakes it
|
||||
}
|
||||
|
||||
// The scheduler loop. Drives ready fibers to quiescence, then advances the
|
||||
// virtual clock by firing the earliest pending timer (which re-readies its
|
||||
// sleeper), and repeats — until both the ready queue and the timer set are
|
||||
// empty. Within the inner drain each iteration: dequeue the next fiber,
|
||||
// switch into it, and — on its switch back — reap it if done (munmap stack,
|
||||
// free the Fiber), re-enqueue it if it yielded, or leave it parked if it
|
||||
// suspended.
|
||||
run :: (self: *Scheduler) {
|
||||
while true {
|
||||
while self.ready_head != null {
|
||||
f := dequeue(self);
|
||||
self.current = f;
|
||||
f.state = .running;
|
||||
swap_context(@self.sched_ctx, @f.ctx); // returns here when f yields / suspends / finishes
|
||||
self.current = null;
|
||||
if f.state == .done {
|
||||
// We've switched OFF f's stack already (the final swap landed
|
||||
// here), so the stack is free to unmap. Free the Fiber struct
|
||||
// AFTER munmap.
|
||||
munmap(f.stack_region, f.stack_len);
|
||||
self.own_allocator.dealloc_bytes(xx f);
|
||||
} else if f.state == .ready {
|
||||
enqueue(self, f);
|
||||
}
|
||||
// .suspended: leave it parked (not in any queue; `wake` re-adds it).
|
||||
}
|
||||
// Ready queue drained. Fire the earliest pending timer — the one
|
||||
// sleeper whose deadline is next — advancing the virtual clock to it.
|
||||
// No timers left ⇒ nothing more can run; exit the loop.
|
||||
idx := earliest_timer(self);
|
||||
if idx < 0 { break; }
|
||||
t := self.timers.items[idx];
|
||||
remove_timer(self, idx);
|
||||
self.clock_ms = t.deadline_ms; // advance VIRTUAL time forward
|
||||
self.wake(t.fiber); // re-enqueue the sleeper → drain again
|
||||
}
|
||||
// Both the ready queue and the timer set are empty. If a fiber is STILL
|
||||
// parked, no timer will ever wake it (a `suspend_self` without an armed
|
||||
// timer, never externally woken) — its stack + struct are leaked and the
|
||||
// program believes it finished. That is a genuine deadlock; surface it
|
||||
// loudly. (Timer sleepers are balanced: each `sleep` increments
|
||||
// `n_suspended` via `suspend_self`, and the timer-fire `wake` decrements
|
||||
// it — so once every timer has fired, `n_suspended` counts only true
|
||||
// orphans.)
|
||||
if self.n_suspended != 0 {
|
||||
print("sched: deadlock — {} fiber(s) suspended with an empty run queue\n", self.n_suspended);
|
||||
abort();
|
||||
@@ -303,8 +382,59 @@ dequeue :: (self: *Scheduler) -> *Fiber {
|
||||
return f;
|
||||
}
|
||||
|
||||
// --- virtual-time timer set (linear min-scan, FIFO tiebreak) ---------------
|
||||
//
|
||||
// The timer set is a plain `List(Timer)` kept in INSERTION order. Fiber counts
|
||||
// are tiny, so a linear scan for the minimum deadline is ideal — no heap to
|
||||
// maintain — and "first entry at the minimum" naturally gives FIFO ordering for
|
||||
// equal deadlines (the earlier-inserted timer is visited first, so it wins the
|
||||
// tie). Removal shifts the tail down by one to preserve that insertion order for
|
||||
// the remaining entries.
|
||||
|
||||
// Index of the earliest-deadline pending timer, or -1 if none. On a deadline
|
||||
// tie the lowest index (earliest inserted) wins → deterministic FIFO wake order.
|
||||
earliest_timer :: (self: *Scheduler) -> i64 {
|
||||
if self.timers.len == 0 { return -1; }
|
||||
best := 0;
|
||||
i := 1;
|
||||
while i < self.timers.len {
|
||||
// Strict `<` so equal deadlines do NOT displace the earlier (lower)
|
||||
// index — that is the FIFO tiebreak.
|
||||
if self.timers.items[i].deadline_ms < self.timers.items[best].deadline_ms {
|
||||
best = i;
|
||||
}
|
||||
i = i + 1;
|
||||
}
|
||||
return best;
|
||||
}
|
||||
|
||||
// Remove the timer at `idx`, shifting every later entry down one slot so the
|
||||
// remaining timers keep their insertion order (preserving the FIFO tiebreak).
|
||||
remove_timer :: (self: *Scheduler, idx: i64) {
|
||||
i := idx;
|
||||
while i < self.timers.len - 1 {
|
||||
self.timers.items[i] = self.timers.items[i + 1];
|
||||
i = i + 1;
|
||||
}
|
||||
self.timers.len = self.timers.len - 1;
|
||||
}
|
||||
|
||||
// Remove a pending sleep timer referencing fiber `f`, if any. A fiber has at
|
||||
// most one pending timer in the M:1 model (it can only `sleep` once before
|
||||
// suspending), so the first match is the only one. No-op if `f` has none.
|
||||
cancel_timer_for :: (self: *Scheduler, f: *Fiber) {
|
||||
i := 0;
|
||||
while i < self.timers.len {
|
||||
if self.timers.items[i].fiber == f {
|
||||
remove_timer(self, i);
|
||||
return;
|
||||
}
|
||||
i = i + 1;
|
||||
}
|
||||
}
|
||||
|
||||
// The public API lives as methods on `Scheduler` (above): `init`, `spawn`,
|
||||
// `yield_now`, `suspend_self`, `wake`, `run`.
|
||||
// `yield_now`, `suspend_self`, `wake`, `run`, `now_ms`, `sleep`.
|
||||
|
||||
// --- B1.4a: truly-suspending fiber-task async (`go` / `wait` / `cancel`) ----
|
||||
//
|
||||
|
||||
Reference in New Issue
Block a user