feat: true cancellation for the fiber Io layer (PLAN-IO-UNIFY Phase 3)
A cancelled async worker now abandons its body at its next suspend instead
of running to completion.
- Cancel-flag back-ref (D4): SpawnOpts.cancel_flag (core.sx) + Fiber.cancel_flag
(sched.sx), set from opts.cancel_flag in Scheduler.spawn_raw; async passes
xx @f.canceled (the Future.canceled Atomic(bool) erased to *void).
- Delivery: Scheduler.suspend_raw consults fiber_canceled(self.current) PRE-park
(raise without parking — no deadlock if cancel landed before the worker ran)
and POST-resume (cancel landed while parked), raising error.Canceled.
cancel(f) flips the sticky flag, marks .canceled, and wakes the worker.
- async worker is failable Closure() -> ($R, !); the completion closure
f.value = worker() catch {…} marks .canceled/.failed and wakes the awaiter,
so post-suspend side effects never run. New failable io.sleep(ms) is the
cancellation point.
- Compiler: a -> ! fn whose only error source is try-ing a protocol method
(io.suspend_raw) was wrongly flagged 'declared ! but never errors';
collectErrorSites now marks a try of a non-identifier callee as a dynamic
(opaque) error source, suppressing the warning.
- Two UAFs found by adversarial review and fixed: (1) cancel-before-park
orphaned io.sleep's armed timer — suspend_raw's pre-park raise now evicts the
current fiber's timer/waiter first; (2) cancel(f) could wake a reaped worker —
now only wakes when was_pending.
Migrated 1805/1806/1824 to failable workers. Lock: example 1825 (seq: 1 -99,
post-suspend line never runs); byte-identical on aarch64-macOS + aarch64-linux.
.ir churn is the SpawnOpts layout change (type-table string renumbering).
This commit is contained in:
@@ -102,6 +102,14 @@ PinTarget :: enum { any; main; on_thread; }
|
||||
|
||||
SpawnOpts :: struct {
|
||||
pin: PinTarget = .any;
|
||||
// Cancellation back-ref (Phase 3 — true cancellation). A pointer to the
|
||||
// spawned task's cancel flag (the `Future.canceled` `Atomic(bool)`, erased to
|
||||
// `*void` so this foundation type stays free of the atomic dependency). A
|
||||
// suspending impl stashes it on the spawned execution context so its
|
||||
// `suspend_raw` can consult it and raise `IoErr.Canceled` on resume. `null`
|
||||
// (the default) means "no cancellation channel" — the blocking impl and any
|
||||
// uncancellable spawn leave it unset.
|
||||
cancel_flag: *void = null;
|
||||
}
|
||||
|
||||
ParkToken :: struct {
|
||||
|
||||
@@ -145,7 +145,7 @@ sx_run_boxed_closure :: (arg: *void) {
|
||||
// long-lived allocator) is safe. A deeper fix — `async` capturing the scheduler's
|
||||
// own long-lived allocator the way `sched.go` does — needs a protocol affordance
|
||||
// to reach it and is deferred to the convergence phase.
|
||||
async :: ufcs (io: Io, worker: Closure() -> $R) -> *Future($R) {
|
||||
async :: ufcs (io: Io, worker: Closure() -> ($R, !)) -> *Future($R) {
|
||||
raw := context.allocator.alloc_bytes(size_of(Future($R)));
|
||||
f : *Future($R) = xx raw;
|
||||
f.state = .pending;
|
||||
@@ -154,14 +154,31 @@ async :: ufcs (io: Io, worker: Closure() -> $R) -> *Future($R) {
|
||||
// The completion closure: run the worker, publish the result, wake any parked
|
||||
// awaiter. Heap-boxed so it survives until the worker actually runs (deferred
|
||||
// under the fiber impl). It captures `f` + `worker`; nothing variadic crosses.
|
||||
//
|
||||
// Phase 3 (true cancellation): the worker is FAILABLE (`Closure() -> ($R, !)`).
|
||||
// A suspend that delivers cancellation (`suspend_raw` raising `Canceled` on a
|
||||
// cancelled worker), or any genuine `raise`, unwinds the worker's body right
|
||||
// here — so its post-suspend side effects never run. On success publish the
|
||||
// value and mark `.ready`; on error mark `.canceled` when `cancel` set the
|
||||
// flag, else `.failed`. Either way wake any parked awaiter. Under `CBlockingIo`
|
||||
// `suspend_raw` is a no-op, so the worker never raises Canceled inline — it
|
||||
// runs to completion (a post-hoc `cancel` still makes `await` raise via the
|
||||
// sticky `f.canceled`, the 1806 contract).
|
||||
braw := context.allocator.alloc_bytes(size_of(ThunkBox));
|
||||
b : *ThunkBox = xx braw;
|
||||
b.run = () => {
|
||||
f.value = worker();
|
||||
f.value = worker() catch {
|
||||
if f.canceled.load(.acquire) { f.state = .canceled; }
|
||||
else { f.state = .failed; }
|
||||
context.io.ready(f.park);
|
||||
return;
|
||||
};
|
||||
f.state = .ready;
|
||||
context.io.ready(f.park); // no-op if no awaiter parked yet
|
||||
};
|
||||
f.task = io.spawn_raw(xx sx_run_boxed_closure, xx b, .{});
|
||||
// Pass the cancel-flag back-ref so the worker fiber's `suspend_raw` can consult
|
||||
// it (Phase 3). `xx @f.canceled` erases the `*Atomic(bool)` to `*void`.
|
||||
f.task = io.spawn_raw(xx sx_run_boxed_closure, xx b, .{ cancel_flag = xx @f.canceled });
|
||||
return f;
|
||||
}
|
||||
|
||||
@@ -194,15 +211,45 @@ await :: ufcs (f: *Future($R)) -> ($R, !IoErr) {
|
||||
return f.value;
|
||||
}
|
||||
|
||||
// `cancel(f)` — request cancellation. Sets the per-future cancel flag + marks the
|
||||
// state so a subsequent `await` raises `.Canceled` (model (a) — cancel rides the
|
||||
// `!` channel). DOES NOT STOP AN ALREADY-SPAWNED WORKER: under the fiber impl the
|
||||
// worker fiber is already queued, so `run()` still executes it to completion (its
|
||||
// side effects happen; it flips `.canceled -> .ready`). The sticky `canceled`
|
||||
// atomic is the source of truth — subsequent awaits keep raising regardless of
|
||||
// the state field. True work-cancellation (the worker's next suspend raising
|
||||
// `Canceled` so it abandons its body) is Phase 3.
|
||||
// `cancel(f)` — request cancellation (model (a) — cancel rides the `!` channel).
|
||||
// Sets the sticky per-future cancel flag + marks `.canceled` (so a subsequent
|
||||
// `await` raises `.Canceled`), then WAKES the worker fiber so it delivers the
|
||||
// cancellation at its current/next suspend.
|
||||
//
|
||||
// Phase 3 (TRUE cancellation): `ready(.{ handle = f.task })` re-readies the worker
|
||||
// fiber parked under the fiber impl. On resume its `suspend_raw` sees the flag and
|
||||
// raises `Canceled`, so the worker ABANDONS its body — post-suspend side effects
|
||||
// never run. The sticky `canceled` atomic is the source of truth (`await` keeps
|
||||
// raising regardless of the state field). `wake` is guarded on `.suspended`, so a
|
||||
// `ready` of a not-yet-parked worker is a safe no-op (its first `suspend_raw`'s
|
||||
// pre-park check then delivers the cancel without parking). Under `CBlockingIo`
|
||||
// `f.task` is null and `ready` is a no-op — the worker already ran inline, and the
|
||||
// sticky flag still makes `await` raise (the 1806 contract, unchanged).
|
||||
cancel :: ufcs (f: *Future($R)) {
|
||||
// Wake the worker fiber ONLY while the task is still in flight (`.pending`).
|
||||
// Once it has completed (`.ready`/`.failed`) or was already cancelled, its
|
||||
// fiber may have been REAPED (the run loop `munmap`s + frees a `.done`
|
||||
// fiber), so `f.task` would dangle — `ready` on it is a use-after-free. The
|
||||
// sticky `canceled` flag still makes a subsequent `await` raise in those
|
||||
// cases (the 1806 model-(a) contract), so no wake is needed there. A
|
||||
// not-yet-run worker is `.pending` with a live (queued) fiber; `ready` is a
|
||||
// safe no-op on it (its first `suspend_raw` pre-park check then delivers).
|
||||
was_pending := f.state == .pending;
|
||||
f.canceled.store(true, .release);
|
||||
f.state = .canceled;
|
||||
if was_pending { context.io.ready(.{ handle = f.task }); }
|
||||
}
|
||||
|
||||
// `sleep(io, ms)` — a FAILABLE suspend for `ms` virtual milliseconds. Arms a
|
||||
// timer at `now_ms() + ms` and parks via `suspend_raw`; the fired timer
|
||||
// re-readies the fiber, and on resume `suspend_raw` raises `Canceled` if the task
|
||||
// was cancelled while sleeping (Phase 3). So `try io.sleep(..)` inside an `async`
|
||||
// worker is a cancellation point: a `cancel` lands the worker's body unwinding
|
||||
// here instead of running past the sleep. No-op under `CBlockingIo` (its
|
||||
// `arm_timer`/`suspend_raw` are stubs — the blocking model has no scheduler to
|
||||
// advance a virtual clock).
|
||||
sleep :: ufcs (io: Io, ms: i64) -> ! {
|
||||
pk : ParkToken = .{ handle = null };
|
||||
io.arm_timer(io.now_ms() + ms, pk);
|
||||
try io.suspend_raw(@pk);
|
||||
}
|
||||
|
||||
@@ -26,6 +26,11 @@
|
||||
// epoll on linux). Runs end-to-end on a matching aarch64 host, ir-only on an
|
||||
// arch mismatch.
|
||||
#import "modules/std.sx";
|
||||
// Phase 3 (true cancellation): `suspend_raw` reads the spawned fiber's cancel
|
||||
// flag — a `*Atomic(bool)` back-ref to the `Future.canceled` — to decide whether
|
||||
// to raise `error.Canceled` on resume. atomic.sx has no naked asm, so it is safe
|
||||
// to pull in here (no fib_tramp re-emit).
|
||||
#import "modules/std/atomic.sx";
|
||||
// `race` synthesizes its result type (a tagged-union mirroring the input tuple's
|
||||
// labels) and constructs the winner variant by runtime index — both need the
|
||||
// metatype WRITE side (`make_enum`/`make_variant`/`EnumVariant`/`field_type`).
|
||||
@@ -98,6 +103,13 @@ Fiber :: struct {
|
||||
// not the blocking default. Behavior-preserving for fibers spawned under the
|
||||
// default context (the capture just re-pushes that same default).
|
||||
dctx: Context;
|
||||
// Phase 3 (true cancellation): back-ref to this fiber's task cancel flag
|
||||
// (the `Future.canceled` `Atomic(bool)`, erased to `*void`), set from
|
||||
// `SpawnOpts.cancel_flag` at `spawn_raw`. `suspend_raw` consults it and
|
||||
// raises `IoErr.Canceled` when set, so a cancelled worker abandons its body
|
||||
// at its next suspend. `null` for a fiber spawned with no cancel channel
|
||||
// (a bare `spawn`, or `spawn_raw` with `cancel_flag = null`).
|
||||
cancel_flag: *void = null;
|
||||
}
|
||||
|
||||
// A pending virtual-time timer: wake `fiber` once the virtual clock reaches
|
||||
@@ -630,6 +642,9 @@ impl Io for Scheduler {
|
||||
entry_fn : (*void) -> void = xx entry;
|
||||
entry_fn(arg);
|
||||
});
|
||||
// Stash the cancel-flag back-ref (Phase 3): `suspend_raw` consults it on
|
||||
// this fiber to raise `error.Canceled`. `null` for an uncancellable spawn.
|
||||
f.cancel_flag = opts.cancel_flag;
|
||||
return xx f;
|
||||
}
|
||||
|
||||
@@ -638,10 +653,42 @@ impl Io for Scheduler {
|
||||
// out here when the parked task was cancelled (wired in Phase 3). The M:1
|
||||
// impl does not raise yet — it just parks the current fiber.
|
||||
suspend_raw :: (self: *Scheduler, park: *ParkToken) -> ! {
|
||||
// Phase 3 (true cancellation), PRE-PARK check: if this fiber's task was
|
||||
// already cancelled before it reached this suspend, deliver immediately —
|
||||
// do NOT park (a park with no pending waker would deadlock; the cancel's
|
||||
// `ready` already fired as a no-op against a not-yet-suspended fiber).
|
||||
if self.fiber_canceled(self.current) {
|
||||
// A timer / fd-waiter armed by a higher-level suspend (e.g.
|
||||
// `io.sleep`'s `arm_timer`) just before this call would be ORPHANED:
|
||||
// raising without parking means this fiber never `wake`s (the path
|
||||
// that normally evicts its timer), yet it runs to its end and is
|
||||
// reaped — a later timer fire would then `wake` freed memory (UAF).
|
||||
// Evict any pending timer / waiter for this fiber before unwinding.
|
||||
cancel_timer_for(self, self.current);
|
||||
cancel_io_waiter_for(self, self.current);
|
||||
raise error.Canceled;
|
||||
}
|
||||
// Record the parking fiber so a cross-fiber `ready(park)` (the worker that
|
||||
// completes the awaited task) can find and wake it.
|
||||
// completes the awaited task, or a `cancel` that wakes this worker) can
|
||||
// find and wake it.
|
||||
park.handle = xx self.current;
|
||||
self.suspend_self();
|
||||
// POST-RESUME check: a `cancel` that landed while we were parked woke us
|
||||
// here (evicting any armed timer). Deliver `Canceled` so the worker
|
||||
// abandons its body — its post-suspend lines never run.
|
||||
if self.fiber_canceled(self.current) { raise error.Canceled; }
|
||||
}
|
||||
|
||||
// True iff fiber `f`'s task carries a cancel flag that is set. The flag is a
|
||||
// `*Atomic(bool)` back-ref (`Fiber.cancel_flag`, erased to `*void`) to the
|
||||
// owning `Future.canceled`. A null flag (uncancellable spawn / bare fiber)
|
||||
// is never cancelled. A null `f` (called outside a fiber) likewise can't be
|
||||
// cancelled — the caller's own loud `suspend_self` guard handles that path.
|
||||
fiber_canceled :: (self: *Scheduler, f: *Fiber) -> bool {
|
||||
if f == null { return false; }
|
||||
if f.cancel_flag == null { return false; }
|
||||
flag : *Atomic(bool) = xx f.cancel_flag;
|
||||
return flag.load(.acquire);
|
||||
}
|
||||
|
||||
// Re-ready a fiber parked under `park` (its `handle` is the `*Fiber`, recorded
|
||||
|
||||
Reference in New Issue
Block a user