feat: true cancellation for the fiber Io layer (PLAN-IO-UNIFY Phase 3)

A cancelled async worker now abandons its body at its next suspend instead
of running to completion.

- Cancel-flag back-ref (D4): SpawnOpts.cancel_flag (core.sx) + Fiber.cancel_flag
  (sched.sx), set from opts.cancel_flag in Scheduler.spawn_raw; async passes
  xx @f.canceled (the Future.canceled Atomic(bool) erased to *void).
- Delivery: Scheduler.suspend_raw consults fiber_canceled(self.current) PRE-park
  (raise without parking — no deadlock if cancel landed before the worker ran)
  and POST-resume (cancel landed while parked), raising error.Canceled.
  cancel(f) flips the sticky flag, marks .canceled, and wakes the worker.
- async worker is failable Closure() -> ($R, !); the completion closure
  f.value = worker() catch {…} marks .canceled/.failed and wakes the awaiter,
  so post-suspend side effects never run. New failable io.sleep(ms) is the
  cancellation point.
- Compiler: a -> ! fn whose only error source is try-ing a protocol method
  (io.suspend_raw) was wrongly flagged 'declared ! but never errors';
  collectErrorSites now marks a try of a non-identifier callee as a dynamic
  (opaque) error source, suppressing the warning.
- Two UAFs found by adversarial review and fixed: (1) cancel-before-park
  orphaned io.sleep's armed timer — suspend_raw's pre-park raise now evicts the
  current fiber's timer/waiter first; (2) cancel(f) could wake a reaped worker —
  now only wakes when was_pending.

Migrated 1805/1806/1824 to failable workers. Lock: example 1825 (seq: 1 -99,
post-suspend line never runs); byte-identical on aarch64-macOS + aarch64-linux.
.ir churn is the SpawnOpts layout change (type-table string renumbering).
This commit is contained in:
agra
2026-06-28 09:19:01 +03:00
parent 45bd561a0d
commit 8bacb2b01c
54 changed files with 58249 additions and 57562 deletions

View File

@@ -26,6 +26,11 @@
// epoll on linux). Runs end-to-end on a matching aarch64 host, ir-only on an
// arch mismatch.
#import "modules/std.sx";
// Phase 3 (true cancellation): `suspend_raw` reads the spawned fiber's cancel
// flag — a `*Atomic(bool)` back-ref to the `Future.canceled` — to decide whether
// to raise `error.Canceled` on resume. atomic.sx has no naked asm, so it is safe
// to pull in here (no fib_tramp re-emit).
#import "modules/std/atomic.sx";
// `race` synthesizes its result type (a tagged-union mirroring the input tuple's
// labels) and constructs the winner variant by runtime index — both need the
// metatype WRITE side (`make_enum`/`make_variant`/`EnumVariant`/`field_type`).
@@ -98,6 +103,13 @@ Fiber :: struct {
// not the blocking default. Behavior-preserving for fibers spawned under the
// default context (the capture just re-pushes that same default).
dctx: Context;
// Phase 3 (true cancellation): back-ref to this fiber's task cancel flag
// (the `Future.canceled` `Atomic(bool)`, erased to `*void`), set from
// `SpawnOpts.cancel_flag` at `spawn_raw`. `suspend_raw` consults it and
// raises `IoErr.Canceled` when set, so a cancelled worker abandons its body
// at its next suspend. `null` for a fiber spawned with no cancel channel
// (a bare `spawn`, or `spawn_raw` with `cancel_flag = null`).
cancel_flag: *void = null;
}
// A pending virtual-time timer: wake `fiber` once the virtual clock reaches
@@ -630,6 +642,9 @@ impl Io for Scheduler {
entry_fn : (*void) -> void = xx entry;
entry_fn(arg);
});
// Stash the cancel-flag back-ref (Phase 3): `suspend_raw` consults it on
// this fiber to raise `error.Canceled`. `null` for an uncancellable spawn.
f.cancel_flag = opts.cancel_flag;
return xx f;
}
@@ -638,10 +653,42 @@ impl Io for Scheduler {
// out here when the parked task was cancelled (wired in Phase 3). The M:1
// impl does not raise yet — it just parks the current fiber.
suspend_raw :: (self: *Scheduler, park: *ParkToken) -> ! {
// Phase 3 (true cancellation), PRE-PARK check: if this fiber's task was
// already cancelled before it reached this suspend, deliver immediately —
// do NOT park (a park with no pending waker would deadlock; the cancel's
// `ready` already fired as a no-op against a not-yet-suspended fiber).
if self.fiber_canceled(self.current) {
// A timer / fd-waiter armed by a higher-level suspend (e.g.
// `io.sleep`'s `arm_timer`) just before this call would be ORPHANED:
// raising without parking means this fiber never `wake`s (the path
// that normally evicts its timer), yet it runs to its end and is
// reaped — a later timer fire would then `wake` freed memory (UAF).
// Evict any pending timer / waiter for this fiber before unwinding.
cancel_timer_for(self, self.current);
cancel_io_waiter_for(self, self.current);
raise error.Canceled;
}
// Record the parking fiber so a cross-fiber `ready(park)` (the worker that
// completes the awaited task) can find and wake it.
// completes the awaited task, or a `cancel` that wakes this worker) can
// find and wake it.
park.handle = xx self.current;
self.suspend_self();
// POST-RESUME check: a `cancel` that landed while we were parked woke us
// here (evicting any armed timer). Deliver `Canceled` so the worker
// abandons its body — its post-suspend lines never run.
if self.fiber_canceled(self.current) { raise error.Canceled; }
}
// True iff fiber `f`'s task carries a cancel flag that is set. The flag is a
// `*Atomic(bool)` back-ref (`Fiber.cancel_flag`, erased to `*void`) to the
// owning `Future.canceled`. A null flag (uncancellable spawn / bare fiber)
// is never cancelled. A null `f` (called outside a fiber) likewise can't be
// cancelled — the caller's own loud `suspend_self` guard handles that path.
fiber_canceled :: (self: *Scheduler, f: *Fiber) -> bool {
if f == null { return false; }
if f.cancel_flag == null { return false; }
flag : *Atomic(bool) = xx f.cancel_flag;
return flag.load(.acquire);
}
// Re-ready a fiber parked under `park` (its `handle` is the `*Fiber`, recorded