fix: aarch64-linux port of the M:1 fiber runtime (sched.sx)
Port library/modules/std/sched.sx to run on aarch64-linux alongside aarch64-macOS, validated byte-identical on both via Apple `container`. Per-OS bits are comptime-branched: - MAP_AP (mmap MAP_ANON flag): linux 0x22 / macOS 0x1002. - fd-readiness backend: epoll on linux, kqueue on darwin (epoll import scoped to the linux branch). block_on_fd, the run-loop Mode-2 drain, and cancel_io_waiter_for each branch; the epoll paths EPOLL_CTL_DEL on fire and on early-wake (EPOLLONESHOT only disables a registration; kqueue EV_ONESHOT auto-removes it). - first-entry trampoline: a per-OS hand-written global-asm symbol becomes a naked sx fn fib_tramp (mov x0,x19; br x20) + register-indirect dispatch (spawn presets regs[1] == x20 == &fib_dispatch), dropping the per-OS .global symbol entirely. Fixes issue 0193 Bug A: the trampoline redesign bus-errored on the go/wait/sleep capstone (1817) until `export "fib_dispatch"` was restored. Without the export, fib_dispatch reverts to sx's internal ABI (x0 = implicit context, first arg self shifted to x1) while the trampoline hands self over in x0 (C-ABI); on first entry the body runs (x1 happens to alias self) but the closure then loads regs[1] == &fib_dispatch as its first capture and re-invokes fib_dispatch forever -> stack overflow -> bus error. The export pins fib_dispatch to the C-ABI (self in x0), matching the trampoline. Root cause found via lldb on an AOT build; confirmed against the compiler source. Bug B (a top-level asm block wrapped in inline-if is dropped during the comptime-conditional flatten) is carved out to issue 0194 (OPEN) -- no live trigger remains, since the naked-fn trampoline sidesteps it. 1811/1814/1816/1817 run byte-identical on the aarch64-macOS host and in an aarch64-linux container; full suite green (817/0). Documents the fiber runtime in readme.md.
This commit is contained in:
@@ -13,19 +13,28 @@
|
||||
// - `swap_context` (aarch64 `abi(.naked)`, 13-slot save area: x19..x28, fp,
|
||||
// lr, sp) saves the callee-saved registers + SP into `*from` and loads them
|
||||
// from `*to`, then `ret`s onto `to`'s stack.
|
||||
// - the `_fib_tramp` global-asm first-entry trampoline: x19 holds the
|
||||
// bootstrapped `*Fiber`; it moves it to x0 and `bl`s the exported generic
|
||||
// dispatch `fib_dispatch`, which calls the body then switches back to the
|
||||
// scheduler.
|
||||
// - the `fib_tramp` first-entry trampoline (a naked sx fn): x19 holds the
|
||||
// bootstrapped `*Fiber` and x20 = `&fib_dispatch`; it moves the fiber to x0
|
||||
// and `br`s through x20 to the C-ABI `fib_dispatch`, which calls the body
|
||||
// then switches back to the scheduler.
|
||||
// - guarded `mmap` stacks: `[GUARD | usable]`, low GUARD page `mprotect`'d
|
||||
// PROT_NONE, 16-aligned top returned as the bootstrapped SP.
|
||||
//
|
||||
// aarch64-macOS-pinned: the `swap_context` asm + the 13-slot save area are
|
||||
// per-arch; the `mmap` flag constants (MAP_ANON = 0x1000) and the 16 KB guard
|
||||
// page are Apple-specific. Runs end-to-end on a matching host, ir-only on a
|
||||
// mismatch.
|
||||
// aarch64-pinned (macOS + linux): the `swap_context` asm + the 13-slot save
|
||||
// area are per-arch. The per-OS bits are branched at comptime — `mmap`'s
|
||||
// MAP_ANON flag (`MAP_AP`) and the fd-readiness backend (kqueue on darwin,
|
||||
// epoll on linux). Runs end-to-end on a matching aarch64 host, ir-only on an
|
||||
// arch mismatch.
|
||||
#import "modules/std.sx";
|
||||
kqb :: #import "modules/std/net/kqueue.sx";
|
||||
// The fd-readiness backend is per-OS: kqueue (kqb, above) on darwin, epoll on
|
||||
// linux. The epoll import is scoped to the linux branch so darwin never pulls
|
||||
// epoll's types into the concurrency examples' type tables (the same
|
||||
// std-barrel-drift rule std.event.Loop follows); `block_on_fd` / the run loop
|
||||
// reference `ep` only inside their own `inline if OS == .linux` arms.
|
||||
inline if OS == .linux {
|
||||
ep :: #import "modules/std/net/epoll.sx";
|
||||
}
|
||||
|
||||
// --- libc mmap stack primitives -------------------------------------------
|
||||
|
||||
@@ -40,7 +49,14 @@ abort :: () -> noreturn extern libc "abort";
|
||||
|
||||
PROT_NONE :: 0;
|
||||
PROT_RW :: 3; // PROT_READ | PROT_WRITE
|
||||
MAP_AP :: 0x1002; // macOS MAP_PRIVATE (0x2) | MAP_ANON (0x1000)
|
||||
// Exhaustive on the SUPPORTED OSes (linux/macOS), no default case: an
|
||||
// unsupported target matches no case → MAP_AP undefined → a loud compile error
|
||||
// on use rather than a silent wrong flag. (The fiber runtime is aarch64-only
|
||||
// anyway — the swap_context asm — so only these two platforms are wired.)
|
||||
inline if OS == {
|
||||
case .linux: MAP_AP :: 0x22; // linux MAP_PRIVATE (0x2) | MAP_ANON (0x20)
|
||||
case .macos: MAP_AP :: 0x1002; // macOS MAP_PRIVATE (0x2) | MAP_ANON (0x1000)
|
||||
}
|
||||
GUARD :: 16384; // one 16 KB page (aarch64-macOS)
|
||||
STACK :: 131072; // 128 KB usable per fiber
|
||||
|
||||
@@ -172,10 +188,11 @@ Scheduler :: struct {
|
||||
self.n_spawned = self.n_spawned + 1;
|
||||
|
||||
top := boot_stack(f, STACK);
|
||||
f.ctx.regs[0] = xx f; // x19 = self
|
||||
f.ctx.regs[10] = 0; // fp
|
||||
f.ctx.regs[11] = xx fib_tramp; // lr → trampoline
|
||||
f.ctx.regs[12] = top; // sp
|
||||
f.ctx.regs[0] = xx f; // x19 = self (→ x0 in the tramp)
|
||||
f.ctx.regs[1] = xx fib_dispatch; // x20 = dispatch entry (tramp `br`s to it)
|
||||
f.ctx.regs[10] = 0; // fp
|
||||
f.ctx.regs[11] = xx fib_tramp; // lr → trampoline
|
||||
f.ctx.regs[12] = top; // sp
|
||||
|
||||
f.state = .ready;
|
||||
enqueue(self, f);
|
||||
@@ -239,12 +256,13 @@ Scheduler :: struct {
|
||||
// but was woken by another path (a manual wake, a Task completion), its
|
||||
// `IoWaiter` would otherwise survive pointing at a fiber that runs to
|
||||
// completion and is reaped (stack munmap'd + Fiber freed). A later
|
||||
// kqueue drain matching that stale record would `wake` freed memory.
|
||||
// Evict it here. NOTE: we do NOT EV_DELETE the kqueue registration — it
|
||||
// is EV_ONESHOT, so a never-fired registration simply lingers in the
|
||||
// kernel queue until the fd is readable, at which point the drain finds
|
||||
// no matching waiter and ignores it (see `run`). The fd is the example's
|
||||
// to close; closing it auto-removes any pending registration.
|
||||
// readiness drain matching that stale record would `wake` freed memory.
|
||||
// Evict it here. The kernel-side registration is handled per-OS inside
|
||||
// `cancel_io_waiter_for`: on darwin the EV_ONESHOT kqueue registration is
|
||||
// left to linger (a never-fired one-shot the drain ignores; the fd's
|
||||
// owner closes it, auto-removing it), but on linux the EPOLLONESHOT
|
||||
// registration stays enabled and must be `EPOLL_CTL_DEL`'d (else it could
|
||||
// fire later with no waiter and would block a re-arm of the same fd).
|
||||
cancel_io_waiter_for(self, f);
|
||||
self.n_suspended = self.n_suspended - 1;
|
||||
f.state = .ready;
|
||||
@@ -333,20 +351,38 @@ Scheduler :: struct {
|
||||
}
|
||||
j = j + 1;
|
||||
}
|
||||
// Lazily open the kqueue fd the first time fd-blocking is used.
|
||||
// Lazily open the event-queue fd the first time fd-blocking is used:
|
||||
// kqueue on darwin, epoll on linux. `self.kq` holds whichever — it is
|
||||
// just "the readiness queue fd".
|
||||
if self.kq < 0 {
|
||||
self.kq = kqb.kqueue();
|
||||
inline if OS == {
|
||||
case .linux: self.kq = ep.ep_create();
|
||||
case .macos: self.kq = kqb.kqueue();
|
||||
}
|
||||
if self.kq < 0 {
|
||||
print("sched: kqueue() failed to open the event queue\n");
|
||||
print("sched: failed to open the event queue\n");
|
||||
abort();
|
||||
}
|
||||
}
|
||||
// Arm a one-shot read-readiness registration for `fd`. udata is unused
|
||||
// (we match the waiter by fd in the drain), so pass 0.
|
||||
chg := kqb.kev_change(fd, kqb.EVFILT_READ, kqb.EV_ADD | kqb.EV_ENABLE | kqb.EV_ONESHOT, 0);
|
||||
if !kqb.kq_apply(self.kq, chg) {
|
||||
print("sched: kevent() failed to register fd {} for read readiness\n", fd);
|
||||
abort();
|
||||
// Arm a one-shot read-readiness registration for `fd`, matched back by
|
||||
// the run-loop drain (kqueue by ident; epoll stashes the fd in `data`).
|
||||
// darwin EV_ONESHOT auto-removes the registration on fire; epoll's
|
||||
// EPOLLONESHOT only DISABLES it, so the linux paths additionally
|
||||
// EPOLL_CTL_DEL on fire (run) and on early-wake (cancel_io_waiter_for).
|
||||
inline if OS == {
|
||||
case .linux: {
|
||||
if !ep.ep_ctl(self.kq, ep.EPOLL_CTL_ADD, fd, ep.EPOLLIN | ep.EPOLLONESHOT) {
|
||||
print("sched: epoll_ctl() failed to register fd {} for read readiness\n", fd);
|
||||
abort();
|
||||
}
|
||||
}
|
||||
case .macos: {
|
||||
chg := kqb.kev_change(fd, kqb.EVFILT_READ, kqb.EV_ADD | kqb.EV_ENABLE | kqb.EV_ONESHOT, 0);
|
||||
if !kqb.kq_apply(self.kq, chg) {
|
||||
print("sched: kevent() failed to register fd {} for read readiness\n", fd);
|
||||
abort();
|
||||
}
|
||||
}
|
||||
}
|
||||
// Record the waiter BEFORE parking — the run loop matches the fired
|
||||
// event's ident back to this record. Long-lived-container rule: the
|
||||
@@ -407,20 +443,42 @@ Scheduler :: struct {
|
||||
// kernel reports at least one fd ready, then wake every waiter whose
|
||||
// fd fired. (null timeout via -1 → wait forever.)
|
||||
if self.io_waiters.len > 0 {
|
||||
evbuf : [MAXEV]kqb.Kevent = ---;
|
||||
n := kqb.kq_wait(self.kq, @evbuf[0], MAXEV, -1);
|
||||
if n < 0 {
|
||||
print("sched: kevent() wait failed while blocking on fd readiness\n");
|
||||
abort();
|
||||
}
|
||||
// For each fired event, find the io-waiter whose fd matches its
|
||||
// ident, evict it, and wake its fiber. EV_ONESHOT already removed
|
||||
// the kernel registration, so we only drop the waiter record.
|
||||
i := 0;
|
||||
while i < n {
|
||||
ready_fd : i32 = xx evbuf[i].ident;
|
||||
wake_io_waiter_for_fd(self, ready_fd);
|
||||
i = i + 1;
|
||||
// BLOCK on the readiness queue until ≥1 fd fires (timeout -1 =
|
||||
// forever), then for each fired event match the fd back to its
|
||||
// io-waiter, evict the record, and wake the fiber.
|
||||
inline if OS == {
|
||||
case .linux: {
|
||||
evbuf : [MAXEV]ep.EpollEvent = ---;
|
||||
n := ep.ep_wait(self.kq, .{ ptr = @evbuf[0], len = MAXEV }, MAXEV, -1);
|
||||
if n < 0 {
|
||||
print("sched: epoll_wait() failed while blocking on fd readiness\n");
|
||||
abort();
|
||||
}
|
||||
i := 0;
|
||||
while i < n {
|
||||
ready_fd := ep.ev_fd(evbuf[i]);
|
||||
wake_io_waiter_for_fd(self, ready_fd);
|
||||
// EPOLLONESHOT only DISABLED the registration; remove it
|
||||
// fully so the fd can be re-armed by a future block_on_fd
|
||||
// (kqueue's EV_ONESHOT removes it for free).
|
||||
ep.ep_ctl(self.kq, ep.EPOLL_CTL_DEL, ready_fd, 0);
|
||||
i = i + 1;
|
||||
}
|
||||
}
|
||||
case .macos: {
|
||||
evbuf : [MAXEV]kqb.Kevent = ---;
|
||||
n := kqb.kq_wait(self.kq, @evbuf[0], MAXEV, -1);
|
||||
if n < 0 {
|
||||
print("sched: kevent() wait failed while blocking on fd readiness\n");
|
||||
abort();
|
||||
}
|
||||
i := 0;
|
||||
while i < n {
|
||||
ready_fd : i32 = xx evbuf[i].ident;
|
||||
wake_io_waiter_for_fd(self, ready_fd);
|
||||
i = i + 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
continue;
|
||||
}
|
||||
@@ -539,23 +597,48 @@ ASM
|
||||
};
|
||||
}
|
||||
|
||||
// First-entry trampoline: a fiber's bootstrapped LR points here. x19 holds the
|
||||
// `*Fiber` (preset in the saved context); move it to x0 and call the generic
|
||||
// dispatch.
|
||||
asm {
|
||||
#string T
|
||||
.global _fib_tramp
|
||||
_fib_tramp:
|
||||
// First-entry trampoline: a fiber's bootstrapped LR points here, with x19 =
|
||||
// `*Fiber` and x20 = `&fib_dispatch` (both preset in the saved context by
|
||||
// `spawn`, both callee-saved so `swap_context` restores them on first entry).
|
||||
// Move the fiber to x0 and tail-branch to dispatch via the REGISTER (x20) — so
|
||||
// there is no hand-written global-asm symbol and nothing here needs per-OS
|
||||
// symbol naming (`_fib_tramp` on darwin vs `fib_tramp` on linux) or a `bl` to a
|
||||
// named export. As a naked sx fn `fib_tramp`'s own symbol is emitted with the
|
||||
// platform-correct name automatically, so `spawn`'s `xx fib_tramp` resolves on
|
||||
// every target. This register-indirect bootstrap replaced an OS-conditional
|
||||
// global `asm` block (a top-level `asm` wrapped in an `inline if` is dropped in
|
||||
// this module's context — see issues/0193) and sidesteps the hand-written
|
||||
// symbol entirely, which is cleaner regardless.
|
||||
fib_tramp :: () abi(.naked) {
|
||||
asm volatile {
|
||||
#string T
|
||||
mov x0, x19
|
||||
bl _fib_dispatch
|
||||
brk #0
|
||||
T,
|
||||
};
|
||||
fib_tramp :: () extern;
|
||||
br x20
|
||||
T
|
||||
};
|
||||
}
|
||||
|
||||
// The ONE place that runs a fiber body. Reached only from `_fib_tramp` on first
|
||||
// The ONE place that runs a fiber body. Reached only from `fib_tramp` on first
|
||||
// entry, on the fiber's own fresh stack. Runs the body, marks the fiber done,
|
||||
// and switches back to the scheduler — never returns past the final switch.
|
||||
//
|
||||
// `export "fib_dispatch"` is MANDATORY, not decorative: it pins this fn to the
|
||||
// **C ABI** (first real arg `self` in x0). The trampoline hands the fiber over
|
||||
// in x0 (`mov x0, x19; br x20`), which is exactly C-ABI. Drop the export and the
|
||||
// fn reverts to sx's INTERNAL calling convention, which reserves x0 for the
|
||||
// implicit `context` pointer and shifts `self` to x1 — so the trampoline's x0
|
||||
// would land in the context slot and `self` would be read from a garbage x1. On
|
||||
// first entry that garbage happens to alias `&fiber.ctx == self` (left in x1 by
|
||||
// the scheduler's prior `swap_context`), so the body runs once; but inside it
|
||||
// the closure loads `[Fiber+8] == regs[1] == &fib_dispatch` as its "first
|
||||
// capture" and re-invokes `fib_dispatch` forever → stack overflow → bus error
|
||||
// (issue 0193 Bug A, observed only on the go/wait/sleep capstone 1817).
|
||||
//
|
||||
// One consequence of the C-ABI boundary: an exported fn has no implicit
|
||||
// `context` param, so `self.body()` runs under the static `__sx_default_context`
|
||||
// — NOT whatever `push Context { allocator = ... }` was in force at the
|
||||
// `run()` call site. Fiber bodies do not inherit a caller-scoped allocator; a
|
||||
// body that needs one must capture it explicitly (the long-lived-container rule).
|
||||
fib_dispatch :: (self: *Fiber) export "fib_dispatch" {
|
||||
self.body();
|
||||
self.state = .done;
|
||||
@@ -687,7 +770,19 @@ cancel_io_waiter_for :: (self: *Scheduler, f: *Fiber) {
|
||||
i := 0;
|
||||
while i < self.io_waiters.len {
|
||||
if self.io_waiters.items[i].fiber == f {
|
||||
remove_io_waiter(self, i);
|
||||
// Early-wake: the fiber is re-readied by another path while its fd
|
||||
// registration is still armed. kqueue's EV_ONESHOT lingers
|
||||
// harmlessly (a never-fired one-shot the drain ignores); epoll's
|
||||
// EPOLLONESHOT registration stays enabled — it could fire later with
|
||||
// no waiter, and blocks a re-arm of the same fd — so remove it.
|
||||
inline if OS == {
|
||||
case .linux: {
|
||||
fd := self.io_waiters.items[i].fd;
|
||||
remove_io_waiter(self, i);
|
||||
if self.kq >= 0 { ep.ep_ctl(self.kq, ep.EPOLL_CTL_DEL, fd, 0); }
|
||||
}
|
||||
case .macos: remove_io_waiter(self, i);
|
||||
}
|
||||
return;
|
||||
}
|
||||
i = i + 1;
|
||||
|
||||
Reference in New Issue
Block a user