fix: aarch64-linux port of the M:1 fiber runtime (sched.sx)

Port library/modules/std/sched.sx to run on aarch64-linux alongside
aarch64-macOS, validated byte-identical on both via Apple `container`.

Per-OS bits are comptime-branched:
- MAP_AP (mmap MAP_ANON flag): linux 0x22 / macOS 0x1002.
- fd-readiness backend: epoll on linux, kqueue on darwin (epoll import
  scoped to the linux branch). block_on_fd, the run-loop Mode-2 drain,
  and cancel_io_waiter_for each branch; the epoll paths EPOLL_CTL_DEL on
  fire and on early-wake (EPOLLONESHOT only disables a registration;
  kqueue EV_ONESHOT auto-removes it).
- first-entry trampoline: a per-OS hand-written global-asm symbol becomes
  a naked sx fn fib_tramp (mov x0,x19; br x20) + register-indirect
  dispatch (spawn presets regs[1] == x20 == &fib_dispatch), dropping the
  per-OS .global symbol entirely.

Fixes issue 0193 Bug A: the trampoline redesign bus-errored on the
go/wait/sleep capstone (1817) until `export "fib_dispatch"` was restored.
Without the export, fib_dispatch reverts to sx's internal ABI (x0 =
implicit context, first arg self shifted to x1) while the trampoline
hands self over in x0 (C-ABI); on first entry the body runs (x1 happens
to alias self) but the closure then loads regs[1] == &fib_dispatch as its
first capture and re-invokes fib_dispatch forever -> stack overflow ->
bus error. The export pins fib_dispatch to the C-ABI (self in x0),
matching the trampoline. Root cause found via lldb on an AOT build;
confirmed against the compiler source.

Bug B (a top-level asm block wrapped in inline-if is dropped during the
comptime-conditional flatten) is carved out to issue 0194 (OPEN) -- no
live trigger remains, since the naked-fn trampoline sidesteps it.

1811/1814/1816/1817 run byte-identical on the aarch64-macOS host and in
an aarch64-linux container; full suite green (817/0). Documents the fiber
runtime in readme.md.
This commit is contained in:
agra
2026-06-26 11:32:01 +03:00
parent 7218280bf0
commit 22f4719e83
5 changed files with 370 additions and 65 deletions

View File

@@ -13,19 +13,28 @@
// - `swap_context` (aarch64 `abi(.naked)`, 13-slot save area: x19..x28, fp,
// lr, sp) saves the callee-saved registers + SP into `*from` and loads them
// from `*to`, then `ret`s onto `to`'s stack.
// - the `_fib_tramp` global-asm first-entry trampoline: x19 holds the
// bootstrapped `*Fiber`; it moves it to x0 and `bl`s the exported generic
// dispatch `fib_dispatch`, which calls the body then switches back to the
// scheduler.
// - the `fib_tramp` first-entry trampoline (a naked sx fn): x19 holds the
// bootstrapped `*Fiber` and x20 = `&fib_dispatch`; it moves the fiber to x0
// and `br`s through x20 to the C-ABI `fib_dispatch`, which calls the body
// then switches back to the scheduler.
// - guarded `mmap` stacks: `[GUARD | usable]`, low GUARD page `mprotect`'d
// PROT_NONE, 16-aligned top returned as the bootstrapped SP.
//
// aarch64-macOS-pinned: the `swap_context` asm + the 13-slot save area are
// per-arch; the `mmap` flag constants (MAP_ANON = 0x1000) and the 16 KB guard
// page are Apple-specific. Runs end-to-end on a matching host, ir-only on a
// mismatch.
// aarch64-pinned (macOS + linux): the `swap_context` asm + the 13-slot save
// area are per-arch. The per-OS bits are branched at comptime — `mmap`'s
// MAP_ANON flag (`MAP_AP`) and the fd-readiness backend (kqueue on darwin,
// epoll on linux). Runs end-to-end on a matching aarch64 host, ir-only on an
// arch mismatch.
#import "modules/std.sx";
kqb :: #import "modules/std/net/kqueue.sx";
// The fd-readiness backend is per-OS: kqueue (kqb, above) on darwin, epoll on
// linux. The epoll import is scoped to the linux branch so darwin never pulls
// epoll's types into the concurrency examples' type tables (the same
// std-barrel-drift rule std.event.Loop follows); `block_on_fd` / the run loop
// reference `ep` only inside their own `inline if OS == .linux` arms.
inline if OS == .linux {
ep :: #import "modules/std/net/epoll.sx";
}
// --- libc mmap stack primitives -------------------------------------------
@@ -40,7 +49,14 @@ abort :: () -> noreturn extern libc "abort";
PROT_NONE :: 0;
PROT_RW :: 3; // PROT_READ | PROT_WRITE
MAP_AP :: 0x1002; // macOS MAP_PRIVATE (0x2) | MAP_ANON (0x1000)
// Exhaustive on the SUPPORTED OSes (linux/macOS), no default case: an
// unsupported target matches no case → MAP_AP undefined → a loud compile error
// on use rather than a silent wrong flag. (The fiber runtime is aarch64-only
// anyway — the swap_context asm — so only these two platforms are wired.)
inline if OS == {
case .linux: MAP_AP :: 0x22; // linux MAP_PRIVATE (0x2) | MAP_ANON (0x20)
case .macos: MAP_AP :: 0x1002; // macOS MAP_PRIVATE (0x2) | MAP_ANON (0x1000)
}
GUARD :: 16384; // one 16 KB page (aarch64-macOS)
STACK :: 131072; // 128 KB usable per fiber
@@ -172,10 +188,11 @@ Scheduler :: struct {
self.n_spawned = self.n_spawned + 1;
top := boot_stack(f, STACK);
f.ctx.regs[0] = xx f; // x19 = self
f.ctx.regs[10] = 0; // fp
f.ctx.regs[11] = xx fib_tramp; // lr → trampoline
f.ctx.regs[12] = top; // sp
f.ctx.regs[0] = xx f; // x19 = self (→ x0 in the tramp)
f.ctx.regs[1] = xx fib_dispatch; // x20 = dispatch entry (tramp `br`s to it)
f.ctx.regs[10] = 0; // fp
f.ctx.regs[11] = xx fib_tramp; // lr → trampoline
f.ctx.regs[12] = top; // sp
f.state = .ready;
enqueue(self, f);
@@ -239,12 +256,13 @@ Scheduler :: struct {
// but was woken by another path (a manual wake, a Task completion), its
// `IoWaiter` would otherwise survive pointing at a fiber that runs to
// completion and is reaped (stack munmap'd + Fiber freed). A later
// kqueue drain matching that stale record would `wake` freed memory.
// Evict it here. NOTE: we do NOT EV_DELETE the kqueue registration — it
// is EV_ONESHOT, so a never-fired registration simply lingers in the
// kernel queue until the fd is readable, at which point the drain finds
// no matching waiter and ignores it (see `run`). The fd is the example's
// to close; closing it auto-removes any pending registration.
// readiness drain matching that stale record would `wake` freed memory.
// Evict it here. The kernel-side registration is handled per-OS inside
// `cancel_io_waiter_for`: on darwin the EV_ONESHOT kqueue registration is
// left to linger (a never-fired one-shot the drain ignores; the fd's
// owner closes it, auto-removing it), but on linux the EPOLLONESHOT
// registration stays enabled and must be `EPOLL_CTL_DEL`'d (else it could
// fire later with no waiter and would block a re-arm of the same fd).
cancel_io_waiter_for(self, f);
self.n_suspended = self.n_suspended - 1;
f.state = .ready;
@@ -333,20 +351,38 @@ Scheduler :: struct {
}
j = j + 1;
}
// Lazily open the kqueue fd the first time fd-blocking is used.
// Lazily open the event-queue fd the first time fd-blocking is used:
// kqueue on darwin, epoll on linux. `self.kq` holds whichever — it is
// just "the readiness queue fd".
if self.kq < 0 {
self.kq = kqb.kqueue();
inline if OS == {
case .linux: self.kq = ep.ep_create();
case .macos: self.kq = kqb.kqueue();
}
if self.kq < 0 {
print("sched: kqueue() failed to open the event queue\n");
print("sched: failed to open the event queue\n");
abort();
}
}
// Arm a one-shot read-readiness registration for `fd`. udata is unused
// (we match the waiter by fd in the drain), so pass 0.
chg := kqb.kev_change(fd, kqb.EVFILT_READ, kqb.EV_ADD | kqb.EV_ENABLE | kqb.EV_ONESHOT, 0);
if !kqb.kq_apply(self.kq, chg) {
print("sched: kevent() failed to register fd {} for read readiness\n", fd);
abort();
// Arm a one-shot read-readiness registration for `fd`, matched back by
// the run-loop drain (kqueue by ident; epoll stashes the fd in `data`).
// darwin EV_ONESHOT auto-removes the registration on fire; epoll's
// EPOLLONESHOT only DISABLES it, so the linux paths additionally
// EPOLL_CTL_DEL on fire (run) and on early-wake (cancel_io_waiter_for).
inline if OS == {
case .linux: {
if !ep.ep_ctl(self.kq, ep.EPOLL_CTL_ADD, fd, ep.EPOLLIN | ep.EPOLLONESHOT) {
print("sched: epoll_ctl() failed to register fd {} for read readiness\n", fd);
abort();
}
}
case .macos: {
chg := kqb.kev_change(fd, kqb.EVFILT_READ, kqb.EV_ADD | kqb.EV_ENABLE | kqb.EV_ONESHOT, 0);
if !kqb.kq_apply(self.kq, chg) {
print("sched: kevent() failed to register fd {} for read readiness\n", fd);
abort();
}
}
}
// Record the waiter BEFORE parking — the run loop matches the fired
// event's ident back to this record. Long-lived-container rule: the
@@ -407,20 +443,42 @@ Scheduler :: struct {
// kernel reports at least one fd ready, then wake every waiter whose
// fd fired. (null timeout via -1 → wait forever.)
if self.io_waiters.len > 0 {
evbuf : [MAXEV]kqb.Kevent = ---;
n := kqb.kq_wait(self.kq, @evbuf[0], MAXEV, -1);
if n < 0 {
print("sched: kevent() wait failed while blocking on fd readiness\n");
abort();
}
// For each fired event, find the io-waiter whose fd matches its
// ident, evict it, and wake its fiber. EV_ONESHOT already removed
// the kernel registration, so we only drop the waiter record.
i := 0;
while i < n {
ready_fd : i32 = xx evbuf[i].ident;
wake_io_waiter_for_fd(self, ready_fd);
i = i + 1;
// BLOCK on the readiness queue until ≥1 fd fires (timeout -1 =
// forever), then for each fired event match the fd back to its
// io-waiter, evict the record, and wake the fiber.
inline if OS == {
case .linux: {
evbuf : [MAXEV]ep.EpollEvent = ---;
n := ep.ep_wait(self.kq, .{ ptr = @evbuf[0], len = MAXEV }, MAXEV, -1);
if n < 0 {
print("sched: epoll_wait() failed while blocking on fd readiness\n");
abort();
}
i := 0;
while i < n {
ready_fd := ep.ev_fd(evbuf[i]);
wake_io_waiter_for_fd(self, ready_fd);
// EPOLLONESHOT only DISABLED the registration; remove it
// fully so the fd can be re-armed by a future block_on_fd
// (kqueue's EV_ONESHOT removes it for free).
ep.ep_ctl(self.kq, ep.EPOLL_CTL_DEL, ready_fd, 0);
i = i + 1;
}
}
case .macos: {
evbuf : [MAXEV]kqb.Kevent = ---;
n := kqb.kq_wait(self.kq, @evbuf[0], MAXEV, -1);
if n < 0 {
print("sched: kevent() wait failed while blocking on fd readiness\n");
abort();
}
i := 0;
while i < n {
ready_fd : i32 = xx evbuf[i].ident;
wake_io_waiter_for_fd(self, ready_fd);
i = i + 1;
}
}
}
continue;
}
@@ -539,23 +597,48 @@ ASM
};
}
// First-entry trampoline: a fiber's bootstrapped LR points here. x19 holds the
// `*Fiber` (preset in the saved context); move it to x0 and call the generic
// dispatch.
asm {
#string T
.global _fib_tramp
_fib_tramp:
// First-entry trampoline: a fiber's bootstrapped LR points here, with x19 =
// `*Fiber` and x20 = `&fib_dispatch` (both preset in the saved context by
// `spawn`, both callee-saved so `swap_context` restores them on first entry).
// Move the fiber to x0 and tail-branch to dispatch via the REGISTER (x20) — so
// there is no hand-written global-asm symbol and nothing here needs per-OS
// symbol naming (`_fib_tramp` on darwin vs `fib_tramp` on linux) or a `bl` to a
// named export. As a naked sx fn `fib_tramp`'s own symbol is emitted with the
// platform-correct name automatically, so `spawn`'s `xx fib_tramp` resolves on
// every target. This register-indirect bootstrap replaced an OS-conditional
// global `asm` block (a top-level `asm` wrapped in an `inline if` is dropped in
// this module's context — see issues/0193) and sidesteps the hand-written
// symbol entirely, which is cleaner regardless.
fib_tramp :: () abi(.naked) {
asm volatile {
#string T
mov x0, x19
bl _fib_dispatch
brk #0
T,
};
fib_tramp :: () extern;
br x20
T
};
}
// The ONE place that runs a fiber body. Reached only from `_fib_tramp` on first
// The ONE place that runs a fiber body. Reached only from `fib_tramp` on first
// entry, on the fiber's own fresh stack. Runs the body, marks the fiber done,
// and switches back to the scheduler — never returns past the final switch.
//
// `export "fib_dispatch"` is MANDATORY, not decorative: it pins this fn to the
// **C ABI** (first real arg `self` in x0). The trampoline hands the fiber over
// in x0 (`mov x0, x19; br x20`), which is exactly C-ABI. Drop the export and the
// fn reverts to sx's INTERNAL calling convention, which reserves x0 for the
// implicit `context` pointer and shifts `self` to x1 — so the trampoline's x0
// would land in the context slot and `self` would be read from a garbage x1. On
// first entry that garbage happens to alias `&fiber.ctx == self` (left in x1 by
// the scheduler's prior `swap_context`), so the body runs once; but inside it
// the closure loads `[Fiber+8] == regs[1] == &fib_dispatch` as its "first
// capture" and re-invokes `fib_dispatch` forever → stack overflow → bus error
// (issue 0193 Bug A, observed only on the go/wait/sleep capstone 1817).
//
// One consequence of the C-ABI boundary: an exported fn has no implicit
// `context` param, so `self.body()` runs under the static `__sx_default_context`
// — NOT whatever `push Context { allocator = ... }` was in force at the
// `run()` call site. Fiber bodies do not inherit a caller-scoped allocator; a
// body that needs one must capture it explicitly (the long-lived-container rule).
fib_dispatch :: (self: *Fiber) export "fib_dispatch" {
self.body();
self.state = .done;
@@ -687,7 +770,19 @@ cancel_io_waiter_for :: (self: *Scheduler, f: *Fiber) {
i := 0;
while i < self.io_waiters.len {
if self.io_waiters.items[i].fiber == f {
remove_io_waiter(self, i);
// Early-wake: the fiber is re-readied by another path while its fd
// registration is still armed. kqueue's EV_ONESHOT lingers
// harmlessly (a never-fired one-shot the drain ignores); epoll's
// EPOLLONESHOT registration stays enabled — it could fire later with
// no waiter, and blocks a re-arm of the same fd — so remove it.
inline if OS == {
case .linux: {
fd := self.io_waiters.items[i].fd;
remove_io_waiter(self, i);
if self.kq >= 0 { ep.ep_ctl(self.kq, ep.EPOLL_CTL_DEL, fd, 0); }
}
case .macos: remove_io_waiter(self, i);
}
return;
}
i = i + 1;