fix: aarch64-linux port of the M:1 fiber runtime (sched.sx)
Port library/modules/std/sched.sx to run on aarch64-linux alongside aarch64-macOS, validated byte-identical on both via Apple `container`. Per-OS bits are comptime-branched: - MAP_AP (mmap MAP_ANON flag): linux 0x22 / macOS 0x1002. - fd-readiness backend: epoll on linux, kqueue on darwin (epoll import scoped to the linux branch). block_on_fd, the run-loop Mode-2 drain, and cancel_io_waiter_for each branch; the epoll paths EPOLL_CTL_DEL on fire and on early-wake (EPOLLONESHOT only disables a registration; kqueue EV_ONESHOT auto-removes it). - first-entry trampoline: a per-OS hand-written global-asm symbol becomes a naked sx fn fib_tramp (mov x0,x19; br x20) + register-indirect dispatch (spawn presets regs[1] == x20 == &fib_dispatch), dropping the per-OS .global symbol entirely. Fixes issue 0193 Bug A: the trampoline redesign bus-errored on the go/wait/sleep capstone (1817) until `export "fib_dispatch"` was restored. Without the export, fib_dispatch reverts to sx's internal ABI (x0 = implicit context, first arg self shifted to x1) while the trampoline hands self over in x0 (C-ABI); on first entry the body runs (x1 happens to alias self) but the closure then loads regs[1] == &fib_dispatch as its first capture and re-invokes fib_dispatch forever -> stack overflow -> bus error. The export pins fib_dispatch to the C-ABI (self in x0), matching the trampoline. Root cause found via lldb on an AOT build; confirmed against the compiler source. Bug B (a top-level asm block wrapped in inline-if is dropped during the comptime-conditional flatten) is carved out to issue 0194 (OPEN) -- no live trigger remains, since the naked-fn trampoline sidesteps it. 1811/1814/1816/1817 run byte-identical on the aarch64-macOS host and in an aarch64-linux container; full suite green (817/0). Documents the fiber runtime in readme.md.
This commit is contained in:
@@ -4,7 +4,32 @@ Companion to [PLAN-FIBERS.md](PLAN-FIBERS.md). Update after every step (one step
|
||||
per the cadence rule). New corpus category: `18xx` concurrency.
|
||||
|
||||
## Last completed step
|
||||
**B1 follow-up — `Scheduler.deinit` (close the bounded leaks).** Post-B1 non-blocking cleanup: a
|
||||
**B1.6 — aarch64-LINUX port of the M:1 fiber runtime (sched.sx).** `library/modules/std/sched.sx`
|
||||
now runs end-to-end on aarch64-linux as well as aarch64-macOS, validated **byte-identical** on both
|
||||
via Apple `container` (static ELF, no emulation). The per-OS bits are comptime-branched:
|
||||
- `MAP_AP` (mmap MAP_ANON flag) — `inline if OS == { case .linux: 0x22 case .macos: 0x1002 }`,
|
||||
exhaustive on the supported OSes (no default → a new target fails loud on use).
|
||||
- The fd-readiness backend — kqueue on darwin, **epoll on linux**. The `epoll` import is scoped to
|
||||
the linux branch (`inline if OS == .linux { ep :: #import "modules/std/net/epoll.sx" }`) so darwin
|
||||
never pulls epoll types into the concurrency examples (the std-barrel-drift rule). `block_on_fd`, the
|
||||
run-loop Mode-2 drain, and `cancel_io_waiter_for` each branch kqueue/epoll; epoll additionally
|
||||
`EPOLL_CTL_DEL`s on fire + on early-wake (EPOLLONESHOT only DISABLES, kqueue EV_ONESHOT auto-removes).
|
||||
- The first-entry trampoline was redesigned from a per-OS hand-written global-asm symbol to a **naked
|
||||
sx fn** `fib_tramp` (`mov x0, x19; br x20`) + register-indirect dispatch (spawn presets
|
||||
`regs[1] == x20 == &fib_dispatch`), so no per-OS `.global _fib_tramp`/`fib_tramp` symbol literal is
|
||||
needed. This sidesteps a compiler bug (wrapped top-level `asm` dropped — now **issue 0194**, OPEN).
|
||||
|
||||
**Bug fixed en route (issue 0193 Bug A):** the tramp redesign initially bus-errored on the 1817
|
||||
go/wait/sleep capstone (both OSes) because the WIP had dropped `export "fib_dispatch"`. Without the
|
||||
export `fib_dispatch` uses sx's internal ABI (x0 = implicit `context`, `self` shifted to x1), but the
|
||||
trampoline hands `self` in x0 (C-ABI) → on first entry the body runs (x1 happens to alias `self`) but
|
||||
the closure then loads `regs[1] == &fib_dispatch` as its first capture and recurses forever → stack
|
||||
overflow. **Fix: restore `export "fib_dispatch"`** (pins it to C-ABI, `self` in x0). Root cause found
|
||||
via lldb on an AOT macOS build; confirmed by an adversarial source review (`src/ir/lower/decl.zig`).
|
||||
The 1817 capstone in the suite guards the fix. Suite GREEN **817/0**; 1811/1814/1816/1817 byte-identical
|
||||
macOS host ↔ aarch64-linux container.
|
||||
|
||||
### Earlier — B1 follow-up — `Scheduler.deinit` (close the bounded leaks). Post-B1 non-blocking cleanup: a
|
||||
terminal `deinit` on `library/modules/std/sched.sx`'s `Scheduler` releases the resources B1 documented
|
||||
as leaked. Frees, in order: (1) any fibers still enqueued ready (leak-safety net for `spawn`/`go`
|
||||
without `run()` — `munmap` stack + free struct; a suspended off-queue fiber is unreachable, but a clean
|
||||
@@ -401,12 +426,12 @@ env remains unfreeable (language limitation). Locked by `18xx` 1800–1820 (nake
|
||||
blocking async, the switch + §10.7 stress gate + guarded stacks + Win64 sibling, scheduler round-robin,
|
||||
suspend/wake, async go/wait/cancel, sim-timer ordering, timer early-wake eviction, kqueue pipe I/O, the
|
||||
**1817 end-to-end capstone**, sleep-negative/double-wait guards, and **1820 scheduler-deinit**). Suite
|
||||
GREEN **759/0**, committed.
|
||||
GREEN **817/0**, committed. **B1.6: now also runs on aarch64-linux** (epoll fd-backend + comptime-branched
|
||||
`MAP_AP` + naked-fn trampoline) — validated byte-identical to macOS in an Apple `container`.
|
||||
|
||||
Future work (none blocking B1): a **linux epoll twin** of `block_on_fd` (mirror via `std/net/epoll`;
|
||||
OS-neutral facade `std.event`) — B1.4c wired macOS kqueue only; routing the suspending async through
|
||||
the erased `context.io` (forces sched.sx into every std consumer + duplicates the `_fib_tramp` global
|
||||
asm — deferred to the M:N model, where the `Io` protocol's `spawn_raw`/`suspend_raw`/`ready`/
|
||||
Future work (none blocking B1): routing the suspending async through
|
||||
the erased `context.io` (forces sched.sx into every std consumer — deferred to the M:N model, where
|
||||
the `Io` protocol's `spawn_raw`/`suspend_raw`/`ready`/
|
||||
`arm_timer`/`poll` hooks take over); `Future(void)`/`timeout` (issue 0150); freeing the heap-Task /
|
||||
closure-env / kq-fd (a Scheduler `deinit` + closure-env-ownership affordance). **Next carve: Stream
|
||||
B2** (channels / structured cancel / async stdlib) — see PLAN-CHANNELS.md when started.
|
||||
@@ -684,6 +709,19 @@ incomplete); a dedicated effort; lambda workers are the idiom meanwhile.
|
||||
trusted. `18xx` asserts program-emitted ordering contracts, not raw interleaving.
|
||||
|
||||
## Log
|
||||
- **B1.6 — aarch64-linux port of sched.sx.** Comptime-branched the per-OS bits: `MAP_AP` (linux
|
||||
`0x22` / macOS `0x1002`), the fd-readiness backend (epoll on linux, kqueue on darwin — epoll import
|
||||
scoped to the linux branch; `block_on_fd` / run-loop Mode-2 / `cancel_io_waiter_for` each branch,
|
||||
epoll `EPOLL_CTL_DEL`s on fire + early-wake), and the first-entry trampoline (per-OS global-asm
|
||||
symbol → naked sx fn `fib_tramp` + register-indirect `br x20` to `&fib_dispatch` preset in
|
||||
`regs[1]`). **Fixed issue 0193 Bug A:** the tramp redesign bus-errored on 1817 (both OSes) until
|
||||
`export "fib_dispatch"` was restored — without it the fn uses sx's internal ABI (x0 = implicit
|
||||
`context`, `self` → x1) while the trampoline supplies `self` in x0, so the closure loads
|
||||
`regs[1] == &fib_dispatch` as its first capture and recurses forever → stack-overflow bus error.
|
||||
Root cause found via lldb (AOT macOS build) + an adversarial source review. **Bug B** (wrapped
|
||||
top-level `asm` dropped) carved to **issue 0194** (OPEN; no live trigger — the naked-fn tramp
|
||||
sidesteps it). Validated byte-identical on aarch64-macOS host AND aarch64-linux Apple `container`
|
||||
for 1811/1814/1816/1817; full suite GREEN **817/0**.
|
||||
- **B1 follow-up — `Scheduler.deinit`.** Closes the bounded leaks B1 documented. Added a `task_allocs:
|
||||
List(*void)` field (appended in `go` so the scheduler can reach its generic `Task($R)`s) + a canonical
|
||||
`close` extern, then a terminal idempotent `deinit`: reap leftover ready fibers (`munmap` + free) →
|
||||
|
||||
Reference in New Issue
Block a user