fix: aarch64-linux port of the M:1 fiber runtime (sched.sx)
Port library/modules/std/sched.sx to run on aarch64-linux alongside aarch64-macOS, validated byte-identical on both via Apple `container`. Per-OS bits are comptime-branched: - MAP_AP (mmap MAP_ANON flag): linux 0x22 / macOS 0x1002. - fd-readiness backend: epoll on linux, kqueue on darwin (epoll import scoped to the linux branch). block_on_fd, the run-loop Mode-2 drain, and cancel_io_waiter_for each branch; the epoll paths EPOLL_CTL_DEL on fire and on early-wake (EPOLLONESHOT only disables a registration; kqueue EV_ONESHOT auto-removes it). - first-entry trampoline: a per-OS hand-written global-asm symbol becomes a naked sx fn fib_tramp (mov x0,x19; br x20) + register-indirect dispatch (spawn presets regs[1] == x20 == &fib_dispatch), dropping the per-OS .global symbol entirely. Fixes issue 0193 Bug A: the trampoline redesign bus-errored on the go/wait/sleep capstone (1817) until `export "fib_dispatch"` was restored. Without the export, fib_dispatch reverts to sx's internal ABI (x0 = implicit context, first arg self shifted to x1) while the trampoline hands self over in x0 (C-ABI); on first entry the body runs (x1 happens to alias self) but the closure then loads regs[1] == &fib_dispatch as its first capture and re-invokes fib_dispatch forever -> stack overflow -> bus error. The export pins fib_dispatch to the C-ABI (self in x0), matching the trampoline. Root cause found via lldb on an AOT build; confirmed against the compiler source. Bug B (a top-level asm block wrapped in inline-if is dropped during the comptime-conditional flatten) is carved out to issue 0194 (OPEN) -- no live trigger remains, since the naked-fn trampoline sidesteps it. 1811/1814/1816/1817 run byte-identical on the aarch64-macOS host and in an aarch64-linux container; full suite green (817/0). Documents the fiber runtime in readme.md.
This commit is contained in:
@@ -4,7 +4,32 @@ Companion to [PLAN-FIBERS.md](PLAN-FIBERS.md). Update after every step (one step
|
|||||||
per the cadence rule). New corpus category: `18xx` concurrency.
|
per the cadence rule). New corpus category: `18xx` concurrency.
|
||||||
|
|
||||||
## Last completed step
|
## Last completed step
|
||||||
**B1 follow-up — `Scheduler.deinit` (close the bounded leaks).** Post-B1 non-blocking cleanup: a
|
**B1.6 — aarch64-LINUX port of the M:1 fiber runtime (sched.sx).** `library/modules/std/sched.sx`
|
||||||
|
now runs end-to-end on aarch64-linux as well as aarch64-macOS, validated **byte-identical** on both
|
||||||
|
via Apple `container` (static ELF, no emulation). The per-OS bits are comptime-branched:
|
||||||
|
- `MAP_AP` (mmap MAP_ANON flag) — `inline if OS == { case .linux: 0x22 case .macos: 0x1002 }`,
|
||||||
|
exhaustive on the supported OSes (no default → a new target fails loud on use).
|
||||||
|
- The fd-readiness backend — kqueue on darwin, **epoll on linux**. The `epoll` import is scoped to
|
||||||
|
the linux branch (`inline if OS == .linux { ep :: #import "modules/std/net/epoll.sx" }`) so darwin
|
||||||
|
never pulls epoll types into the concurrency examples (the std-barrel-drift rule). `block_on_fd`, the
|
||||||
|
run-loop Mode-2 drain, and `cancel_io_waiter_for` each branch kqueue/epoll; epoll additionally
|
||||||
|
`EPOLL_CTL_DEL`s on fire + on early-wake (EPOLLONESHOT only DISABLES, kqueue EV_ONESHOT auto-removes).
|
||||||
|
- The first-entry trampoline was redesigned from a per-OS hand-written global-asm symbol to a **naked
|
||||||
|
sx fn** `fib_tramp` (`mov x0, x19; br x20`) + register-indirect dispatch (spawn presets
|
||||||
|
`regs[1] == x20 == &fib_dispatch`), so no per-OS `.global _fib_tramp`/`fib_tramp` symbol literal is
|
||||||
|
needed. This sidesteps a compiler bug (wrapped top-level `asm` dropped — now **issue 0194**, OPEN).
|
||||||
|
|
||||||
|
**Bug fixed en route (issue 0193 Bug A):** the tramp redesign initially bus-errored on the 1817
|
||||||
|
go/wait/sleep capstone (both OSes) because the WIP had dropped `export "fib_dispatch"`. Without the
|
||||||
|
export `fib_dispatch` uses sx's internal ABI (x0 = implicit `context`, `self` shifted to x1), but the
|
||||||
|
trampoline hands `self` in x0 (C-ABI) → on first entry the body runs (x1 happens to alias `self`) but
|
||||||
|
the closure then loads `regs[1] == &fib_dispatch` as its first capture and recurses forever → stack
|
||||||
|
overflow. **Fix: restore `export "fib_dispatch"`** (pins it to C-ABI, `self` in x0). Root cause found
|
||||||
|
via lldb on an AOT macOS build; confirmed by an adversarial source review (`src/ir/lower/decl.zig`).
|
||||||
|
The 1817 capstone in the suite guards the fix. Suite GREEN **817/0**; 1811/1814/1816/1817 byte-identical
|
||||||
|
macOS host ↔ aarch64-linux container.
|
||||||
|
|
||||||
|
### Earlier — B1 follow-up — `Scheduler.deinit` (close the bounded leaks). Post-B1 non-blocking cleanup: a
|
||||||
terminal `deinit` on `library/modules/std/sched.sx`'s `Scheduler` releases the resources B1 documented
|
terminal `deinit` on `library/modules/std/sched.sx`'s `Scheduler` releases the resources B1 documented
|
||||||
as leaked. Frees, in order: (1) any fibers still enqueued ready (leak-safety net for `spawn`/`go`
|
as leaked. Frees, in order: (1) any fibers still enqueued ready (leak-safety net for `spawn`/`go`
|
||||||
without `run()` — `munmap` stack + free struct; a suspended off-queue fiber is unreachable, but a clean
|
without `run()` — `munmap` stack + free struct; a suspended off-queue fiber is unreachable, but a clean
|
||||||
@@ -401,12 +426,12 @@ env remains unfreeable (language limitation). Locked by `18xx` 1800–1820 (nake
|
|||||||
blocking async, the switch + §10.7 stress gate + guarded stacks + Win64 sibling, scheduler round-robin,
|
blocking async, the switch + §10.7 stress gate + guarded stacks + Win64 sibling, scheduler round-robin,
|
||||||
suspend/wake, async go/wait/cancel, sim-timer ordering, timer early-wake eviction, kqueue pipe I/O, the
|
suspend/wake, async go/wait/cancel, sim-timer ordering, timer early-wake eviction, kqueue pipe I/O, the
|
||||||
**1817 end-to-end capstone**, sleep-negative/double-wait guards, and **1820 scheduler-deinit**). Suite
|
**1817 end-to-end capstone**, sleep-negative/double-wait guards, and **1820 scheduler-deinit**). Suite
|
||||||
GREEN **759/0**, committed.
|
GREEN **817/0**, committed. **B1.6: now also runs on aarch64-linux** (epoll fd-backend + comptime-branched
|
||||||
|
`MAP_AP` + naked-fn trampoline) — validated byte-identical to macOS in an Apple `container`.
|
||||||
|
|
||||||
Future work (none blocking B1): a **linux epoll twin** of `block_on_fd` (mirror via `std/net/epoll`;
|
Future work (none blocking B1): routing the suspending async through
|
||||||
OS-neutral facade `std.event`) — B1.4c wired macOS kqueue only; routing the suspending async through
|
the erased `context.io` (forces sched.sx into every std consumer — deferred to the M:N model, where
|
||||||
the erased `context.io` (forces sched.sx into every std consumer + duplicates the `_fib_tramp` global
|
the `Io` protocol's `spawn_raw`/`suspend_raw`/`ready`/
|
||||||
asm — deferred to the M:N model, where the `Io` protocol's `spawn_raw`/`suspend_raw`/`ready`/
|
|
||||||
`arm_timer`/`poll` hooks take over); `Future(void)`/`timeout` (issue 0150); freeing the heap-Task /
|
`arm_timer`/`poll` hooks take over); `Future(void)`/`timeout` (issue 0150); freeing the heap-Task /
|
||||||
closure-env / kq-fd (a Scheduler `deinit` + closure-env-ownership affordance). **Next carve: Stream
|
closure-env / kq-fd (a Scheduler `deinit` + closure-env-ownership affordance). **Next carve: Stream
|
||||||
B2** (channels / structured cancel / async stdlib) — see PLAN-CHANNELS.md when started.
|
B2** (channels / structured cancel / async stdlib) — see PLAN-CHANNELS.md when started.
|
||||||
@@ -684,6 +709,19 @@ incomplete); a dedicated effort; lambda workers are the idiom meanwhile.
|
|||||||
trusted. `18xx` asserts program-emitted ordering contracts, not raw interleaving.
|
trusted. `18xx` asserts program-emitted ordering contracts, not raw interleaving.
|
||||||
|
|
||||||
## Log
|
## Log
|
||||||
|
- **B1.6 — aarch64-linux port of sched.sx.** Comptime-branched the per-OS bits: `MAP_AP` (linux
|
||||||
|
`0x22` / macOS `0x1002`), the fd-readiness backend (epoll on linux, kqueue on darwin — epoll import
|
||||||
|
scoped to the linux branch; `block_on_fd` / run-loop Mode-2 / `cancel_io_waiter_for` each branch,
|
||||||
|
epoll `EPOLL_CTL_DEL`s on fire + early-wake), and the first-entry trampoline (per-OS global-asm
|
||||||
|
symbol → naked sx fn `fib_tramp` + register-indirect `br x20` to `&fib_dispatch` preset in
|
||||||
|
`regs[1]`). **Fixed issue 0193 Bug A:** the tramp redesign bus-errored on 1817 (both OSes) until
|
||||||
|
`export "fib_dispatch"` was restored — without it the fn uses sx's internal ABI (x0 = implicit
|
||||||
|
`context`, `self` → x1) while the trampoline supplies `self` in x0, so the closure loads
|
||||||
|
`regs[1] == &fib_dispatch` as its first capture and recurses forever → stack-overflow bus error.
|
||||||
|
Root cause found via lldb (AOT macOS build) + an adversarial source review. **Bug B** (wrapped
|
||||||
|
top-level `asm` dropped) carved to **issue 0194** (OPEN; no live trigger — the naked-fn tramp
|
||||||
|
sidesteps it). Validated byte-identical on aarch64-macOS host AND aarch64-linux Apple `container`
|
||||||
|
for 1811/1814/1816/1817; full suite GREEN **817/0**.
|
||||||
- **B1 follow-up — `Scheduler.deinit`.** Closes the bounded leaks B1 documented. Added a `task_allocs:
|
- **B1 follow-up — `Scheduler.deinit`.** Closes the bounded leaks B1 documented. Added a `task_allocs:
|
||||||
List(*void)` field (appended in `go` so the scheduler can reach its generic `Task($R)`s) + a canonical
|
List(*void)` field (appended in `go` so the scheduler can reach its generic `Task($R)`s) + a canonical
|
||||||
`close` extern, then a terminal idempotent `deinit`: reap leftover ready fibers (`munmap` + free) →
|
`close` extern, then a terminal idempotent `deinit`: reap leftover ready fibers (`munmap` + free) →
|
||||||
|
|||||||
@@ -1,8 +1,35 @@
|
|||||||
# Issue 0193 — linux fiber-runtime port (sched.sx) + a wrapped top-level `asm` drop
|
# Issue 0193 — linux fiber-runtime port (sched.sx) + a wrapped top-level `asm` drop
|
||||||
|
|
||||||
Status: **OPEN.** Two intertwined items uncovered while porting `library/modules/std/sched.sx`
|
> **RESOLVED — port landed on aarch64-linux.**
|
||||||
(the M:1 fiber runtime) to aarch64-linux. The WIP sched.sx port is preserved in
|
>
|
||||||
`git stash` (`stash@{0}`, "WIP on fix/0192-qualified-import-const-comptime") — pop it to resume.
|
> **Bug A (register-indirect trampoline bus-errors on 1817): FIXED.** Root cause found via lldb on an
|
||||||
|
> AOT macOS build (the bug reproduced on macOS too, so no container needed): the WIP port had dropped
|
||||||
|
> `export "fib_dispatch"` from `fib_dispatch`. Without the export the fn reverts to sx's INTERNAL
|
||||||
|
> calling convention, which reserves x0 for the implicit `context` pointer and shifts the first real
|
||||||
|
> arg `self` to x1 — but the trampoline (`mov x0, x19; br x20`) hands the fiber over in x0, C-ABI
|
||||||
|
> style. On first entry x1 coincidentally aliases `&fiber.ctx == self` (left there by the scheduler's
|
||||||
|
> prior `swap_context(from, to)`, x1 = to), so the body runs once; but inside it the closure loads
|
||||||
|
> `[Fiber+8] == ctx.regs[1] == &fib_dispatch` as its "first capture" and re-invokes `fib_dispatch`
|
||||||
|
> forever → stack overflow → bus error. **Fix:** restore `export "fib_dispatch"` so the fn keeps the
|
||||||
|
> C-ABI (`self` in x0), matching what the trampoline supplies — a one-line library change, no compiler
|
||||||
|
> change. The register-indirect naked-fn trampoline design is kept (it sidesteps Bug B's hand-written
|
||||||
|
> per-OS global-asm symbol). Adversarially reviewed against the compiler source (`src/ir/lower/decl.zig`
|
||||||
|
> `funcWantsImplicitCtx`/`wants_ctx`/`CallingConvention.c`); root cause + fix confirmed CORRECT.
|
||||||
|
>
|
||||||
|
> **Validation:** 1811 / 1814 / 1816 / 1817 (the go/wait/sleep capstone) all run **byte-identical** on
|
||||||
|
> the aarch64-macOS host AND in an aarch64-linux Apple `container` (`sum: 123`, completion order
|
||||||
|
> `2@10 3@20 1@30`, etc.). Full `zig build test` macOS suite GREEN (817/0).
|
||||||
|
>
|
||||||
|
> **Bug B (wrapped top-level `asm` dropped): carved out to `issues/0194-wrapped-toplevel-asm-dropped.md`
|
||||||
|
> as an OPEN compiler bug.** It is no longer triggered anywhere in the tree (the port no longer uses a
|
||||||
|
> wrapped global-asm block), so it does not block anything — but it is a real defect and stays filed.
|
||||||
|
>
|
||||||
|
> Original writeup below for history.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Status: **(historical — see RESOLVED banner above).** Two intertwined items uncovered while porting
|
||||||
|
`library/modules/std/sched.sx` (the M:1 fiber runtime) to aarch64-linux.
|
||||||
|
|
||||||
The epoll *bindings* + `std.event.Loop` epoll backend are already committed (`cc137002`) and
|
The epoll *bindings* + `std.event.Loop` epoll backend are already committed (`cc137002`) and
|
||||||
**runtime-validated on real Linux** via Apple `container` (see the event.sx VALIDATION note / the
|
**runtime-validated on real Linux** via Apple `container` (see the event.sx VALIDATION note / the
|
||||||
|
|||||||
99
issues/0194-wrapped-toplevel-asm-dropped.md
Normal file
99
issues/0194-wrapped-toplevel-asm-dropped.md
Normal file
@@ -0,0 +1,99 @@
|
|||||||
|
# Issue 0194 — a top-level global `asm` block wrapped in `inline if` / `case` is DROPPED
|
||||||
|
|
||||||
|
Status: **OPEN.** Carved out of issue 0193 (the linux fiber-runtime port). The port itself is
|
||||||
|
RESOLVED — it sidesteps this bug entirely by using a naked-sx-fn trampoline (`fib_tramp`) plus a
|
||||||
|
register-indirect `br x20` instead of a hand-written global-asm symbol, so there is **no live trigger
|
||||||
|
for this bug in the tree today.** It is filed standalone so the compiler defect is not lost.
|
||||||
|
|
||||||
|
## Symptom
|
||||||
|
|
||||||
|
A top-level global `asm { … }` block that defines a symbol (e.g. `.global _foo` / `_foo: …`) is
|
||||||
|
**not emitted** when it is wrapped in a comptime `inline if OS == { case … }` (or
|
||||||
|
`inline if OS == .linux { asm } else { asm }`). `nm main.o` shows the symbol as `U` (undefined) and
|
||||||
|
the link fails on both platforms. A PLAIN, unwrapped top-level `asm { … }` emits fine.
|
||||||
|
|
||||||
|
- **Observed:** symbol undefined, link error.
|
||||||
|
- **Expected:** the `asm` block in the taken comptime arm emits its template into the module's global
|
||||||
|
asm exactly as an unwrapped block would (the comptime-conditional pre-pass already surfaces the
|
||||||
|
taken arm's *other* top-level decls — fns, consts, imports — correctly; only the `asm_global` node
|
||||||
|
is lost).
|
||||||
|
|
||||||
|
## Reproduction
|
||||||
|
|
||||||
|
**Not yet reproducible in isolation.** During the 0193 port, minimal/medium repros ALL emitted +
|
||||||
|
linked correctly: a top-level `asm` in a single `case`; two `case` blocks; a `case` asm in an
|
||||||
|
imported module; a naked fn + `case` asm with `bl` to an exported fn; a one-sided
|
||||||
|
`inline if .linux { #import }` before the asm. **Only the full `library/modules/std/sched.sx`
|
||||||
|
dropped it** — so the trigger is an interaction with something else in that module, not the wrapped
|
||||||
|
`asm` alone.
|
||||||
|
|
||||||
|
The exact form that triggered it (now replaced on the branch, recoverable from history): the original
|
||||||
|
global trampoline
|
||||||
|
|
||||||
|
```sx
|
||||||
|
asm {
|
||||||
|
#string T
|
||||||
|
.global _fib_tramp
|
||||||
|
_fib_tramp:
|
||||||
|
mov x0, x19
|
||||||
|
bl _fib_dispatch
|
||||||
|
brk #0
|
||||||
|
T,
|
||||||
|
};
|
||||||
|
fib_tramp :: () extern;
|
||||||
|
```
|
||||||
|
|
||||||
|
wrapped as
|
||||||
|
|
||||||
|
```sx
|
||||||
|
inline if OS == {
|
||||||
|
case .linux: asm { #string T
|
||||||
|
fib_tramp:
|
||||||
|
mov x0, x19
|
||||||
|
bl fib_dispatch
|
||||||
|
br x30
|
||||||
|
T, };
|
||||||
|
case .macos: asm { #string T
|
||||||
|
.global _fib_tramp
|
||||||
|
_fib_tramp:
|
||||||
|
mov x0, x19
|
||||||
|
bl _fib_dispatch
|
||||||
|
brk #0
|
||||||
|
T, };
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
dropped the asm in BOTH arms (whichever was taken). See `issues/0193-linux-fiber-port.patch` for the
|
||||||
|
full module context that triggers it, and the 0193 writeup for the larger investigation history.
|
||||||
|
|
||||||
|
## Investigation prompt (ready to paste)
|
||||||
|
|
||||||
|
> A top-level global `asm` block defining a symbol is dropped when wrapped in a comptime
|
||||||
|
> `inline if OS == { case … }` — but only inside the full `library/modules/std/sched.sx`; it can't be
|
||||||
|
> reproduced in isolation. Find where the surfaced `asm_global` node is lost between the
|
||||||
|
> comptime-conditional flatten and IR lowering.
|
||||||
|
>
|
||||||
|
> Key files:
|
||||||
|
> - `src/imports.zig` — `flattenComptimeConditionals` (line ~38) + `appendBranchDecls` (line ~72): the
|
||||||
|
> pre-pass that surfaces a taken comptime arm's top-level decls. It *appears* correct — it appends
|
||||||
|
> every node of the taken branch's block, `asm_global` included — so confirm the flattened slice
|
||||||
|
> actually carries the `asm_global` node (dump `flat_decls` at `src/imports.zig:932`).
|
||||||
|
> - `src/ir/lower/decl.zig` — `lowerMainAndComptime` (line ~1494), whose `.asm_global` arm (line ~1503)
|
||||||
|
> appends the verbatim template to `self.module.global_asm`. **Prime suspect:** does the lowering
|
||||||
|
> entry point feed `lowerMainAndComptime` the *flattened* decl list, or a pre-flatten `root.decls`
|
||||||
|
> that never contains the surfaced (formerly-nested) `asm_global`? If the asm-emission pass walks a
|
||||||
|
> different decl list than the one flattening wrote to, a surfaced `asm_global` is silently skipped.
|
||||||
|
> - `src/ir/emit_llvm.zig:384` — where `module.global_asm` is concatenated into the LLVM module. If the
|
||||||
|
> node never reached `global_asm`, it never emits.
|
||||||
|
>
|
||||||
|
> Steps: (1) build sched.sx's wrapped-asm variant (recover from `issues/0193-linux-fiber-port.patch`
|
||||||
|
> or git history of branch `fix/0192-qualified-import-const-comptime`), (2) instrument
|
||||||
|
> `flattenComptimeConditionals` to log whether the `asm_global` node survives into `flat_decls`,
|
||||||
|
> (3) instrument `lowerMainAndComptime` to log whether it ever *sees* an `asm_global`, (4) bisect what
|
||||||
|
> else in sched.sx must be present for the drop to occur (the isolation repros lacked it).
|
||||||
|
> Verification: `nm` the object shows the wrapped-asm symbol DEFINED (not `U`); the wrapped form links
|
||||||
|
> and runs identically to a plain unwrapped `asm`.
|
||||||
|
>
|
||||||
|
> **Verify it isn't a syntax issue first:** it reproduces with both the `case` and `if/else` forms,
|
||||||
|
> and plain unwrapped asm emits fine — so the wrapping, not the asm itself, is the trigger. That points
|
||||||
|
> to the flatten/lowering interaction, not user error.
|
||||||
@@ -13,19 +13,28 @@
|
|||||||
// - `swap_context` (aarch64 `abi(.naked)`, 13-slot save area: x19..x28, fp,
|
// - `swap_context` (aarch64 `abi(.naked)`, 13-slot save area: x19..x28, fp,
|
||||||
// lr, sp) saves the callee-saved registers + SP into `*from` and loads them
|
// lr, sp) saves the callee-saved registers + SP into `*from` and loads them
|
||||||
// from `*to`, then `ret`s onto `to`'s stack.
|
// from `*to`, then `ret`s onto `to`'s stack.
|
||||||
// - the `_fib_tramp` global-asm first-entry trampoline: x19 holds the
|
// - the `fib_tramp` first-entry trampoline (a naked sx fn): x19 holds the
|
||||||
// bootstrapped `*Fiber`; it moves it to x0 and `bl`s the exported generic
|
// bootstrapped `*Fiber` and x20 = `&fib_dispatch`; it moves the fiber to x0
|
||||||
// dispatch `fib_dispatch`, which calls the body then switches back to the
|
// and `br`s through x20 to the C-ABI `fib_dispatch`, which calls the body
|
||||||
// scheduler.
|
// then switches back to the scheduler.
|
||||||
// - guarded `mmap` stacks: `[GUARD | usable]`, low GUARD page `mprotect`'d
|
// - guarded `mmap` stacks: `[GUARD | usable]`, low GUARD page `mprotect`'d
|
||||||
// PROT_NONE, 16-aligned top returned as the bootstrapped SP.
|
// PROT_NONE, 16-aligned top returned as the bootstrapped SP.
|
||||||
//
|
//
|
||||||
// aarch64-macOS-pinned: the `swap_context` asm + the 13-slot save area are
|
// aarch64-pinned (macOS + linux): the `swap_context` asm + the 13-slot save
|
||||||
// per-arch; the `mmap` flag constants (MAP_ANON = 0x1000) and the 16 KB guard
|
// area are per-arch. The per-OS bits are branched at comptime — `mmap`'s
|
||||||
// page are Apple-specific. Runs end-to-end on a matching host, ir-only on a
|
// MAP_ANON flag (`MAP_AP`) and the fd-readiness backend (kqueue on darwin,
|
||||||
// mismatch.
|
// epoll on linux). Runs end-to-end on a matching aarch64 host, ir-only on an
|
||||||
|
// arch mismatch.
|
||||||
#import "modules/std.sx";
|
#import "modules/std.sx";
|
||||||
kqb :: #import "modules/std/net/kqueue.sx";
|
kqb :: #import "modules/std/net/kqueue.sx";
|
||||||
|
// The fd-readiness backend is per-OS: kqueue (kqb, above) on darwin, epoll on
|
||||||
|
// linux. The epoll import is scoped to the linux branch so darwin never pulls
|
||||||
|
// epoll's types into the concurrency examples' type tables (the same
|
||||||
|
// std-barrel-drift rule std.event.Loop follows); `block_on_fd` / the run loop
|
||||||
|
// reference `ep` only inside their own `inline if OS == .linux` arms.
|
||||||
|
inline if OS == .linux {
|
||||||
|
ep :: #import "modules/std/net/epoll.sx";
|
||||||
|
}
|
||||||
|
|
||||||
// --- libc mmap stack primitives -------------------------------------------
|
// --- libc mmap stack primitives -------------------------------------------
|
||||||
|
|
||||||
@@ -40,7 +49,14 @@ abort :: () -> noreturn extern libc "abort";
|
|||||||
|
|
||||||
PROT_NONE :: 0;
|
PROT_NONE :: 0;
|
||||||
PROT_RW :: 3; // PROT_READ | PROT_WRITE
|
PROT_RW :: 3; // PROT_READ | PROT_WRITE
|
||||||
MAP_AP :: 0x1002; // macOS MAP_PRIVATE (0x2) | MAP_ANON (0x1000)
|
// Exhaustive on the SUPPORTED OSes (linux/macOS), no default case: an
|
||||||
|
// unsupported target matches no case → MAP_AP undefined → a loud compile error
|
||||||
|
// on use rather than a silent wrong flag. (The fiber runtime is aarch64-only
|
||||||
|
// anyway — the swap_context asm — so only these two platforms are wired.)
|
||||||
|
inline if OS == {
|
||||||
|
case .linux: MAP_AP :: 0x22; // linux MAP_PRIVATE (0x2) | MAP_ANON (0x20)
|
||||||
|
case .macos: MAP_AP :: 0x1002; // macOS MAP_PRIVATE (0x2) | MAP_ANON (0x1000)
|
||||||
|
}
|
||||||
GUARD :: 16384; // one 16 KB page (aarch64-macOS)
|
GUARD :: 16384; // one 16 KB page (aarch64-macOS)
|
||||||
STACK :: 131072; // 128 KB usable per fiber
|
STACK :: 131072; // 128 KB usable per fiber
|
||||||
|
|
||||||
@@ -172,10 +188,11 @@ Scheduler :: struct {
|
|||||||
self.n_spawned = self.n_spawned + 1;
|
self.n_spawned = self.n_spawned + 1;
|
||||||
|
|
||||||
top := boot_stack(f, STACK);
|
top := boot_stack(f, STACK);
|
||||||
f.ctx.regs[0] = xx f; // x19 = self
|
f.ctx.regs[0] = xx f; // x19 = self (→ x0 in the tramp)
|
||||||
f.ctx.regs[10] = 0; // fp
|
f.ctx.regs[1] = xx fib_dispatch; // x20 = dispatch entry (tramp `br`s to it)
|
||||||
f.ctx.regs[11] = xx fib_tramp; // lr → trampoline
|
f.ctx.regs[10] = 0; // fp
|
||||||
f.ctx.regs[12] = top; // sp
|
f.ctx.regs[11] = xx fib_tramp; // lr → trampoline
|
||||||
|
f.ctx.regs[12] = top; // sp
|
||||||
|
|
||||||
f.state = .ready;
|
f.state = .ready;
|
||||||
enqueue(self, f);
|
enqueue(self, f);
|
||||||
@@ -239,12 +256,13 @@ Scheduler :: struct {
|
|||||||
// but was woken by another path (a manual wake, a Task completion), its
|
// but was woken by another path (a manual wake, a Task completion), its
|
||||||
// `IoWaiter` would otherwise survive pointing at a fiber that runs to
|
// `IoWaiter` would otherwise survive pointing at a fiber that runs to
|
||||||
// completion and is reaped (stack munmap'd + Fiber freed). A later
|
// completion and is reaped (stack munmap'd + Fiber freed). A later
|
||||||
// kqueue drain matching that stale record would `wake` freed memory.
|
// readiness drain matching that stale record would `wake` freed memory.
|
||||||
// Evict it here. NOTE: we do NOT EV_DELETE the kqueue registration — it
|
// Evict it here. The kernel-side registration is handled per-OS inside
|
||||||
// is EV_ONESHOT, so a never-fired registration simply lingers in the
|
// `cancel_io_waiter_for`: on darwin the EV_ONESHOT kqueue registration is
|
||||||
// kernel queue until the fd is readable, at which point the drain finds
|
// left to linger (a never-fired one-shot the drain ignores; the fd's
|
||||||
// no matching waiter and ignores it (see `run`). The fd is the example's
|
// owner closes it, auto-removing it), but on linux the EPOLLONESHOT
|
||||||
// to close; closing it auto-removes any pending registration.
|
// registration stays enabled and must be `EPOLL_CTL_DEL`'d (else it could
|
||||||
|
// fire later with no waiter and would block a re-arm of the same fd).
|
||||||
cancel_io_waiter_for(self, f);
|
cancel_io_waiter_for(self, f);
|
||||||
self.n_suspended = self.n_suspended - 1;
|
self.n_suspended = self.n_suspended - 1;
|
||||||
f.state = .ready;
|
f.state = .ready;
|
||||||
@@ -333,20 +351,38 @@ Scheduler :: struct {
|
|||||||
}
|
}
|
||||||
j = j + 1;
|
j = j + 1;
|
||||||
}
|
}
|
||||||
// Lazily open the kqueue fd the first time fd-blocking is used.
|
// Lazily open the event-queue fd the first time fd-blocking is used:
|
||||||
|
// kqueue on darwin, epoll on linux. `self.kq` holds whichever — it is
|
||||||
|
// just "the readiness queue fd".
|
||||||
if self.kq < 0 {
|
if self.kq < 0 {
|
||||||
self.kq = kqb.kqueue();
|
inline if OS == {
|
||||||
|
case .linux: self.kq = ep.ep_create();
|
||||||
|
case .macos: self.kq = kqb.kqueue();
|
||||||
|
}
|
||||||
if self.kq < 0 {
|
if self.kq < 0 {
|
||||||
print("sched: kqueue() failed to open the event queue\n");
|
print("sched: failed to open the event queue\n");
|
||||||
abort();
|
abort();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
// Arm a one-shot read-readiness registration for `fd`. udata is unused
|
// Arm a one-shot read-readiness registration for `fd`, matched back by
|
||||||
// (we match the waiter by fd in the drain), so pass 0.
|
// the run-loop drain (kqueue by ident; epoll stashes the fd in `data`).
|
||||||
chg := kqb.kev_change(fd, kqb.EVFILT_READ, kqb.EV_ADD | kqb.EV_ENABLE | kqb.EV_ONESHOT, 0);
|
// darwin EV_ONESHOT auto-removes the registration on fire; epoll's
|
||||||
if !kqb.kq_apply(self.kq, chg) {
|
// EPOLLONESHOT only DISABLES it, so the linux paths additionally
|
||||||
print("sched: kevent() failed to register fd {} for read readiness\n", fd);
|
// EPOLL_CTL_DEL on fire (run) and on early-wake (cancel_io_waiter_for).
|
||||||
abort();
|
inline if OS == {
|
||||||
|
case .linux: {
|
||||||
|
if !ep.ep_ctl(self.kq, ep.EPOLL_CTL_ADD, fd, ep.EPOLLIN | ep.EPOLLONESHOT) {
|
||||||
|
print("sched: epoll_ctl() failed to register fd {} for read readiness\n", fd);
|
||||||
|
abort();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
case .macos: {
|
||||||
|
chg := kqb.kev_change(fd, kqb.EVFILT_READ, kqb.EV_ADD | kqb.EV_ENABLE | kqb.EV_ONESHOT, 0);
|
||||||
|
if !kqb.kq_apply(self.kq, chg) {
|
||||||
|
print("sched: kevent() failed to register fd {} for read readiness\n", fd);
|
||||||
|
abort();
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
// Record the waiter BEFORE parking — the run loop matches the fired
|
// Record the waiter BEFORE parking — the run loop matches the fired
|
||||||
// event's ident back to this record. Long-lived-container rule: the
|
// event's ident back to this record. Long-lived-container rule: the
|
||||||
@@ -407,20 +443,42 @@ Scheduler :: struct {
|
|||||||
// kernel reports at least one fd ready, then wake every waiter whose
|
// kernel reports at least one fd ready, then wake every waiter whose
|
||||||
// fd fired. (null timeout via -1 → wait forever.)
|
// fd fired. (null timeout via -1 → wait forever.)
|
||||||
if self.io_waiters.len > 0 {
|
if self.io_waiters.len > 0 {
|
||||||
evbuf : [MAXEV]kqb.Kevent = ---;
|
// BLOCK on the readiness queue until ≥1 fd fires (timeout -1 =
|
||||||
n := kqb.kq_wait(self.kq, @evbuf[0], MAXEV, -1);
|
// forever), then for each fired event match the fd back to its
|
||||||
if n < 0 {
|
// io-waiter, evict the record, and wake the fiber.
|
||||||
print("sched: kevent() wait failed while blocking on fd readiness\n");
|
inline if OS == {
|
||||||
abort();
|
case .linux: {
|
||||||
}
|
evbuf : [MAXEV]ep.EpollEvent = ---;
|
||||||
// For each fired event, find the io-waiter whose fd matches its
|
n := ep.ep_wait(self.kq, .{ ptr = @evbuf[0], len = MAXEV }, MAXEV, -1);
|
||||||
// ident, evict it, and wake its fiber. EV_ONESHOT already removed
|
if n < 0 {
|
||||||
// the kernel registration, so we only drop the waiter record.
|
print("sched: epoll_wait() failed while blocking on fd readiness\n");
|
||||||
i := 0;
|
abort();
|
||||||
while i < n {
|
}
|
||||||
ready_fd : i32 = xx evbuf[i].ident;
|
i := 0;
|
||||||
wake_io_waiter_for_fd(self, ready_fd);
|
while i < n {
|
||||||
i = i + 1;
|
ready_fd := ep.ev_fd(evbuf[i]);
|
||||||
|
wake_io_waiter_for_fd(self, ready_fd);
|
||||||
|
// EPOLLONESHOT only DISABLED the registration; remove it
|
||||||
|
// fully so the fd can be re-armed by a future block_on_fd
|
||||||
|
// (kqueue's EV_ONESHOT removes it for free).
|
||||||
|
ep.ep_ctl(self.kq, ep.EPOLL_CTL_DEL, ready_fd, 0);
|
||||||
|
i = i + 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
case .macos: {
|
||||||
|
evbuf : [MAXEV]kqb.Kevent = ---;
|
||||||
|
n := kqb.kq_wait(self.kq, @evbuf[0], MAXEV, -1);
|
||||||
|
if n < 0 {
|
||||||
|
print("sched: kevent() wait failed while blocking on fd readiness\n");
|
||||||
|
abort();
|
||||||
|
}
|
||||||
|
i := 0;
|
||||||
|
while i < n {
|
||||||
|
ready_fd : i32 = xx evbuf[i].ident;
|
||||||
|
wake_io_waiter_for_fd(self, ready_fd);
|
||||||
|
i = i + 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
@@ -539,23 +597,48 @@ ASM
|
|||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
// First-entry trampoline: a fiber's bootstrapped LR points here. x19 holds the
|
// First-entry trampoline: a fiber's bootstrapped LR points here, with x19 =
|
||||||
// `*Fiber` (preset in the saved context); move it to x0 and call the generic
|
// `*Fiber` and x20 = `&fib_dispatch` (both preset in the saved context by
|
||||||
// dispatch.
|
// `spawn`, both callee-saved so `swap_context` restores them on first entry).
|
||||||
asm {
|
// Move the fiber to x0 and tail-branch to dispatch via the REGISTER (x20) — so
|
||||||
#string T
|
// there is no hand-written global-asm symbol and nothing here needs per-OS
|
||||||
.global _fib_tramp
|
// symbol naming (`_fib_tramp` on darwin vs `fib_tramp` on linux) or a `bl` to a
|
||||||
_fib_tramp:
|
// named export. As a naked sx fn `fib_tramp`'s own symbol is emitted with the
|
||||||
|
// platform-correct name automatically, so `spawn`'s `xx fib_tramp` resolves on
|
||||||
|
// every target. This register-indirect bootstrap replaced an OS-conditional
|
||||||
|
// global `asm` block (a top-level `asm` wrapped in an `inline if` is dropped in
|
||||||
|
// this module's context — see issues/0193) and sidesteps the hand-written
|
||||||
|
// symbol entirely, which is cleaner regardless.
|
||||||
|
fib_tramp :: () abi(.naked) {
|
||||||
|
asm volatile {
|
||||||
|
#string T
|
||||||
mov x0, x19
|
mov x0, x19
|
||||||
bl _fib_dispatch
|
br x20
|
||||||
brk #0
|
T
|
||||||
T,
|
};
|
||||||
};
|
}
|
||||||
fib_tramp :: () extern;
|
|
||||||
|
|
||||||
// The ONE place that runs a fiber body. Reached only from `_fib_tramp` on first
|
// The ONE place that runs a fiber body. Reached only from `fib_tramp` on first
|
||||||
// entry, on the fiber's own fresh stack. Runs the body, marks the fiber done,
|
// entry, on the fiber's own fresh stack. Runs the body, marks the fiber done,
|
||||||
// and switches back to the scheduler — never returns past the final switch.
|
// and switches back to the scheduler — never returns past the final switch.
|
||||||
|
//
|
||||||
|
// `export "fib_dispatch"` is MANDATORY, not decorative: it pins this fn to the
|
||||||
|
// **C ABI** (first real arg `self` in x0). The trampoline hands the fiber over
|
||||||
|
// in x0 (`mov x0, x19; br x20`), which is exactly C-ABI. Drop the export and the
|
||||||
|
// fn reverts to sx's INTERNAL calling convention, which reserves x0 for the
|
||||||
|
// implicit `context` pointer and shifts `self` to x1 — so the trampoline's x0
|
||||||
|
// would land in the context slot and `self` would be read from a garbage x1. On
|
||||||
|
// first entry that garbage happens to alias `&fiber.ctx == self` (left in x1 by
|
||||||
|
// the scheduler's prior `swap_context`), so the body runs once; but inside it
|
||||||
|
// the closure loads `[Fiber+8] == regs[1] == &fib_dispatch` as its "first
|
||||||
|
// capture" and re-invokes `fib_dispatch` forever → stack overflow → bus error
|
||||||
|
// (issue 0193 Bug A, observed only on the go/wait/sleep capstone 1817).
|
||||||
|
//
|
||||||
|
// One consequence of the C-ABI boundary: an exported fn has no implicit
|
||||||
|
// `context` param, so `self.body()` runs under the static `__sx_default_context`
|
||||||
|
// — NOT whatever `push Context { allocator = ... }` was in force at the
|
||||||
|
// `run()` call site. Fiber bodies do not inherit a caller-scoped allocator; a
|
||||||
|
// body that needs one must capture it explicitly (the long-lived-container rule).
|
||||||
fib_dispatch :: (self: *Fiber) export "fib_dispatch" {
|
fib_dispatch :: (self: *Fiber) export "fib_dispatch" {
|
||||||
self.body();
|
self.body();
|
||||||
self.state = .done;
|
self.state = .done;
|
||||||
@@ -687,7 +770,19 @@ cancel_io_waiter_for :: (self: *Scheduler, f: *Fiber) {
|
|||||||
i := 0;
|
i := 0;
|
||||||
while i < self.io_waiters.len {
|
while i < self.io_waiters.len {
|
||||||
if self.io_waiters.items[i].fiber == f {
|
if self.io_waiters.items[i].fiber == f {
|
||||||
remove_io_waiter(self, i);
|
// Early-wake: the fiber is re-readied by another path while its fd
|
||||||
|
// registration is still armed. kqueue's EV_ONESHOT lingers
|
||||||
|
// harmlessly (a never-fired one-shot the drain ignores); epoll's
|
||||||
|
// EPOLLONESHOT registration stays enabled — it could fire later with
|
||||||
|
// no waiter, and blocks a re-arm of the same fd — so remove it.
|
||||||
|
inline if OS == {
|
||||||
|
case .linux: {
|
||||||
|
fd := self.io_waiters.items[i].fd;
|
||||||
|
remove_io_waiter(self, i);
|
||||||
|
if self.kq >= 0 { ep.ep_ctl(self.kq, ep.EPOLL_CTL_DEL, fd, 0); }
|
||||||
|
}
|
||||||
|
case .macos: remove_io_waiter(self, i);
|
||||||
|
}
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
i = i + 1;
|
i = i + 1;
|
||||||
|
|||||||
46
readme.md
46
readme.md
@@ -30,6 +30,7 @@ main :: () {
|
|||||||
- Pattern matching on enums, optionals, and type categories
|
- Pattern matching on enums, optionals, and type categories
|
||||||
- C interop via `extern` / `export` and `#import c`
|
- C interop via `extern` / `export` and `#import c`
|
||||||
- Inline assembly as a first-class expression
|
- Inline assembly as a first-class expression
|
||||||
|
- Colorblind async via a pure-sx cooperative fiber runtime (no function coloring)
|
||||||
- Targets: macOS (ARM64, x86_64), Linux (x86_64, ARM64), Windows (x86_64), WebAssembly
|
- Targets: macOS (ARM64, x86_64), Linux (x86_64, ARM64), Windows (x86_64), WebAssembly
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
@@ -511,6 +512,51 @@ fence(.seq_cst); // standalone memory fence
|
|||||||
combinations are compile errors. The same operations run at compile time (`#run`)
|
combinations are compile errors. The same operations run at compile time (`#run`)
|
||||||
under single-threaded semantics.
|
under single-threaded semantics.
|
||||||
|
|
||||||
|
### Async / Concurrency (`modules/std/sched.sx`)
|
||||||
|
|
||||||
|
A pure-sx cooperative fiber runtime — **colorblind async**, with no `async` /
|
||||||
|
`await` keywords and no function coloring. Any function can suspend; a `Scheduler`
|
||||||
|
drives any number of stackful fibers, each on its own guard-paged stack. The
|
||||||
|
high-level API is `go` to spawn a task and `wait` to suspend until it completes:
|
||||||
|
|
||||||
|
```sx
|
||||||
|
#import "modules/std.sx";
|
||||||
|
sched :: #import "modules/std/sched.sx";
|
||||||
|
|
||||||
|
main :: () {
|
||||||
|
s := sched.Scheduler.init();
|
||||||
|
ps := @s; // closures capture by value — capture a pointer to the scheduler
|
||||||
|
|
||||||
|
// The coordinator runs as a fiber so `wait` has a fiber to park.
|
||||||
|
s.spawn(() => {
|
||||||
|
a := ps.go(() -> i64 => { ps.sleep(30); 100 }); // launch async tasks
|
||||||
|
b := ps.go(() -> i64 => { ps.sleep(10); 20 });
|
||||||
|
c := ps.go(() -> i64 => { ps.sleep(20); 3 });
|
||||||
|
|
||||||
|
sum := (a.wait() or 0) + (b.wait() or 0) + (c.wait() or 0); // 123
|
||||||
|
print("sum: {}\n", sum);
|
||||||
|
});
|
||||||
|
|
||||||
|
s.run(); // drive the scheduler until all fibers finish
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Tasks complete in deadline order, not spawn or await order. The runtime offers:
|
||||||
|
|
||||||
|
- **`go(work) -> *Task($R)`** / **`wait() -> R !TaskErr`** / **`cancel()`** — the
|
||||||
|
task layer. `wait` rides the `!` error channel so a cancel surfaces as
|
||||||
|
`error.Canceled`.
|
||||||
|
- **`spawn`**, **`yield_now`**, **`suspend_self`**, **`wake`** — the raw fiber
|
||||||
|
primitives the task layer is built on.
|
||||||
|
- **`sleep(ms)`** / **`now_ms()`** — timer-driven suspension on a virtual clock
|
||||||
|
(deterministic, no real wall time).
|
||||||
|
- **`block_on_fd(fd, want_read)`** — suspend until a file descriptor is ready,
|
||||||
|
backed by kqueue (darwin) or epoll (linux).
|
||||||
|
|
||||||
|
It's an M:1 model (cooperative, no preemption — so no data races between fibers
|
||||||
|
and no atomics needed across them), built on `abi(.naked)` context switching over
|
||||||
|
guarded `mmap` stacks. Currently aarch64-pinned (macOS + Linux).
|
||||||
|
|
||||||
### Command-line interface (`modules/std/cli.sx`)
|
### Command-line interface (`modules/std/cli.sx`)
|
||||||
|
|
||||||
`std.cli` builds command-line front-ends over an explicit logical argv: `os_args`
|
`std.cli` builds command-line front-ends over an explicit logical argv: `os_args`
|
||||||
|
|||||||
Reference in New Issue
Block a user