fibers B1.3a-1: stackful context switch (naked swap_context + fiber bootstrap)
The first piece of the B1.3 fiber runtime — the stackful context switch, pure sx over abi(.naked). swap_context(from, to) saves the callee-saved registers + SP/LR into *from and loads them from *to, then rets onto to's stack (SP-in != SP-out by design — why it must be .naked). Fibers are bootstrapped by hand: the saved context starts with SP = top of an alloc_bytes stack, LR = a global-asm trampoline (mov x0, x19; bl _fib_body, reaching the sx body via export), and x19 = the *Fiber. Locked by examples/1807-concurrency-fiber-context-switch.sx (aarch64-pinned): - 2-fiber ping-pong (A <-> B, 3 rounds each): rounds: 6, and a per-fiber stack canary held live across every suspend survives (canary fails: 0); - a 64-frame deep recursive chain suspended at the bottom and resumed, verifying every frame's stack-local on the unwind (frames verified: 64, depth fails: 0). Scope (honest): exercises register/stack preservation INDIRECTLY (compiler- allocated live values + the canary). The EXPLICIT every-callee-saved GP (x19-x28) + FP (d8-d15) sentinel scribble — the full design-section-10.7 gate — is B1.3a-2, still owed. x86_64 sibling + mmap guard-page stacks are B1.3b. Suite green 733/0. Runs under JIT, ir-only on a non-arm host.
This commit is contained in:
@@ -4,9 +4,24 @@ Companion to [PLAN-FIBERS.md](PLAN-FIBERS.md). Update after every step (one step
|
||||
per the cadence rule). New corpus category: `18xx` concurrency.
|
||||
|
||||
## Last completed step
|
||||
**B1.2 COMPLETE — the async surface works end-to-end.** All three surface blockers (0151, 0152,
|
||||
0153) are FIXED + committed; the async examples are landed + green. Suite green **732/0**, master
|
||||
clean.
|
||||
**B1.3a-1 — the stackful context switch works (foundational + indirect survival harness).**
|
||||
Pure sx over `abi(.naked)`: a naked `swap_context(from, to)` saves callee-saved + SP/LR into
|
||||
`*from` and loads from `*to`; fibers are bootstrapped by hand (SP = top of an `alloc_bytes`
|
||||
stack, LR = a global-asm trampoline, x19 = `*Fiber`; the trampoline `mov x0, x19; bl _fib_body`).
|
||||
Locked by `examples/1807-concurrency-fiber-context-switch.sx` (aarch64-pinned, `.build
|
||||
{"target":"macos"}`, `.ir` captured): a 2-fiber ping-pong (A⇄B, 3 rounds each → `rounds: 6`) with
|
||||
a per-fiber stack canary surviving every switch (`canary fails: 0`), plus a 64-frame deep
|
||||
recursive chain suspended at the bottom and resumed, verifying every frame's stack-local on the
|
||||
unwind (`frames verified: 64` / `depth fails: 0`). Suite green **733/0**, master clean.
|
||||
- **Honest scope:** this exercises register/stack preservation INDIRECTLY (compiler-allocated
|
||||
live values + the canary), which catches a broken SP/LR or a dropped callee-saved. It does NOT
|
||||
yet EXPLICITLY scribble every callee-saved GP (x19-x28) + FP (d8-d15) with sentinels — that
|
||||
full §10.7 gate is **B1.3a-2** (a dedicated naked scribble/verify routine, see Next step).
|
||||
- The mechanism (bootstrap + naked switch + resume-mid-stack + `alloc_bytes` stacks) is proven;
|
||||
the WIP probe lives at `.sx-tmp/fib_full.sx` / `.sx-tmp/fib_probe.sx`.
|
||||
|
||||
### Earlier — B1.2 COMPLETE — the async surface works end-to-end
|
||||
All three surface blockers (0151, 0152, 0153) FIXED + committed; async examples landed + green.
|
||||
- **0151 fixed** (`362674f`): generic `$T` infers through generic-struct / pointer / UFCS-pack
|
||||
params. Regression `0214` + `0215`.
|
||||
- **0152 fixed** (`e5586f6`): `Atomic(bool)` load/store byte-promoted to `i8` in the codegen
|
||||
@@ -189,11 +204,28 @@ fibers/Io/scheduler code yet. Grounded floor facts:
|
||||
boundary; a sharper sx diagnostic for it is a candidate polish, not a blocker.
|
||||
|
||||
## Next step
|
||||
**B1.2 is done → start B1.3 (fiber runtime).** The compiler floor (B1.0 `abi(.naked)`, B1.1
|
||||
per-fiber `context`) + the capability surface (B1.2 Io / `async`/`await`/`cancel`) are all in.
|
||||
B1.3 builds the actual M:1 fiber scheduler on the `.naked` context-switch substrate — see
|
||||
`PLAN-FIBERS.md` for the B1.3 step list. The B1.3 switch-stress harness (design §10.7) gates the
|
||||
context-switch correctness the deterministic Io can't test.
|
||||
**B1.3a-2 — the EXPLICIT register/FP scribble gate (design §10.7), then B1.3b.** The foundational
|
||||
switch (B1.3a-1) is in; the rigorous gate is still owed. Sequence:
|
||||
1. **B1.3a-2 (explicit scribble — the real §10.7 gate):** a naked `scribble_verify(self, peer,
|
||||
base)` that loads known sentinels into EVERY callee-saved GP (x19-x28) AND FP (d8-d15), saves
|
||||
the original return addr on the stack, swaps to the peer, and on resume reads every register
|
||||
back and returns a mismatch count. Mechanism worked out: a naked fn CAN `bl swap_context` and
|
||||
resume in-place (the swap saves/restores its lr), so push the caller-return on the fiber stack
|
||||
before the swap and pop it after (sp is part of the saved context, so it round-trips). Add to
|
||||
`1807` (or a sibling `1808`); expect 0 mismatches. This is the single highest-corruption-risk
|
||||
asm in the stream — **review adversarially (worker if authorized)** per the plan.
|
||||
2. **B1.3b:** the x86_64 sibling of `swap_context` (rbx/rbp/r12-r15/rsp save area — different
|
||||
slot count + regs) + `mmap` stacks **with mandatory guard pages** (`mprotect` the low page
|
||||
`PROT_NONE`; a fixed stack without a guard silently corrupts neighbors — §8.1.1). Replace the
|
||||
`alloc_bytes` stack in `1807` with the guarded `mmap` path; add the x86_64 run sibling.
|
||||
3. Then B1.3 (fiber runtime substrate) is done → **B1.4** (`Io` impls: blocking ✅ →
|
||||
deterministic-sim KEYSTONE → event-loop) and **B1.5** (M:1 scheduler) build the real scheduler
|
||||
on top, replacing the hand-bootstrapped ping-pong with `spawn`/`yield`/`resume`.
|
||||
|
||||
**Deferred (do NOT block on these):** issue **0150** (`void` struct field SIGTRAP) — only
|
||||
`Future(void)`/`timeout`, which are B1.4. The **`::` callable-parameter feature** (named-fn
|
||||
async workers `async(read_a, conn)`) — WIP at `.sx-tmp/wip-callable-params/patch.diff` (parser
|
||||
done, inference incomplete); a dedicated effort; lambda workers are the B1.2 idiom meanwhile.
|
||||
|
||||
**Deferred (do NOT block on these):** issue **0150** (`void` struct field SIGTRAP) — only
|
||||
`Future(void)`/`timeout`, which are B1.4. The **`::` callable-parameter feature** (named-fn
|
||||
@@ -350,3 +382,15 @@ done, inference incomplete); a dedicated effort; lambda workers are the B1.2 idi
|
||||
42` / `double: 42` / `clock ok`) + **`1806`** (`cancel` → `await` raises `.Canceled` → `or`
|
||||
default; `ok: 7` / `canceled: -99`). **B1.2 (Io capability + M:1 async surface) is COMPLETE.**
|
||||
Next: B1.3 (fiber runtime) on the `.naked` context-switch substrate.
|
||||
- **B1.3a-1 — context switch works.** Implemented the stackful switch in pure sx over
|
||||
`abi(.naked)`: `swap_context(from, to)` (save callee-saved x19-x28 + fp/lr + sp into `*from`,
|
||||
load from `*to`, `ret` onto `to`'s stack) + by-hand fiber bootstrap (SP = top of an
|
||||
`alloc_bytes` stack, LR = a `.global _fib_tramp` global-asm trampoline that does `mov x0, x19;
|
||||
bl _fib_body`, x19 = `*Fiber`). Proven via a probe (main↔fiber), then locked by
|
||||
`examples/1807-concurrency-fiber-context-switch.sx` (aarch64-pinned): a 2-fiber ping-pong
|
||||
(`rounds: 6`, `canary fails: 0` — a per-fiber stack canary survives every switch) + a 64-frame
|
||||
deep recursive chain suspended at the bottom and resumed (`frames verified: 64` / `depth fails:
|
||||
0`). The `bl _fib_body` reaches the sx body via `export "fib_body"` (the 1655 asm→sx pattern);
|
||||
runs under JIT, ir-only on a non-arm host (`.ir` captured — `swap_context` shows `naked noinline
|
||||
nounwind`). Suite green 733/0. **Honest scope:** indirect register/stack survival only; the
|
||||
EXPLICIT every-callee-saved + FP scribble (§10.7) is B1.3a-2, still owed. Next: B1.3a-2.
|
||||
|
||||
150
examples/1807-concurrency-fiber-context-switch.sx
Normal file
150
examples/1807-concurrency-fiber-context-switch.sx
Normal file
@@ -0,0 +1,150 @@
|
||||
// Stream B1 (fibers) B1.3a — the stackful context switch, in pure sx over the
|
||||
// `abi(.naked)` primitive. `swap_context(from, to)` saves the callee-saved
|
||||
// registers + SP/LR into `*from` and loads them from `*to`, then `ret`s onto
|
||||
// `to`'s stack (SP-in ≠ SP-out by design — why it must be `.naked`, not `.c`).
|
||||
// A fiber is bootstrapped by hand: its saved context starts with SP = the top
|
||||
// of a fresh `alloc_bytes` stack, LR = a global-asm trampoline, and x19 = the
|
||||
// `*Fiber` (the trampoline moves it to x0 and `bl`s the exported entry).
|
||||
//
|
||||
// This is the foundational switch + an indirect survival harness:
|
||||
// - a 2-fiber ping-pong (A ⇄ B, 3 rounds each) — resume-mid-stack across
|
||||
// switches; a per-fiber stack canary held live across every suspend must
|
||||
// survive (the compiler allocates it to a callee-saved reg / stack slot,
|
||||
// so a clobbered switch would corrupt it);
|
||||
// - a deep recursive chain (64 frames) suspended at the bottom and resumed —
|
||||
// every frame's stack-local is verified on the unwind.
|
||||
//
|
||||
// What it does NOT yet do (B1.3a-2): EXPLICITLY scribble every callee-saved GP
|
||||
// (x19-x28) + FP (d8-d15) register with sentinels and check them in asm — the
|
||||
// full §10.7 gate. This harness exercises preservation indirectly (via
|
||||
// compiler-allocated live values), which catches a broken SP/LR or a dropped
|
||||
// callee-saved, but not a single specific register the allocator didn't use.
|
||||
//
|
||||
// aarch64-pinned (the asm + the 13-slot save area are per-arch); runs
|
||||
// end-to-end here, ir-only on a mismatch. The x86_64 sibling + `mmap` guard-
|
||||
// page stacks are B1.3b.
|
||||
#import "modules/std.sx";
|
||||
|
||||
// Saved context: x19..x28 (10), x29/fp, x30/lr, sp — 13 u64 slots.
|
||||
FiberCtx :: struct { regs: [13]u64; }
|
||||
|
||||
Fiber :: struct {
|
||||
ctx: FiberCtx;
|
||||
peer: *FiberCtx; // ping-pong hand-off target
|
||||
finish: *FiberCtx; // where to switch when the body ends (the spawner)
|
||||
count: *i64; // shared round counter
|
||||
verified: *i64; // shared count of verified recursion frames
|
||||
rounds: i64;
|
||||
id: i64;
|
||||
mode: i64; // 0 = ping-pong, 1 = deep recursion
|
||||
canary_fail: i64;
|
||||
depth_fail: i64;
|
||||
}
|
||||
|
||||
// The switch: x0 = from, x1 = to (read straight from the ABI registers — a
|
||||
// naked fn has no frame, so its params are never spilled).
|
||||
swap_context :: (from: *FiberCtx, to: *FiberCtx) abi(.naked) {
|
||||
asm volatile {
|
||||
#string ASM
|
||||
stp x19, x20, [x0, #0]
|
||||
stp x21, x22, [x0, #16]
|
||||
stp x23, x24, [x0, #32]
|
||||
stp x25, x26, [x0, #48]
|
||||
stp x27, x28, [x0, #64]
|
||||
stp x29, x30, [x0, #80]
|
||||
mov x9, sp
|
||||
str x9, [x0, #96]
|
||||
ldp x19, x20, [x1, #0]
|
||||
ldp x21, x22, [x1, #16]
|
||||
ldp x23, x24, [x1, #32]
|
||||
ldp x25, x26, [x1, #48]
|
||||
ldp x27, x28, [x1, #64]
|
||||
ldp x29, x30, [x1, #80]
|
||||
ldr x9, [x1, #96]
|
||||
mov sp, x9
|
||||
ret
|
||||
ASM
|
||||
};
|
||||
}
|
||||
|
||||
// First-entry trampoline: a fiber's bootstrapped LR points here. x19 holds the
|
||||
// `*Fiber` (preset in the saved context); move it to x0 and call the body.
|
||||
asm {
|
||||
#string T
|
||||
.global _fib_tramp
|
||||
_fib_tramp:
|
||||
mov x0, x19
|
||||
bl _fib_body
|
||||
brk #0
|
||||
T,
|
||||
};
|
||||
|
||||
fib_tramp :: () extern;
|
||||
|
||||
// Descend `depth` frames, yield to the spawner at the bottom, then on resume
|
||||
// verify every frame's stack-local survived the switch.
|
||||
descend :: (self: *Fiber, depth: i64) -> i64 {
|
||||
if depth == 0 {
|
||||
swap_context(@self.ctx, self.finish);
|
||||
return 0;
|
||||
}
|
||||
marker : i64 = depth * 7 + 3;
|
||||
bad := descend(self, depth - 1);
|
||||
if marker == depth * 7 + 3 { self.verified.* = self.verified.* + 1; } else { bad = bad + 1; }
|
||||
return bad;
|
||||
}
|
||||
|
||||
fib_body :: (self: *Fiber) export "fib_body" {
|
||||
if self.mode == 1 {
|
||||
self.depth_fail = descend(self, 64);
|
||||
swap_context(@self.ctx, self.finish);
|
||||
return;
|
||||
}
|
||||
canary : u64 = 0xCA11AB1E0000 + (xx self.id);
|
||||
i := 0;
|
||||
while i < self.rounds {
|
||||
self.count.* = self.count.* + 1;
|
||||
swap_context(@self.ctx, self.peer);
|
||||
if canary != 0xCA11AB1E0000 + (xx self.id) { self.canary_fail = self.canary_fail + 1; }
|
||||
i = i + 1;
|
||||
}
|
||||
swap_context(@self.ctx, self.finish);
|
||||
}
|
||||
|
||||
STACK :: 131072;
|
||||
|
||||
boot :: (f: *Fiber) {
|
||||
base : *void = context.allocator.alloc_bytes(STACK);
|
||||
top : u64 = (xx base) + STACK;
|
||||
top = top - (top % 16); // 16-byte aligned stack top (AAPCS)
|
||||
f.ctx.regs[0] = xx f; // x19 = self
|
||||
f.ctx.regs[10] = 0; // fp
|
||||
f.ctx.regs[11] = xx fib_tramp; // lr → trampoline
|
||||
f.ctx.regs[12] = top; // sp
|
||||
f.canary_fail = 0;
|
||||
f.depth_fail = 0;
|
||||
}
|
||||
|
||||
main :: () -> i64 {
|
||||
main_ctx : FiberCtx = ---;
|
||||
count : i64 = 0;
|
||||
verified : i64 = 0;
|
||||
|
||||
// Scenario 1: 2-fiber ping-pong with a per-fiber stack canary.
|
||||
a : Fiber = ---; a.id = 1; a.mode = 0; a.rounds = 3; a.count = @count; a.verified = @verified; a.finish = @main_ctx;
|
||||
b : Fiber = ---; b.id = 2; b.mode = 0; b.rounds = 3; b.count = @count; b.verified = @verified; b.finish = @main_ctx;
|
||||
a.peer = @b.ctx; b.peer = @a.ctx;
|
||||
boot(@a); boot(@b);
|
||||
swap_context(@main_ctx, @a.ctx);
|
||||
print("rounds: {}\n", count);
|
||||
print("canary fails: {}\n", a.canary_fail + b.canary_fail);
|
||||
|
||||
// Scenario 2: a deep recursive chain suspended at the bottom, then resumed.
|
||||
c : Fiber = ---; c.id = 3; c.mode = 1; c.count = @count; c.verified = @verified; c.peer = @main_ctx; c.finish = @main_ctx;
|
||||
boot(@c);
|
||||
swap_context(@main_ctx, @c.ctx); // descend to the bottom, yields back
|
||||
swap_context(@main_ctx, @c.ctx); // resume → unwind + verify, then finish
|
||||
print("frames verified: {}\n", verified);
|
||||
print("depth fails: {}\n", c.depth_fail);
|
||||
return 0;
|
||||
}
|
||||
@@ -0,0 +1 @@
|
||||
{ "target": "macos" }
|
||||
@@ -0,0 +1 @@
|
||||
0
|
||||
16966
examples/expected/1807-concurrency-fiber-context-switch.ir
Normal file
16966
examples/expected/1807-concurrency-fiber-context-switch.ir
Normal file
File diff suppressed because one or more lines are too long
@@ -0,0 +1 @@
|
||||
|
||||
@@ -0,0 +1,4 @@
|
||||
rounds: 6
|
||||
canary fails: 0
|
||||
frames verified: 64
|
||||
depth fails: 0
|
||||
Reference in New Issue
Block a user