The context switch is now proven on a second arch/ABI pair. A Win64 swap_context saves the complete Win64 callee-saved set: 8 GP (rbx,rbp,rdi,rsi,r12-r15) + rsp AND xmm6-xmm15 (10 XMM, 128-bit via movups -- Win64 has callee-saved XMM, unlike SysV/aarch64), plus a Win64 scribble_verify (264-byte frame, 32-byte shadow + 16-align at each call, COFF symbols, rsp-carried return address) driving the 2-fiber mutual scribble. Built --target x86_64-windows-gnu --self-contained (PE32+, output via the Win32 WriteFile boundary -- the 1660 pattern) and run on a Windows 7 x64 VM (UTM): printed '0 0 P' -- every GP + XMM callee-saved register survived the switch. Adversarially reviewed before the VM run (worker emitted the real .s and verified every call alignment, the frame offsets, the rsp/return-address round-trip, swap ordering, and COFF naming against the Win64 ABI -- no critical/minor bugs). Locked by examples/1810-concurrency-fiber-switch-win64.sx (pinned x86_64-windows-gnu, ir-only on this non-Windows host; the VM run is the runtime-correctness provenance). Good-swap-only mutual scribble (self-validating by construction; the in-process negative control was dropped to avoid an sx fn-ptr-convention issue -- detection of this exact logic was negative-controlled on aarch64 in 1808). Suite green 736/0. The B1.3 switch is proven on aarch64 + x86_64/Win64. Next: B1.4 (Io impls / M:1 scheduler).
479 lines
38 KiB
Markdown
479 lines
38 KiB
Markdown
# CHECKPOINT-FIBERS — Stream B1 (fibers + Io + M:1 scheduler)
|
||
|
||
Companion to [PLAN-FIBERS.md](PLAN-FIBERS.md). Update after every step (one step at a time,
|
||
per the cadence rule). New corpus category: `18xx` concurrency.
|
||
|
||
## Last completed step
|
||
**B1.3b-1 — the x86_64 / Win64 `swap_context` sibling — VALIDATED on real hardware.** The
|
||
context switch is now proven on a SECOND architecture + ABI. A Win64 `swap_context` saves the
|
||
COMPLETE Win64 callee-saved set — 8 GP (rbx, rbp, rdi, rsi, r12-r15) + rsp **and xmm6-xmm15**
|
||
(10 XMM, 128-bit via `movups` — Win64 has callee-saved XMM, unlike SysV/aarch64) — plus a Win64
|
||
`scribble_verify` (32-byte shadow + 16-align at each `call`, COFF symbols, rsp-carried return
|
||
addr). Locked by `examples/1810-concurrency-fiber-switch-win64.sx` (pinned `x86_64-windows-gnu`,
|
||
ir-only here): the 2-fiber mutual scribble printed **`0 0 P`** when built `--target
|
||
x86_64-windows-gnu --self-contained` and **run on a Windows 7 x64 VM (UTM)** — every GP + XMM
|
||
callee-saved survived. **Adversarially reviewed before the VM run** (worker emitted the real `.s`
|
||
and verified every `call` alignment, the 264-byte frame offsets, the rsp/return-addr round-trip,
|
||
swap ordering, and COFF naming against the Win64 ABI — no critical/minor bugs). The build→VM→run
|
||
loop was set up this session (cross-build needs `--self-contained`; output via the Win32
|
||
`WriteFile` boundary, the 1660 pattern). Suite green. Note: this is the GOOD-swap-only mutual
|
||
scribble (self-validating by construction; the in-process negative control was dropped to avoid an
|
||
sx fn-ptr-convention rabbit hole — the detection of this exact logic was negative-controlled on
|
||
aarch64 in 1808). The SysV/Linux x86_64 sibling (different reg set: no callee-saved XMM, args
|
||
rdi/rsi) remains for a Linux x86_64 host.
|
||
|
||
### Earlier — B1.3b-2 — mmap guard-page stacks (commit `dd532ab`)
|
||
Fiber stacks are `mmap`'d with a `PROT_NONE` GUARD PAGE at the low end (§8.1.1: a
|
||
fixed stack without a guard silently corrupts neighbors on overflow). `mmap` the `[guard |
|
||
usable]` region, `mprotect` the low 16KB page `PROT_NONE`; SP descends into the guard and faults
|
||
loudly at the boundary instead of corrupting a neighbor. Locked by
|
||
`examples/1809-concurrency-fiber-guard-stack.sx` (aarch64-macos-pinned): `guard armed: 1`
|
||
(`mprotect`→0) + `sum: 20100` (a fiber runs real recursion on the guarded stack + yields).
|
||
- **Guard FIRING validated** (manually, not corpus-pinned — a deliberate overflow crash is
|
||
host-fragile): a fiber recursing past its 128KB stack faults with `Bus error` at the guard page
|
||
(`region+GUARD`); the sx crash handler turns it into exit 134. Documented in the example header.
|
||
- **x86_64 sibling:** was deferred here (couldn't run x86_64 on this arm64 host), then DONE as
|
||
Win64 once a Windows 7 x64 VM became available — see B1.3b-1 above (`examples/1810`, `0 0 P`).
|
||
|
||
### Earlier — B1.3a-2 — the context-switch STRESS GATE (design §10.7) — DONE + adversarially reviewed
|
||
The explicit every-callee-saved-register scribble that B1.3a-1 owed. `swap_context` now saves the
|
||
COMPLETE AAPCS64 callee-saved set — integer x19-x28 + fp/lr + sp AND FP **d8-d15** (per §6.1.2
|
||
only the low 64 bits of v8-v15 are callee-saved, so `d8-d15` is exactly sufficient; x18 is Apple's
|
||
reserved platform reg, untouched). A naked `scribble_verify(self_ctx, peer, base)` loads a unique
|
||
sentinel into all 18 callee-saved regs, yields, and on resume counts the ones that didn't survive
|
||
(honoring its own caller ABI via a 176-byte frame that saves+restores the caller's callee-saved;
|
||
base reloaded from the frame post-swap; the original lr round-trips through the swap). The gate is
|
||
a **2-fiber MUTUAL scribble** (A and B scribble DISTINCT sentinels into the same physical regs, so
|
||
each survives only if `swap_context` saved+restored it — a lone fiber yielding to an idle peer
|
||
would NOT exercise preservation). Locked by `examples/1808-concurrency-fiber-switch-stress.sx`
|
||
(aarch64-pinned): `A mismatches: 0` / `B mismatches: 0`.
|
||
- **Validity proven by NEGATIVE controls:** dropping the d8-d15 save/restore → 8/8 mismatches
|
||
(exactly the FP regs); dropping x27/x28 → 2/2. The gate genuinely catches a broken switch.
|
||
- **Adversarial review (worker, per the plan): no CRITICAL bugs.** Verified the callee-saved set
|
||
is complete + correct, all frame offsets/16-alignment, the lr/sp dance, and swap read-ordering
|
||
against AAPCS64. Applied its one recommendation: `boot` now zeroes the FP ctx slots [13..20] so a
|
||
first switch-to loads 0 (not garbage) into d8-d15. Residual gaps it flagged (all spec-correct
|
||
for a call-boundary swap, documented in the example header): NZCV/FPSR not swapped; **FPCR**
|
||
(rounding mode — thread-global, bleeds across fibers if changed) and **TPIDR_EL0/TLS** (errno,
|
||
allocator thread-caches — shared by same-thread fibers) not swapped; fp=0 bootstrap blocks
|
||
unwind/signal walking past a fiber trampoline. These bite at the N×M:1 / signals stages, not the
|
||
single-thread switch.
|
||
- Suite green **734/0**, master clean. WIP probes: `.sx-tmp/scribble2.sx` (+ `_broken`/`_gp`).
|
||
|
||
### Earlier — B1.3a-1 — the foundational stackful context switch (commit `b234b7d`)
|
||
Pure sx over `abi(.naked)`: naked `swap_context` (GP-only 13-slot save) + by-hand fiber bootstrap
|
||
(SP = `alloc_bytes` stack top, LR = global-asm trampoline, x19 = `*Fiber`). Locked by
|
||
`examples/1807-concurrency-fiber-context-switch.sx`: 2-fiber ping-pong (`rounds: 6` / `canary
|
||
fails: 0`) + 64-frame deep recursion (`frames verified: 64` / `depth fails: 0`). Indirect
|
||
register/stack survival; 1808 supersedes its switch with the complete GP+FP save area + the
|
||
explicit gate.
|
||
|
||
### Earlier — B1.2 COMPLETE — the async surface works end-to-end
|
||
All three surface blockers (0151, 0152, 0153) FIXED + committed; async examples landed + green.
|
||
- **0151 fixed** (`362674f`): generic `$T` infers through generic-struct / pointer / UFCS-pack
|
||
params. Regression `0214` + `0215`.
|
||
- **0152 fixed** (`e5586f6`): `Atomic(bool)` load/store byte-promoted to `i8` in the codegen
|
||
emitters. Regression `1705`.
|
||
- **0153 fixed** (`68c1991`): `inferGenericReturnType` now pins return-type resolution to the
|
||
fn's DEFINING module (mirroring `monomorphizeFunction`), so a re-exported value-failable's
|
||
`!E` resolves to the real `.error_set` TypeId — the failable channel survives the re-export
|
||
alias. Regression `1058-errors-reexport-value-failable-channel.sx`.
|
||
- **Async examples landed:** `examples/1805-concurrency-io-blocking-async.sx`
|
||
(`context.io.async((a,b)->i64 => a+b, 40, 2).await() or {…}` → `sum: 42` / `double: 42` /
|
||
`clock ok`) + `examples/1806-concurrency-io-cancel.sx` (`f.cancel()` → `await` raises
|
||
`.Canceled` → `or` default; `ok: 7` / `canceled: -99`). Both green, snapshots captured.
|
||
|
||
### Earlier — the three B1.2 surface fixes (committed)
|
||
Generic `$T` inference, `Atomic(bool)` byte-promotion, and re-export failable-channel pin —
|
||
details below.
|
||
- **0151 fix (committed):** four gaps closed on the inference + UFCS-dispatch path —
|
||
(1) `extractTypeParam`/`matchTypeParam(Static)` got a `parameterized_type_expr` arm
|
||
(recover the arg instance's recorded per-param bindings via `struct_instance_bindings` +
|
||
the template's ordered `type_params`, recurse positionally; this also fixes `*Box($T)` —
|
||
it recurses into its `Box($T)` pointee); (2) the `pointer_type_expr` arm now falls through
|
||
to match the pointee against a non-pointer arg (auto-address-of: a `*Box($T)` param accepts
|
||
a by-value `Box($T)`, e.g. a UFCS receiver `b.m()`); (3) `ExprTyper.inferType` got a
|
||
`.lambda` arm building the closure type from the lambda's annotations (the UFCS binder types
|
||
args from the raw AST before they're lowered, so it can now bind `Closure(..) -> $R` from
|
||
the worker's declared return type); (4) a pack UFCS target routes through the SAME
|
||
`lowerPackFnCall` the direct call uses, with the receiver spliced in as `args[0]`.
|
||
- Regression tests: `examples/0214-generics-ufcs-closure-return-pack.sx` (direct + UFCS
|
||
closure-return pack) + `examples/0215-generics-infer-through-pointer.sx` (by-value /
|
||
pointer / multi-param / nested / UFCS-auto-ref struct-head inference). Issue 0151 marked
|
||
RESOLVED; repro moved into the suite.
|
||
|
||
### Earlier — B1.2 (Io capability) — LANDED + adversarially reviewed
|
||
Commits `a1b14f0` (lock) + `45d869d` (Io capability) + `3eeb965` (issue 0151 lock).
|
||
- **LANDED + review-confirmed correct** (commit 45d869d): `Io :: protocol #inline`
|
||
(spawn_raw/suspend_raw/ready/poll/now_ms/arm_timer) + `io` field on `Context`
|
||
(`{allocator; data; io}`, io LAST); BOTH `__sx_default_context` materializers
|
||
(protocol.zig + comptime_vm.zig) build an identical CBlockingIo→Io vtable (review verified
|
||
byte-for-byte agreement; `context.io.now_ms()` dispatches at runtime AND comptime); the
|
||
`push Context.{…}` omitted-field-**inherits-ambient** fix (review: correct, right fix, no
|
||
bad blast radius); `library/modules/std/io.sx` (`Future($R)`, `CBlockingIo`,
|
||
`async`/`await`/`cancel`); the `!`-protocol-impl-lint suppression; 37 `.ir` regens
|
||
(review: pure layout/type-table, no error text, zero .exit/.stdout/.stderr change).
|
||
- **BLOCKED — async surface non-functional:** `await`/`cancel` take `*Future($R)` and are
|
||
**uncallable in EVERY form** (not just UFCS) — sx can't infer a generic `$T` from a
|
||
pointer-wrapped arg (`*Future($R)`). `async(...)` (create) works via explicit call and
|
||
produces a correct `.ready` Future, but you can't `await` it. Root bug = **issue 0151
|
||
(WIDENED)**: infer `$T` from `*T`-wrapped params + closure-return-via-pack + UFCS dispatch.
|
||
Minimal repro: `unbox :: (b: *Box($T)) -> $T` fails to infer `T`.
|
||
- **No async example in the corpus** (1805 was removed because it needs the blocked surface)
|
||
→ the green suite does NOT cover async. Restore `1805` (async/await) + add `1806` (cancel)
|
||
once 0151 is fixed.
|
||
|
||
### Earlier — B1.1 (per-fiber `context` root) — DONE. Zero compiler change (confirmed by probe).
|
||
The fiber-spawn context convention works end-to-end with ordinary language features:
|
||
- `snap := context` captures the spawner's `Context` as a value;
|
||
- the snapshot is stored in a struct (the stand-in `Fiber`);
|
||
- a trampoline running under a *different* ambient context installs the fiber's stored root
|
||
with `push f.root { … }`, and the body reads the snapshot — not the trampoline's ambient
|
||
context — because `context` is an implicit slot-0 `*Context` param (call-carried, rides the
|
||
callee's own stack) and `push` allocates on the caller frame (no global, no TLS).
|
||
- Locked by `examples/1804-concurrency-context-snapshot.sx`: prints `fiber root: 42` (the
|
||
installed snapshot wins over ambient 99) + `ambient after: 99` (the `push` scope restores
|
||
the ambient context on exit). No fiber runtime yet (that's B1.3) — this proves the plumbing
|
||
it will build on. No `.build` pin (pure sx, host-independent).
|
||
- **Probe result:** the design doc's "lower as swappable indirection, never raw TLS" guarded
|
||
a non-problem — context was already param-carried, never TLS. No path re-reads
|
||
`__sx_default_context` mid-stack, so there is **no compiler obligation** here.
|
||
- `zig build && zig build test` green: **726 ran, 0 failed**.
|
||
|
||
### Earlier — B1.0 (`abi(.naked)` codegen) — complete
|
||
Replaced the emit bail with real LLVM `naked` emission:
|
||
- `emit_llvm` declaration pass: for `func.is_naked`, add the LLVM `naked` + `noinline` +
|
||
`nounwind` attributes and **skip** the `frame-pointer=all` attribute (incompatible with a
|
||
frameless function). Pass 2 now emits the `.naked` body normally — `naked` makes the
|
||
backend emit it verbatim (the inline asm + its own `ret`) with no prologue/epilogue.
|
||
- IR shape (verified): `; Function Attrs: naked noinline nounwind` / `define internal i64
|
||
@answer() #0 { entry: call void asm sideeffect "…ret…", ""() unreachable }` /
|
||
`attributes #0 = { naked noinline nounwind }`. The caller invokes it as an ordinary
|
||
`() -> i64` call (`.naked` is `call_conv == .default`).
|
||
- `examples/1800-concurrency-naked-asm.sx` — now GREEN, aarch64-pinned (`.build {"target":
|
||
"macos"}`): runs end-to-end → **exit 42** on this host, ir-only on a mismatch; `.ir`
|
||
snapshot captured.
|
||
- `examples/1801-concurrency-naked-generic.sx` (renamed from `-bail`) — the generic `.naked`
|
||
now emits a correct naked `answer__i64` (exit 42), proving generic.zig produces a naked
|
||
body, not a framed one. aarch64-pinned.
|
||
- `examples/1802-concurrency-naked-asm-x86.sx` — x86_64 cross sibling (`.build {"target":
|
||
"x86_64-linux"}`, ir-only here): `.ir` locks `naked` + `movl $42, %eax` / `ret`.
|
||
- Unit test `emit: abi(.naked) function gets the naked attribute (no frame-pointer)` in
|
||
`emit_llvm.test.zig` (asserts `naked` present, `frame-pointer` absent).
|
||
- **B1.0c (review-hardening):** a param-bearing `.naked` fn emitted invalid LLVM (loud
|
||
verifier error "cannot use argument of naked function") because the param-alloca loop
|
||
wasn't gated. Fixed forward (this *enables* the B1.3 context-switch use case rather than
|
||
rejecting it): gated the param-alloca loop on `fd.abi != .naked` in decl.zig (both paths) +
|
||
generic.zig; a naked fn's args stay in registers (read by asm), declared-but-unused in
|
||
LLVM. Locked by `examples/1803-concurrency-naked-asm-param.sx` (`add(a,b)` → x0+x1 → 42).
|
||
- `zig build && zig build test` green: **725 ran, 0 failed** + unit tests.
|
||
|
||
### Earlier — B1.0a (lock + review hardening)
|
||
Plumbed `Function.is_naked` (set from `fd.abi == .naked` at both decl sites + generic.zig +
|
||
pack.zig); `funcWantsImplicitCtx` skips `.naked` (no synthetic ctx, like `.c`); all
|
||
body-lowering paths bypass `lowerValueBody` for `.naked` (asm body + `unreachable` cap — no sx
|
||
return); `emit_llvm` Pass 2 bailed loudly (since flipped to real emission). Adversarial
|
||
review caught the generic/pack `is_naked` gap (a generic `.naked` silently shipped a framed
|
||
body); closed + locked. The review's `.naked`-lambda CRITICAL was a false positive
|
||
(unparseable — `isLambda` breaks on the `abi` keyword).
|
||
|
||
## Current state
|
||
**B1.2 COMPLETE.** The full async surface (Io capability on Context + `async`/`await`/`cancel` +
|
||
blocking `CBlockingIo`) works end-to-end. Master GREEN (732/0), installed `sx` clean. All four
|
||
B1.2 surface bugs resolved or deferred:
|
||
- **0151 fixed** (`362674f`): generic `$T` through generic-struct / pointer / UFCS-pack params.
|
||
Regression `0214` + `0215`.
|
||
- **0152 fixed** (`e5586f6`): `Atomic(bool)` byte-promoted to `i8` in the load/store emitters.
|
||
Regression `1705`.
|
||
- **0153 fixed** (`68c1991`): `inferGenericReturnType` pins return-type resolution to the fn's
|
||
defining module, so a re-exported value-failable keeps its `!` channel. Regression `1058`.
|
||
- Issue **0150** (`void` struct field → SIGTRAP) DEFERRED — only `Future(void)` / `timeout`,
|
||
which are B1.4.
|
||
|
||
The async examples are landed + green: `1805` (`async`/`await` + `now_ms` → `sum: 42` /
|
||
`double: 42` / `clock ok`) + `1806` (`cancel` → `await` raises `.Canceled` → `or` default).
|
||
The `18xx` concurrency category now covers naked-asm (1800-1803), context-snapshot (1804), and
|
||
the async surface (1805-1806).
|
||
|
||
### B1.2 Io capability — what is LANDED + verified (commit 45d869d)
|
||
- `Io :: protocol #inline { spawn_raw; suspend_raw -> !; ready; poll; now_ms; arm_timer; }`
|
||
in `core.sx` next to `Allocator`, with `SpawnOpts{ pin: PinTarget }` + `ParkToken{ handle }`.
|
||
Six methods, each justified by a downstream consumer (B1.3-B1.5).
|
||
- `Context :: struct { allocator; data; io: Io; }` — `io` appended LAST so `allocator` stays
|
||
index 0 (the `call.zig:1229` hardcode) and `data` keeps index 1 (minimal VM-fallback churn).
|
||
- Both `__sx_default_context` materializers updated in lockstep + verified: `protocol.zig`
|
||
`emitDefaultContextGlobal` (extended `ctx_fields` 2→3, built the `CBlockingIo→Io` inline
|
||
7-word vtable `{null-ctx, fn0..fn5}` via `getOrCreateThunks("Io","CBlockingIo")`) and
|
||
`comptime_vm.zig` `materializeDefaultContext` fallback (wrote the 6 thunk func-refs at
|
||
`io_base = addr + 4*ps`, offset `+ (i+1)*ps`). The global path auto-followed the 3-field
|
||
Context type. **`context.io.now_ms()` printed `clock ok` live — the capability threads + the
|
||
vtable dispatches correctly.**
|
||
- Stateless `CBlockingIo :: struct {}` + `impl Io for CBlockingIo` (mirror of `CAllocator`):
|
||
blocking semantics — `spawn_raw`/`ready`/`poll`/`arm_timer` no-op/0, `now_ms` → `time.mono_ms()`.
|
||
- **push-inherit-omitted fix** (`stmt.zig` `lowerPush`): a `push Context.{...}` now SEEDS the
|
||
new slot from the ambient context (load+store), then overwrites ONLY the literal's named
|
||
fields — so omitted fields (now incl. `io`) are INHERITED, never zero-inited to a null
|
||
vtable. Eliminates the omitted-field footgun globally (zero per-site churn across the 17
|
||
partial-literal sites). This is the correct capability-bag semantics; it compiled clean.
|
||
- **`!`-protocol-method warning fix** (`error_analysis.zig` + a new `Lowering.impl_method_names`
|
||
set populated in `protocols.zig` `registerImplBlock`): a protocol impl method may be declared
|
||
`!` by contract (e.g. `Io.suspend_raw`) yet never raise; the "declared `!` but never errors —
|
||
drop the `!`" hint is a false positive for impl methods, now suppressed for them.
|
||
|
||
Status of the blockers that originally stopped B1.2:
|
||
- **issue 0151 — FIXED this session** (generic `$T` through generic-struct / pointer /
|
||
UFCS-pack params). `async`/`await`/`cancel` are callable. See "Last completed step".
|
||
- **issue 0152 — NEW, the current blocker** (`Atomic(bool)` → sub-byte i1 atomic; LLVM reject).
|
||
Blocks the async examples via `Future.canceled: Atomic(bool)`. Filed; codegen-level fix.
|
||
- **issue 0150** — `void` struct field SIGTRAP; only `Future(void)`/`timeout` (B1.4). DEFERRED.
|
||
|
||
Per the IMPASSABLE STOP rule: 0151 fix shipped (suite green 728/0), 0152 filed, STOPPED.
|
||
Resume B1.2's async examples once 0152 lands.
|
||
|
||
### Earlier — B1.0 + B1.1 complete
|
||
Stream A (atomics) is feature-complete (✅). Stream B1: **B1.0 + B1.1 complete.** The two
|
||
compiler-floor preconditions for the fiber runtime are in place: (1) `abi(.naked)` emits a
|
||
real LLVM `naked` function end-to-end (decl, generic, pack paths) — the context-switch
|
||
substrate; (2) per-fiber `context` root needs **no compiler change** — the spawn convention
|
||
(snapshot `context`, store, `push` it from the trampoline) is pure library sx. No
|
||
fibers/Io/scheduler code yet. Grounded floor facts:
|
||
- `context` is an implicit slot-0 `*Context` param + `push Context` is a stack `alloca` ⇒
|
||
**fiber-local for free** (confirmed by the B1.1 probe — never TLS, never re-read from the
|
||
`__sx_default_context` global mid-stack). A spawn passes the snapshot as the fiber-entry
|
||
fn's slot-0 ctx via `push f.root { entry(args) }`. Locked by `1804-...-context-snapshot`.
|
||
- Inline asm works end-to-end (lower→emit→JIT, aarch64 + x86_64) — the `.naked` body reuses it.
|
||
- **`.naked` with PARAMS works** (B1.0c, the B1.3 substrate): the param-alloca loop is gated
|
||
on `fd.abi != .naked` in decl.zig (both paths) + generic.zig — a naked fn's args stay in
|
||
ABI registers (read by the asm body), declared-but-unused in LLVM (verifier-legal).
|
||
Example `1803-concurrency-naked-asm-param.sx` (`add(a,b)` reads x0/x1). **Unsupported (loud,
|
||
not silent):** a `.naked` *variadic-pack* fn (pack.zig's param loop is intertwined with
|
||
comptime-param/`#insert` handling, and a naked fn can't read a runtime-sized pack from
|
||
registers anyway) → loud LLVM-verifier error for that nonsensical construct. Acceptable
|
||
boundary; a sharper sx diagnostic for it is a candidate polish, not a blocker.
|
||
|
||
## Next step
|
||
**→ B1.4 — `Io` impls / the scheduler.** The switch substrate is proven on TWO arch/ABI pairs
|
||
(aarch64 native + x86_64/Win64 on the VM), with the §10.7 stress gate, guarded mmap stacks, and
|
||
adversarial review. That's enough to build the scheduler on. B1.4 builds the deterministic-sim
|
||
`Io` (calibrated against blocking `Io` before trusting it — §8.1.3), then **B1.5** (M:1 scheduler)
|
||
replaces the hand-bootstrapped ping-pong with real `spawn`/`yield`/`resume` over the switch. The
|
||
§10.7 gate (1808) + guarded-stack path (1809) + the Win64 sibling (1810) must keep passing as the
|
||
switch is wrapped into the scheduler.
|
||
|
||
**Side thread (optional, low priority): the SysV/Linux x86_64 sibling.** A THIRD switch variant
|
||
for `x86_64-linux`: SysV callee-saved = rbx, rbp, r12-r15 + rsp (6 GP + sp; **no** callee-saved
|
||
XMM, unlike Win64) — a 7-slot ctx, args rdi/rsi/rdx, the rsp-carried return addr. Needs a Linux
|
||
x86_64 host (or a working cross-run) to RUN + the mutual-scribble gate. Not blocking — the switch
|
||
is already validated on two arch/ABI pairs.
|
||
|
||
**Deferred (do NOT block on these):** issue **0150** (`void` struct field SIGTRAP) — only
|
||
`Future(void)`/`timeout` (B1.4). The **`::` callable-parameter feature** (named-fn async workers
|
||
`async(read_a, conn)`) — WIP at `.sx-tmp/wip-callable-params/patch.diff` (parser done, inference
|
||
incomplete); a dedicated effort; lambda workers are the idiom meanwhile.
|
||
|
||
`Context` layout settled: `{ allocator; data; io; }` (allocator index 0 fixed by
|
||
`call.zig:1229`, io last). Io protocol + materializers + push-inherit are LANDED + reviewed.
|
||
|
||
## Known issues / capability gaps
|
||
- **✅ issue 0153 — FIXED** (re-exported generic value-failable `($R, !E)` kept its `!` channel:
|
||
`inferGenericReturnType` now pins return-type resolution to the fn's defining module).
|
||
Regression: `examples/1058`. Was the LAST B1.2 surface blocker.
|
||
- **✅ issue 0152 — FIXED** (`Atomic(bool)` sub-byte i1 atomic → byte-promoted to i8 in the
|
||
load/store emitters). Regression: `examples/1705`. Unblocked `Future.canceled`.
|
||
- **✅ issue 0151 — FIXED** (generic `$T` through generic-struct / pointer / UFCS-pack params).
|
||
Regression: `examples/0214` + `0215`. Was the original B1.2 surface blocker.
|
||
- **issue 0150** (deferred) — a `void` struct field crashes the compiler (unsized-type SIGTRAP
|
||
in LLVM `getTypeSizeInBits`). Blocks `Future(void)` → `timeout` (B1.4). Repro: `issues/0150-...`.
|
||
- (Note: **issue 0149**, filed by another session against an earlier dirty binary, was a
|
||
manifestation of the pre-fix 0151 — now moot.)
|
||
- **Orthogonal (not a B1 blocker):** default VALUES for comptime params don't bind on
|
||
generic-struct methods (free-fn defaults DO work) — inherited from Stream A. Only matters
|
||
if a B2 lib type wants a defaulted comptime param; atomics/fibers require explicit, so
|
||
unaffected.
|
||
- **Issue 0144 (open, independent):** calling an unrecognized bodiless `#builtin` silently
|
||
returns 0 / exit 0 — a silent-fallback footgun in the generic builtin-call path. Filed;
|
||
leave for its own fix session unless prioritized. Not a B1 blocker.
|
||
- **Deferred design gap (documented):** the B1.4 event-loop `Io` does not yet cooperate with
|
||
a platform UI run loop (CFRunLoop/NSRunLoop/ALooper); pinning gives thread-affinity, not
|
||
run-loop integration — a §6 app-target concern, out of B1 scope.
|
||
|
||
## Decisions (Stream B1 specifics; surface locked in design §4 / §4.6)
|
||
- **The async runtime is sx LIBRARY code.** The compiler provides only: the general
|
||
primitives (inline asm ✅, `abi(.naked)` naked [B1.0], atomics ✅) + fiber-safe codegen
|
||
(`context` already fiber-local — B1.1). Schedulers, fibers, channels, futures, `Io`
|
||
vtables, `mmap` stacks are all sx.
|
||
- **`abi(.naked)` is the real spelling of the design's `callconv(.naked)`** — postfix slot,
|
||
`name :: (sig) -> Ret abi(.naked) { asm { … }; }`. B1.0 = carry it into IR + emit LLVM
|
||
`naked` + skip prologue/ctx (mirror the existing `.c` skip), NOT extend the enum (it's
|
||
already there, just inert).
|
||
- **`.naked` ≠ `.c`:** a `.c` epilogue would restore SP from the wrong stack across a context
|
||
switch (SP-in ≠ SP-out by design). `.naked` = no prologue/epilogue/frame; the asm emits its
|
||
own `ret`. This is why the switch must be `.naked`.
|
||
- **Naming:** sx-facing name is **`naked`** (keyword `abi(.naked)`, field `is_naked`, the
|
||
diagnostic), matching LLVM's `naked` attribute and the industry term (Zig/Rust/GCC/Clang).
|
||
The ABI variant was renamed `.pure → .naked` (user direction): "pure" universally means
|
||
*side-effect-free*, the opposite of a register-clobbering context switch.
|
||
- **B1.0 snapshot scope:** a `.naked` body is raw per-arch asm; LLVM's `naked` attr text is
|
||
arch-invariant. **B1.0a** = one host example locked to the emit bail (host-independent —
|
||
fires before instruction selection; no `.build` pin). **B1.0b** = pin aarch64 + add an
|
||
x86_64 cross sibling (`.build` target-gated, ir-only on mismatch), like the asm corpus
|
||
split. The `.ir` proves the `naked` attr + asm emitted, NOT register-save correctness
|
||
(that's B1.3's stress harness).
|
||
- **B1.1 — per-fiber context is library-only (CONFIRMED by probe):** push frames are
|
||
stack-`alloca`'d and the implicit ctx rides slot 0, so the spawn convention — snapshot
|
||
`context`, store it, `push f.root { entry(args) }` from the trampoline — installs the
|
||
fiber's root with no compiler change. Verified: the body reads the snapshot over a different
|
||
ambient context, and `push` restores ambient on exit (`1804-...-context-snapshot`). The
|
||
design doc's "never raw TLS" guarded a non-problem (context was never TLS).
|
||
- **Test keystones (design §10):** the **B1.3 switch-stress harness** gates the
|
||
context-switch (the one piece the deterministic `Io` can't test — §8.1.1, §10.7); the
|
||
**B1.4 deterministic-sim `Io`** (calibrated against blocking `Io` — §8.1.3) gates all
|
||
scheduling tests. Both must exist + be calibrated before the async tests they gate are
|
||
trusted. `18xx` asserts program-emitted ordering contracts, not raw interleaving.
|
||
|
||
## Log
|
||
- **carve** — wrote PLAN-FIBERS.md + CHECKPOINT-FIBERS.md. Grounded the B1 compiler floor:
|
||
`ABI.naked` inert (type_resolver.zig:237), IR `Function` has no naked flag (inst.zig:605),
|
||
attribute API pattern (emit_llvm.zig:1339 nounwind), `.c` ctx-skip precedent
|
||
(decl.zig:515), `push Context` stack-alloca + slot-0 implicit ctx (stmt.zig:1263,
|
||
lower.zig:259), `__sx_default_context` root (decl.zig:2667/2815), inline-asm corpus
|
||
(1645/1651). Corrected the design's `callconv(.naked)` → real `abi(.naked)` spelling and
|
||
the B1.0 snapshot story. B1.1 grounded as likely library-only. Baseline green (721/0).
|
||
- **B1.0a** — plumbed `Function.is_naked` (set from `fd.abi == .naked` at both decl sites);
|
||
`funcWantsImplicitCtx` skips `.naked` (no implicit ctx, like `.c`); both body-lowering
|
||
paths bypass `lowerValueBody` for `.naked` (asm body + `unreachable` cap — no sx return);
|
||
`emit_llvm` Pass 2 bails loudly on `func.is_naked`. `examples/1800-concurrency-naked-asm.sx`
|
||
locked to the bail (exit 1 + diagnostic). Suite green (722/0). (ABI variant later renamed
|
||
`.pure → .naked` — see the Naming decision above — so all `is_*`/`abi(.*)`/example names
|
||
here read `naked`.)
|
||
- **B1.0a review-hardening** — adversarial review found generic/pack Function-creation paths
|
||
left `is_naked` false (silent framed body for a generic `.naked` instance — returned 42 but
|
||
corrupted the stack). Fixed generic.zig + pack.zig (set `is_naked` + asm-only `unreachable`
|
||
cap); locked by `examples/1801-concurrency-naked-generic-bail.sx`. The review's `.naked`-
|
||
lambda CRITICAL was a false positive (unparseable — `isLambda` breaks on `abi`). Suite
|
||
green (723/0).
|
||
- **B1.0b** — real `naked` emission: emit_llvm declaration pass adds LLVM `naked`/`noinline`/
|
||
`nounwind` + skips `frame-pointer` for `func.is_naked`; Pass 2 emits the body verbatim (no
|
||
prologue). `1800` green aarch64-pinned (exit 42 + `.ir`); renamed `1801` → `-generic`
|
||
(generic `.naked` emits a naked body, exit 42); added x86_64 sibling `1802` (ir-only, `.ir`
|
||
locks `naked` + `movl $42, %eax`). Unit test asserts `naked` present + `frame-pointer`
|
||
absent. Suite green (724/0).
|
||
- **B1.0c** — review-hardening: param-bearing `.naked` emitted invalid LLVM (loud verifier
|
||
error). Gated the param-alloca loop on `fd.abi != .naked` (decl.zig both paths + generic.zig)
|
||
— naked args stay in registers, read by the asm body (the B1.3 context-switch shape).
|
||
Locked by `examples/1803-concurrency-naked-asm-param.sx`. Pack `.naked` left unsupported
|
||
(loud, nonsensical). **B1.0 complete.** Suite green (725/0).
|
||
- **rename** — ABI variant `.pure → .naked` (keyword, `Function.is_naked`, diagnostics,
|
||
examples 1800-1803 `*-pure-* → *-naked-*`, docs). "pure" universally means side-effect-free
|
||
— wrong for a register-clobbering switch; "naked" matches LLVM/Zig/Rust/GCC/Clang. Pure
|
||
cosmetics, no semantic change. Suite green (725/0).
|
||
- **B1.1** — per-fiber `context` root: **zero compiler change** (probe-confirmed). The spawn
|
||
convention (snapshot `context` → store in a struct → `push f.root { entry() }` from the
|
||
trampoline) installs the fiber's root via the implicit slot-0 `*Context` param; the body
|
||
reads the snapshot, not the trampoline's ambient ctx, and the `push` scope restores ambient
|
||
on exit. Locked by `examples/1804-concurrency-context-snapshot.sx` (prints `fiber root: 42`
|
||
/ `ambient after: 99`). Suite green (726/0). **Next: B1.2 (Io interface + context.io).**
|
||
- **B1.2 (BLOCKED)** — built the full `Io` capability (protocol on `Context`, stateless
|
||
`CBlockingIo` blocking default, both `__sx_default_context` materializers, push-inherit-omitted
|
||
fix, `!`-impl-method warning fix) and VERIFIED the core works live (`context.io.now_ms()` →
|
||
`clock ok`). Two independent compiler bugs blocked the `async`/`await`/`timeout` layer:
|
||
**0150** (`void` struct field → unsized SIGTRAP, blocks `Future(void)`) and **0151** (type-var
|
||
from a fn-ptr param's return type not bound in the body, blocks `async`'s `Future(R)`). Both
|
||
filed with standalone repros + investigation prompts. Per the STOP rule: reverted ALL B1.2
|
||
working changes (master green again, 726/0; the dirty binary had broken the photo project —
|
||
see the now-moot 0149), saved WIP to `.sx-tmp/b12-wip/`, STOPPED. Resume after 0150 + 0151.
|
||
- **0151 FIXED** — generic inference now binds `$T` through a generic-struct param head, a
|
||
pointer (`*Box($T)`, incl. UFCS auto-ref), and a closure-return-via-pack on the UFCS path.
|
||
Four gaps closed: `parameterized_type_expr` arm in `extractTypeParam`/`matchTypeParam(Static)`
|
||
(recovers the arg instance's recorded per-param bindings, recurses positionally); pointer arm
|
||
falls through to match a value arg (auto-address-of); `ExprTyper.inferType` `.lambda` arm
|
||
(closure type from annotations — UFCS types args from raw AST pre-lowering); pack UFCS target
|
||
routes through `lowerPackFnCall` with the receiver spliced in as `args[0]`. Issue 0151 marked
|
||
RESOLVED; repro → `examples/0214-generics-ufcs-closure-return-pack.sx`; widened cases →
|
||
`examples/0215-generics-infer-through-pointer.sx`. Suite green 728/0. The now-callable async
|
||
surface immediately exposed a SEPARATE codegen bug — **issue 0152** (`Atomic(bool)` → sub-byte
|
||
i1 atomic, LLVM reject; `Future.canceled` hits it). Filed with standalone repro + fix prompt.
|
||
Per the STOP rule: shipped the 0151 fix, filed 0152, STOPPED. Resume the async examples
|
||
(1805/1806) after 0152.
|
||
- **0152 FIXED** — the atomic load/store emitters (`src/backend/llvm/ops.zig`) byte-promote a
|
||
sub-byte (`bool`→`i1`) access to its `i8` storage type and `trunc`/`zext` the value at the
|
||
boundary (new `atomicByteType` helper). rmw/cmpxchg left as-is (a `bool` rmw/CAS is rejected
|
||
at the sx level — integer-only — so a sub-byte element never reaches them; comments record
|
||
this). Regression `examples/1705-atomics-bool-byte-promoted.sx` (load/store round-trip). Issue
|
||
0152 marked RESOLVED. Suite green 729/0. With `Atomic(bool)` working, the async surface
|
||
exposed the TRUE remaining blocker — **issue 0153**: a re-exported generic value-failable
|
||
`($R, !E)` loses its `!` channel at the call site (the earlier "secondary `or` PHI" symptom
|
||
was this, NOT an `Atomic` cascade — confirmed it persists after 0152). Narrowed to the
|
||
generic+re-export co-requirement (non-generic re-export OK; direct generic import OK; only the
|
||
combination drops `!`). Root cause: the monomorphized return-type's error-set, reached via the
|
||
re-export alias, resolves to a non-`.error_set` TypeId, so `errorChannelOf`
|
||
(`lower/error.zig:148`) misses the channel. Filed `issues/0153-...` with a minimal co-located
|
||
2-file repro + a single-file stdlib-`await` repro + investigation prompt. Per the STOP rule:
|
||
shipped the 0152 fix, filed 0153, STOPPED. Resume the async examples after 0153.
|
||
- **0153 FIXED → B1.2 COMPLETE** — `inferGenericReturnType` (`src/ir/generics.zig`) resolved the
|
||
return-type AST in the CALL-SITE module, so a re-exported error set (`LE :: lib.LE`) resolved
|
||
to a non-`.error_set` alias and the planned call-result was a plain tuple (channel lost). Fix:
|
||
pin the source to `fd.body.source_file` around the return-type resolution, exactly as
|
||
`monomorphizeFunction` does — the `!E` now resolves to the real `.error_set`. One-function
|
||
change; full suite green (732/0), no regression. Issue 0153 RESOLVED; repro →
|
||
`examples/1058-errors-reexport-value-failable-channel.sx` (+ companion `lib.sx`). With the
|
||
channel preserved, landed the async examples: **`1805`** (`async`/`await` + `now_ms` → `sum:
|
||
42` / `double: 42` / `clock ok`) + **`1806`** (`cancel` → `await` raises `.Canceled` → `or`
|
||
default; `ok: 7` / `canceled: -99`). **B1.2 (Io capability + M:1 async surface) is COMPLETE.**
|
||
Next: B1.3 (fiber runtime) on the `.naked` context-switch substrate.
|
||
- **B1.3a-1 — context switch works.** Implemented the stackful switch in pure sx over
|
||
`abi(.naked)`: `swap_context(from, to)` (save callee-saved x19-x28 + fp/lr + sp into `*from`,
|
||
load from `*to`, `ret` onto `to`'s stack) + by-hand fiber bootstrap (SP = top of an
|
||
`alloc_bytes` stack, LR = a `.global _fib_tramp` global-asm trampoline that does `mov x0, x19;
|
||
bl _fib_body`, x19 = `*Fiber`). Proven via a probe (main↔fiber), then locked by
|
||
`examples/1807-concurrency-fiber-context-switch.sx` (aarch64-pinned): a 2-fiber ping-pong
|
||
(`rounds: 6`, `canary fails: 0` — a per-fiber stack canary survives every switch) + a 64-frame
|
||
deep recursive chain suspended at the bottom and resumed (`frames verified: 64` / `depth fails:
|
||
0`). The `bl _fib_body` reaches the sx body via `export "fib_body"` (the 1655 asm→sx pattern);
|
||
runs under JIT, ir-only on a non-arm host (`.ir` captured — `swap_context` shows `naked noinline
|
||
nounwind`). Suite green 733/0. **Honest scope:** indirect register/stack survival only; the
|
||
EXPLICIT every-callee-saved + FP scribble (§10.7) is B1.3a-2, still owed. Next: B1.3a-2.
|
||
- **B1.3a-2 — the §10.7 stress gate, adversarially reviewed.** Extended `swap_context` to the
|
||
COMPLETE AAPCS64 callee-saved set (added FP d8-d15 → 21-slot ctx) and wrote a naked
|
||
`scribble_verify` that loads a unique sentinel into all 18 callee-saved regs, yields, and counts
|
||
non-survivors on resume (176-byte frame saves/restores the caller's callee-saved + base; lr
|
||
round-trips the swap). The gate is a 2-fiber MUTUAL scribble (each clobbers the other's regs, so
|
||
survival ⇒ the switch saved+restored them). Locked by
|
||
`examples/1808-concurrency-fiber-switch-stress.sx` (`A/B mismatches: 0`). Validity proven by
|
||
negative controls (drop d8-d15 → 8/8; drop x27/x28 → 2/2). **Spawned an adversarial-review
|
||
worker (per the plan + user request): NO critical bugs** — callee-saved set complete (x18 rightly
|
||
excluded; d8-d15 suffices per §6.1.2), offsets/alignment/lr-sp dance all verified. Applied its
|
||
one rec: `boot` zeroes FP ctx slots so first-entry loads 0, not garbage. Honest residual gaps
|
||
(spec-correct for a call-boundary swap; in the example header): FPCR/FPSR/NZCV + TPIDR/TLS not
|
||
swapped, fp=0 blocks unwind — relevant at N×M:1 / signals, not here. Suite green 734/0.
|
||
Next: B1.3b (x86_64 sibling + mmap guard-page stacks).
|
||
- **B1.3b — mmap guard-page stacks (x86_64 sibling deferred).** Fiber stacks now `mmap` a
|
||
`[guard | usable]` region and `mprotect` the low 16KB page `PROT_NONE`, so a stack overflow
|
||
faults at the guard boundary instead of silently corrupting a neighbor (§8.1.1). Locked by
|
||
`examples/1809-concurrency-fiber-guard-stack.sx` (aarch64-macos-pinned): `guard armed: 1`
|
||
(`mprotect`→0) + `sum: 20100` (a fiber runs real recursion on the guarded stack + yields).
|
||
Guard FIRING validated manually (overflow → `Bus error` at `region+GUARD`, exit 134 via the sx
|
||
crash handler) — not corpus-pinned because a deliberate-overflow crash is host-fragile (and a
|
||
mere "child faulted" fork test wouldn't prove the BOUNDARY catch). The x86_64 `swap_context`
|
||
sibling was DEFERRED: `--target x86_64-macos` mislinks on this arm64 host and `x86_64-linux`
|
||
can't run here, so it could only ship un-run/un-negative-controlled — which §10.7 forbids for the
|
||
highest-risk asm. SysV target notes (rbx/rbp/r12-r15/rsp, no callee-saved XMM, rsp-carried return
|
||
addr) recorded in Next step. Suite green **735/0**. Next: x86_64 sibling (needs an x86_64 host)
|
||
OR B1.4 (`Io` impls / scheduler) on the proven aarch64 substrate.
|
||
- **B1.3b-1 — x86_64 / Win64 switch sibling VALIDATED on real hardware.** The user provided a
|
||
Windows 7 x64 VM (UTM), so the x86_64 switch became RUNNABLE (as Win64). Validated the
|
||
cross-build→VM→run loop (`--target x86_64-windows-gnu --self-contained` → PE32+; output via the
|
||
Win32 `WriteFile` boundary, the 1660 pattern). Wrote a Win64 `swap_context` (8 GP rbx/rbp/rdi/
|
||
rsi/r12-r15 + rsp + **xmm6-xmm15** via `movups` — Win64 has callee-saved XMM) + a Win64
|
||
`scribble_verify` (264-byte frame, 32-byte shadow + 16-align at each `call`, COFF symbols,
|
||
rsp-carried return addr) driving the 2-fiber mutual scribble. **Adversarially reviewed (worker
|
||
emitted the real `.s`, verified every alignment/offset/round-trip against the Win64 ABI — no
|
||
critical/minor bugs), THEN run on the VM → `0 0 P`** (all 8 GP + 10 XMM callee-saved survived).
|
||
Locked by `examples/1810-concurrency-fiber-switch-win64.sx` (pinned `x86_64-windows-gnu`,
|
||
ir-only on this host; the VM run is the runtime-correctness provenance). Good-swap-only (the
|
||
in-process negative control was dropped to avoid an sx fn-ptr-convention rabbit hole; the
|
||
detection of this exact logic was negative-controlled on aarch64 in 1808). Suite green **736/0**.
|
||
The B1.3 context switch is now proven on TWO arch/ABI pairs. Next: **B1.4** (Io impls / M:1
|
||
scheduler) on the proven substrate. (Side thread: the SysV/Linux x86_64 sibling, when a Linux
|
||
x86_64 host is available.)
|