plan: correct grounded errors + harden async streams (post-metatype review)

Fold the adversarial-review corrections into the program plan + design-of-record:
- atomics is 100% net-new (no scaffolding; lower.zig 'ordering' is comparison-only)
- context is already an implicit *Context param (not TLS) — B1.1 rescoped
- abi(.pure) exists but is inert (no naked emission) — B1.0 rescoped
- B1.3 switch-stress harness is the first deliverable + mandatory stack guards
- Stream C gated on a named TSan/ASan + run-N stress harness, not a footnote
This commit is contained in:
agra
2026-06-20 08:47:07 +03:00
parent f81d101fae
commit ad1687c692
2 changed files with 87 additions and 32 deletions

View File

@@ -34,8 +34,12 @@ earlier if FFI/`#compiler`-collapse becomes a priority).
## Stream A — ATOMICS (N1) · `PLAN-ATOMICS.md` when carved
**Goal:** LLVM atomic codegen — the net-new emit primitive. Surface = `Atomic($T)`
wrapper + `Ordering` enum (locked, design §4.6). Some IR/inference scaffolding exists;
**lowering is absent**.
wrapper + `Ordering` enum (locked, design §4.6). **Grounding correction: this is 100%
net-new — there is NO atomics scaffolding.** `Atomic`/`Ordering` exist nowhere in
`library/` (the only `thread.sx` hit is the word "Atomically" in a comment), and the
only "ordering" in `lower.zig:1400-1418` is **comparison** ordering (`< <= > >=`),
entirely unrelated to memory ordering — do not mistake it for groundwork. A.0 must
build the type, the IR op, inference, AND lowering from zero.
**Phases:**
- A.0 `Atomic($T)` + `Ordering` lib types + `load`/`store` → LLVM `load atomic`/`store
@@ -57,24 +61,50 @@ The colorblind, stackful, pure-sx async runtime (design §4). Compiler floor is
the runtime is sx lib. Likely carved as two PLANs:
### B1 — Fibers + Io + M:1 (the runtime; `PLAN-FIBERS.md`)
- B1.0 **`callconv(.naked)`**extend `CallConv {default, c}` (types.zig:169) + skip
prologue/epilogue lowering. (Net-new; gates the context-switch.)
- B1.1 **Repointable-`context` codegen** — lower `context` as a swappable indirection
(never raw TLS) + per-fiber stack-limit. **Prerequisite of B1.3, not a successor.**
- B1.0 **`abi(.naked)` — make the EXISTING `.pure` ABI actually naked.** The enum
already carries `.pure` (ast.zig:142, documented "pure/naked, no prologue/epilogue"),
but it is an **inert label today**: `type_resolver.zig:237` maps `.pure → .default`
CC and there is **zero naked-attribute emission in emit_llvm**. So B1.0 is NOT
"extend the enum" (done) — it is "emit the LLVM `naked` attr + skip prologue/epilogue
lowering for `.pure`," genuinely net-new. (Roadmap §7-step-4's "extend
`CallConv {default, c}`" is stale — CallConv was renamed ABI and already gained
`compiler`/`pure` in the compiler-API stream.) Gates the context-switch.
- B1.1 **Per-fiber `context` root + `push Context`-stack storage.** Grounding correction:
`context` is **already an implicit `*Context` parameter** (comptime_vm.zig:392,
lower.zig:257 "Implicit Context parameter machinery"), **not raw TLS** — so it already
rides the fiber stack and the design doc's "lower as swappable indirection, never raw
TLS" guards a non-problem. The **real, currently-unsized** scope is: (a) where a
freshly-spawned fiber's *root* `Context` comes from, and (b) where the `push Context`
stack frames live (if on the caller stack, fiber-local for free; if a global root,
that root must become per-fiber). **Ground the current mechanism FIRST** — B1.1's size
is unknown until then, and it may be much smaller than the prior "M" estimate.
**Prerequisite of B1.3, not a successor.**
- B1.2 **A1 — `Io` interface + `context.io` + `Future` + `cancel()` API** (protocol/
vtable threaded like `Allocator`).
- B1.3 **A2 — fiber runtime**: `callconv(.naked)` context-switch asm (per-arch),
bootstrap, `mmap` stacks. **sx lib, not a compiler builtin** (design §4 A2).
- B1.3 **A2 — fiber runtime**: `abi(.naked)` context-switch asm (per-arch), bootstrap,
`mmap` stacks **with mandatory guard pages** (NOT optional — a fixed-stack fiber that
overflows without a guard corrupts adjacent fiber memory silently; §8.1.1). **sx lib,
not a compiler builtin** (design §4 A2). **First deliverable of B1.3, before the
scheduler AND before the deterministic `Io`: a standalone 2-fiber ping-pong
switch-stress harness** (scribble every callee-saved reg + a stack canary before each
suspend, deep/recursive fiber chains, verify all survive post-resume — §10.7). It
needs no scheduler and is the *only* gate that catches a one-register slip; A2 is
untestable by the deterministic-`Io` harness (which tests *scheduling*, not the
*switch*), so this harness — not B1.4 — is A2's correctness gate.
- B1.4 **A3 — `Io` impls: blocking → deterministic-sim (KEYSTONE) → event-loop**
(kqueue/epoll/io_uring). Build the deterministic `Io` *before* the event loop — it
is the test harness (§10.1).
is the test harness for *scheduling* (§10.1). (Note: the **event loop does not yet
cooperate with a platform UI run loop** — CFRunLoop/NSRunLoop/ALooper; pinning gives
thread-affinity, not run-loop integration. Tracked as an open design gap for the §6
app targets, deferred out of B1.)
- B1.5 **A5·M:1 scheduler** — validates the whole colorblind stack end-to-end.
**Gates:** deterministic-`Io` **calibrated** against blocking `Io` (don't trust an
uncalibrated oracle — §8.1.3); corpus `18xx` under deterministic `Io`; **A2
switch-stress test** (scribble every callee-saved reg + canary, deep fiber chains,
verify post-resume — §10.7) + arch-gated run tests. A2 is the highest-corruption-risk
piece (§8.1.1).
**Gates:** the **B1.3 switch-stress harness is A2's gate** (register/canary survival,
not run/snapshot — §8.1.1, §10.7) + arch-gated run tests; deterministic-`Io`
**calibrated** against blocking `Io` (don't trust an uncalibrated oracle — §8.1.3);
corpus `18xx` under deterministic `Io` asserts a program-emitted **ordering contract**
(sequence markers), not raw interleaving, so scheduler-internal policy changes don't
churn every snapshot.
### B2 — Channels + cancellation + stdlib (`PLAN-CHANNELS.md`)
- B2.0 **N3 — channels** (`Channel($T)`; `recv → RecvResult($T)` tagged union built via
@@ -96,9 +126,19 @@ ordering; `RecvResult` exercises the metatype primitives.
- C.1 **M:N** — work-stealing (thread-safe steal queues + migration); **pinning** API
(`pin = .main | .any | .on(thread)`). M:N is **committed, not deferred** — just last.
**Gates:** data races aren't snapshottable → **stress harness** (run-N / TSan-style),
*loudly* out of corpus scope (§10.2). **Named `context`-fiber-local + errno migration
test** (M:1 can't exercise migration — §10.7).
**Gates:** data races aren't snapshottable, but "out of corpus scope" is **not** "no
plan" — Stream C is **blocked on a concrete, named stress harness landing FIRST** (a
gating artifact carved into `PLAN-PARALLEL.md`, not a footnote):
1. **Sanitizer build** — a `zig build`-integrated TSan (and ASan) variant of the
concurrency corpus; CI runs `18xx`/parallel examples under it.
2. **Run-N driver** — each parallel example executed N times (configurable, default
≥100) with interleaving perturbation (randomized ready-queue / yield injection); any
nondeterministic divergence or sanitizer report fails the build.
3. **Coverage-bound `log()`** — the harness emits, loudly, exactly which guarantees it
does and does NOT cover (per the REJECTED-PATTERNS rule against silent gaps).
This harness is the **only** correctness story for N×(M:1)/M:N; C.0/C.1 do not start
until it exists and is calibrated. Plus the **named `context`-fiber-local + errno
migration test** (M:1 can't exercise migration — §10.7).
---
@@ -143,14 +183,20 @@ module hazard; S2 TLS + C-constructor JIT test per host OS (the exact prior-spik
## Cross-cutting (applies across streams)
- **Testing keystone:** the deterministic-sim `Io` (B1.4) must exist + be calibrated
before *any* async test is trusted (§10.1).
- **Top risks to watch (§8.1):** A2 context-switch correctness (B1.3), minted-enum →
match codegen (de-risked, metatype stream), deterministic-`Io` oracle calibration,
`context`-fiber-local/errno (C), S2 (E), C1 args-buffer layout (D).
- **The compiler floor stays small, but deep:** atomics, `callconv(.naked)`, repointable-
`context` codegen, `declare`/`define`/`type_info` (metatype stream), the S1 JIT spine.
Everything else — schedulers, fibers, channels, the bundler — is sx lib.
- **Testing keystone:** the deterministic-sim `Io` (B1.4) gates *scheduling* tests
(§10.1); the **B1.3 switch-stress harness gates the context-switch** (the one piece
the deterministic `Io` can't test). Both must exist + be calibrated before the async
tests they gate are trusted.
- **Top risks to watch (§8.1):** A2 context-switch correctness (B1.3 — gated by its own
stress harness, not the deterministic `Io`), minted-enum → match codegen (de-risked,
metatype stream), deterministic-`Io` oracle calibration, `context`-fiber-local/errno
(C — gated by the named stress harness), S2 (E), C1 args-buffer layout (D).
- **The compiler floor stays small, but deep — net-new pieces, grounded:** atomics
(100% net-new, no scaffolding), making `abi(.pure)` actually naked (the enum variant
exists but is inert today), per-fiber `context` root + push-stack storage (`context`
is already an implicit param, NOT TLS — so this is smaller/different than "repointable
codegen" implied), `declare`/`define`/`type_info` (metatype stream — **done**), the
S1 JIT spine. Everything else — schedulers, fibers, channels, the bundler — is sx lib.
## Carving protocol

View File

@@ -74,7 +74,7 @@ is `<host>`").
| ID | Piece | State | Size |
|----|-------|-------|------|
| **N1** | **Atomics — NET-NEW compiler feature.** Atomic load/store/RMW (`add/sub/and/or/xor/swap` + `fetch_min`/`fetch_max`; no `nand`), `compare_exchange`/`_weak` (→ `?T`, **null = success**), and fences, with orderings (relaxed/acquire/release/acq_rel/seq_cst). LLVM provides all — an **emit** feature, not a runtime library. **Surface LOCKED = `Atomic($T)` wrapper + `Ordering` enum** (not `@atomic_*``@` is address-of in sx). | **lowering absent** — zero LLVM `atomicrmw`/`cmpxchg`/`fence` emission today; some IR/inference scaffolding exists | M |
| **N1** | **Atomics — NET-NEW compiler feature.** Atomic load/store/RMW (`add/sub/and/or/xor/swap` + `fetch_min`/`fetch_max`; no `nand`), `compare_exchange`/`_weak` (→ `?T`, **null = success**), and fences, with orderings (relaxed/acquire/release/acq_rel/seq_cst). LLVM provides all — an **emit** feature, not a runtime library. **Surface LOCKED = `Atomic($T)` wrapper + `Ordering` enum** (not `@atomic_*``@` is address-of in sx). | **fully net-new** — zero LLVM `atomicrmw`/`cmpxchg`/`fence` emission **and no atomics scaffolding**: `Atomic`/`Ordering` exist nowhere in `library/`, and the only "ordering" in `lower.zig:1400` is *comparison* ordering (`< <= >=`), unrelated to memory ordering | M |
| **N2** | **OS threads + pthread Mutex/Cond + worker Pool** | **landed** — [std/thread.sx](../library/modules/std/thread.sx) (`pthread_create`/`join`/`detach`, in-place `Mutex`/`Cond`, bounded `Pool`). NOTE: pthread mutex **blocks the OS thread** — it is *not* fiber-aware (it would park every fiber on that thread); fiber-aware sync is N3, built on N1. | — |
| **N3** | **Fiber-aware sync** — mutex / channel / waitgroup that **suspend the fiber**, not the OS thread. Hybrid: atomic fast-path (N1) + fiber-suspend slow-path (A2/A5). Distinct from the pthread primitives in N2. | new library | M |
@@ -99,7 +99,7 @@ suspends is decided by the `Io` *implementation*, transparently.
| ID | Piece | Notes | Size |
|----|-------|-------|------|
| **A1** | **`Io` interface + `context.io`** — a protocol/vtable threaded like `Allocator`. `io.async(fn,args) → Future`, `future.await`, cancellation. | leverages protocols + context | M |
| **A2** | **Stackful coroutine runtime — in sx lib, NOT a compiler builtin.** The context-switch is a `callconv(.naked)` sx fn with an inline-asm body (save callee-saved + SP/LR into `*from`, load from `*to`, `ret`); fiber bootstrap + stack alloc (`mmap`+guard via `extern`) also sx. The **compiler's** job is only (a) the general primitives — inline asm, `callconv(.naked)`, atomics — and (b) **fiber-safe codegen**: `context` lowered as a *repointable indirection* (never raw TLS) so the switch can repoint it, and stack-limit guards (if emitted) read from a swappable per-fiber location. Most arch-delicate sx in the tree (must match the platform callee-saved set + the compiler ABI), but it's inspectable sx, not a black box. | per-arch, arch-gated; co-validate vs codegen | M |
| **A2** | **Stackful coroutine runtime — in sx lib, NOT a compiler builtin.** The context-switch is a `callconv(.naked)` sx fn with an inline-asm body (save callee-saved + SP/LR into `*from`, load from `*to`, `ret`); fiber bootstrap + stack alloc (`mmap`+guard via `extern`) also sx. The **compiler's** job is only (a) the general primitives — inline asm, `abi(.naked)`, atomics — and (b) **fiber-safe codegen**: `context` is **already an implicit `*Context` param** (not TLS — see §7 step 5), so the switch repoints it for free by swapping the per-fiber root; the open work is the per-fiber root + push-stack storage, and stack-limit guards (**mandatory, not optional** — fixed mmap stacks without a guard corrupt neighbors silently) reading from a swappable per-fiber location. Most arch-delicate sx in the tree (must match the platform callee-saved set + the compiler ABI), but it's inspectable sx, not a black box. | per-arch, arch-gated; co-validate vs codegen | M |
| **A3** | **Event-loop `Io` impls** — kqueue / epoll / io_uring drive readiness, then the (now-ready) syscall via C1. Plus a trivial **blocking `Io`**. | pure sx around syscall `extern`s | L |
| **A4** | **Stdlib I/O rework** — fs/socket/process take/use `context.io` instead of raw blocking syscalls, so existing calls participate in async. | mirrors the allocator-threading rule | M |
| **A5** | **Schedulers — M:1 → N×(M:1) → M:N, all sx std-lib `Io` vtables (committed; M:N last, not deferred).** M:1 first (minimal vehicle to validate the colorblind stack; covers I/O-bound). N×(M:1) = first parallel step (per-thread M:1 loops + `std/thread.sx` spawn; shared state uses N1 atomics — expected under parallelism, not a wart). M:N work-stealing last (most machinery: thread-safe steal queues + migration + errno/TLS discipline). All over N1 atomics + the A2 asm context-switch + `extern` syscalls. **pinning** API for thread-affine work (UI main thread, GL context). | see §4.3 | M (M:1) / M (N×M:1) / L (M:N) |
@@ -395,12 +395,21 @@ grounding) are explicit steps, not buried.
construct `TypeInfo` programmatically + `intern()`. **Residual = plumbing, not
capability:** name minted results by the instantiation's mangled name + input
validation.
4. **`callconv(.naked)`** — extend `CallConv {default, c}` (types.zig:169) + skip
4. **`abi(.naked)`** — *correction:* `CallConv` was renamed `ABI` and **already carries
`.pure`** (ast.zig:142, "pure/naked, no prologue/epilogue") during the compiler-API
stream — so this is NOT "extend the enum." `.pure` is an **inert label today**:
`type_resolver.zig:237` maps it to `.default` CC and emit_llvm emits **no** naked
attribute. The net-new work is making `.pure` actually emit LLVM `naked` + skip
prologue/epilogue lowering. Gates A2.
5. **Repointable-`context` codegen**lower `context` as a swappable indirection
(never raw TLS) + per-fiber stack-limit. Compiler obligation; gates A2 *and*
cross-fiber `context.io` correctness. (Reviewer note: this is a **prerequisite**
of A2, not a successor.)
5. **Per-fiber `context` root + push-stack storage***correction:* `context` is
**already an implicit `*Context` parameter** (comptime_vm.zig:392, lower.zig:257
"Implicit Context parameter machinery"), **not raw TLS** — so the "lower as swappable
indirection, never raw TLS" framing guards a non-problem; it already rides the fiber
stack. The real, **currently-unsized** obligation is (a) where a freshly-spawned
fiber's *root* `Context` comes from and (b) where `push Context` frames live (caller
stack ⇒ fiber-local for free; a global root ⇒ must become per-fiber) + per-fiber
stack-limit. **Ground the current mechanism before sizing this.** Prerequisite of
A2, not a successor.
**Async runtime — sx lib over the primitives:**
6. **A1 — `Io` interface + `context.io` + `Future` + `cancel()` API.**