Fold the adversarial-review corrections into the program plan + design-of-record: - atomics is 100% net-new (no scaffolding; lower.zig 'ordering' is comparison-only) - context is already an implicit *Context param (not TLS) — B1.1 rescoped - abi(.pure) exists but is inert (no naked emission) — B1.0 rescoped - B1.3 switch-stress harness is the first deliverable + mandatory stack guards - Stream C gated on a named TSan/ASan + run-N stress harness, not a footnote
12 KiB
PLAN-POST-METATYPE — program plan for the async-first roadmap (everything after metatype)
Sequences every remaining stream after PLAN-METATYPE.md. This is the
program-level plan; each stream below is carved into its own
PLAN-<STREAM>.md + CHECKPOINT-<STREAM>.md (full step detail + kickoff prompt)
when reached, exactly as metatype was. Rationale, the comptime type-construction
design, risk ranking (§8.1), and the testing strategy (§10) all live in the design-of-record:
../design/execution-evolution-roadmap.md.
Cadence (IMPASSIBLE), every stream: no commit both adds a test AND makes it pass
(lock, or xfail→green); zig build && zig build test green after every step; never
regenerate snapshots while red. On an unrelated compiler bug → file issues/NNNN,
mark the stream checkpoint BLOCKED, stop (CLAUDE.md rule).
Ordering = async-first (design §7): the async story needs no JIT spine, so the
JIT/FFI cluster comes after. New corpus categories: 17xx atomics, 18xx concurrency.
Stream order (post-metatype)
| # | Stream | Roadmap steps | Depends on | Notes |
|---|---|---|---|---|
| A | Atomics | N1 (1) | — | independent foundation; gates B-parallel + channels |
| B | Async runtime | 4–12 | metatype, A (for channels) | the bulk; likely splits into B1 (runtime) + B2 (channels/cancel/stdlib) when carved |
| C | Parallel schedulers | 13–14 | A, B | N×(M:1) → M:N |
| D | Comptime JIT/FFI | 15–18 | — (independent of async) | S1 → C1 → C2 → C3 |
| E | Hot-reload (deferred) | 19–22 | D (S1/S2) | S2 → R1 → R2 → R3 |
A and D are independent of each other and of B's core; B is the spine of the async
story. Recommended execution order: A → B → C → D → E (async-first; D can slot
earlier if FFI/#compiler-collapse becomes a priority).
Stream A — ATOMICS (N1) · PLAN-ATOMICS.md when carved
Goal: LLVM atomic codegen — the net-new emit primitive. Surface = Atomic($T)
wrapper + Ordering enum (locked, design §4.6). Grounding correction: this is 100%
net-new — there is NO atomics scaffolding. Atomic/Ordering exist nowhere in
library/ (the only thread.sx hit is the word "Atomically" in a comment), and the
only "ordering" in lower.zig:1400-1418 is comparison ordering (< <= > >=),
entirely unrelated to memory ordering — do not mistake it for groundwork. A.0 must
build the type, the IR op, inference, AND lowering from zero.
Phases:
- A.0
Atomic($T)+Orderinglib types +load/store→ LLVMload atomic/store atomicwith orderings. - A.1 RMW:
fetch_add/sub/and/or/xor+fetch_min/max→atomicrmw(nonand). - A.2
compare_exchange/_weak→cmpxchg(returns?T, null = success). - A.3
swap+fence(.ordering).
Gates: unit emit_llvm.test.zig (correct op + ordering emission); corpus 17xx
single-thread (deterministic); arch-gated x86_64 + aarch64 .ir (orderings lower
differently — x86 vs LL/SC). Out of snapshot scope, state loudly: ordering
semantics under weak memory (.ir proves the keyword emitted, not correctness).
Stream B — ASYNC RUNTIME (steps 4–12) · splits into PLAN-FIBERS.md + PLAN-CHANNELS.md
The colorblind, stackful, pure-sx async runtime (design §4). Compiler floor is small; the runtime is sx lib. Likely carved as two PLANs:
B1 — Fibers + Io + M:1 (the runtime; PLAN-FIBERS.md)
- B1.0
abi(.naked)— make the EXISTING.pureABI actually naked. The enum already carries.pure(ast.zig:142, documented "pure/naked, no prologue/epilogue"), but it is an inert label today:type_resolver.zig:237maps.pure → .defaultCC and there is zero naked-attribute emission in emit_llvm. So B1.0 is NOT "extend the enum" (done) — it is "emit the LLVMnakedattr + skip prologue/epilogue lowering for.pure," genuinely net-new. (Roadmap §7-step-4's "extendCallConv {default, c}" is stale — CallConv was renamed ABI and already gainedcompiler/purein the compiler-API stream.) Gates the context-switch. - B1.1 Per-fiber
contextroot +push Context-stack storage. Grounding correction:contextis already an implicit*Contextparameter (comptime_vm.zig:392, lower.zig:257 "Implicit Context parameter machinery"), not raw TLS — so it already rides the fiber stack and the design doc's "lower as swappable indirection, never raw TLS" guards a non-problem. The real, currently-unsized scope is: (a) where a freshly-spawned fiber's rootContextcomes from, and (b) where thepush Contextstack frames live (if on the caller stack, fiber-local for free; if a global root, that root must become per-fiber). Ground the current mechanism FIRST — B1.1's size is unknown until then, and it may be much smaller than the prior "M" estimate. Prerequisite of B1.3, not a successor. - B1.2 A1 —
Iointerface +context.io+Future+cancel()API (protocol/ vtable threaded likeAllocator). - B1.3 A2 — fiber runtime:
abi(.naked)context-switch asm (per-arch), bootstrap,mmapstacks with mandatory guard pages (NOT optional — a fixed-stack fiber that overflows without a guard corrupts adjacent fiber memory silently; §8.1.1). sx lib, not a compiler builtin (design §4 A2). First deliverable of B1.3, before the scheduler AND before the deterministicIo: a standalone 2-fiber ping-pong switch-stress harness (scribble every callee-saved reg + a stack canary before each suspend, deep/recursive fiber chains, verify all survive post-resume — §10.7). It needs no scheduler and is the only gate that catches a one-register slip; A2 is untestable by the deterministic-Ioharness (which tests scheduling, not the switch), so this harness — not B1.4 — is A2's correctness gate. - B1.4 A3 —
Ioimpls: blocking → deterministic-sim (KEYSTONE) → event-loop (kqueue/epoll/io_uring). Build the deterministicIobefore the event loop — it is the test harness for scheduling (§10.1). (Note: the event loop does not yet cooperate with a platform UI run loop — CFRunLoop/NSRunLoop/ALooper; pinning gives thread-affinity, not run-loop integration. Tracked as an open design gap for the §6 app targets, deferred out of B1.) - B1.5 A5·M:1 scheduler — validates the whole colorblind stack end-to-end.
Gates: the B1.3 switch-stress harness is A2's gate (register/canary survival,
not run/snapshot — §8.1.1, §10.7) + arch-gated run tests; deterministic-Io
calibrated against blocking Io (don't trust an uncalibrated oracle — §8.1.3);
corpus 18xx under deterministic Io asserts a program-emitted ordering contract
(sequence markers), not raw interleaving, so scheduler-internal policy changes don't
churn every snapshot.
B2 — Channels + cancellation + stdlib (PLAN-CHANNELS.md)
- B2.0 N3 — channels (
Channel($T);recv → RecvResult($T)tagged union built via metatype type-fn) + fiber-awareMutex/WaitGroup(atomic fast-path from A). - B2.1 A6 — cancellation =
.canceledin the existing!channel (model a); per- fiber atomic flag (A); everyio.*a cancellation point; structured cancel-and-join; masked during cleanup. Rides ERR (try/onfail/defer). - B2.2 A4 — stdlib I/O rework — fs/socket/process onto
context.io.
Gates: 18xx under deterministic Io; cancellation cleanup asserted via stdout
ordering; RecvResult exercises the metatype primitives.
Stream C — PARALLEL SCHEDULERS (steps 13–14) · PLAN-PARALLEL.md
- C.0 N×(M:1) — per-thread M:1 loops +
std/thread.sxspawn; shared state uses A atomics; errno-capture discipline +context-fiber-local become mandatory. - C.1 M:N — work-stealing (thread-safe steal queues + migration); pinning API
(
pin = .main | .any | .on(thread)). M:N is committed, not deferred — just last.
Gates: data races aren't snapshottable, but "out of corpus scope" is not "no
plan" — Stream C is blocked on a concrete, named stress harness landing FIRST (a
gating artifact carved into PLAN-PARALLEL.md, not a footnote):
- Sanitizer build — a
zig build-integrated TSan (and ASan) variant of the concurrency corpus; CI runs18xx/parallel examples under it. - Run-N driver — each parallel example executed N times (configurable, default ≥100) with interleaving perturbation (randomized ready-queue / yield injection); any nondeterministic divergence or sanitizer report fails the build.
- Coverage-bound
log()— the harness emits, loudly, exactly which guarantees it does and does NOT cover (per the REJECTED-PATTERNS rule against silent gaps). This harness is the only correctness story for N×(M:1)/M:N; C.0/C.1 do not start until it exists and is calibrated. Plus the namedcontext-fiber-local + errno migration test (M:1 can't exercise migration — §10.7).
Stream D — COMPTIME JIT / FFI (steps 15–18) · PLAN-JIT.md
Independent of async; can move earlier if #compiler→extern / bundler cleanup is
prioritized.
- D.0 S1 — persistent JIT executor (long-lived ORC LLJIT + host-triple emitter + fragment cache, plumbed into the interp). Foundational for C1/C3.
- D.1 C1 — real comptime FFI = LLVM single ABI authority (per-signature JIT
calling-thunks via S1 + trampoline fast-path). Adversarial layout cases (over-
aligned/empty structs, aarch64 small-struct split,
bool— §8.1.6). - D.2 C2 —
#compiler→externcollapse (hooks → exported C symbols via C1; deletecompiler_call/Registry). Gate: bundler corpus byte-identical pre/post. - D.3 C3 — comptime asm via host-JIT (un-bail
inline_asm; lift→JIT→cache).06xxhost-arch#runasm +11xxcross-arch loud-bail diagnostic. - (S2 only if a path hits TLS/constructors — see Stream E.)
Gates: S1 lifecycle + cache unit tests; C1 behavior-lock trampoline cases →
xfail/green 12xx float/struct/aggregate returns.
Stream E — HOT-RELOAD (deferred) (steps 19–22) · PLAN-HOTRELOAD.md
Deferred; R1-vs-R2 chosen at pickup. Design constraint (not optional): runtime + long-lived fibers stay persistent, only leaf logic reloads (can't hot-swap code with live suspended fibers).
- E.0 S2 — ORC C++ shim (
MachOPlatform+ redirectable symbols). Highest risk (§8.1.5): only C++ in the tree, prior spike failed on_Thread_local, macOS- specific — Linux/Windows + non-Mac TLS/ctor JIT have no named plan yet. - E.1 R1 — dylib hot-reload (only needs shipped
export; sidesteps S2). - E.2 R2 — JIT-resident hot-reload (S1 + S2; ORC indirection stubs).
- E.3 R3 — incremental compilation (perf enabler; coarse per-file v1 first).
Gates (when picked up): state-survival test; the live-suspended-fiber-into-stale- module hazard; S2 TLS + C-constructor JIT test per host OS (the exact prior-spike case).
Cross-cutting (applies across streams)
- Testing keystone: the deterministic-sim
Io(B1.4) gates scheduling tests (§10.1); the B1.3 switch-stress harness gates the context-switch (the one piece the deterministicIocan't test). Both must exist + be calibrated before the async tests they gate are trusted. - Top risks to watch (§8.1): A2 context-switch correctness (B1.3 — gated by its own
stress harness, not the deterministic
Io), minted-enum → match codegen (de-risked, metatype stream), deterministic-Iooracle calibration,context-fiber-local/errno (C — gated by the named stress harness), S2 (E), C1 args-buffer layout (D). - The compiler floor stays small, but deep — net-new pieces, grounded: atomics
(100% net-new, no scaffolding), making
abi(.pure)actually naked (the enum variant exists but is inert today), per-fibercontextroot + push-stack storage (contextis already an implicit param, NOT TLS — so this is smaller/different than "repointable codegen" implied),declare/define/type_info(metatype stream — done), the S1 JIT spine. Everything else — schedulers, fibers, channels, the bundler — is sx lib.
Carving protocol
When a stream is reached: copy this section into current/PLAN-<STREAM>.md, expand the
phases to xfail→green steps with file anchors (from the design doc's anchor list), add
a CHECKPOINT-<STREAM>.md, and write a Phase-0-scoped kickoff prompt (mirror
PLAN-METATYPE's). Update CHECKPOINT-METATYPE.md/this file's status as
streams complete.