Commit Graph

18 Commits

Author SHA1 Message Date
agra
2437cf5e59 fibers B1.3b-1: x86_64 / Win64 swap_context sibling, validated on a Win7 x64 VM
The context switch is now proven on a second arch/ABI pair. A Win64 swap_context
saves the complete Win64 callee-saved set: 8 GP (rbx,rbp,rdi,rsi,r12-r15) + rsp
AND xmm6-xmm15 (10 XMM, 128-bit via movups -- Win64 has callee-saved XMM, unlike
SysV/aarch64), plus a Win64 scribble_verify (264-byte frame, 32-byte shadow +
16-align at each call, COFF symbols, rsp-carried return address) driving the
2-fiber mutual scribble.

Built --target x86_64-windows-gnu --self-contained (PE32+, output via the Win32
WriteFile boundary -- the 1660 pattern) and run on a Windows 7 x64 VM (UTM):
printed '0 0 P' -- every GP + XMM callee-saved register survived the switch.
Adversarially reviewed before the VM run (worker emitted the real .s and
verified every call alignment, the frame offsets, the rsp/return-address
round-trip, swap ordering, and COFF naming against the Win64 ABI -- no
critical/minor bugs).

Locked by examples/1810-concurrency-fiber-switch-win64.sx (pinned
x86_64-windows-gnu, ir-only on this non-Windows host; the VM run is the
runtime-correctness provenance). Good-swap-only mutual scribble (self-validating
by construction; the in-process negative control was dropped to avoid an sx
fn-ptr-convention issue -- detection of this exact logic was negative-controlled
on aarch64 in 1808).

Suite green 736/0. The B1.3 switch is proven on aarch64 + x86_64/Win64. Next:
B1.4 (Io impls / M:1 scheduler).
2026-06-21 07:35:51 +03:00
agra
dd532ab7b2 fibers B1.3b: mmap guard-page fiber stacks (x86_64 switch sibling deferred)
Fiber stacks are now mmap'd with a PROT_NONE guard page at the low end: mmap a
[guard | usable] region and mprotect the low 16KB page PROT_NONE, so a stack
overflow faults at the guard boundary instead of silently corrupting a neighbor
(design 8.1.1 — fixed stacks without a guard corrupt silently on overflow).

Locked by examples/1809-concurrency-fiber-guard-stack.sx (aarch64-macos-pinned):
guard armed: 1 (mprotect -> 0) + sum: 20100 (a fiber runs real recursion on the
guarded stack and yields). The guard FIRING is validated manually (a fiber
recursing past its 128KB stack faults with Bus error at region+GUARD, exit 134
via the sx crash handler) — not corpus-pinned, since a deliberate-overflow crash
is host-fragile and a 'child faulted' fork test would not prove the boundary
catch specifically.

The x86_64 swap_context sibling is DEFERRED: sx build --target x86_64-macos
mislinks on this arm64 host (object x86_64, link step arm64) and x86_64-linux
can't run here, so it could only ship IR-only / unrun. For the highest-
corruption-risk asm, shipping un-run / un-negative-controlled code violates the
design 10.7 'correctness not existence' rule. SysV target notes (rbx/rbp/r12-r15
/rsp, no callee-saved XMM, rsp-carried return address) recorded for a future
x86_64 host. Suite green 735/0.
2026-06-21 06:51:29 +03:00
agra
ed1b6c396d fibers B1.3a-2: context-switch stress gate (explicit callee-saved scribble) + adversarial review
The design section-10.7 correctness gate the foundational switch owed: explicitly
scribble EVERY callee-saved register, switch, and verify each survived.

- Extended swap_context to the COMPLETE AAPCS64 callee-saved set: integer
  x19-x28 + fp/lr + sp AND the FP regs d8-d15 (21-slot context). Per AAPCS64
  6.1.2 only the low 64 bits of v8-v15 are callee-saved, so d8-d15 is exactly
  sufficient; x18 is Apple's reserved platform register, untouched.
- naked scribble_verify(self_ctx, peer, base): loads a unique sentinel into all
  18 callee-saved regs, bl swap_context to yield, and on resume counts the regs
  that did not survive. Honors its own caller ABI via a 176-byte frame that
  saves+restores the caller's callee-saved; base reloaded post-swap (x2 not
  preserved); the original lr round-trips through the swap.
- The gate is a 2-fiber MUTUAL scribble (A and B scribble DISTINCT sentinels into
  the same physical regs), so a value survives only if swap_context saved and
  restored it. A lone fiber yielding to an idle peer would NOT exercise
  preservation.

Locked by examples/1808-concurrency-fiber-switch-stress.sx (aarch64-pinned):
A/B mismatches: 0. Validity proven by negative controls: dropping the d8-d15
save/restore reports 8/8 mismatches (the FP regs); dropping x27/x28 reports 2/2.

Adversarial review (worker): no critical bugs — callee-saved set complete and
correct, all frame offsets / 16-alignment / the lr-sp dance verified against
AAPCS64. Applied its one recommendation: boot zeroes the FP ctx slots so a first
switch-to loads 0, not garbage, into d8-d15. Residual gaps (spec-correct for a
call-boundary swap, documented in the header): FPCR/FPSR/NZCV + TPIDR/TLS are not
swapped, fp=0 blocks unwind past a fiber trampoline — these matter at the N×M:1 /
signals stages, not the single-thread switch.

Suite green 734/0. Next: B1.3b (x86_64 sibling + mmap guard-page stacks).
2026-06-21 06:38:02 +03:00
agra
b234b7df6f fibers B1.3a-1: stackful context switch (naked swap_context + fiber bootstrap)
The first piece of the B1.3 fiber runtime — the stackful context switch, pure
sx over abi(.naked). swap_context(from, to) saves the callee-saved registers +
SP/LR into *from and loads them from *to, then rets onto to's stack (SP-in !=
SP-out by design — why it must be .naked). Fibers are bootstrapped by hand: the
saved context starts with SP = top of an alloc_bytes stack, LR = a global-asm
trampoline (mov x0, x19; bl _fib_body, reaching the sx body via export), and
x19 = the *Fiber.

Locked by examples/1807-concurrency-fiber-context-switch.sx (aarch64-pinned):
- 2-fiber ping-pong (A <-> B, 3 rounds each): rounds: 6, and a per-fiber stack
  canary held live across every suspend survives (canary fails: 0);
- a 64-frame deep recursive chain suspended at the bottom and resumed, verifying
  every frame's stack-local on the unwind (frames verified: 64, depth fails: 0).

Scope (honest): exercises register/stack preservation INDIRECTLY (compiler-
allocated live values + the canary). The EXPLICIT every-callee-saved GP
(x19-x28) + FP (d8-d15) sentinel scribble — the full design-section-10.7 gate —
is B1.3a-2, still owed. x86_64 sibling + mmap guard-page stacks are B1.3b.

Suite green 733/0. Runs under JIT, ir-only on a non-arm host.
2026-06-21 06:16:58 +03:00
agra
37d68e72be fibers B1.2 COMPLETE: async/await/cancel examples (1805/1806)
With the three surface blockers fixed (0151 generic inference, 0152
Atomic(bool), 0153 re-export failable channel), the M:1 async surface works
end-to-end on the blocking Io default. Landed the corpus examples:

- 1805-concurrency-io-blocking-async.sx: context.io.async(lambda, ..args)
  runs the worker inline, await() or {…} yields the result; context.io.now_ms()
  reads the monotonic clock. Prints sum: 42 / double: 42 / clock ok.
- 1806-concurrency-io-cancel.sx: f.cancel() marks the future canceled so a
  later await() raises error.Canceled out of its (R, !IoErr) channel, caught
  with or. Prints ok: 7 / canceled: -99.

B1.2 (Io capability on Context + async/await/cancel + blocking CBlockingIo) is
complete. Suite green 732/0. Next: B1.3 (fiber runtime).
2026-06-21 05:59:04 +03:00
agra
a7499d5f51 fibers B1.2: 0152 fixed → Atomic(bool) works; blocked on 0153 (re-export value-failable loses ! channel)
With 0151 + 0152 fixed, the async surface is callable and Atomic(bool) works.
Building the async examples isolated the TRUE remaining blocker (the earlier
'secondary or PHI' symptom, confirmed NOT an Atomic cascade): a re-exported
generic value-failable ($R, !E) fn loses its ! error channel at the call site
— the result types as a plain tuple, so await(...) or { ... } / try ...await()
fail / build a malformed i1 PHI. await/IoErr are re-exported via std.sx, so the
async surface hits it.

Narrowed to the generic + re-export co-requirement (non-generic re-export OK;
direct generic import OK). Filed issues/0153 with a minimal co-located 2-file
repro + a single-file stdlib-await repro + investigation prompt (root cause:
the monomorphized return-type's error-set, reached via the re-export alias,
resolves to a non-.error_set TypeId, so errorChannelOf misses the channel).
Per the STOP rule, paused B1.2's async examples pending the 0153 fix.
2026-06-21 05:45:27 +03:00
agra
ea1faf7b69 fibers B1.2: 0151 fixed → async surface callable; blocked on 0152 (Atomic(bool) i1 atomic)
issue 0151 (generic $T inference through generic-struct / pointer / UFCS-pack
params) is fixed and committed, so io.sx's async/await/cancel are now callable
in every form. Building the async examples then tripped a SEPARATE codegen bug:
Atomic(bool) emits a sub-byte (i1) atomic load/store that fails LLVM
verification (must be byte-sized). Future.canceled: Atomic(bool) hits it.

Filed issues/0152 with a standalone repro + investigation prompt (codegen fix
in src/backend/llvm/ops.zig — promote sub-byte atomics to i8 storage). Per the
STOP rule, paused B1.2's async examples (1805/1806) pending the 0152 fix.
Checkpoint updated: 0151 RESOLVED, async surface BLOCKED on 0152.
2026-06-21 05:27:41 +03:00
agra
0ab26c8a40 fibers B1.2: record review findings — async surface blocked on 0151 (widened)
Adversarial review of 45d869d: the Io infrastructure (both materializers,
push-inherit, 37 .ir regens, !-lint) is correct + landed; but await/cancel
(*Future($R)) are uncallable in EVERY form because sx can't infer a generic
$T from a pointer-wrapped arg. Widened issue 0151 to that root (repro:
unbox(b: *Box($T)) -> $T). Checkpoint: B1.2 partially landed; next = fix 0151
generic inference -> make await/cancel callable -> add 1805/1806 -> B1.3.
2026-06-21 00:43:09 +03:00
agra
eee905c73c fibers B1.2: lambda-only async (named-fn :: feature deferred)
User decision: ship B1.2 async with lambda workers (works today, zero
compiler change); defer named-fn workers, which need a new :: callable-
parameter language feature (3 failed worker attempts; partial WIP saved
at .sx-tmp/wip-callable-params/). Records the resolved lambda async idiom
+ resume plan; no compiler/library code changed.
2026-06-20 21:51:01 +03:00
agra
7bf65565bd fibers B1.2: UNBLOCKED — remove invalid issue 0151, correct the async idiom
The B1.2 "blockers" were not real:
- Issue 0151 was INVALID: its repro used the non-idiomatic `($A) -> $R`
  bare-fn-ptr form. The canonical higher-order pack idiom
  `Closure(..$args) -> $R` + `..$args` (see examples/0543-packs-canonical-map)
  infers $R fine and runs today with no compiler change. Removed 0151.
- The correct async idiom is verified working live (42 42 for homo + hetero
  args): async :: (io, worker: Closure(..$args) -> $R, ..$args) -> Future($R)
  with a lambda worker (annotated params) + a `result = ---; result.v = ...`
  build form. No compiler change needed.

Issue 0150 (void struct field -> SIGTRAP exit 133) IS a real bug but is only
reached via Future(void) (void-returning worker / timeout) — deferred to B1.4;
B1.2 supports non-void workers.

Updates the PLAN/CHECKPOINT B1.2 status to UNBLOCKED with the corrected idiom
and the resume plan. No compiler/library code changed in this commit.
2026-06-20 20:00:36 +03:00
agra
f0a918f3c8 fibers B1.2: record async-args = variadic pack (..$args: []Type) correction
User correction: async's args are a variadic heterogeneous comptime pack
(..$args: []Type, specs.md:1383), not a single $A. Orthogonal to 0151
(return type-var binding). Recorded for the B1.2 resume.
2026-06-20 19:03:43 +03:00
agra
e78320637f fibers B1.2: BLOCKED on compiler bugs 0150 + 0151 (Io design proven)
Stream B1 B1.2 (Io capability + context.io + Future + cancel) is blocked on
two newly-discovered, independent compiler bugs, both with standalone repros:

- 0150: a `void` struct field crashes the compiler with an unsized-type
  SIGTRAP in LLVM getTypeSizeInBits. Blocks `Future(void)` -> `timeout`.
- 0151: a type-var inferred from a fn-pointer parameter's RETURN type is not
  bound as a usable type in the function body (`unknown type 'R'`). Blocks the
  central `async(io, worker: ($A)->$R, arg)` free-fn's `Future(R)`.

The B1.2 design itself is validated end-to-end (the Io protocol threaded on
Context like Allocator, the stateless blocking CBlockingIo default, both
__sx_default_context materializers, and `context.io.now_ms()` all work live).
Only the async/await/timeout ergonomic layer hits the two bugs. Per the
IMPASSABLE STOP rule, all B1.2 working changes were reverted (master green,
726/0) and the work paused pending fixes; WIP is saved at .sx-tmp/b12-wip/.

Checkpoint + plan updated to mark B1.2 BLOCKED with full resume notes.
2026-06-20 18:54:04 +03:00
agra
bab4886346 fibers B1.1: per-fiber context root is library-only (no compiler change)
A fiber needs its own root Context (the spawner's snapshot), not the
ambient one. Probed whether that needs compiler support: it does not.
context is an implicit slot-0 *Context param (call-carried, rides the
callee's own stack) and push Context allocates on the caller frame —
never TLS, never re-read from the __sx_default_context global mid-stack.
So the spawn convention is pure library sx:

  snap := context;            // snapshot the spawner's context
  f := Fiber.{ root = snap }; // store it
  push f.root { entry(args) } // trampoline installs it as the fiber root

examples/1804-concurrency-context-snapshot.sx locks it: a trampoline
running under ambient ctx 99 installs a stored snapshot (42); the body
reads 42, and the push scope restores 99 on exit. No fiber runtime yet
(B1.3) — this proves the plumbing it builds on.

The design doc's "lower context as swappable indirection, never raw
TLS" guarded a non-problem — context was already param-carried.

Suite green (726/0).
2026-06-20 17:09:26 +03:00
agra
a7fe165684 fibers: rename ABI variant .pure -> .naked
"pure" universally means side-effect-free (GCC __attribute__((pure)),
FP purity, D's pure) — the opposite of a register-clobbering context
switch. The concept is "naked": no compiler-generated prologue/epilogue,
body is raw asm that emits its own ret. That is the established term
everywhere (LLVM's naked function attribute — which we literally emit —
plus Zig callconv(.naked), Rust #[naked], GCC/Clang __attribute__
((naked))). Rename the keyword + everything keyed off it so concept,
surface, field, and the emitted LLVM attribute all agree.

- ast.zig: ABI enum variant pure -> naked (+ doc).
- parser: accept abi(.naked); error text updated.
- IR Function.is_pure -> is_naked; type_resolver/decl/generic/pack/
  emit_llvm references updated; diagnostics say abi(.naked).
- examples 1800-1803 renamed *-pure-* -> *-naked-* (source + expected/
  snapshots; .ir/.exit/.stdout/.stderr are byte-identical — the emitted
  IR is unchanged, only the keyword spelling differs).
- docs (PLAN-FIBERS, CHECKPOINT-FIBERS, PLAN-POST-METATYPE, the design
  roadmap, the compiler-API checkpoint/design) updated; the naming
  rationale now records why .naked over .pure.

No semantic change — pure cosmetics. Suite green (725/0).
2026-06-20 17:01:09 +03:00
agra
b631590574 fibers B1.0c: support params in abi(.pure) (read from registers)
Adversarial review of B1.0b found a param-bearing abi(.pure) function
emitted invalid LLVM ("cannot use argument of naked function" — loud
verifier error, not silent) because the param-alloca loop spilled the
args to stack slots, which a naked function cannot have.

Fixed forward — this ENABLES the B1.3 context-switch use case rather
than rejecting it: gate the param-alloca loop on fd.abi != .pure in
decl.zig (both body-lowering paths) and generic.zig. A naked function's
args stay in their ABI registers and are read directly by the asm body
(e.g. swap_context reads from/to from x0/x1); the LLVM args are
declared-but-unused, which the verifier allows.

examples/1803-concurrency-pure-asm-param.sx: naked add(a, b) reads x0/x1
(add x0, x0, x1; ret) -> 40 + 2 = 42. aarch64-pinned.

Pack abi(.pure) (variadic + naked — nonsensical, can't read a runtime
pack from registers) left unsupported: pack.zig's param loop is
intertwined with comptime-param/#insert handling, so that case still
hits the loud verifier error. Documented in the checkpoint.

Also updates PLAN-FIBERS / CHECKPOINT-FIBERS for B1.0 completion.
B1.0 complete. Suite green (725/0).
2026-06-20 16:36:31 +03:00
agra
40424df1b8 fibers B1.0a: close generic/pack is_pure gap (review)
Adversarial review of dd363ca found is_pure was set only at the two
declareFunction decl sites. Generic monomorphization (generic.zig) and
pack expansion (pack.zig) create the IR Function via a different path
and left is_pure false, so a generic abi(.pure) instance bypassed the
emit bail and silently shipped a framed body — it returned 42 but
leaked the prologue's stack adjustment (the exact SP-in != SP-out
corruption the lock exists to prevent).

Both paths now set is_pure and route .pure bodies through the asm-only
+ unreachable cap, mirroring the decl path. Locked by
examples/1801-concurrency-pure-generic-bail.sx (generic .pure reaches
the loud bail).

The review's other CRITICAL (a .pure lambda) is a false positive:
isLambda's return-type scan (parser.zig:3652) breaks on the abi
keyword, so a .pure lambda is unparseable and parseLambda's abi
handling is never reached. Latent isLambda/parseLambda inconsistency,
not a B1 concern.

Suite green (723/0).
2026-06-20 14:45:29 +03:00
agra
dd363ca877 fibers B1.0a: plumb abi(.pure), emit bails (lock)
First implementation step of Stream B1 (fibers). Make the inert abi(.pure)
ABI carry an is_pure flag through lowering, with LLVM emission deliberately
bailing loudly until B1.0b — the lock half of the lock->green cadence.

- IR Function.is_pure, set from fd.abi == .pure at both declareFunction
  decl sites.
- funcWantsImplicitCtx skips .pure (no synthetic __sx_ctx, mirroring the
  .c skip): a pure fn reads args from ABI registers, an implicit ctx would
  occupy a register slot the asm doesn't expect.
- both body-lowering paths bypass lowerValueBody for .pure: lower the asm
  body as statements + cap with unreachable. A pure body has no sx return
  (the asm rets itself), so the implicit-return diagnostic must not fire.
- emit_llvm Pass 2 bails loudly when func.is_pure (build-gating nonzero
  exit) rather than emit a framed body, whose epilogue would corrupt a
  context switch's deliberate SP-in != SP-out.

examples/1800-concurrency-pure-asm.sx: one host example (no .build pin --
the bail fires before instruction selection, so it is host-independent),
locked to the bail snapshot. B1.0b flips emit to LLVM's naked attribute +
asm-only body and pins the example per-arch.

The sx-facing name is "pure" throughout (field, diagnostic); LLVM's naked
attribute is only the B1.0b lowering mechanism. Suite green (722/0).
2026-06-20 14:34:53 +03:00
agra
7044b8133b fibers: carve Stream B1 (PLAN-FIBERS + CHECKPOINT-FIBERS)
Carve the async-runtime fibers stream off PLAN-POST-METATYPE Stream B,
mirroring the atomics carve. Grounds the B1 compiler floor against the
tree:

- abi(.pure) exists in the ABI enum but is inert (type_resolver maps it
  to .default CC, emit emits no naked attr) -> B1.0 makes it emit LLVM
  naked + skip prologue/ctx. Corrected the design's callconv(.naked)
  spelling to the real abi(.pure).
- context is already an implicit *Context param (slot 0) + push Context
  is a stack alloca -> fiber-local for free; only shared root is the
  __sx_default_context global. B1.1 grounded as likely library-only
  (probe-first).
- B1.0 snapshot story corrected: naked body is raw per-arch asm -> two
  arch-gated examples (aarch64 + x86_64), not one host .ir.

Full xfail->green step detail + a B1.0a kickoff prompt. Baseline green
(721/0). No code change; first implementation step is B1.0a.
2026-06-20 14:16:39 +03:00