diff --git a/current/CHECKPOINT-FIBERS.md b/current/CHECKPOINT-FIBERS.md index b425122a..5c42d6cd 100644 --- a/current/CHECKPOINT-FIBERS.md +++ b/current/CHECKPOINT-FIBERS.md @@ -59,6 +59,58 @@ body); closed + locked. The review's `.naked`-lambda CRITICAL was a false positi (unparseable — `isLambda` breaks on the `abi` keyword). ## Current state +**B1.2 is BLOCKED on two newly-filed compiler bugs (issues 0150 + 0151). Master is GREEN +(726/0); all B1.2 working changes were REVERTED so the installed `zig-out/bin/sx` is clean.** +See the "B1.2 attempt (BLOCKED)" subsection below + "Known issues" for the full story. The +B1.2 *design* is validated end-to-end (the `Io` protocol on `Context`, the blocking +`CBlockingIo` default, and `context.io.now_ms()` all WORK — verified live); only the two +generic/void compiler gaps stop the `async`/`await`/`timeout` ergonomic layer. WIP saved at +`.sx-tmp/b12-wip/` (io.sx + the compiler+lib diff + the 1805 example) for a fast resume. + +### B1.2 attempt (BLOCKED — design proven, two compiler bugs filed) +What was built + verified WORKING (then reverted to keep master green): +- `Io :: protocol #inline { spawn_raw; suspend_raw -> !; ready; poll; now_ms; arm_timer; }` + in `core.sx` next to `Allocator`, with `SpawnOpts{ pin: PinTarget }` + `ParkToken{ handle }`. + Six methods, each justified by a downstream consumer (B1.3-B1.5). +- `Context :: struct { allocator; data; io: Io; }` — `io` appended LAST so `allocator` stays + index 0 (the `call.zig:1229` hardcode) and `data` keeps index 1 (minimal VM-fallback churn). +- Both `__sx_default_context` materializers updated in lockstep + verified: `protocol.zig` + `emitDefaultContextGlobal` (extended `ctx_fields` 2→3, built the `CBlockingIo→Io` inline + 7-word vtable `{null-ctx, fn0..fn5}` via `getOrCreateThunks("Io","CBlockingIo")`) and + `comptime_vm.zig` `materializeDefaultContext` fallback (wrote the 6 thunk func-refs at + `io_base = addr + 4*ps`, offset `+ (i+1)*ps`). The global path auto-followed the 3-field + Context type. **`context.io.now_ms()` printed `clock ok` live — the capability threads + the + vtable dispatches correctly.** +- Stateless `CBlockingIo :: struct {}` + `impl Io for CBlockingIo` (mirror of `CAllocator`): + blocking semantics — `spawn_raw`/`ready`/`poll`/`arm_timer` no-op/0, `now_ms` → `time.mono_ms()`. +- **push-inherit-omitted fix** (`stmt.zig` `lowerPush`): a `push Context.{...}` now SEEDS the + new slot from the ambient context (load+store), then overwrites ONLY the literal's named + fields — so omitted fields (now incl. `io`) are INHERITED, never zero-inited to a null + vtable. Eliminates the omitted-field footgun globally (zero per-site churn across the 17 + partial-literal sites). This is the correct capability-bag semantics; it compiled clean. +- **`!`-protocol-method warning fix** (`error_analysis.zig` + a new `Lowering.impl_method_names` + set populated in `protocols.zig` `registerImplBlock`): a protocol impl method may be declared + `!` by contract (e.g. `Io.suspend_raw`) yet never raise; the "declared `!` but never errors — + drop the `!`" hint is a false positive for impl methods, now suppressed for them. + +Where it BROKE (the two blockers — both INDEPENDENT of the Io design, both repro standalone): +- **issue 0150** — `Future(void)` (for `timeout -> Future(void)`) makes a `result: void` field; + a `void` struct field crashes the compiler with an unsized-type SIGTRAP in LLVM + `getTypeSizeInBits` (a bare `struct { v: void; }` repros it). `timeout` was DEFERRED (it is a + B1.4 stub needing `arm_timer` anyway) rather than routed around with a non-void shape. +- **issue 0151** — `async(io, worker: ($A) -> $R, arg: $A) -> Future($R)`: `$R` inferred from a + fn-pointer parameter's RETURN type type-checks the call but is NOT bound as a usable type in + the body, so `Future(R)` errors `unknown type 'R'`. A direct `arg: $A` binds fine — the gap is + specific to type-vars nested in a fn-ptr/closure param signature. This blocks the central + `async`/`await` free-fns. (Manifested as the "unresolved type reached LLVM emission" panic — + the same one another session filed against my dirty binary as issue 0149, now moot after the + revert.) + +Per the IMPASSABLE STOP rule: filed 0150 + 0151, reverted all B1.2 working changes (master +green again, photo project unbroken), STOPPED. Resume B1.2 once 0150 + 0151 land — the WIP in +`.sx-tmp/b12-wip/` makes it ~mechanical (the design is proven). + +### Earlier — B1.0 + B1.1 complete Stream A (atomics) is feature-complete (✅). Stream B1: **B1.0 + B1.1 complete.** The two compiler-floor preconditions for the fiber runtime are in place: (1) `abi(.naked)` emits a real LLVM `naked` function end-to-end (decl, generic, pack paths) — the context-switch @@ -80,16 +132,31 @@ fibers/Io/scheduler code yet. Grounded floor facts: boundary; a sharper sx diagnostic for it is a candidate polish, not a blocker. ## Next step -**B1.2 (A1 — `Io` interface + `context.io` + `Future` + `cancel()` API).** Per PLAN-FIBERS.md -"Phases → B1.2". Library-only: add an `Io` protocol as a `Context` field (mirror `Allocator` -at field 0; `Context` is currently `{ allocator, data }` — add `io`), plus the `Future` / -`cancel()` surface. Exercise the blocking-`Io` default with an `18xx` example (real suspend -lands in B1.3). No compiler change expected; if a protocol-in-context gap appears, file it. -NOTE: adding a field to `Context` shifts its layout — check whether any `Context` literal / -`push Context.{...}` site or the `__sx_default_context` builder needs the new field (the -allocator precedent shows the pattern). +**B1.2 is BLOCKED — resume only AFTER issues 0150 + 0151 are fixed.** Then re-land B1.2 from +the saved WIP (`.sx-tmp/b12-wip/`): the `Io` protocol + `Context.io` + both materializers + +the push-inherit fix + the `!`-impl-warning fix all WORK as-is; restore `timeout -> +Future(void)` (needs 0150) and `async`/`await` (needs 0151), add `examples/1805-concurrency- +io-blocking-async.sx` (lock→green) + `1806-concurrency-io-cancel.sx` (cancel→`await` raises +`.canceled`). Regen `.ir` snapshots ONLY after green (`-Dupdate-goldens`) — adding `Io` to the +prelude shifts many `.ir` type tables; confirm the diff is ONLY layout/numbering + the new +vtable, NO error text. The `Context` layout decision is settled: `{ allocator; data; io; }` +(allocator index 0 fixed by `call.zig:1229`, `io` last). + +NOTE for the resume: do NOT add `io = context.io` to the 17 partial `push Context.{...}` sites +— the push-inherit-omitted fix (in the WIP diff) makes omitted fields inherit from the ambient +context, which is the correct fix and was verified to compile. Use that, not per-site edits. ## Known issues / capability gaps +- **🔴 B1.2 BLOCKERS (both filed, both standalone-reproducible, both independent of the Io + design):** + - **issue 0150** — a `void` struct field crashes the compiler (unsized-type SIGTRAP in LLVM + `getTypeSizeInBits`). Blocks `Future(void)` → `timeout`. Repro: `issues/0150-...`. + - **issue 0151** — a type-var inferred from a fn-pointer parameter's RETURN type is not bound + in the function body (`unknown type 'R'`). Blocks `async(io, worker: ($A)->$R, arg)`'s + `Future(R)`. Repro: `issues/0151-...`. + - (Note: **issue 0149**, filed by another session against the dirty in-progress binary, was a + manifestation of 0151 — "unresolved type reached LLVM emission". Moot after the revert; its + real root cause is 0151.) - **Orthogonal (not a B1 blocker):** default VALUES for comptime params don't bind on generic-struct methods (free-fn defaults DO work) — inherited from Stream A. Only matters if a B2 lib type wants a defaulted comptime param; atomics/fibers require explicit, so @@ -177,3 +244,12 @@ allocator precedent shows the pattern). reads the snapshot, not the trampoline's ambient ctx, and the `push` scope restores ambient on exit. Locked by `examples/1804-concurrency-context-snapshot.sx` (prints `fiber root: 42` / `ambient after: 99`). Suite green (726/0). **Next: B1.2 (Io interface + context.io).** +- **B1.2 (BLOCKED)** — built the full `Io` capability (protocol on `Context`, stateless + `CBlockingIo` blocking default, both `__sx_default_context` materializers, push-inherit-omitted + fix, `!`-impl-method warning fix) and VERIFIED the core works live (`context.io.now_ms()` → + `clock ok`). Two independent compiler bugs blocked the `async`/`await`/`timeout` layer: + **0150** (`void` struct field → unsized SIGTRAP, blocks `Future(void)`) and **0151** (type-var + from a fn-ptr param's return type not bound in the body, blocks `async`'s `Future(R)`). Both + filed with standalone repros + investigation prompts. Per the STOP rule: reverted ALL B1.2 + working changes (master green again, 726/0; the dirty binary had broken the photo project — + see the now-moot 0149), saved WIP to `.sx-tmp/b12-wip/`, STOPPED. Resume after 0150 + 0151. diff --git a/current/PLAN-FIBERS.md b/current/PLAN-FIBERS.md index 1bbfb10a..99321712 100644 --- a/current/PLAN-FIBERS.md +++ b/current/PLAN-FIBERS.md @@ -1,8 +1,12 @@ # PLAN-FIBERS — Stream B1 (fibers + Io + M:1 scheduler) > **STATUS: 🚧 in progress.** B1.0 (`abi(.naked)` codegen) ✅ + B1.1 (per-fiber `context` -> root — zero compiler change, library convention) ✅ complete. Next step = **B1.2** (`Io` -> interface + `context.io` + `Future` + `cancel()`). +> root — zero compiler change, library convention) ✅ complete. **B1.2** (`Io` interface + +> `context.io` + `Future` + `cancel()`) is **🔴 BLOCKED on compiler issues 0150 (`void` +> struct field SIGTRAP) + 0151 (type-var from a fn-ptr-return not bound in body)** — the Io +> design is proven (the protocol-on-`Context` + blocking default + `context.io.now_ms()` work +> live), but `Future(void)`/`timeout` and the `async`/`await` generics hit those two bugs. +> Resume B1.2 after both land (WIP saved at `.sx-tmp/b12-wip/`); see `CHECKPOINT-FIBERS.md`. Carved from [PLAN-POST-METATYPE.md](PLAN-POST-METATYPE.md) Stream B (§B1) + the design-of-record [../design/execution-evolution-roadmap.md](../design/execution-evolution-roadmap.md) diff --git a/issues/0150-void-struct-field-unsized-llvm-trap.md b/issues/0150-void-struct-field-unsized-llvm-trap.md new file mode 100644 index 00000000..b948f74f --- /dev/null +++ b/issues/0150-void-struct-field-unsized-llvm-trap.md @@ -0,0 +1,72 @@ +# 0150 — a `void` struct field crashes the compiler (unsized-type SIGTRAP in LLVM) + +## Status +OPEN — surfaced by Stream B1 (fibers) B1.2: `Future(void)` (needed by +`timeout(io, ms) -> Future(void)`) instantiates a struct with a `result: void` +field, which hits this bug. Independent of the fibers work (a plain +`struct { v: void; }` reproduces it standalone). + +## Symptom +Declaring or instantiating any struct that has a field of type `void` aborts the +compiler with `SIGTRAP` (exit 133/134) — no sx diagnostic. The trap is LLVM's +`llvm_unreachable("Cannot getTypeInfo() on a type that is unsized!")`: + +``` +libLLVM`llvm::DataLayout::getTypeSizeInBits + 912 brk #0x1 (EXC_BREAKPOINT) +``` + +Reached via `declareFunction` → `toLLVMType(func.ret)` when a function returns +such a struct, or directly when laying out the struct. + +Observed: SIGTRAP, no output, no diagnostic. +Expected: either zero-size the `void` field (a `void`/zero-sized field is a +legitimate construct — cf. Zig) OR emit a clean type diagnostic +("a struct field may not have type `void`") — never a raw backend crash. + +## Reproduction +```sx +#import "modules/std.sx"; + +Holder :: struct { v: void; ok: bool; } + +main :: () -> i32 { + h : Holder = .{ ok = true }; + if h.ok { print("ok\n"); } + return 0; +} +``` +`./zig-out/bin/sx run repro.sx` → SIGTRAP (exit 133), no output. + +Also reproduces through a generic: `Box :: struct($T: Type) { v: T; }` then +`Box(void)` — i.e. any monomorphization that binds a struct field to `void`. + +## Suspected area +- `src/backend/llvm/types.zig` `toLLVMTypeInfo` (struct field loop ~line 111): + a `void` field's LLVM type is the unsized `void` type, then `getTypeSizeInBits` + on the enclosing struct traps. +- The type layout / size code (`src/ir/types.zig` `typeSizeBytes` and the LLVM + struct builder) should treat a `void` field as zero-sized (skip it in the LLVM + struct, size 0, align 1) — the same way a zero-field struct is handled. + +## Investigation prompt (paste into a fresh session) +> A `void` struct field crashes the sx compiler with an unsized-type SIGTRAP in +> LLVM `getTypeSizeInBits` (no diagnostic). Repro: `issues/0150-...` (run it → +> exit 133). Decide the semantics: a `void` field should be ZERO-SIZED (preferred +> — it is a legitimate construct, e.g. `Future(void).result`), laid out as +> nothing (size 0, align 1) and OMITTED from the LLVM struct body; OR, if +> zero-sized fields are out of scope, a clean front-end diagnostic ("a struct +> field may not have type `void`, found in field `` of ``") before +> emission — NEVER a backend trap. Likely sites: `src/backend/llvm/types.zig` +> `toLLVMTypeInfo` (skip `void` fields when building the LLVM struct element +> list) + `src/ir/types.zig` size/align (`typeSizeBytes`/align: a `void` field +> contributes 0). If choosing the diagnostic route, add it where struct fields +> are validated at type-resolution time. Verify: the repro prints `ok` (zero-size +> route) or emits the diagnostic + clean exit 1 (diagnostic route); then move the +> repro into `examples/` as a regression test. + +## Why this matters for B1 (fibers) +`Future($R)` with `$R = void` is the natural shape for `timeout(io, ms) -> +Future(void)` (B1.2 spec) and for any future-of-no-value. B1.2 deferred +`timeout` pending this fix rather than route around it with a substitute +non-void shape (which would hide the bug). Once 0150 lands, re-add `timeout` +with `Future(void)` (see the saved WIP at `.sx-tmp/b12-wip/io.sx`). diff --git a/issues/0150-void-struct-field-unsized-llvm-trap.sx b/issues/0150-void-struct-field-unsized-llvm-trap.sx new file mode 100644 index 00000000..3451e385 --- /dev/null +++ b/issues/0150-void-struct-field-unsized-llvm-trap.sx @@ -0,0 +1,13 @@ +// Repro for issue 0150 — a `void` struct field crashes the compiler with an +// unsized-type SIGTRAP (LLVM getTypeSizeInBits). Unpinned (no expected marker) +// because it currently aborts the compiler; pin it as a regression test once +// the fix lands. +#import "modules/std.sx"; + +Holder :: struct { v: void; ok: bool; } + +main :: () -> i32 { + h : Holder = .{ ok = true }; + if h.ok { print("ok\n"); } + return 0; +} diff --git a/issues/0151-typevar-from-fnptr-return-not-bound-in-body.md b/issues/0151-typevar-from-fnptr-return-not-bound-in-body.md new file mode 100644 index 00000000..d74438fa --- /dev/null +++ b/issues/0151-typevar-from-fnptr-return-not-bound-in-body.md @@ -0,0 +1,98 @@ +# 0151 — a type-var inferred from a fn-pointer parameter's RETURN type is not bound in the function body + +## Status +OPEN — blocks Stream B1 (fibers) B1.2's `async(io, worker: ($A) -> $R, arg: $A)` +free-fn (it needs `Future(R)` in its body). Independent of the fibers work +(reproduces with a tiny `Wrap($R)` standalone). + +## Symptom +A generic free function whose type-var `$R` is introduced **inside a +fn-pointer parameter's return type** (`worker: ($A) -> $R`) infers `$R` fine for +the call's type-checking, but `R` is **not in scope as a usable type name in the +function body**. Referencing `Wrap(R)` (or any `R`) in the body errors: + +``` +error: unknown type 'R' + --> repro.sx:4:14 + | + 4 | w : Wrap(R) = .{ v = worker(arg) }; +``` + +By contrast a type-var introduced **directly** by a parameter (`arg: $A`) IS +usable in the body (`Wrap(A)` works — see the second repro). So the gap is +specific to type-vars that appear only nested in a fn-pointer (or closure) +parameter's signature. + +Observed: `error: unknown type 'R'` for the body reference. +Expected: `R` binds to the worker's return type (here `i64`), so `Wrap(R)` +resolves to `Wrap(i64)`, exactly as `$A` → `A` does. + +## Reproduction (fails) +```sx +#import "modules/std.sx"; + +Wrap :: struct($T: Type) { v: T; } + +runit :: (worker: ($A) -> $R, arg: $A) -> Wrap($R) { + w : Wrap(R) = .{ v = worker(arg) }; // error: unknown type 'R' + return w; +} + +dbl :: (n: i64) -> i64 { return n * 2; } + +main :: () -> i32 { + r := runit(dbl, 21); + print("{}\n", r.v); // want: 42 + return 0; +} +``` + +## Reproduction (works — shows the contrast) +```sx +#import "modules/std.sx"; +Wrap :: struct($T: Type) { v: T; } +runit :: (arg: $A) -> Wrap($A) { + w : Wrap(A) = .{ v = arg }; // OK — `A` (direct param type-var) binds + return w; +} +main :: () -> i32 { r := runit(21); print("{}\n", r.v); return 0; } // prints 21 +``` + +## Suspected area +Generic monomorphization / type-binding collection (the pass that walks a generic +function's parameter signatures to discover `$X` type-vars and record the +caller-inferred binding for use in the body). It descends into direct param types +(`arg: $A` → binds `A`) but does NOT descend into a fn-pointer / closure +parameter's nested signature (`($A) -> $R`) to also bind `$R` from the matched +argument function's return type (and likewise `$A` from its params, if only named +there). Look for where `$`-type-params are gathered from `fd.params` and the +per-instance `type_bindings` map is seeded — likely in `src/ir/generic.zig` +and/or the call-site argument→param type-var unifier in `src/ir/lower/call.zig`. +The unifier already infers `$R` well enough to type-check the call (the error is +only in the BODY), so the binding exists at the call site but isn't propagated +into the monomorphized body's `type_bindings`. + +## Investigation prompt (paste into a fresh session) +> A type-var that appears only inside a fn-pointer parameter's signature +> (`worker: ($A) -> $R`) is inferred at the call site (the call type-checks) but +> is NOT available as a type name in the generic function's BODY — `Wrap(R)` in +> the body errors `unknown type 'R'`, while a direct `arg: $A` makes `A` usable. +> Repro: `issues/0151-...` (the failing + the working contrast are both inline in +> the `.md`; the `.sx` is the failing one). Fix the generic type-binding pass so +> that when a generic fn is monomorphized, type-vars discovered inside a +> fn-pointer/closure parameter's nested signature (its params AND its return +> type) are added to the instance's `type_bindings` from the matched argument +> function's concrete signature — mirroring how direct param type-vars are bound. +> Suspected sites: `src/ir/generic.zig` (binding collection from `fd.params`) + +> the call-site unifier in `src/ir/lower/call.zig` (it already infers `$R` for +> overload/type-check, so reuse that result to seed the body bindings). Verify: +> the failing repro prints `42`; then move it to `examples/` as a regression test. + +## Why this matters for B1 (fibers) +`async(io, worker: ($A) -> $R, arg: $A) -> Future($R)` is the central B1.2 +ergonomic free-fn; its body builds `Future(R)`. Without this fix `async` can't be +written in its spec-faithful form. Routing around it (an explicit `$R: Type` +param the caller must pass) would change the surface and HIDE the gap — not done. +The rest of the B1.2 Io surface (the `Io` protocol on `Context`, the blocking +`CBlockingIo` default, `context.io.now_ms()`) works; only the `async`/`await` +generics are blocked by this. Saved WIP: `.sx-tmp/b12-wip/`.