fix: type-safe stores + Any unbox/eq; finish multi-return deferrals

Type-checking gaps (segfault/corruption → compile errors):

- 0197: reject a store into an annotated slot whose value has no modeled
  coercion AND a different byte width (a 16-byte string into a 4-byte i32
  overran the slot and segfaulted). New checkAssignable / noneReinterpretIsUnsafe
  (coerce.zig, width via the LLVM-accurate typeSizeBytes) wired into every store
  site: var/const-decl, single + multi assignment (identifier/field/index/
  element/deref), named-return defaults. Same-width reinterpretations (*T→[*]T,
  i64→isize, fn-ref) and explicit xx/cast stay allowed; cascades suppressed via
  externalErrorsExist. Examples 1205, 1206.
- 0198: an implicit `Any → T` unbox is now a compile error (it blindly
  reinterpreted the boxed payload — silent garbage for a wrong scalar, a segfault
  for an aggregate). xx and compiler-generated match/pack unboxes are unaffected.
  Example 1207.
- 0199: `Any == <concrete>` (one operand Any) aborted the LLVM verifier — the
  comparison arm now fires when either operand is Any, boxing the concrete side
  first. Example 0654.

Multi-return deferrals (PLAN-MULTIRET #6 + named-order + D3 + generic):

- Reorder named return elements by name instead of requiring slot order; error on
  unknown/duplicate/missing (value-only AND full-failable-tuple forms). Examples
  0210, 0214.
- Reject a bare-paren (A, B) multi-return signature in generic-arg position
  (return-position-only). Example 0215.
- Multi-return closure types / lambda literals work via the reused tuple
  machinery (destructure, single-bind+field, lambda arg). Example 0216.
- Generic multi-return: positional works (0217); 0200: the named-slot
  implicit-return form now works for generic free fns + struct methods —
  monomorphizeFunction now calls bindNamedReturnSlots. Example 0218.

readme.md documents the annotated-store coercion rule; CHECKPOINT-MULTIRET.md
updated. Full corpus green (850/0).
This commit is contained in:
agra
2026-06-27 17:28:27 +03:00
parent 97772abf54
commit b322dcfe61
51 changed files with 1000 additions and 56 deletions

View File

@@ -1,3 +1,30 @@
> **RESOLVED** (2026-06-27). Root cause: a value whose type has NO modeled
> coercion to the destination slot (`classify == .none`) was passed through the
> `coerceMode` `.no_op, .none => return val` arm UNCHANGED — a raw reinterpreting
> store. When the value's byte width differed from the slot's (a 16-byte `string`
> into a 4-byte `i32`), the store overran the slot and corrupted memory / SIGSEGV'd.
>
> Fix: a shared guard `checkAssignable` / `noneReinterpretIsUnsafe`
> (`src/ir/lower/coerce.zig`) rejects a `.none` store ONLY when the byte WIDTHS
> differ (`typeSizeBytes`, the LLVM-accurate ABI size — NOT the field-padded
> `sizeOf`). A same-width `.none` is a legitimate bit-compatible reinterpretation
> (`*T → [*]T`, `i64 → isize`, a bare fn-ref into a function slot) and stays
> allowed; an explicit `xx`/`cast` always passes (the escape hatch). Cascades are
> suppressed via `externalErrorsExist()` (the guard tallies its own diagnostics,
> so a pre-lowering error — an unknown annotation type — or a failed initializer
> doesn't trigger a pile-on, while independent mismatches each still report).
> Wired into EVERY annotated-slot store site: var-decl, body-local const-decl,
> scalar reassignment (local + global), struct/tuple field, array/slice/pointer
> element, pointer deref, multi-assignment targets, and named-return defaults.
> (`destructure-decl` infers target types from the RHS, so it has no annotation
> to mismatch.) Regression tests: `examples/diagnostics/1205` (var/const/reassign)
> + `examples/diagnostics/1206` (field/element/deref/multi-assign width overrun).
>
> NOTE: a sibling runtime-safety gap surfaced during the fix's adversarial
> review — unboxing an `Any` to a mismatched type is unchecked (silent-wrong /
> segfault). That is a DIFFERENT code path (`unbox_any`, not the `.none`
> passthrough) and is filed separately as **issue 0198**.
# 0197 — annotated assignment with an incompatible type is unchecked (segfaults)
**Symptom** — A variable / constant declared with an explicit type annotation and

View File

@@ -0,0 +1,98 @@
> **RESOLVED** (2026-06-27). Fix: an IMPLICIT `Any → T` unbox is now a COMPILE
> ERROR (`coerceMode`'s `.unbox_any` arm, `mode == .implicit`, in
> `src/ir/lower/coerce.zig`). sx prevents this unsafe class at compile time —
> like the no-implicit-optional-unwrap rule — rather than with a runtime trap
> (the LLVM backend has no runtime-abort infra by design; compiled code relies on
> compile-time flow analysis). The escape hatches are unaffected: an explicit
> `xx some_any` (handled by `lowerXX`'s own unbox arm) and the compiler-generated
> type-dispatch / variadic-pack-extraction unboxes (which emit `.unbox_any`
> directly, not via `coerceMode`) all still work, as do `print`/`type_name`/`{}`
> formatting of an `Any`. So both 0198 cases are fixed: `s : S = some_any` (was a
> segfault) and `f : f64 = some_any` (was a silent `0.0`) now emit a clean
> compile error. Adversarial review found no false-positive (every legitimate
> `Any` pattern still works) and no surviving silent/segfault path. Regression
> test: `examples/diagnostics/1207-diagnostics-any-implicit-unbox-rejected.sx`.
>
> A SEPARATE pre-existing bug surfaced during the review — `Any == <concrete>`
> (one operand `Any`) aborts the LLVM verifier — filed as **issue 0199**.
# 0198 — unboxing an `Any` to a mismatched type is unchecked (silent-wrong / segfaults)
**Symptom** — Extracting a concrete value from an `Any` (the implicit
`Any → T` unbox, `classify == .unbox_any`) does NO runtime tag check: if the
boxed type does not match the unbox target `T`, the boxed bits are reinterpreted
blindly. For a scalar mismatch this silently produces garbage; for an aggregate
target it treats the boxed scalar as a pointer and dereferences it, **segfaulting**.
- Observed:
- `Any(boxed i64 5) → i64``5` (correct).
- `Any(boxed i64 5) → f64``0.000000` (silent garbage — raw bit reinterpret, no diagnostic).
- `Any(boxed i64 5) → struct{a:i32; b:i32}`**Segmentation fault** (the i64 `5`
is treated as a struct pointer and dereferenced).
- Expected: a runtime trap / clean diagnostic on a tag mismatch (the `Any` box
carries a type tag in field 0 — `{i64 tag, i64 value}` — so a checked unbox is
feasible), OR at minimum no memory-unsafe dereference.
This is DISTINCT from issue 0197 (the compile-time `.none` annotated-assignment
gap, now fixed): here the static types `Any → T` are a *legal* unbox, so the
mismatch is only knowable at runtime via the tag. It was surfaced by the
adversarial review of the 0197 fix — the 0197 size guard correctly does NOT
fire here because `classify(Any, T) == .unbox_any`, not `.none`.
## Reproduction
```sx
#import "modules/std.sx";
S :: struct { a: i32; b: i32; }
main :: () -> i64 {
x : Any = 5; // boxes an i64
s : S = x; // Any → S unbox: NO tag check
print("unreached\n");
return 0;
}
```
`./zig-out/bin/sx run repro.sx``Segmentation fault`. `sx ir` lowers fine.
A non-crashing but silently-wrong variant: change `s : S = x;` to
`f : f64 = x;` — prints `0.000000` with no diagnostic.
## Investigation prompt
The unbox is lowered as `Op.unbox_any` (coerce.zig, the `.unbox_any` arm of
`coerceMode` / `lowerXX`) and emitted by `emitUnboxAny`
(`src/backend/llvm/ops.zig:2462`):
```zig
pub fn emitUnboxAny(self: Ops, instruction: *const Inst, un: UnaryOp) void {
const any_val = self.e.resolveRef(un.operand);
const any_kind = c.LLVMGetTypeKind(c.LLVMTypeOf(any_val));
if (any_kind == c.LLVMStructTypeKind) {
const raw = c.LLVMBuildExtractValue(self.e.builder, any_val, 1, "ua.raw"); // field 1 = boxed value (i64)
const target_ty = self.e.toLLVMType(instruction.ty);
self.e.mapRef(self.e.coerceFromI64(raw, target_ty)); // ← no tag check; struct target derefs the scalar
} else {
self.e.mapRef(c.LLVMGetUndef(self.e.toLLVMType(instruction.ty)));
}
}
```
The `Any` box is `{ i64 type_tag, i64 value }`. Field 0 is the type tag (the
boxing site stores the source `TypeId`). The fix likely needs `emitUnboxAny` to
compare field 0 against `instruction.ty`'s tag and, on mismatch, trap with a
located runtime diagnostic (mirror the optional-unwrap / bounds-check trap
pattern) rather than `coerceFromI64`-ing arbitrary bits. For an aggregate target
the current `coerceFromI64` path is itself wrong (a >8-byte boxed value is
heap-stored as a pointer in field 1; a fits-in-8 scalar is stored inline) — the
unbox must distinguish the two by the boxed type, which the tag enables.
Decision needed: does sx want `Any` unbox to be CHECKED (trap on mismatch, the
safe default) or remain an unchecked escape hatch (then `xx`/an explicit
checked-cast builtin should be the only spelling, and the implicit
`T x = some_any` unbox should at least not dereference a scalar as a pointer)?
See `specs.md` for the intended `Any` semantics before choosing.
Verification: run the repro; expect a clean trap/diagnostic (or a checked-cast
requirement), NOT a segfault, and the `f64` variant to not silently yield `0.0`.

View File

@@ -0,0 +1,61 @@
> **RESOLVED** (2026-06-27). Fix: the `Any`-shaped `==`/`!=` arm in
> `src/ir/lower/expr.zig` now fires when EITHER operand is `.any` (was both). A
> concrete operand is boxed to `Any` (`builder.boxAny`) first, so both sides are
> 16-byte boxes; then both unbox to their `.i64` value words and compare — the
> same value-identity the both-`Any` path uses (tags not compared). An
> already-errored `.unresolved` / `.void` operand falls through (no cascade).
> Verified: `x == 5`, `x == 6`, `x != 6`, `5 == x` (reversed), bool `Any`, and the
> both-`Any` form all work; no verifier abort. Regression test:
> `examples/comptime/0654-comptime-any-eq-concrete.sx`. (Aggregate-`Any`
> comparison still uses value-word identity — the same limitation the both-`Any`
> path always had; orthogonal to this verifier fix.)
# 0199 — `Any == <concrete>` (one operand `Any`) fails LLVM verification
**Symptom** — An equality / inequality comparison where exactly ONE operand is
`Any` and the other is a concrete type is not handled: it falls through to a
plain `icmp` on a 16-byte `{tag, value}` aggregate vs a scalar and aborts the
LLVM verifier.
- Observed: `x : Any = 5; if x == 5 { ... }`
`error: Both operands to ICmp are not of the same type! {i64,i64} vs i64`,
`LLVM verification failed`, exit 1 (loud — not a segfault / silent miscompile).
- Expected: either box the concrete operand to `Any` (then compare as `Any ==
Any`, the path that already works) consulting the tag, OR a clean located
compile diagnostic (e.g. "compare an 'Any' against a value of its boxed type,
or `xx` the Any first"). Not an LLVM verifier abort.
Distinct from issue 0198 (the implicit `Any → T` unbox). Surfaced by the
adversarial review of the 0198 fix. `Any == Any` works correctly.
## Reproduction
```sx
#import "modules/std.sx";
main :: () -> i64 {
x : Any = 5;
if x == 5 { return 1; } // error: ICmp operand type mismatch {i64,i64} vs i64
return 0;
}
```
`./zig-out/bin/sx run repro.sx` → `LLVM verification failed`, exit 1.
## Investigation prompt
The `Any` equality path is in `src/ir/lower/expr.zig` (~3201-3215), gated on
`lhs_ty == .any and rhs_ty == .any` — it `unbox_any`s both sides to `.i64` and
`cmp_eq`s the value words. When only ONE side is `.any`, that guard is false and
the comparison falls through to the generic numeric/`icmp` path, which emits an
`icmp` between the 16-byte `Any` aggregate and the scalar → verifier abort.
The fix likely adds a mixed-operand arm: when exactly one operand is `.any` and
the other is a concrete type `T`, box the concrete operand to `Any`
(`self.builder.boxAny(concrete, T)`) and reuse the existing `Any == Any`
value-word comparison — OR, if comparing only the payload word is unsound across
types (a `5:i64` and a `5.0:f64` would compare equal by bits), gate on the tag
too / emit a diagnostic. Decide whether `Any == concrete` should compare by
(tag AND value) or be disallowed; mirror whatever `Any == Any` semantics are
documented. Verify: the repro compiles and `x == 5` is true, OR a clean
diagnostic is emitted — never an LLVM verifier abort.

View File

@@ -0,0 +1,78 @@
> **RESOLVED** (2026-06-27). Root cause exactly as hypothesized: the generic
> monomorph path `monomorphizeFunction` (`src/ir/lower/generic.zig`) bound params
> and lowered the body via `lowerValueBody`, but NEVER called
> `bindNamedReturnSlots` — so `named_return_names` stayed null and the
> implicit-return synthesis (`lowerValueBody`, stmt.zig) didn't fire. (The
> non-generic decl path `lowerFunctionBodyInto` already called it.) Fix: call
> `bindNamedReturnSlots(fd, ret_ty, &scope)` in `monomorphizeFunction` after
> param-binding, with the same `named_return_names`/`named_return_defaults`
> save/restore. Covers generic free functions AND generic struct methods (the
> instance-method path shares the monomorph), with defaults and the failable
> error channel. Regression test: `examples/types/0218-types-multi-return-generic-named.sx`.
# 0200 — named-return locals don't synthesize the implicit return in a GENERIC multi-return function
**Symptom** — A generic function with a NAMED multi-return (`-> (first: $T, second: $U)`)
that relies on the implicit return (assigns the named slot locals, no explicit
`return`) fails to compile: the named-return-locals synthesis does not fire for
the monomorphized instance, so it reports "body produces no value".
- Observed: `pair :: (a: $T, b: $U) -> (first: T, second: U) { first = a; second = b; }`
`error: function returns '(first: i64, second: bool)' but its body produces
no value — end it with a trailing expression (no ';') or an explicit 'return'`.
- Expected: the named slot locals (`first`, `second`) are bound and the implicit
return is synthesized from them, exactly as for a NON-generic named
multi-return.
Note the diagnostic shows the return type RESOLVED to concrete types
(`(first: i64, second: bool)`) — so binding/return-type resolution ran; only the
named-return-LOCALS path (`bindNamedReturnSlots``self.named_return_names`) did
not take effect for the generic instance.
WORKS (so this is narrow): the POSITIONAL generic multi-return with an explicit
return is fine — `(a: $T, b: $U) -> (T, U) { return a, b; }` and explicit-type
`pair(i32, bool, 7, true)` both run correctly. Only the named-slot IMPLICIT-return
form × generic monomorph is broken. Workaround: use an explicit `return a, b`.
## Reproduction
```sx
#import "modules/std.sx";
pair :: (a: $T, b: $U) -> (first: T, second: U) {
first = a;
second = b; // implicit return from named slots — never synthesized
}
main :: () -> i64 {
x, y := pair(7, true);
print("{} {}\n", x, y);
return 0;
}
```
`./zig-out/bin/sx run repro.sx` → the "produces no value" error, exit 1.
## Investigation prompt
The implicit-return-from-named-slots synthesis (`lowerValueBody` in
`src/ir/lower/stmt.zig` ~line 172: `if (self.named_return_names) |names| { … }`)
only fires when `self.named_return_names` is set by `bindNamedReturnSlots`
(`src/ir/lower/stmt.zig` ~258). That binder is called from `lowerFunctionBodyInto`
(`src/ir/lower/decl.zig:2729`). `bindNamedReturnSlots` early-returns unless
`fd.return_type.?.data == .return_type_expr`.
The likely cause: the generic-FREE-function monomorph lowers the instance with a
SUBSTITUTED return-type node (the `$T`/`$U` resolved into a concrete
`tuple_type_expr` or a resolved TypeId), so `fd.return_type.data` is no longer
`.return_type_expr``bindNamedReturnSlots` early-returns → `named_return_names`
stays null → the implicit return isn't synthesized. Confirm by checking the
generic free-function instantiation path (search `instantiateGeneric` /
`lazyLowerFunction` / the monomorph that rewrites `fd` for free functions): does
it preserve the original `ReturnTypeExpr` AST node (binding via `type_bindings`),
or rewrite it? The fix likely keys `bindNamedReturnSlots` off the ORIGINAL
template `fd.return_type` (which carries `field_names`), or threads the
field-names through the monomorph. Generic STRUCT methods may have the same gap —
test `Box(T)` with a named multi-return method.
Verify: the repro prints `7 true`, exit 0. Add a positive generics example.