fix: reject direct assignment to a tagged-union variant member
A tagged union (enum-with-payload) is laid out { tag, payload }, but a
direct member write `s.rect = payload` lowered to a payload-only store
(union_gep into field 1) with no tag store — the discriminant went stale,
so a later match/== took the wrong arm with no diagnostic (issue 0136).
The read path already distinguishes tagged unions (enum_payload/enum_tag);
the write path treated them like plain unions.
A variant is set via construction (`s = .variant(payload)`, which writes
both tag and payload). A direct member write can't safely set the tag (the
active variant isn't known at the write site), so it is now rejected with a
diagnostic pointing to construction. A new diagTaggedUnionVariantWrite guard
— reusing the shared fieldLvalueResolve matcher, applied at both store sites
(lowerAssignment, lowerMultiAssign) — fires only for a whole-variant write
on a tagged union. Plain `union` writes and nested sub-field writes
(`s.rect.w = ...`) are unaffected.
Resolves issue 0136. Tests: examples/0185 (rejected), 0186 (nested write +
construction still work). specs.md / readme.md updated.
This commit is contained in:
139
issues/0136-tagged-union-member-write-does-not-set-tag.md
Normal file
139
issues/0136-tagged-union-member-write-does-not-set-tag.md
Normal file
@@ -0,0 +1,139 @@
|
||||
# 0136 — direct write to a tagged-union member updates the payload but not the tag
|
||||
|
||||
> **RESOLVED (2026-06-13).** Root cause: the read path distinguishes a tagged
|
||||
> union (`enum_payload`/`enum_tag`, field 1/field 0) but the write path treated
|
||||
> it like a plain union (`fieldLvalueResolve` → `.union_direct` → `union_gep`),
|
||||
> storing the payload (field 1) with no tag store — so the discriminant went
|
||||
> stale. Fix (chosen option 1 — reject; the spec only blesses construction /
|
||||
> read / match for tagged unions, and no corpus relied on member writes): a new
|
||||
> `diagTaggedUnionVariantWrite` guard (`src/ir/lower/stmt.zig`, reusing the
|
||||
> shared `fieldLvalueResolve` matcher, registered in `src/ir/lower.zig`) rejects
|
||||
> a direct whole-variant member assignment at both store sites (`lowerAssignment`
|
||||
> and `lowerMultiAssign`) with a diagnostic pointing to `s = .variant(...)`.
|
||||
> Plain `union` writes and nested sub-field writes (`s.rect.w = ...`) are
|
||||
> unaffected (they don't resolve to `.union_direct` on a tagged union).
|
||||
> Regression tests: `examples/0185-types-tagged-union-member-assign-rejected.sx`
|
||||
> (rejected), `examples/0186-types-tagged-union-nested-field-write.sx` (nested
|
||||
> write + construction still work). Spec/readme updated (enum section).
|
||||
|
||||
## Symptom
|
||||
|
||||
One-line: `s.rect = .{ ... }` on a tagged union (`enum`-with-payload) stores
|
||||
into the payload area but leaves the discriminant (tag) untouched, so a later
|
||||
`match`/`==` reads the STALE tag while the payload holds the new variant — a
|
||||
silent tag/payload desync with no diagnostic.
|
||||
|
||||
- **Observed:** after `s : Shape = .circle(1.0); s.rect = .{ w=4, h=2 };`, a
|
||||
`match` on `s` takes the `.circle` arm (tag never updated) even though the
|
||||
payload now holds the rect. The wrong-variant payload read (`s.rect`) returns
|
||||
the written bytes, masking the inconsistency.
|
||||
- **Expected:** either a compile error directing the user to construction
|
||||
(`s = .rect(...)`), or the member write sets the tag too so `s` becomes the
|
||||
`.rect` variant and `match` sees `.rect`.
|
||||
|
||||
Same-variant write is fine (`s.circle = 9.0` while the tag is already
|
||||
`circle`): the payload updates and the tag already matched. Only a write whose
|
||||
variant differs from the current tag desyncs — and the compiler can't know the
|
||||
runtime tag at the write site, so the danger is inherent to the operation.
|
||||
|
||||
## Reproduction
|
||||
|
||||
`issues/0136-tagged-union-member-write-does-not-set-tag.sx` (standalone, only
|
||||
`modules/std.sx`):
|
||||
|
||||
```sx
|
||||
#import "modules/std.sx";
|
||||
|
||||
Shape :: enum {
|
||||
circle: f32;
|
||||
rect: struct { w, h: f32; };
|
||||
}
|
||||
|
||||
main :: () {
|
||||
s : Shape = .circle(1.0); // tag = circle, payload = 1.0
|
||||
s.rect = .{ w = 4.0, h = 2.0 }; // writes rect payload; tag stays circle
|
||||
if s == {
|
||||
case .circle: print("tag=circle (STALE — wrote rect)\n");
|
||||
case .rect: print("tag=rect (correct)\n");
|
||||
}
|
||||
r := s.rect;
|
||||
print("rect.w={} rect.h={}\n", r.w, r.h);
|
||||
}
|
||||
```
|
||||
|
||||
Run: `./zig-out/bin/sx run issues/0136-tagged-union-member-write-does-not-set-tag.sx`
|
||||
→ today prints `tag=circle (STALE — wrote rect)` then `rect.w=4 rect.h=2`.
|
||||
|
||||
## Root cause (traced)
|
||||
|
||||
A tagged union is laid out `{ tag (field 0), payload (field 1) }`
|
||||
(`src/ir/types.zig` sizeOf: `tag_sz + max_field`). The READ and WRITE paths
|
||||
treat it asymmetrically:
|
||||
|
||||
- **Read** distinguishes a tagged union: `lowerFieldAccess`
|
||||
(`src/ir/lower/expr.zig`) emits `enum_payload` → `emitEnumPayload`
|
||||
(`src/backend/llvm/ops.zig:1392`) GEPs **field 1** (payload); `.tag` /
|
||||
`==` use `enum_tag` (field 0).
|
||||
- **Write** treats a tagged union like a plain union: `fieldLvalueResolve`
|
||||
(`src/ir/lower/stmt.zig`) maps a tagged-union member to `.union_direct` →
|
||||
`fieldLvaluePtr` emits `union_gep` → `emitUnionGep`
|
||||
(`src/backend/llvm/ops.zig:1430`) GEPs **field 1** (payload) and stores there.
|
||||
|
||||
So the payload OFFSET is correct (both use field 1 — not an out-of-bounds /
|
||||
clobber bug), but the write never emits the tag (field 0) store that
|
||||
construction does. Construction `s = .rect(...)` lowers via `enum_init`
|
||||
(`src/ir/inst.zig:170`), which writes BOTH tag and payload; the member-write
|
||||
path emits only the payload store.
|
||||
|
||||
## Investigation prompt
|
||||
|
||||
> A direct write to a tagged-union member (`s.rect = .{...}`) updates the
|
||||
> payload but not the discriminant, silently desyncing tag and payload. Repro:
|
||||
> `issues/0136-tagged-union-member-write-does-not-set-tag.sx` (today prints the
|
||||
> STALE tag; the fix should make it either a compile error or set the tag).
|
||||
>
|
||||
> Root: `fieldLvalueResolve` (`src/ir/lower/stmt.zig`) maps a tagged-union
|
||||
> member to `.union_direct`, so `fieldLvaluePtr` emits a bare `union_gep`
|
||||
> (payload pointer, field 1) and `lowerAssignment` stores only the payload. The
|
||||
> tag (field 0) is never written. Construction (`enum_init`) writes both.
|
||||
>
|
||||
> Two reasonable fixes (pick one — both beat the silent desync):
|
||||
> 1. **Reject** direct assignment to a tagged-union variant member with a
|
||||
> diagnostic ("set a tagged-union variant via `s = .rect(...)`; direct
|
||||
> member assignment can't set the discriminant"). Simplest and safe — the
|
||||
> construction syntax already covers the intent, and the compiler can't
|
||||
> know the runtime tag to validate a same-variant write anyway. Detect in
|
||||
> `lowerAssignment`'s `.field_access` arm when the object type is a
|
||||
> `tagged_union` and the field is a variant.
|
||||
> 2. **Set the tag too**: make `s.rect = payload` equivalent to
|
||||
> `s = .rect(payload)` — emit a tag store (the variant discriminant, which
|
||||
> `fieldLvalueResolve` already knows as the variant index) alongside the
|
||||
> payload store. More ergonomic but more plumbing (the assignment path
|
||||
> must emit two stores, and compound ops like `s.rect.w += 1` need
|
||||
> thought). Mirror how `enum_init` sets the tag.
|
||||
> Do NOT leave the silent payload-only write. Note the read/write asymmetry
|
||||
> (read uses `enum_payload`, write uses `union_gep`) is the structural root;
|
||||
> whichever fix, keep plain `union` (no tag) working — only `tagged_union`
|
||||
> needs the new behavior.
|
||||
>
|
||||
> Verification: the repro errors (option 1) or `match` sees `.rect` (option 2);
|
||||
> then `zig build && zig build test` green. Add coverage under
|
||||
> `examples/01xx-types-...` (a tagged-union member write: rejected, or
|
||||
> tag-setting round-trips through `match`).
|
||||
|
||||
## Notes
|
||||
|
||||
- PRE-EXISTING and orthogonal to issues 0133 / 0135. Surfaced while reviewing
|
||||
the 0133 fix (which unified the lvalue field resolver): the read path
|
||||
special-cases `tagged_union` but the write path does not. The 0133 fix itself
|
||||
is about PLAIN `union` (no tag) and did not introduce or change this — it
|
||||
preserved the existing `union_direct → union_gep` routing for tagged-union
|
||||
members.
|
||||
- Related read-side gap (probably same fix family, not verified here): reading
|
||||
the WRONG variant (`s.rect` while tag is `circle`) also returns raw payload
|
||||
bytes with no tag check (`emitEnumPayload` doesn't check the discriminant).
|
||||
A variant-safe accessor / checked read is a separate consideration.
|
||||
- Sites: write — `fieldLvalueResolve`/`fieldLvaluePtr` (`src/ir/lower/stmt.zig`),
|
||||
`emitUnionGep` (`src/backend/llvm/ops.zig:1430`); read — `emitEnumPayload`
|
||||
(`src/backend/llvm/ops.zig:1392`); construction — `enum_init`
|
||||
(`src/ir/inst.zig:170`); layout — `src/ir/types.zig` (tagged_union sizeOf).
|
||||
Reference in New Issue
Block a user