Files
sx/issues/0078-string-eq-operand-of-short-circuit-and-invalid-phi.md
agra 7c1b90519f fix(emit): PHI predecessor for and/or operand that emits sub-CFG (issue 0078)
A string `==`/`!=` used as an operand of a short-circuit `and`/`or` emitted
invalid LLVM (`PHI node entries do not match predecessors!`). String compares
expand into their own memcmp sub-CFG during LLVM emission, so the operand
finishes in a later basic block (`str.merge`) than the one the IR block
started in. `fixupPhiNodes` wired the short-circuit merge PHI's incoming edge
to `block_map[ir_block]` (the block the IR block started as), recording a
stale predecessor (`%entry`/`%and.rhs.0`).

Fix: record the builder's actual insertion block after emitting each IR
block's instructions (`term_block_map`, via `LLVMGetInsertBlock`) and use it
as the PHI predecessor. General — corrects the incoming block for any operand
that emitted intermediate basic blocks (string `==`, value `match`, …), not
just string `==`.

Regression: examples/0045-basic-string-eq-short-circuit.sx (string `==` on
both sides of `and` and of `or`, plus a match-value + enum-payload `==` shape).
Fails (LLVM abort) pre-fix, passes after.
2026-06-04 02:00:13 +03:00

5.8 KiB

0078 — string == as an and/or operand emits an invalid PHI

RESOLVED. Root cause was in the LLVM emitter, not the and/or lowering: fixupPhiNodes wired each short-circuit merge PHI's incoming edge to block_map[ir_block] — the LLVM block the IR block started as. But a single IR instruction can expand into its own sub-CFG during emission (string =='s str.memcmp/str.merge blocks; a value match's arm blocks), leaving the builder in a later block. The terminator — and therefore the real predecessor edge — lands in that later block, so the recorded predecessor was stale (%entry/%and.rhs.0 instead of %str.merge). Fix: in src/ir/emit_llvm.zig, record the builder's actual insertion block after emitting each IR block's instructions (term_block_map, captured via LLVMGetInsertBlock) and use that as the PHI predecessor in fixupPhiNodes. General — corrects the incoming block for ANY operand that emitted intermediate basic blocks, not just string ==. Mirrors the issue-0066 "stale PHI incoming-block after an operand emits new blocks" shape. Regression: examples/0045-basic-string-eq-short-circuit.sx.

Symptom

A string equality (a == "x") used as an operand of a short-circuit and / or emits LLVM IR that fails verification — the JIT (sx run) and AOT paths both abort before running:

LLVM verification failed: PHI node entries do not match predecessors!
  %bp = phi i1 [ false, %entry ], [ %str.eq10, %and.rhs.0 ]
label %entry
label %str.merge
Instruction does not dominate all uses!
  %str.eq10 = phi i1 [ false, %and.rhs.0 ], [ %str.ceq9, %str.memcmp6 ]
  %bp = phi i1 [ false, %entry ], [ %str.eq10, %and.rhs.0 ]

Integer/error-tag equality in the same position is fine — only the string == operand miscompiles, because string == lowers to its own multi-block memcmp with an internal PHI (str.eq ← {str.memcmp, short-circuit false}). When that result is then consumed by the and/or short-circuit merge, the predecessor set the outer PHI records does not match the actual CFG: the string-compare's merge block becomes a predecessor of the and merge, but the outer PHI still lists the original entry/and.rhs edges. The inner str.eq PHI also ends up referenced from a block it does not dominate.

Reproduction

#import "modules/std.sx";
main :: () {
    a := "k";
    b := "v";
    r := a == "k" and b == "v";   // string == as an `and` operand
    print("{}\n", r);
}
$ ./zig-out/bin/sx run repro.sx
LLVM verification failed: PHI node entries do not match predecessors!
...

a == "k" or b == "v" reproduces it identically (or.rhs in place of and.rhs). A single a == "k" (no and/or) compiles and runs fine, as does x == 1 and y == 2 (integer operands). So the trigger is specifically a string ==/!= as an operand of a short-circuit and/or — the operand emits its own str.memcmp/str.merge sub-CFG, and the short-circuit PHI then records a stale predecessor block.

A related match.merge-predecessor variant of the same PHI mismatch also appears in a LARGER function that mixes several enum-payload accesses (v.str/v.int_) and match expressions with multiple and/or operations (it surfaced while writing examples/0714-modules-json-reader.sx). It did NOT reduce to a small standalone repro — each construct compiles fine in isolation, and a single payload-access operand (true and e.a == 1) or a preceding match expression followed by an and of locals both compile — which points at cumulative basic-block bookkeeping in the and/or lowering rather than a single local pattern. The string-== case above is the reliable minimal reproduction; the broader fix should address PHI predecessor tracking for any and/or operand that emits intermediate basic blocks.

Expected

r should be true (both compares hold) and the program print true. Generally: a string ==/!= result must be usable as an operand of and/or exactly like any other bool.

Workaround (until fixed)

Don't combine string equality with and/or in one expression; split into separate statements / separate boolean locals:

ok_k := a == "k";
ok_v := b == "v";
r := ok_k and ok_v;   // each string-eq materialized before the short-circuit

Background / where to look

The string == lowering (search str.eq / str.memcmp / str.merge block names in src/ir/lower.zig) produces a value via a PHI that joins the memcmp-equal block and the early-out (length-mismatch / short-circuit) block. The boolean and/or lowering builds its own and.rhs / and.merge (resp. or.*) blocks and a merge PHI. When the LHS (or RHS) of the and/or is itself a string compare, the outer short-circuit lowering must take the string-compare's actual current block (its merge block) as the incoming predecessor for the outer PHI — not the block that was current before the string compare emitted its sub-CFG. The mismatch above is the classic "PHI incoming-block is stale after the operand emitted new basic blocks" bug: the fix is to re-read the builder's current insertion block when wiring the and/or PHI incoming edges, rather than caching it before lowering the operand. This mirrors the shape of the match-arm PHI fix in issue 0066.

Discovered while writing the std.json reader regression example (examples/0714-modules-json-reader.sx, flow step F2.2): an assertion key == "k" and val.str == "v" triggered it. The reader library code itself does not use this pattern; the example was rewritten to assert the two string equalities separately.

Verification (once fixed)

./zig-out/bin/sx run repro.sx      # prints: true

Add a regression example (next free examples/NNNN-*.sx slot) that uses a string == on both sides of an and and on both sides of an or, and the full suite + zig build test must stay green.