A string `==`/`!=` used as an operand of a short-circuit `and`/`or` emitted invalid LLVM (`PHI node entries do not match predecessors!`). String compares expand into their own memcmp sub-CFG during LLVM emission, so the operand finishes in a later basic block (`str.merge`) than the one the IR block started in. `fixupPhiNodes` wired the short-circuit merge PHI's incoming edge to `block_map[ir_block]` (the block the IR block started as), recording a stale predecessor (`%entry`/`%and.rhs.0`). Fix: record the builder's actual insertion block after emitting each IR block's instructions (`term_block_map`, via `LLVMGetInsertBlock`) and use it as the PHI predecessor. General — corrects the incoming block for any operand that emitted intermediate basic blocks (string `==`, value `match`, …), not just string `==`. Regression: examples/0045-basic-string-eq-short-circuit.sx (string `==` on both sides of `and` and of `or`, plus a match-value + enum-payload `==` shape). Fails (LLVM abort) pre-fix, passes after.
5.8 KiB
0078 — string == as an and/or operand emits an invalid PHI
RESOLVED. Root cause was in the LLVM emitter, not the
and/orlowering:fixupPhiNodeswired each short-circuit merge PHI's incoming edge toblock_map[ir_block]— the LLVM block the IR block started as. But a single IR instruction can expand into its own sub-CFG during emission (string=='sstr.memcmp/str.mergeblocks; a valuematch's arm blocks), leaving the builder in a later block. The terminator — and therefore the real predecessor edge — lands in that later block, so the recorded predecessor was stale (%entry/%and.rhs.0instead of%str.merge). Fix: insrc/ir/emit_llvm.zig, record the builder's actual insertion block after emitting each IR block's instructions (term_block_map, captured viaLLVMGetInsertBlock) and use that as the PHI predecessor infixupPhiNodes. General — corrects the incoming block for ANY operand that emitted intermediate basic blocks, not just string==. Mirrors the issue-0066 "stale PHI incoming-block after an operand emits new blocks" shape. Regression:examples/0045-basic-string-eq-short-circuit.sx.
Symptom
A string equality (a == "x") used as an operand of a short-circuit
and / or emits LLVM IR that fails verification — the JIT (sx run)
and AOT paths both abort before running:
LLVM verification failed: PHI node entries do not match predecessors!
%bp = phi i1 [ false, %entry ], [ %str.eq10, %and.rhs.0 ]
label %entry
label %str.merge
Instruction does not dominate all uses!
%str.eq10 = phi i1 [ false, %and.rhs.0 ], [ %str.ceq9, %str.memcmp6 ]
%bp = phi i1 [ false, %entry ], [ %str.eq10, %and.rhs.0 ]
Integer/error-tag equality in the same position is fine — only the
string == operand miscompiles, because string == lowers to its own
multi-block memcmp with an internal PHI (str.eq ← {str.memcmp,
short-circuit false}). When that result is then consumed by the and/or
short-circuit merge, the predecessor set the outer PHI records does not
match the actual CFG: the string-compare's merge block becomes a
predecessor of the and merge, but the outer PHI still lists the original
entry/and.rhs edges. The inner str.eq PHI also ends up referenced
from a block it does not dominate.
Reproduction
#import "modules/std.sx";
main :: () {
a := "k";
b := "v";
r := a == "k" and b == "v"; // string == as an `and` operand
print("{}\n", r);
}
$ ./zig-out/bin/sx run repro.sx
LLVM verification failed: PHI node entries do not match predecessors!
...
a == "k" or b == "v" reproduces it identically (or.rhs in place of
and.rhs). A single a == "k" (no and/or) compiles and runs fine, as
does x == 1 and y == 2 (integer operands). So the trigger is specifically
a string ==/!= as an operand of a short-circuit and/or — the
operand emits its own str.memcmp/str.merge sub-CFG, and the
short-circuit PHI then records a stale predecessor block.
A related match.merge-predecessor variant of the same PHI mismatch also
appears in a LARGER function that mixes several enum-payload accesses
(v.str/v.int_) and match expressions with multiple and/or
operations (it surfaced while writing
examples/0714-modules-json-reader.sx). It did NOT reduce to a small
standalone repro — each construct compiles fine in isolation, and a single
payload-access operand (true and e.a == 1) or a preceding match
expression followed by an and of locals both compile — which points at
cumulative basic-block bookkeeping in the and/or lowering rather than a
single local pattern. The string-== case above is the reliable minimal
reproduction; the broader fix should address PHI predecessor tracking for
any and/or operand that emits intermediate basic blocks.
Expected
r should be true (both compares hold) and the program print true.
Generally: a string ==/!= result must be usable as an operand of
and/or exactly like any other bool.
Workaround (until fixed)
Don't combine string equality with and/or in one expression; split
into separate statements / separate boolean locals:
ok_k := a == "k";
ok_v := b == "v";
r := ok_k and ok_v; // each string-eq materialized before the short-circuit
Background / where to look
The string == lowering (search str.eq / str.memcmp / str.merge
block names in src/ir/lower.zig) produces a value via a PHI that joins
the memcmp-equal block and the early-out (length-mismatch / short-circuit)
block. The boolean and/or lowering builds its own and.rhs /
and.merge (resp. or.*) blocks and a merge PHI. When the LHS (or RHS)
of the and/or is itself a string compare, the outer short-circuit
lowering must take the string-compare's actual current block (its merge
block) as the incoming predecessor for the outer PHI — not the block that
was current before the string compare emitted its sub-CFG. The mismatch
above is the classic "PHI incoming-block is stale after the operand
emitted new basic blocks" bug: the fix is to re-read the builder's current
insertion block when wiring the and/or PHI incoming edges, rather than
caching it before lowering the operand. This mirrors the shape of the
match-arm PHI fix in issue 0066.
Discovered while writing the std.json reader regression example
(examples/0714-modules-json-reader.sx, flow step F2.2): an assertion
key == "k" and val.str == "v" triggered it. The reader library code
itself does not use this pattern; the example was rewritten to assert the
two string equalities separately.
Verification (once fixed)
./zig-out/bin/sx run repro.sx # prints: true
Add a regression example (next free examples/NNNN-*.sx slot) that uses a
string == on both sides of an and and on both sides of an or, and
the full suite + zig build test must stay green.