Two genuine defects behind the 0128 filing (whose original repros were
both poisoned by binding getenv, which std already declares -> *u8):
1. Re-declaring a C symbol was silent first-wins: every call through
the later declaration was typed by the older signature. Foreign
registration now dedupes — equal signatures share one FuncId,
conflicting ones are diagnosed.
2. Foreign -> string / -> ?string returns read garbage: C returns one
char*, but the LLVM signature declared the fat {ptr,i64} (len =
register garbage), and ?string was mis-declared SRET (the hidden
out-pointer landed in the callee's first arg register). cstrRetKind
now classifies such returns, declares them as plain ptr (never
sret), and the call site synthesizes {ptr, strlen} via a
branch-guarded strlen (NULL -> {null,0} / optional null), wrapping
{string, i1} for ?string.
?[:0]u8 itself resolves fine (it is ?string); the spelling works in
return, param, local, and alias positions.
Regression: examples/1221 (plain + optional non-null + NULL paths) and
examples/1172 (conflict diagnostic); both FAIL pre-fix. The extern
dedupe collapses duplicate libc decls, so affected .ir snapshots were
regenerated. zig build test 426/426; run_examples 602/602;
distribution suite 21/21.
Surface rename of the signed integer family: s1..s64 become i1..i64
(u1..u64, usize, isize unchanged). 'string' keeps the s-prefix arm in
name classification; width parsing moves to the i-prefix arm next to
isize.
Internal TypeId tags follow the surface (.s8/.s16/.s32/.s64 ->
.i8/.i16/.i32/.i64), as do mono-key mangle fragments (ptr_i64,
tu_i64_bool) and all display/diagnostic formatting (i{d}).
Migrated in the same sweep: stdlib + examples + issue repros + FFI C
companions (shared symbol names like ffi_id_i64), expected
stdout/stderr/ir snapshots, specs.md, readme.md, CLAUDE.md/AGENTS.md,
implementation_plan.md, docs/, issue writeups. Vendored stb_image and
historical flow state left untouched.
zig build test: 426/426; examples suite: 595/595.
An alloca built at its use site re-executes on every pass through that
block, and LLVM reclaims allocas only at ret — so loop-body locals,
nested-loop index slots, and emitter spill temps (ig.tmp, sret slots, ABI
coercion temps, byval materialization) grew the stack per iteration and
long loops segfaulted on stack exhaustion.
New LLVMEmitter.buildEntryAlloca inserts after existing entry-block
allocas and restores the builder position; every LLVMBuildAlloca site
reachable during instruction emission now routes through it.
Initialization stores stay at the use site (per-iteration re-init is
unchanged), and entry slots become mem2reg-promotable. The 35 .ir
snapshot diffs are pure alloca position moves (type multisets verified
identical per file).
Regression: examples/0047-basic-loop-local-stack-reuse.sx (segfaulted
pre-fix on both the 1M-iteration body-local loop and the 3M-iteration
nested loop).
Sweep all src/**.zig comments that cite resolved issues (issue NNNN /
fix-NNNN / KB-N): the invariant or mechanism each comment states is
kept; the historical citation is dropped, per the no-conclusion-comments
rule. Pure-history parentheticals are removed outright. References to
the 16 still-open issues (0030, 0041-0056) are untouched, as are test
NAMES carrying regression provenance (matching the sanctioned
"Regression (issue NNNN)" example-header convention).
Also removes the issues/0019-import-non-transitive-c-scope/ fixture dir
— the issue is superseded and its behavior is covered by
examples/0706-modules-import-non-transitive.sx (the .md writeup stays).
issues/0030's repro .sx stays: that issue is an open feature request.
Gate: zig build OK; zig build test 426/426; run_examples 541/0; zero
expected/ snapshot churn.
`type_name` / `type_is_unsigned` on an `Any` argument unconditionally read
the Any's payload as a TypeId index. That is correct only when the Any holds
a Type value (`{ .any, tid }`); for an Any holding a runtime *value*
(`av : Any = 6`, tag s64, payload 6) it returned `types[6]` — `type_name(av)`
gave "u8" and `type_is_unsigned(av)` gave true.
Both backends now branch on the Any's runtime type-tag: tag == `.any` → the
box is a Type value, use the payload as the TypeId; otherwise the tag IS the
held value's type. So `type_name(av)` → "s64", `type_is_unsigned(av)` → false,
while `type_name(type_of(x))` still names the held type. The `{}` formatter is
unchanged (it already passed `type_of(val)`, a proper Type value).
- src/ir/interp.zig: shared `Value.reflectTypeId` tag-branching resolver; the
`type_name` / `type_is_unsigned` interp arms route through it.
- src/backend/llvm/ops.zig: shared `Ops.reflectArgTypeId` emits
extractvalue-tag / icmp-eq-.any / select for the runtime path; both
reflection arms route through it. The two backends agree.
- examples/0164-types-reflection-any-tag.sx: regression pinning type_name /
type_is_unsigned / print on an Any holding a value vs a Type.
- src/ir/interp.test.zig: unit test for `reflectTypeId`.
- 22 .ir snapshots: the new select appears in every std-importing program's
IR (any_to_string embeds these builtins) — benign, verified structurally
identical apart from the three new instructions.
- issues/0090, specs.md: documented the Any-tag rule.
Resolves issue 0090. The `{}` integer formatter mis-rendered both ends of
the 64-bit range:
- `int_to_string` computed the magnitude as `0 - n`, which overflows for
`s64::MIN` (its magnitude is unrepresentable as a positive s64) — the
value stayed negative, the digit loop ran zero times, so only `-`
printed. It now extracts digits straight from `n` (per-digit
`|n % 10|`, `n` truncating toward zero), never negating MIN.
- `any_to_string`'s `case int:` formatted every integer as s64, so a u64
all-ones value printed as `-1`. There was no `uint` type-category to
distinguish signedness. Added an additive `type_is_unsigned(T)`
reflection builtin (static fold + dynamic interp/LLVM paths, mirroring
`type_name`), backed by the new `TypeTable.isUnsignedInt` predicate, and
a `uint_to_string` formatter (unsigned decimal via long-division over
four 16-bit limbs). `case int:` routes through `type_is_unsigned(type)`.
The 16-bit-limb split is factored into a shared `decompose_u16x4`, now
reused by `int_to_hex_string` (no second unsigned-math routine).
Regression: examples/0046-basic-int-formatter-extremes pins both extremes
plus a width spread; unit tests cover `isUnsignedInt`. Docs (specs.md
representation note, readme std API) updated for unsigned/extreme `{}`
behavior. IR snapshots refreshed for the two new std functions.
Writing a Vector lane (`v.x = …`, `.y/.z/.w` + colour aliases) panicked
with "unresolved type reached LLVM emission". The store path had no
vector branch: a `.field_access` target on a Vector fell through to
struct-field lookup, matched nothing, left `field_ty = .unresolved`, and
built a `ptrTo(.unresolved)` that tripped the LLVM emission guard. The
read path resolved the lane fine — the two had diverged (issue-0083
two-resolver class).
Extract a shared `Lowering.vectorLaneIndex` resolver and route BOTH paths
through it. The read path (`lowerFieldAccessOnType`) delegates to it,
dropping its silent `else 0` fallback. A new vector branch in
`lowerAssignment` GEPs a typed pointer to the lane (`structGepTyped`) and
stores via `storeOrCompound` (plain + compound). `emitStructGep` now
addresses a vector base type with a `[0, lane]` GEP. A non-lane field now
reports field-not-found on both paths instead of silent-lane-0 / panic.
Regression: examples/1506-vectors-lane-store.sx (panicked pre-fix, now
reads back written values) + a vectorLaneIndex unit test. Resolves issue
0086; spec documents element assignment.
emitCmpNe lowered float `!=` to `LLVMRealONE` (ordered not-equal), which
is false when either operand is NaN. That made `nan != nan` false in
native code — breaking the canonical `x != x` NaN test, making `!=`
non-complementary with `==` for NaN, and disagreeing with the interpreter.
Change the float predicate to `LLVMRealUNE` (unordered not-equal): true
if either operand is NaN OR they are unequal. For all non-NaN operands
`UNE` ≡ `ONE`, so only NaN-involving comparisons change (toward correct).
The integer predicate (`LLVMIntNE`) and `emitCmpEq` (`OEQ`) are unchanged,
so `nan == nan` stays false and `!=` is now the exact complement of `==`.
- Regression: examples/0150-types-float-ne-unordered-nan.sx (fails before,
passes after; also pins #run/comptime == runtime agreement).
- specs.md: documents float comparison / NaN semantics (Operators).
- Resolves issue 0091 (issues/0091-float-ne-ordered-nan.md).
The `type_name` / `type_eq` reflection builtins resolved their Type arg's IR
type via `getRefIRType(...) orelse TypeId.s64`, then gated `== .any`. A failed
must-succeed lookup silently became `.s64` (`!= .any`), classifying a boxed
`Any` arg as bare i64 and reading the wrong value with no diagnostic.
Add the sibling classifier `LLVMEmitter.reflectArgRepr`, which routes the
lookup through `argIRTypeOrFail` (the issue-0074 `.unresolved` resolver) and
returns `{ boxed, bare, unresolved }`. The three emit sites in ops.zig
(`type_name` + `type_eq` x2) now switch on it: `.boxed` extracts the Any value
field, `.bare` uses the value directly, `.unresolved` hits a hard `@panic`
tripwire — never silently treated as bare. Real args always resolve, so the
happy path is byte-identical (suite stays 361/0, zero snapshot churn).
Secondary `lower.zig` `null_literal`/`undef_literal => target_type orelse .void`
confirmed intentional (typeless-literal default deliberately handled by
emitConstNull/emitConstUndef as null-ptr / undef-i64) — left with an invariant
comment, not the `.unresolved` tripwire.
Regression test in emit_llvm.test.zig asserts the loud path: fail-before with
`orelse .s64` yields `.bare`; pass-after yields `.unresolved`.
Four FFI call-arg lowering sites resolved an argument's IR type via
`getRefIRType(arg_ref) orelse .void` — a silent fallback to the load-bearing
real type `.void`. A failed lookup there is a codegen invariant violation, but
`.void` is treated by downstream `toLLVMType` → `abiCoerceParamType` →
`coerceArg` as a legitimate void-typed foreign argument, corrupting the call
ABI with no diagnostic.
Add one shared resolver `LLVMEmitter.argIRTypeOrFail` that returns the
dedicated `.unresolved` sentinel on a failed lookup — never `.void`/`.s64` — so
the failure cannot masquerade as a real type and trips `toLLVMType`'s existing
hard `@panic` tripwire at the call site. Route all four sites through it:
- src/ir/emit_llvm.zig JNI constructor (NewObject) arg loop
- src/backend/llvm/ops.zig objc_msgSend arg loop
- src/backend/llvm/ops.zig JNI non-virtual call arg loop
- src/backend/llvm/ops.zig JNI Call<Type>Method arg loop
Happy path is byte-identical (every real arg already has a resolved type); FFI
examples stay green with zero snapshot churn.
Regression test (fail-before/pass-after) in src/ir/emit_llvm.test.zig asserts an
unresolvable FFI arg ref now yields `.unresolved`, not the old silent `.void`.
Move the final inline emitInst handler groups (terminators, box/unbox-Any,
reflection, switch-branch, closure-creation, vector, block-param, misc) into
the Ops facade in src/backend/llvm/ops.zig. emitInst is now pure dispatch:
every arm delegates to self.ops().*, leaving only setInstDebugLocation plus
one-line delegations.
Widen the shared infra the moved bodies reach (emitFailableMainRet, getBlock,
anyTag, isSignedTypeEx, coerceToI64/coerceToI64Signed/coerceFromI64,
emitFieldValueGet) to pub on LLVMEmitter; helper and ref-tracking sections
stay put. Pure relocation: emitted LLVM IR byte-identical, zero snapshot churn.
Relocate the struct, enum, union, array/slice, tuple, and optional
opcode handler bodies out of emitInst into the existing Ops facade.
Each moved arm now delegates via self.ops().emit<Op>(...); shared infra
stays on LLVMEmitter, with resolveAggregate/resolveGepStructType widened
to pub as the GEP handlers require. Pure relocation, behavior-preserving:
zero snapshot churn (361/0).
Relocate the Calls (objc_msg_send / jni_msg_send / call / call_indirect)
and Call-extensions (call_builtin / compiler_call / call_closure) emitInst
handler groups out of emit_llvm.zig into the existing Ops facade. Each
emitInst arm now delegates via self.ops().emit<Op>(...). Behavior-preserving
pure relocation; emitted LLVM IR is byte-identical (361/0 examples, no
snapshot churn).
Shared call infra stays on LLVMEmitter, widened pub only as the moved
bodies require: extractSlicePtr, loadJniFn, getObjcMsgSendValue, the math
F32/F64 declarators + types, getOrDeclareWrite/getWriteType, ffiCtors,
materializeByvalArg, emitCStringGlobal, emitJniConstructor, and the Jni
slot-offset constants. emitJniConstructor remains in emit_llvm.zig (A7.3
decision); the moved jni arm calls it via self.e.emitJniConstructor(...).
Relocate the `// ── Memory ──`, `// ── Globals ──`, `// ── Conversions ──`,
and `// ── Pointer ops ──` opcode handler bodies out of `emitInst` in
src/ir/emit_llvm.zig into the existing `Ops` facade in
src/backend/llvm/ops.zig. Each `emitInst` arm now delegates via
`self.ops().emit<Op>(...)`. Widen `emitConversion`, `coerceArg`, and
`getRefIRType` to `pub` (the only helpers the moved bodies call).
Pure relocation: zero snapshot churn.
Move the Constants/Arithmetic/Bitwise/Comparisons/Logical opcode handler
bodies out of emitInst into a new Ops facade in src/backend/llvm/ops.zig.
emitInst's scalar arms now delegate via self.ops().*; the shared infra they
call (mapRef/resolveRef/matchBinOpTypes/emitCmp/emitCmpOrdered/emitStrCmp/
emitStringConstant/reflection + isFloatOrVecFloat/isSignedType) stays on
LLVMEmitter, widened to pub as needed. Pure relocation: zero snapshot churn.