Files
sx/issues/0090-int-formatter-extremes.md
agra 5f64ee4426 fix(ir): reflection builtins on an Any read its runtime tag, not payload [F0.8]
`type_name` / `type_is_unsigned` on an `Any` argument unconditionally read
the Any's payload as a TypeId index. That is correct only when the Any holds
a Type value (`{ .any, tid }`); for an Any holding a runtime *value*
(`av : Any = 6`, tag s64, payload 6) it returned `types[6]` — `type_name(av)`
gave "u8" and `type_is_unsigned(av)` gave true.

Both backends now branch on the Any's runtime type-tag: tag == `.any` → the
box is a Type value, use the payload as the TypeId; otherwise the tag IS the
held value's type. So `type_name(av)` → "s64", `type_is_unsigned(av)` → false,
while `type_name(type_of(x))` still names the held type. The `{}` formatter is
unchanged (it already passed `type_of(val)`, a proper Type value).

- src/ir/interp.zig: shared `Value.reflectTypeId` tag-branching resolver; the
  `type_name` / `type_is_unsigned` interp arms route through it.
- src/backend/llvm/ops.zig: shared `Ops.reflectArgTypeId` emits
  extractvalue-tag / icmp-eq-.any / select for the runtime path; both
  reflection arms route through it. The two backends agree.
- examples/0164-types-reflection-any-tag.sx: regression pinning type_name /
  type_is_unsigned / print on an Any holding a value vs a Type.
- src/ir/interp.test.zig: unit test for `reflectTypeId`.
- 22 .ir snapshots: the new select appears in every std-importing program's
  IR (any_to_string embeds these builtins) — benign, verified structurally
  identical apart from the three new instructions.
- issues/0090, specs.md: documented the Any-tag rule.
2026-06-05 12:09:52 +03:00

7.4 KiB

0090 — integer formatter can't render i64::MIN or unsigned all-ones

STATUS: RESOLVED (F0.8). Both extremes now render correctly: s64.min-9223372036854775808, u64.max18446744073709551615.

Root cause.

  • Symptom 1 (i64::MIN): std.int_to_string computed the magnitude as 0 - n, which overflows for s64::MIN (its magnitude is unrepresentable as a positive s64) — the value stayed negative, the while v > 0 loop ran zero times, and only the - was emitted.
  • Symptom 2 (unsigned all-ones): any_to_string's case int: arm formatted every integer as s64 (int_to_string(xx val)); there was no way to tell a u64 from an s64, so an all-ones u64 printed as -1.

Fix per file.

  • library/modules/std.sxint_to_string now extracts digits straight from n (taking |n % 10| per digit, n truncates toward zero) so it never negates s64::MIN. Added uint_to_string (unsigned decimal via long-division-by-10 over four 16-bit limbs) and decompose_u16x4 (the shared 16-bit-limb split, now reused by int_to_hex_string too). any_to_string's case int: routes through the new type_is_unsigned(type) query to pick the unsigned vs signed formatter. Declared type_is_unsigned :: ($T: Type) -> bool #builtin;.
  • src/ir/types.zigTypeTable.isUnsignedInt (canonical signedness predicate; single source of truth).
  • src/ir/inst.zigtype_is_unsigned BuiltinId.
  • src/ir/calls.zig — register type_is_unsigned as a .bool reflection builtin.
  • src/ir/lower.zigtryLowerReflectionCall arm: static fold + dynamic callBuiltin.
  • src/ir/interp.zig — interp arm (reads the boxed TypeId / type_of aggregate shape).
  • src/ir/emit_llvm.zig + src/backend/llvm/reflection.zig + src/backend/llvm/ops.zig — lazy [N x i1] __sx_type_is_unsigned table built from isUnsignedInt; runtime arm GEPs in at the TypeId.

Regression test. examples/0046-basic-int-formatter-extremes.sx pins both extremes plus a width spread (s8/s16/s32 + u8/u16/u32/u64, mins/maxes, 0, ordinary values). Unit tests: isUnsignedInt in src/ir/types.test.zig.

Follow-up (F0.8 attempt 2) — strict $T: Type on all 7 reflection builtins. The stress-review of the additive type_is_unsigned builtin found it (and the whole reflection family) silently accepted a non-type argument: type_is_unsigned(6) reinterpreted 6 as a TypeId index and returned the signedness of types[6] (u8 → true); size_of(6)/(true) sized its typeof (8); type_name(6) returned types[6]'s name. Per Agra's ruling, all 7 type-introspection builtins — size_of, align_of, field_count, type_name, type_eq, type_is_unsigned, is_flags — now STRICTLY require a type (compile-time): a value argument is rejected with "<builtin> expects a type, got '<type>'".

  • src/ir/lower.zig — one shared guard, reflectionTypeArgGuard (run at the top of tryLowerReflectionCall), classifies each arg via reflectionArgIsType: a spelled / compile-time type or generic type param (the isStaticTypeArg shapes), or a runtime Type value (static type .anytype_of(x), a []Type element list[i], a Type-typed local / field / param) is ACCEPTED; anything else is rejected. The existing runtime path for type_name / type_is_unsigned is preserved (the formatter calls type_is_unsigned(type_of(val)) at runtime). The 5 comptime-only builtins stay comptime-only (runtime reflection deferred).
  • Negative regression: examples/1144-diagnostics-reflection-builtin-needs-type.sx (reject cases across all 7, exit 1). Unit test: reflectionArgIsType in src/ir/lower.test.zig.

Follow-up (F0.8 attempt 3) — reflection builtins on an Any consult the Any's runtime TYPE-TAG, not its payload. The attempt-2 guard correctly accepts an Any argument (the formatter passes val: Any), but the dynamic type_name / type_is_unsigned path still read the Any's payload as a TypeId index unconditionally — correct only when the Any holds a Type value. For an Any holding a value (av : Any = 6, runtime tag s64, payload 6) it reported types[6] (u8): type_name(av)"u8", type_is_unsigned(av)true. Per Agra's ruling ("Any is a type AND a value, so it's expected to work"), both builtins now branch on the Any's runtime tag: tag == .any → the box is a Type value, use the payload as the TypeId; otherwise the tag IS the held value's type. So type_name(av)"s64", type_is_unsigned(av)false, while type_name(type_of(x)) still names the held type. The formatter is unchanged (it already passed type_of(val), a proper Type value).

  • src/ir/interp.zig — shared Value.reflectTypeId (the tag-branching resolver); the type_name / type_is_unsigned interp arms route through it. src/backend/llvm/ops.zig — shared Ops.reflectArgTypeId emits extractvalue tag / icmp eq tag, .any / select for the runtime path; both reflection arms route through it. The two backends agree.
  • Regression: examples/0164-types-reflection-any-tag.sx pins type_name / type_is_unsigned / print on an Any holding a value vs. a Type value. Unit test: reflectTypeId in src/ir/interp.test.zig.
  • Out of scope (kept comptime-only / deferred): the 5 comptime-only builtins (size_of/align_of/field_count/is_flags/type_eq). type_eq has no dynamic emit path (it folds at lower time), so it is unaffected.

STATUS (original): OPEN. Pre-existing + orthogonal; surfaced (not introduced) by NL.1. Manager-verified independent of the numeric-limit accessors. Scheduled separately.

Symptom

print("{}", x) mis-renders the integer extremes the s64-based formatter can't represent:

  • i64::MIN (-9223372036854775808) prints a bare - (the minus sign with NO digits).
  • An unsigned all-ones value (e.g. u64.max = 18446744073709551615) prints -1 (the i64 bit-reinterpretation), not the unsigned decimal.

Reproduction (no numeric-limit accessor needed — pre-existing)

#import "modules/std.sx";
main :: () {
    x := -9223372036854775807 - 1;   // i64::MIN
    print("min={}\n", x);            // prints "min=-"  (should be -9223372036854775808)
}

u64.max (via the NL.1 accessor, or any all-ones u64) prints -1 for the same root reason.

Root cause (suspected)

The integer-to-string path is s64-based (std.int_to_string / the {} formatter takes s64): it negates the value to print the sign, but -i64::MIN overflows, and it has no unsigned-aware path so an all-ones u64 is read as -1. Needs a width/ signedness-aware integer formatter (format by the value's actual integer TYPE: unsigned types print the unsigned decimal; signed MIN is handled without negating).

Investigation prompt

Make the {} integer formatter type-aware: render an unsigned integer as its unsigned decimal (all 64 bits for u64), and handle signed MIN without the -MIN overflow (e.g. format the magnitude via unsigned arithmetic, or special-case MIN). Verify: i64::MIN prints -9223372036854775808; u64.max prints 18446744073709551615; existing numeric output (incl. the NL.1 examples, which assert via bit-reinterpret) stays green. Likely area: the formatter / int_to_string in the std print path and/or the comptime {} lowering.