Files
sx/issues/0090-int-formatter-extremes.md
agra d8076b9333 lang: rename signed integer types sN -> iN
Surface rename of the signed integer family: s1..s64 become i1..i64
(u1..u64, usize, isize unchanged). 'string' keeps the s-prefix arm in
name classification; width parsing moves to the i-prefix arm next to
isize.

Internal TypeId tags follow the surface (.s8/.s16/.s32/.s64 ->
.i8/.i16/.i32/.i64), as do mono-key mangle fragments (ptr_i64,
tu_i64_bool) and all display/diagnostic formatting (i{d}).

Migrated in the same sweep: stdlib + examples + issue repros + FFI C
companions (shared symbol names like ffi_id_i64), expected
stdout/stderr/ir snapshots, specs.md, readme.md, CLAUDE.md/AGENTS.md,
implementation_plan.md, docs/, issue writeups. Vendored stb_image and
historical flow state left untouched.

zig build test: 426/426; examples suite: 595/595.
2026-06-12 09:31:53 +03:00

7.4 KiB

0090 — integer formatter can't render i64::MIN or unsigned all-ones

STATUS: RESOLVED (F0.8). Both extremes now render correctly: i64.min-9223372036854775808, u64.max18446744073709551615.

Root cause.

  • Symptom 1 (i64::MIN): std.int_to_string computed the magnitude as 0 - n, which overflows for i64::MIN (its magnitude is unrepresentable as a positive i64) — the value stayed negative, the while v > 0 loop ran zero times, and only the - was emitted.
  • Symptom 2 (unsigned all-ones): any_to_string's case int: arm formatted every integer as i64 (int_to_string(xx val)); there was no way to tell a u64 from an i64, so an all-ones u64 printed as -1.

Fix per file.

  • library/modules/std.sxint_to_string now extracts digits straight from n (taking |n % 10| per digit, n truncates toward zero) so it never negates i64::MIN. Added uint_to_string (unsigned decimal via long-division-by-10 over four 16-bit limbs) and decompose_u16x4 (the shared 16-bit-limb split, now reused by int_to_hex_string too). any_to_string's case int: routes through the new type_is_unsigned(type) query to pick the unsigned vs signed formatter. Declared type_is_unsigned :: ($T: Type) -> bool #builtin;.
  • src/ir/types.zigTypeTable.isUnsignedInt (canonical signedness predicate; single source of truth).
  • src/ir/inst.zigtype_is_unsigned BuiltinId.
  • src/ir/calls.zig — register type_is_unsigned as a .bool reflection builtin.
  • src/ir/lower.zigtryLowerReflectionCall arm: static fold + dynamic callBuiltin.
  • src/ir/interp.zig — interp arm (reads the boxed TypeId / type_of aggregate shape).
  • src/ir/emit_llvm.zig + src/backend/llvm/reflection.zig + src/backend/llvm/ops.zig — lazy [N x i1] __sx_type_is_unsigned table built from isUnsignedInt; runtime arm GEPs in at the TypeId.

Regression test. examples/0046-basic-int-formatter-extremes.sx pins both extremes plus a width spread (i8/i16/i32 + u8/u16/u32/u64, mins/maxes, 0, ordinary values). Unit tests: isUnsignedInt in src/ir/types.test.zig.

Follow-up (F0.8 attempt 2) — strict $T: Type on all 7 reflection builtins. The stress-review of the additive type_is_unsigned builtin found it (and the whole reflection family) silently accepted a non-type argument: type_is_unsigned(6) reinterpreted 6 as a TypeId index and returned the signedness of types[6] (u8 → true); size_of(6)/(true) sized its typeof (8); type_name(6) returned types[6]'s name. Per Agra's ruling, all 7 type-introspection builtins — size_of, align_of, field_count, type_name, type_eq, type_is_unsigned, is_flags — now STRICTLY require a type (compile-time): a value argument is rejected with "<builtin> expects a type, got '<type>'".

  • src/ir/lower.zig — one shared guard, reflectionTypeArgGuard (run at the top of tryLowerReflectionCall), classifies each arg via reflectionArgIsType: a spelled / compile-time type or generic type param (the isStaticTypeArg shapes), or a runtime Type value (static type .anytype_of(x), a []Type element list[i], a Type-typed local / field / param) is ACCEPTED; anything else is rejected. The existing runtime path for type_name / type_is_unsigned is preserved (the formatter calls type_is_unsigned(type_of(val)) at runtime). The 5 comptime-only builtins stay comptime-only (runtime reflection deferred).
  • Negative regression: examples/1144-diagnostics-reflection-builtin-needs-type.sx (reject cases across all 7, exit 1). Unit test: reflectionArgIsType in src/ir/lower.test.zig.

Follow-up (F0.8 attempt 3) — reflection builtins on an Any consult the Any's runtime TYPE-TAG, not its payload. The attempt-2 guard correctly accepts an Any argument (the formatter passes val: Any), but the dynamic type_name / type_is_unsigned path still read the Any's payload as a TypeId index unconditionally — correct only when the Any holds a Type value. For an Any holding a value (av : Any = 6, runtime tag i64, payload 6) it reported types[6] (u8): type_name(av)"u8", type_is_unsigned(av)true. Per Agra's ruling ("Any is a type AND a value, so it's expected to work"), both builtins now branch on the Any's runtime tag: tag == .any → the box is a Type value, use the payload as the TypeId; otherwise the tag IS the held value's type. So type_name(av)"i64", type_is_unsigned(av)false, while type_name(type_of(x)) still names the held type. The formatter is unchanged (it already passed type_of(val), a proper Type value).

  • src/ir/interp.zig — shared Value.reflectTypeId (the tag-branching resolver); the type_name / type_is_unsigned interp arms route through it. src/backend/llvm/ops.zig — shared Ops.reflectArgTypeId emits extractvalue tag / icmp eq tag, .any / select for the runtime path; both reflection arms route through it. The two backends agree.
  • Regression: examples/0164-types-reflection-any-tag.sx pins type_name / type_is_unsigned / print on an Any holding a value vs. a Type value. Unit test: reflectTypeId in src/ir/interp.test.zig.
  • Out of scope (kept comptime-only / deferred): the 5 comptime-only builtins (size_of/align_of/field_count/is_flags/type_eq). type_eq has no dynamic emit path (it folds at lower time), so it is unaffected.

STATUS (original): OPEN. Pre-existing + orthogonal; surfaced (not introduced) by NL.1. Manager-verified independent of the numeric-limit accessors. Scheduled separately.

Symptom

print("{}", x) mis-renders the integer extremes the i64-based formatter can't represent:

  • i64::MIN (-9223372036854775808) prints a bare - (the minus sign with NO digits).
  • An unsigned all-ones value (e.g. u64.max = 18446744073709551615) prints -1 (the i64 bit-reinterpretation), not the unsigned decimal.

Reproduction (no numeric-limit accessor needed — pre-existing)

#import "modules/std.sx";
main :: () {
    x := -9223372036854775807 - 1;   // i64::MIN
    print("min={}\n", x);            // prints "min=-"  (should be -9223372036854775808)
}

u64.max (via the NL.1 accessor, or any all-ones u64) prints -1 for the same root reason.

Root cause (suspected)

The integer-to-string path is i64-based (std.int_to_string / the {} formatter takes i64): it negates the value to print the sign, but -i64::MIN overflows, and it has no unsigned-aware path so an all-ones u64 is read as -1. Needs a width/ signedness-aware integer formatter (format by the value's actual integer TYPE: unsigned types print the unsigned decimal; signed MIN is handled without negating).

Investigation prompt

Make the {} integer formatter type-aware: render an unsigned integer as its unsigned decimal (all 64 bits for u64), and handle signed MIN without the -MIN overflow (e.g. format the magnitude via unsigned arithmetic, or special-case MIN). Verify: i64::MIN prints -9223372036854775808; u64.max prints 18446744073709551615; existing numeric output (incl. the NL.1 examples, which assert via bit-reinterpret) stays green. Likely area: the formatter / int_to_string in the std print path and/or the comptime {} lowering.