`type_name` / `type_is_unsigned` on an `Any` argument unconditionally read
the Any's payload as a TypeId index. That is correct only when the Any holds
a Type value (`{ .any, tid }`); for an Any holding a runtime *value*
(`av : Any = 6`, tag s64, payload 6) it returned `types[6]` — `type_name(av)`
gave "u8" and `type_is_unsigned(av)` gave true.
Both backends now branch on the Any's runtime type-tag: tag == `.any` → the
box is a Type value, use the payload as the TypeId; otherwise the tag IS the
held value's type. So `type_name(av)` → "s64", `type_is_unsigned(av)` → false,
while `type_name(type_of(x))` still names the held type. The `{}` formatter is
unchanged (it already passed `type_of(val)`, a proper Type value).
- src/ir/interp.zig: shared `Value.reflectTypeId` tag-branching resolver; the
`type_name` / `type_is_unsigned` interp arms route through it.
- src/backend/llvm/ops.zig: shared `Ops.reflectArgTypeId` emits
extractvalue-tag / icmp-eq-.any / select for the runtime path; both
reflection arms route through it. The two backends agree.
- examples/0164-types-reflection-any-tag.sx: regression pinning type_name /
type_is_unsigned / print on an Any holding a value vs a Type.
- src/ir/interp.test.zig: unit test for `reflectTypeId`.
- 22 .ir snapshots: the new select appears in every std-importing program's
IR (any_to_string embeds these builtins) — benign, verified structurally
identical apart from the three new instructions.
- issues/0090, specs.md: documented the Any-tag rule.
133 lines
7.4 KiB
Markdown
133 lines
7.4 KiB
Markdown
# 0090 — integer formatter can't render i64::MIN or unsigned all-ones
|
|
|
|
> STATUS: RESOLVED (F0.8). Both extremes now render correctly:
|
|
> `s64.min` → `-9223372036854775808`, `u64.max` → `18446744073709551615`.
|
|
>
|
|
> **Root cause.**
|
|
> - Symptom 1 (i64::MIN): `std.int_to_string` computed the magnitude as
|
|
> `0 - n`, which overflows for `s64::MIN` (its magnitude is
|
|
> unrepresentable as a positive s64) — the value stayed negative, the
|
|
> `while v > 0` loop ran zero times, and only the `-` was emitted.
|
|
> - Symptom 2 (unsigned all-ones): `any_to_string`'s `case int:` arm
|
|
> formatted every integer as s64 (`int_to_string(xx val)`); there was no
|
|
> way to tell a `u64` from an `s64`, so an all-ones u64 printed as `-1`.
|
|
>
|
|
> **Fix per file.**
|
|
> - `library/modules/std.sx` — `int_to_string` now extracts digits straight
|
|
> from `n` (taking `|n % 10|` per digit, `n` truncates toward zero) so it
|
|
> never negates `s64::MIN`. Added `uint_to_string` (unsigned decimal via
|
|
> long-division-by-10 over four 16-bit limbs) and `decompose_u16x4` (the
|
|
> shared 16-bit-limb split, now reused by `int_to_hex_string` too).
|
|
> `any_to_string`'s `case int:` routes through the new
|
|
> `type_is_unsigned(type)` query to pick the unsigned vs signed formatter.
|
|
> Declared `type_is_unsigned :: ($T: Type) -> bool #builtin;`.
|
|
> - `src/ir/types.zig` — `TypeTable.isUnsignedInt` (canonical signedness
|
|
> predicate; single source of truth).
|
|
> - `src/ir/inst.zig` — `type_is_unsigned` BuiltinId.
|
|
> - `src/ir/calls.zig` — register `type_is_unsigned` as a `.bool` reflection
|
|
> builtin.
|
|
> - `src/ir/lower.zig` — `tryLowerReflectionCall` arm: static fold +
|
|
> dynamic `callBuiltin`.
|
|
> - `src/ir/interp.zig` — interp arm (reads the boxed TypeId / `type_of`
|
|
> aggregate shape).
|
|
> - `src/ir/emit_llvm.zig` + `src/backend/llvm/reflection.zig` +
|
|
> `src/backend/llvm/ops.zig` — lazy `[N x i1]` `__sx_type_is_unsigned`
|
|
> table built from `isUnsignedInt`; runtime arm GEPs in at the TypeId.
|
|
>
|
|
> **Regression test.** `examples/0046-basic-int-formatter-extremes.sx`
|
|
> pins both extremes plus a width spread (s8/s16/s32 + u8/u16/u32/u64,
|
|
> mins/maxes, 0, ordinary values). Unit tests: `isUnsignedInt` in
|
|
> `src/ir/types.test.zig`.
|
|
>
|
|
> **Follow-up (F0.8 attempt 2) — strict `$T: Type` on all 7 reflection
|
|
> builtins.** The stress-review of the additive `type_is_unsigned` builtin
|
|
> found it (and the whole reflection family) silently accepted a non-type
|
|
> argument: `type_is_unsigned(6)` reinterpreted `6` as a TypeId index and
|
|
> returned the signedness of `types[6]` (`u8` → true); `size_of(6)`/`(true)`
|
|
> sized its `typeof` (8); `type_name(6)` returned `types[6]`'s name.
|
|
> Per Agra's ruling, all 7 type-introspection builtins — `size_of`,
|
|
> `align_of`, `field_count`, `type_name`, `type_eq`, `type_is_unsigned`,
|
|
> `is_flags` — now STRICTLY require a type (compile-time): a value argument
|
|
> is rejected with `"<builtin> expects a type, got '<type>'"`.
|
|
> - `src/ir/lower.zig` — one shared guard, `reflectionTypeArgGuard` (run at
|
|
> the top of `tryLowerReflectionCall`), classifies each arg via
|
|
> `reflectionArgIsType`: a spelled / compile-time type or generic type
|
|
> param (the `isStaticTypeArg` shapes), or a runtime `Type` value (static
|
|
> type `.any` — `type_of(x)`, a `[]Type` element `list[i]`, a `Type`-typed
|
|
> local / field / param) is ACCEPTED; anything else is rejected. The
|
|
> existing runtime path for `type_name` / `type_is_unsigned` is preserved
|
|
> (the formatter calls `type_is_unsigned(type_of(val))` at runtime). The 5
|
|
> comptime-only builtins stay comptime-only (runtime reflection deferred).
|
|
> - Negative regression: `examples/1144-diagnostics-reflection-builtin-needs-type.sx`
|
|
> (reject cases across all 7, exit 1). Unit test: `reflectionArgIsType` in
|
|
> `src/ir/lower.test.zig`.
|
|
>
|
|
> **Follow-up (F0.8 attempt 3) — reflection builtins on an `Any` consult the
|
|
> Any's runtime TYPE-TAG, not its payload.** The attempt-2 guard correctly
|
|
> accepts an `Any` argument (the formatter passes `val: Any`), but the dynamic
|
|
> `type_name` / `type_is_unsigned` path still read the Any's payload as a
|
|
> TypeId index unconditionally — correct only when the Any holds a *Type
|
|
> value*. For an Any holding a *value* (`av : Any = 6`, runtime tag `s64`,
|
|
> payload `6`) it reported `types[6]` (`u8`): `type_name(av)` → `"u8"`,
|
|
> `type_is_unsigned(av)` → `true`. Per Agra's ruling ("Any is a type AND a
|
|
> value, so it's expected to work"), both builtins now branch on the Any's
|
|
> runtime tag: tag `== .any` → the box is a Type value, use the payload as the
|
|
> TypeId; otherwise the tag IS the held value's type. So `type_name(av)` →
|
|
> `"s64"`, `type_is_unsigned(av)` → `false`, while `type_name(type_of(x))`
|
|
> still names the held type. The formatter is unchanged (it already passed
|
|
> `type_of(val)`, a proper Type value).
|
|
> - `src/ir/interp.zig` — shared `Value.reflectTypeId` (the tag-branching
|
|
> resolver); the `type_name` / `type_is_unsigned` interp arms route through
|
|
> it. `src/backend/llvm/ops.zig` — shared `Ops.reflectArgTypeId` emits
|
|
> `extractvalue tag` / `icmp eq tag, .any` / `select` for the runtime path;
|
|
> both reflection arms route through it. The two backends agree.
|
|
> - Regression: `examples/0164-types-reflection-any-tag.sx` pins `type_name` /
|
|
> `type_is_unsigned` / `print` on an Any holding a value vs. a Type value.
|
|
> Unit test: `reflectTypeId` in `src/ir/interp.test.zig`.
|
|
> - Out of scope (kept comptime-only / deferred): the 5 comptime-only builtins
|
|
> (`size_of`/`align_of`/`field_count`/`is_flags`/`type_eq`). `type_eq` has no
|
|
> dynamic emit path (it folds at lower time), so it is unaffected.
|
|
|
|
> STATUS (original): OPEN. Pre-existing + orthogonal; surfaced (not introduced) by NL.1.
|
|
> Manager-verified independent of the numeric-limit accessors. Scheduled separately.
|
|
|
|
## Symptom
|
|
|
|
`print("{}", x)` mis-renders the integer extremes the s64-based formatter can't
|
|
represent:
|
|
- `i64::MIN` (`-9223372036854775808`) prints a bare `-` (the minus sign with NO
|
|
digits).
|
|
- An unsigned all-ones value (e.g. `u64.max` = 18446744073709551615) prints `-1`
|
|
(the i64 bit-reinterpretation), not the unsigned decimal.
|
|
|
|
## Reproduction (no numeric-limit accessor needed — pre-existing)
|
|
|
|
```sx
|
|
#import "modules/std.sx";
|
|
main :: () {
|
|
x := -9223372036854775807 - 1; // i64::MIN
|
|
print("min={}\n", x); // prints "min=-" (should be -9223372036854775808)
|
|
}
|
|
```
|
|
|
|
`u64.max` (via the NL.1 accessor, or any all-ones u64) prints `-1` for the same
|
|
root reason.
|
|
|
|
## Root cause (suspected)
|
|
|
|
The integer-to-string path is `s64`-based (`std.int_to_string` / the `{}` formatter
|
|
takes `s64`): it negates the value to print the sign, but `-i64::MIN` overflows, and
|
|
it has no unsigned-aware path so an all-ones u64 is read as `-1`. Needs a width/
|
|
signedness-aware integer formatter (format by the value's actual integer TYPE:
|
|
unsigned types print the unsigned decimal; signed `MIN` is handled without negating).
|
|
|
|
## Investigation prompt
|
|
|
|
Make the `{}` integer formatter type-aware: render an unsigned integer as its
|
|
unsigned decimal (all 64 bits for u64), and handle signed `MIN` without the
|
|
`-MIN` overflow (e.g. format the magnitude via unsigned arithmetic, or special-case
|
|
MIN). Verify: `i64::MIN` prints `-9223372036854775808`; `u64.max` prints
|
|
`18446744073709551615`; existing numeric output (incl. the NL.1 examples, which
|
|
assert via bit-reinterpret) stays green. Likely area: the formatter / `int_to_string`
|
|
in the std print path and/or the comptime `{}` lowering.
|