Files
sx/issues/0076-stack-struct-addrof-passed-by-value.md
agra f49a49cd07 fix(diagnostics): reject reserved/builtin type names used as identifiers (issue 0076)
A value binding (local/global `var` or a parameter) spelled as a
reserved/builtin type name parses as a `.type_expr` rather than an
`.identifier` (parser.zig, via `Type.fromName`), so the address-of
family in lower.zig never saw a scoped local and mis-lowered it —
loading the aggregate and passing it by value to a `ptr` parameter
(LLVM verifier abort, or a silent `*self`-mutation-losing copy).

Add a declaration-site diagnostic in semantic_diagnostics.zig
(`UnknownTypeChecker.checkBindingName`): reject any parameter name or
`var` binding name (`:=` / typed-local / global forms) whose spelling
collides with a reserved type name. `isReservedTypeName` defers to the
parser's own classifier (`types.Type.fromName`) so the rejected set
never drifts from the set that would parse as a type — the named
builtins (bool/string/void/f32/f64/usize/isize/Any) and `[su]N` over
sx's 1-64 range. Bare value names (`s`, `self`, `index`) are untouched.
No lowering special-case; the `.identifier`-only address-of paths are
correct once type-shaped names can never be bound. The rejected
attempt-1 `bareVarName` approach was never landed.

Tests:
- 0125-types-type-named-var-rejected: `:=` form (s2) rejected
  (repurposed from the old test that asserted the now-illegal behavior).
- 1119-diagnostics-reserved-type-name-as-identifier: parameter (u8),
  typed-local (s64, bool), `:=` (string) forms rejected.
- 0135-types-self-streaming-nonreserved: positive — `*self` streaming
  with non-reserved names accumulates correctly via both call styles.
- 0904-optionals: renamed incidental locals s1/s2 -> filled/empty.
2026-06-03 19:00:39 +03:00

126 lines
6.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 0076 — builtin/reserved type name wrongly accepted as an identifier
> **Status: RESOLVED.**
>
> **Root cause:** the language accepted a value binding (local/global `var` or a
> parameter) spelled as a reserved/builtin type name. The parser turns such a
> spelling into a `.type_expr` rather than an `.identifier` (`parser.zig`, via
> `Type.fromName`), so the address-of family in `src/ir/lower.zig` never saw a
> scoped local and fell through to value lowering — loading the whole aggregate
> and passing it by value to a `ptr` parameter (LLVM verifier abort, or a silent
> `*self`-mutation-losing copy).
>
> **Fix:** a declaration-site diagnostic in the existing semantic pass
> `src/ir/semantic_diagnostics.zig` (`UnknownTypeChecker`). New
> `checkBindingName` rejects any parameter name or `var` binding name (local or
> global, `:=` / typed-local forms) whose spelling collides with a reserved type
> name; `isReservedTypeName` defers to the parser's own classifier
> (`types.Type.fromName`) so the rejected set never drifts from the set that
> would parse as a type — the named builtins (`bool`, `string`, `void`, `f32`,
> `f64`, `usize`, `isize`, `Any`) and `[su]N` over sx's 164 range. Bare value
> names (`s`, `self`, `index`) are untouched. No lowering special-case is added;
> the `.identifier`-only address-of paths are correct once type-shaped names can
> never be bound. The rejected `bareVarName` approach was never landed.
>
> **Regression tests:**
> - `examples/0125-types-type-named-var-rejected.sx` — `:=` form (`s2`) rejected.
> - `examples/1119-diagnostics-reserved-type-name-as-identifier.sx` — parameter
> (`u8`), typed-local (`s64`, `bool`), and `:=` (`string`) forms rejected.
> - `examples/0135-types-self-streaming-nonreserved.sx` — positive: `*self`
> streaming with non-reserved names (`hasher`, `ctx`) accumulates correctly via
> both `update(@h, …)` and `h.update(…)`.
>
> Pre-existing example `examples/0904-...` declared locals `s1`/`s2` (incidental
> names); renamed to `filled`/`empty`. Scope: main-file decls only, matching the
> pass's existing trusted-imports convention.
## Symptom (how it first surfaced)
A local variable whose name is lexically a type — e.g. `s2` (the `sN`
arbitrary-width signed-int syntax: `Type.fromName("s2")``s(2)`), or `u8`,
`s64`, etc. — is accepted as a variable. Because such a name parses as a
`.type_expr` (not `.identifier`), the address-of family of lowering sites
(`@s2`, the autoref `s2.update(...)` receiver, a bare `f(s2)` at a `*T` param,
global function-pointer args) does NOT recognize it as a scoped local and falls
through to value lowering — loading the whole aggregate and passing it **by
value** to a `ptr` parameter:
```
LLVM verification failed: Call parameter type does not match function signature!
call void @update(ptr @__sx_default_context,
{ [8 x i64], [64 x i8], i64, i64 } %load, ...)
```
For some struct shapes it compiles but silently passes a **copy** (callee
`*self` mutations lost). A non-type-shaped name (`hasher`, `ctx`) never triggers
any of this — the `.identifier` paths already work correctly.
## Root cause
The language is **accepting reserved/builtin type names as identifiers** in the
first place. `sN`/`uN` (arbitrary-width ints) and the named builtins
(`bool`, `string`, `void`, `f32`, `f64`, `s8`/`s16`/`s32`/`s64`,
`u8`/`u16`/`u32`/`u64`, …) are reserved type names; declaring a variable with
such a name is meaningless and produces the mis-lowering above. Patching each
address-of site to tolerate the name (the rejected `bareVarName` approach) is
whack-a-mole — there is always another site, and it entrenches a name that
should never have been allowed.
## Proper fix (the required direction)
Emit a **diagnostic error** when an identifier is declared with a name that
collides with a **builtin/reserved type name** — including the arbitrary-width
`[su][0-9]+` (`sN`/`uN`) family AND the named builtins (`bool`, `string`,
`void`, `f32`, `f64`, the fixed-width int types, etc.). Scope ruling (Agra):
**all builtin/reserved type names** are rejected as identifiers. (User-defined
struct/type-name shadowing, if intentionally supported elsewhere, is out of
scope for this issue — this is specifically about builtin/reserved type names.)
Diagnostic at the declaration site, e.g.:
`error: 'u8' is a reserved type name and cannot be used as an identifier`
with the declaration's span.
Suspected area: name binding / declaration handling — where a `:=` / typed
local / parameter name is introduced. Reject the name there, before it ever
reaches lowering. Do NOT add lowering special-cases for type-shaped names; the
`.identifier`-only checks at the address-of sites are then correct as-is (no
type-shaped name can reach them).
## Reproduction
```sx
#import "modules/std.sx";
Sha256 :: struct { h:[8]u64; block:[64]u8; block_len:s64=0; total_len:u64=0; }
init :: () -> Sha256 { s:Sha256=---; s.block_len=0; s.total_len=0; s }
update :: (self:*Sha256, data:string) { self.total_len += data.len; }
main :: () -> s32 { s2 := init(); update(@s2, "."); print("total_len={}\n", s2.total_len); return 0; }
```
`./zig-out/bin/sx run <file>` today → LLVM verifier abort.
**Expected after fix:** a clean compile-time diagnostic that `s2` is a reserved
type name and cannot be an identifier (exit non-zero, readable error — NOT an
LLVM abort, NOT a silent copy). The same program with a non-reserved name
(`hasher := init(); update(@hasher, ".")`) must compile and print `total_len=1`.
## Verification
1. Pinned diagnostics test(s) asserting the error for representative reserved
names used as identifiers: `s2`, `u8`, `s64`, `bool`, `string` (declaration
forms: `:=`, typed local, and a parameter name). Capture the diagnostic text
in `expected/`.
2. A positive test: the same `*self` streaming pattern with NON-reserved names
(`hasher`, `ctx`) compiles and accumulates state correctly via both
`update(@h, ...)` and `h.update(...)` — proving the `.identifier` paths are
correct and no lowering special-case is needed.
3. `zig build && zig build test && bash tests/run_examples.sh` all green. If any
existing example/test declares a variable with a reserved type name, it is now
illegal — fix the test's variable name (do NOT weaken the diagnostic). Report
how many such sites existed.
## Provenance
Discovered by the `distribution` flow (P1.2 pure-sx SHA-256), whose minimal repro
happened to name a local `s2`. Real SHA-256 code with names like `hasher`/`ctx`
is unaffected on the current compiler — so the P1.2 "blocker" was a
naming artifact, and this issue is really a missing-diagnostic correctness bug.