The reserved-type-name binding diagnostic fired correctly but underlined the enclosing statement / if / while / for / match / protocol / #objc_class block because every binding-name check reused the parent `node.span`. Thread each binding name's own span through the AST and parser, and pass it to `checkBindingNames`: - ast: add name spans to VarDecl, DestructureDecl, If/WhileExpr, ForExpr (capture + index), MatchArm, Catch/OnFailStmt, Protocol/ForeignMethodDecl. - parser: populate each span at the binding site from the name token's loc; destructure reuses each target identifier's own span. - semantic_diagnostics: every checkBindingName call now passes the binding's own span — no site falls back to node.span. fn/lambda params already used Param.name_span. Carets now land on the offending identifier itself. New regression examples/1125 asserts the protocol default-body and sx-defined #objc_class method param spans; 0125/1119-1124 expected updated to the precise carets.
171 lines
9.5 KiB
Markdown
171 lines
9.5 KiB
Markdown
# 0076 — builtin/reserved type name wrongly accepted as an identifier
|
||
|
||
> **Status: RESOLVED.**
|
||
>
|
||
> **Root cause:** the language accepted a value binding (local/global `var` or a
|
||
> parameter) spelled as a reserved/builtin type name. The parser turns such a
|
||
> spelling into a `.type_expr` rather than an `.identifier` (`parser.zig`, via
|
||
> `Type.fromName`), so the address-of family in `src/ir/lower.zig` never saw a
|
||
> scoped local and fell through to value lowering — loading the whole aggregate
|
||
> and passing it by value to a `ptr` parameter (LLVM verifier abort, or a silent
|
||
> `*self`-mutation-losing copy).
|
||
>
|
||
> **Fix:** a declaration-site diagnostic in the existing semantic pass
|
||
> `src/ir/semantic_diagnostics.zig` (`UnknownTypeChecker`). `checkBindingName`
|
||
> rejects any binding name whose spelling collides with a reserved type name;
|
||
> `isReservedTypeName` defers to the parser's own classifier
|
||
> (`types.Type.fromName`) so the rejected set never drifts from the set that
|
||
> would parse as a type — the named builtins (`bool`, `string`, `void`, `f32`,
|
||
> `f64`, `usize`, `isize`, `Any`) and `[su]N` over sx's 1–64 range. Bare value
|
||
> names (`s`, `self`, `index`) are untouched. No lowering special-case is added;
|
||
> the `.identifier`-only address-of paths are correct once type-shaped names can
|
||
> never be bound. The rejected `bareVarName` approach was never landed.
|
||
>
|
||
> **Coverage is structural (attempt 4).** Earlier landings hand-walked a subset
|
||
> of binding-bearing nodes with a silent `else => {}`, so each review found a new
|
||
> leaking syntactic form (destructure names, `impl` method params/locals, `if` /
|
||
> `while` / `for` / match-arm / `catch` / `onfail` captures) that bypassed the
|
||
> check and hit the original LLVM verifier abort. `checkBindingNames` is now an
|
||
> **exhaustive `switch` over every `Node.Data` tag with NO `else` arm**: a future
|
||
> binding-bearing node type fails to compile until it is handled here, so
|
||
> coverage is enforced by the compiler rather than by a hand-maintained list. The
|
||
> check stays in the pre-lowering semantic pass (NOT moved to the `Scope.put`
|
||
> scope-registration choke point) because lowering is lazy — an UNCALLED
|
||
> function's bindings never reach `Scope.put`, yet they must still be rejected at
|
||
> their declaration (e.g. `examples/1119`'s never-called `takes_u8`).
|
||
>
|
||
> **Span precision (attempt 5).** Every binding form now carries its own
|
||
> name span in the AST (`VarDecl.name_span`, `DestructureDecl.name_spans`,
|
||
> `IfExpr`/`WhileExpr.binding_span`, `ForExpr.capture_span`/`index_span`,
|
||
> `MatchArm.capture_span`, `CatchExpr`/`OnFailStmt.binding_span`,
|
||
> `Protocol`/`ForeignMethodDecl.param_name_spans`), populated by the parser at
|
||
> each binding site. `checkBindingNames` passes that span to the diagnostic, so
|
||
> the caret underlines the offending identifier itself instead of the enclosing
|
||
> statement / `if` / `match` / `protocol` / `#objc_class` block. No call site
|
||
> falls back to the parent `node.span`. Regular `fn`/lambda params already used
|
||
> `Param.name_span`.
|
||
>
|
||
> **Regression tests:**
|
||
> - `examples/0125-types-type-named-var-rejected.sx` — `:=` form (`s2`) rejected.
|
||
> - `examples/1119-diagnostics-reserved-type-name-as-identifier.sx` — parameter
|
||
> (`u8`), typed-local (`s64`, `bool`), and `:=` (`string`) forms rejected.
|
||
> - `examples/1121-diagnostics-reserved-name-control-flow.sx` — destructure name,
|
||
> `if` / `while` optional bindings, `for` capture + index names, match-arm
|
||
> capture.
|
||
> - `examples/1122-diagnostics-reserved-name-impl-method.sx` — `impl`-block method
|
||
> reserved param AND reserved local.
|
||
> - `examples/1123-diagnostics-reserved-name-catch-onfail.sx` — `catch` and
|
||
> `onfail` error-tag bindings.
|
||
> - `examples/1124-diagnostics-imported-reserved-destructure.sx` — destructure
|
||
> name reserved in an IMPORTED module (renders against that module's source).
|
||
> - `examples/1125-diagnostics-reserved-name-method-param.sx` — protocol
|
||
> default-body method param AND sx-defined `#objc_class` method param, each
|
||
> caret landing on the parameter token.
|
||
> - `examples/0135-types-self-streaming-nonreserved.sx` — positive: `*self`
|
||
> streaming with non-reserved names (`hasher`, `ctx`) accumulates correctly via
|
||
> both `update(@h, …)` and `h.update(…)`.
|
||
>
|
||
> Pre-existing example `examples/0904-...` declared locals `s1`/`s2` (incidental
|
||
> names); renamed to `filled`/`empty`.
|
||
>
|
||
> **Coverage extension (issue 0077).** The first landing scoped the binding
|
||
> check to main-file decls (matching the unknown-type check's trusted-imports
|
||
> convention); an imported module could still declare `s2 := …` and hit the
|
||
> original LLVM verifier abort. The reserved-name binding diagnostic now runs
|
||
> over EVERY compiled module — imported user modules (descending the
|
||
> `namespace_decl` an `mod :: #import` wraps) AND the stdlib `library/` — and
|
||
> the two `u1` locals in `library/modules/ui/renderer.sx` were renamed
|
||
> accordingly. The unknown-type check (issue 0064) stays main-file-only. See
|
||
> issue 0077 for the imported-module facet and its pinned regression test
|
||
> `examples/1120-diagnostics-imported-reserved-type-name.sx`.
|
||
|
||
## Symptom (how it first surfaced)
|
||
|
||
A local variable whose name is lexically a type — e.g. `s2` (the `sN`
|
||
arbitrary-width signed-int syntax: `Type.fromName("s2")` → `s(2)`), or `u8`,
|
||
`s64`, etc. — is accepted as a variable. Because such a name parses as a
|
||
`.type_expr` (not `.identifier`), the address-of family of lowering sites
|
||
(`@s2`, the autoref `s2.update(...)` receiver, a bare `f(s2)` at a `*T` param,
|
||
global function-pointer args) does NOT recognize it as a scoped local and falls
|
||
through to value lowering — loading the whole aggregate and passing it **by
|
||
value** to a `ptr` parameter:
|
||
|
||
```
|
||
LLVM verification failed: Call parameter type does not match function signature!
|
||
call void @update(ptr @__sx_default_context,
|
||
{ [8 x i64], [64 x i8], i64, i64 } %load, ...)
|
||
```
|
||
|
||
For some struct shapes it compiles but silently passes a **copy** (callee
|
||
`*self` mutations lost). A non-type-shaped name (`hasher`, `ctx`) never triggers
|
||
any of this — the `.identifier` paths already work correctly.
|
||
|
||
## Root cause
|
||
|
||
The language is **accepting reserved/builtin type names as identifiers** in the
|
||
first place. `sN`/`uN` (arbitrary-width ints) and the named builtins
|
||
(`bool`, `string`, `void`, `f32`, `f64`, `s8`/`s16`/`s32`/`s64`,
|
||
`u8`/`u16`/`u32`/`u64`, …) are reserved type names; declaring a variable with
|
||
such a name is meaningless and produces the mis-lowering above. Patching each
|
||
address-of site to tolerate the name (the rejected `bareVarName` approach) is
|
||
whack-a-mole — there is always another site, and it entrenches a name that
|
||
should never have been allowed.
|
||
|
||
## Proper fix (the required direction)
|
||
|
||
Emit a **diagnostic error** when an identifier is declared with a name that
|
||
collides with a **builtin/reserved type name** — including the arbitrary-width
|
||
`[su][0-9]+` (`sN`/`uN`) family AND the named builtins (`bool`, `string`,
|
||
`void`, `f32`, `f64`, the fixed-width int types, etc.). Scope ruling (Agra):
|
||
**all builtin/reserved type names** are rejected as identifiers. (User-defined
|
||
struct/type-name shadowing, if intentionally supported elsewhere, is out of
|
||
scope for this issue — this is specifically about builtin/reserved type names.)
|
||
|
||
Diagnostic at the declaration site, e.g.:
|
||
`error: 'u8' is a reserved type name and cannot be used as an identifier`
|
||
with the declaration's span.
|
||
|
||
Suspected area: name binding / declaration handling — where a `:=` / typed
|
||
local / parameter name is introduced. Reject the name there, before it ever
|
||
reaches lowering. Do NOT add lowering special-cases for type-shaped names; the
|
||
`.identifier`-only checks at the address-of sites are then correct as-is (no
|
||
type-shaped name can reach them).
|
||
|
||
## Reproduction
|
||
|
||
```sx
|
||
#import "modules/std.sx";
|
||
Sha256 :: struct { h:[8]u64; block:[64]u8; block_len:s64=0; total_len:u64=0; }
|
||
init :: () -> Sha256 { s:Sha256=---; s.block_len=0; s.total_len=0; s }
|
||
update :: (self:*Sha256, data:string) { self.total_len += data.len; }
|
||
main :: () -> s32 { s2 := init(); update(@s2, "."); print("total_len={}\n", s2.total_len); return 0; }
|
||
```
|
||
|
||
`./zig-out/bin/sx run <file>` today → LLVM verifier abort.
|
||
**Expected after fix:** a clean compile-time diagnostic that `s2` is a reserved
|
||
type name and cannot be an identifier (exit non-zero, readable error — NOT an
|
||
LLVM abort, NOT a silent copy). The same program with a non-reserved name
|
||
(`hasher := init(); update(@hasher, ".")`) must compile and print `total_len=1`.
|
||
|
||
## Verification
|
||
|
||
1. Pinned diagnostics test(s) asserting the error for representative reserved
|
||
names used as identifiers: `s2`, `u8`, `s64`, `bool`, `string` (declaration
|
||
forms: `:=`, typed local, and a parameter name). Capture the diagnostic text
|
||
in `expected/`.
|
||
2. A positive test: the same `*self` streaming pattern with NON-reserved names
|
||
(`hasher`, `ctx`) compiles and accumulates state correctly via both
|
||
`update(@h, ...)` and `h.update(...)` — proving the `.identifier` paths are
|
||
correct and no lowering special-case is needed.
|
||
3. `zig build && zig build test && bash tests/run_examples.sh` all green. If any
|
||
existing example/test declares a variable with a reserved type name, it is now
|
||
illegal — fix the test's variable name (do NOT weaken the diagnostic). Report
|
||
how many such sites existed.
|
||
|
||
## Provenance
|
||
|
||
Discovered by the `distribution` flow (P1.2 pure-sx SHA-256), whose minimal repro
|
||
happened to name a local `s2`. Real SHA-256 code with names like `hasher`/`ctx`
|
||
is unaffected on the current compiler — so the P1.2 "blocker" was a
|
||
naming artifact, and this issue is really a missing-diagnostic correctness bug.
|