Files
sx/issues/0076-stack-struct-addrof-passed-by-value.md
agra 6433eb6155 fix(diagnostics): point reserved-type-name binding errors at the binding (issue 0076)
The reserved-type-name binding diagnostic fired correctly but underlined the
enclosing statement / if / while / for / match / protocol / #objc_class block
because every binding-name check reused the parent `node.span`.

Thread each binding name's own span through the AST and parser, and pass it to
`checkBindingNames`:

- ast: add name spans to VarDecl, DestructureDecl, If/WhileExpr, ForExpr
  (capture + index), MatchArm, Catch/OnFailStmt, Protocol/ForeignMethodDecl.
- parser: populate each span at the binding site from the name token's loc;
  destructure reuses each target identifier's own span.
- semantic_diagnostics: every checkBindingName call now passes the binding's
  own span — no site falls back to node.span. fn/lambda params already used
  Param.name_span.

Carets now land on the offending identifier itself. New regression
examples/1125 asserts the protocol default-body and sx-defined #objc_class
method param spans; 0125/1119-1124 expected updated to the precise carets.
2026-06-03 22:06:56 +03:00

171 lines
9.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 0076 — builtin/reserved type name wrongly accepted as an identifier
> **Status: RESOLVED.**
>
> **Root cause:** the language accepted a value binding (local/global `var` or a
> parameter) spelled as a reserved/builtin type name. The parser turns such a
> spelling into a `.type_expr` rather than an `.identifier` (`parser.zig`, via
> `Type.fromName`), so the address-of family in `src/ir/lower.zig` never saw a
> scoped local and fell through to value lowering — loading the whole aggregate
> and passing it by value to a `ptr` parameter (LLVM verifier abort, or a silent
> `*self`-mutation-losing copy).
>
> **Fix:** a declaration-site diagnostic in the existing semantic pass
> `src/ir/semantic_diagnostics.zig` (`UnknownTypeChecker`). `checkBindingName`
> rejects any binding name whose spelling collides with a reserved type name;
> `isReservedTypeName` defers to the parser's own classifier
> (`types.Type.fromName`) so the rejected set never drifts from the set that
> would parse as a type — the named builtins (`bool`, `string`, `void`, `f32`,
> `f64`, `usize`, `isize`, `Any`) and `[su]N` over sx's 164 range. Bare value
> names (`s`, `self`, `index`) are untouched. No lowering special-case is added;
> the `.identifier`-only address-of paths are correct once type-shaped names can
> never be bound. The rejected `bareVarName` approach was never landed.
>
> **Coverage is structural (attempt 4).** Earlier landings hand-walked a subset
> of binding-bearing nodes with a silent `else => {}`, so each review found a new
> leaking syntactic form (destructure names, `impl` method params/locals, `if` /
> `while` / `for` / match-arm / `catch` / `onfail` captures) that bypassed the
> check and hit the original LLVM verifier abort. `checkBindingNames` is now an
> **exhaustive `switch` over every `Node.Data` tag with NO `else` arm**: a future
> binding-bearing node type fails to compile until it is handled here, so
> coverage is enforced by the compiler rather than by a hand-maintained list. The
> check stays in the pre-lowering semantic pass (NOT moved to the `Scope.put`
> scope-registration choke point) because lowering is lazy — an UNCALLED
> function's bindings never reach `Scope.put`, yet they must still be rejected at
> their declaration (e.g. `examples/1119`'s never-called `takes_u8`).
>
> **Span precision (attempt 5).** Every binding form now carries its own
> name span in the AST (`VarDecl.name_span`, `DestructureDecl.name_spans`,
> `IfExpr`/`WhileExpr.binding_span`, `ForExpr.capture_span`/`index_span`,
> `MatchArm.capture_span`, `CatchExpr`/`OnFailStmt.binding_span`,
> `Protocol`/`ForeignMethodDecl.param_name_spans`), populated by the parser at
> each binding site. `checkBindingNames` passes that span to the diagnostic, so
> the caret underlines the offending identifier itself instead of the enclosing
> statement / `if` / `match` / `protocol` / `#objc_class` block. No call site
> falls back to the parent `node.span`. Regular `fn`/lambda params already used
> `Param.name_span`.
>
> **Regression tests:**
> - `examples/0125-types-type-named-var-rejected.sx` — `:=` form (`s2`) rejected.
> - `examples/1119-diagnostics-reserved-type-name-as-identifier.sx` — parameter
> (`u8`), typed-local (`s64`, `bool`), and `:=` (`string`) forms rejected.
> - `examples/1121-diagnostics-reserved-name-control-flow.sx` — destructure name,
> `if` / `while` optional bindings, `for` capture + index names, match-arm
> capture.
> - `examples/1122-diagnostics-reserved-name-impl-method.sx` — `impl`-block method
> reserved param AND reserved local.
> - `examples/1123-diagnostics-reserved-name-catch-onfail.sx` — `catch` and
> `onfail` error-tag bindings.
> - `examples/1124-diagnostics-imported-reserved-destructure.sx` — destructure
> name reserved in an IMPORTED module (renders against that module's source).
> - `examples/1125-diagnostics-reserved-name-method-param.sx` — protocol
> default-body method param AND sx-defined `#objc_class` method param, each
> caret landing on the parameter token.
> - `examples/0135-types-self-streaming-nonreserved.sx` — positive: `*self`
> streaming with non-reserved names (`hasher`, `ctx`) accumulates correctly via
> both `update(@h, …)` and `h.update(…)`.
>
> Pre-existing example `examples/0904-...` declared locals `s1`/`s2` (incidental
> names); renamed to `filled`/`empty`.
>
> **Coverage extension (issue 0077).** The first landing scoped the binding
> check to main-file decls (matching the unknown-type check's trusted-imports
> convention); an imported module could still declare `s2 := …` and hit the
> original LLVM verifier abort. The reserved-name binding diagnostic now runs
> over EVERY compiled module — imported user modules (descending the
> `namespace_decl` an `mod :: #import` wraps) AND the stdlib `library/` — and
> the two `u1` locals in `library/modules/ui/renderer.sx` were renamed
> accordingly. The unknown-type check (issue 0064) stays main-file-only. See
> issue 0077 for the imported-module facet and its pinned regression test
> `examples/1120-diagnostics-imported-reserved-type-name.sx`.
## Symptom (how it first surfaced)
A local variable whose name is lexically a type — e.g. `s2` (the `sN`
arbitrary-width signed-int syntax: `Type.fromName("s2")``s(2)`), or `u8`,
`s64`, etc. — is accepted as a variable. Because such a name parses as a
`.type_expr` (not `.identifier`), the address-of family of lowering sites
(`@s2`, the autoref `s2.update(...)` receiver, a bare `f(s2)` at a `*T` param,
global function-pointer args) does NOT recognize it as a scoped local and falls
through to value lowering — loading the whole aggregate and passing it **by
value** to a `ptr` parameter:
```
LLVM verification failed: Call parameter type does not match function signature!
call void @update(ptr @__sx_default_context,
{ [8 x i64], [64 x i8], i64, i64 } %load, ...)
```
For some struct shapes it compiles but silently passes a **copy** (callee
`*self` mutations lost). A non-type-shaped name (`hasher`, `ctx`) never triggers
any of this — the `.identifier` paths already work correctly.
## Root cause
The language is **accepting reserved/builtin type names as identifiers** in the
first place. `sN`/`uN` (arbitrary-width ints) and the named builtins
(`bool`, `string`, `void`, `f32`, `f64`, `s8`/`s16`/`s32`/`s64`,
`u8`/`u16`/`u32`/`u64`, …) are reserved type names; declaring a variable with
such a name is meaningless and produces the mis-lowering above. Patching each
address-of site to tolerate the name (the rejected `bareVarName` approach) is
whack-a-mole — there is always another site, and it entrenches a name that
should never have been allowed.
## Proper fix (the required direction)
Emit a **diagnostic error** when an identifier is declared with a name that
collides with a **builtin/reserved type name** — including the arbitrary-width
`[su][0-9]+` (`sN`/`uN`) family AND the named builtins (`bool`, `string`,
`void`, `f32`, `f64`, the fixed-width int types, etc.). Scope ruling (Agra):
**all builtin/reserved type names** are rejected as identifiers. (User-defined
struct/type-name shadowing, if intentionally supported elsewhere, is out of
scope for this issue — this is specifically about builtin/reserved type names.)
Diagnostic at the declaration site, e.g.:
`error: 'u8' is a reserved type name and cannot be used as an identifier`
with the declaration's span.
Suspected area: name binding / declaration handling — where a `:=` / typed
local / parameter name is introduced. Reject the name there, before it ever
reaches lowering. Do NOT add lowering special-cases for type-shaped names; the
`.identifier`-only checks at the address-of sites are then correct as-is (no
type-shaped name can reach them).
## Reproduction
```sx
#import "modules/std.sx";
Sha256 :: struct { h:[8]u64; block:[64]u8; block_len:s64=0; total_len:u64=0; }
init :: () -> Sha256 { s:Sha256=---; s.block_len=0; s.total_len=0; s }
update :: (self:*Sha256, data:string) { self.total_len += data.len; }
main :: () -> s32 { s2 := init(); update(@s2, "."); print("total_len={}\n", s2.total_len); return 0; }
```
`./zig-out/bin/sx run <file>` today → LLVM verifier abort.
**Expected after fix:** a clean compile-time diagnostic that `s2` is a reserved
type name and cannot be an identifier (exit non-zero, readable error — NOT an
LLVM abort, NOT a silent copy). The same program with a non-reserved name
(`hasher := init(); update(@hasher, ".")`) must compile and print `total_len=1`.
## Verification
1. Pinned diagnostics test(s) asserting the error for representative reserved
names used as identifiers: `s2`, `u8`, `s64`, `bool`, `string` (declaration
forms: `:=`, typed local, and a parameter name). Capture the diagnostic text
in `expected/`.
2. A positive test: the same `*self` streaming pattern with NON-reserved names
(`hasher`, `ctx`) compiles and accumulates state correctly via both
`update(@h, ...)` and `h.update(...)` — proving the `.identifier` paths are
correct and no lowering special-case is needed.
3. `zig build && zig build test && bash tests/run_examples.sh` all green. If any
existing example/test declares a variable with a reserved type name, it is now
illegal — fix the test's variable name (do NOT weaken the diagnostic). Report
how many such sites existed.
## Provenance
Discovered by the `distribution` flow (P1.2 pure-sx SHA-256), whose minimal repro
happened to name a local `s2`. Real SHA-256 code with names like `hasher`/`ctx`
is unaffected on the current compiler — so the P1.2 "blocker" was a
naming artifact, and this issue is really a missing-diagnostic correctness bug.