Files
sx/issues/0089-backtick-raw-identifier.md
agra b9a29c39c5 docs(lang): fix invalid protocol method-signature snippets — (self) -> () -> s64 [F0.6]
A protocol method signature omits the receiver; a bare `self` has no type, so
`protocol { … :: (self) … }` fails at parse with 'expected :'. Correct the three
member-exemption doc snippets (readme.md, specs.md, issues/0089) to the valid
signature form, matching examples/0158's `Speaker :: protocol { s2 :: () -> s64; }`.
2026-06-04 23:02:09 +03:00

189 lines
12 KiB
Markdown

# 0089 — backtick raw-identifier escape + `#import c` foreign-name exemption from the reserved-type-name rule
> **✅ RESOLVED** (foundation step F0.6). Two mechanisms, per Agra's design
> ruling; the final shape is the **universal raw identifier** (attempt 4):
> `` `name `` is THE LITERAL identifier `name`, usable in EVERY position — value,
> declaration, AND type — meaning only "treat this token as a plain identifier,
> never the reserved keyword/type." The backtick is never part of the name's text.
>
> 1. **Backtick raw identifier.** The lexer recognises a leading backtick
> (`` `s2 ``) and emits an `.identifier` token whose span excludes the backtick,
> carrying a `Token.is_raw` flag ([src/lexer.zig], [src/token.zig]). The flag
> threads through `ast.Identifier`, `ast.TypeExpr`, and EVERY binding / capture /
> declaration node ([src/ast.zig]): `VarDecl` / `ConstDecl` / `Param` / `FnDecl`
> plus `IfExpr` / `WhileExpr` optional bindings, `ForExpr` capture + index,
> `MatchArm` capture, `CatchExpr` / `OnFailStmt` tag bindings, `DestructureDecl`
> per-name, protocol-default / foreign-class method params, AND every
> type-introducing decl — `StructDecl` / `EnumDecl` / `UnionDecl` /
> `ErrorSetDecl` / `ProtocolDecl` / `ForeignClassDecl` / `UfcsAlias` /
> `NamespaceDecl` / `ImportDecl` / `CImportDecl` / `LibraryDecl`.
>
> - **Value position.** The parser skips `Type.fromName` for a raw identifier
> in expression position ([src/parser.zig] `parsePrimary`), so `` `s2 `` is a
> value identifier; a later bare reference resolves to the binding.
> - **Type position.** `parseTypeExpr` sets the raw flag on the type ATOM and
> lets it flow through the SAME continuations as a bare name (attempt 5), so a
> raw reference parameterizes a reserved-spelled template (`` `s2(s64) ``) and
> composes under the pointer / optional / slice wrappers; `ParameterizedTypeExpr`
> carries `is_raw` and `resolveParameterizedWithBindings` skips the `Vector`
> intrinsic when raw. Resolution skips the builtin classifier
> (`TypeResolver.resolveNamed`'s `skip_builtin`, threaded from `te.is_raw` in
> [src/ir/lower.zig] and [src/ir/type_bridge.zig]) and looks up a
> `` `s2 ``-declared type (struct / enum / union / alias), else a NORMAL
> "unknown type 's2'" error (`UnknownTypeChecker.reportIfUnknownType` skips the
> builtin-name exemption when raw). A bare `s2` in type position is still the
> builtin int. The SECOND (editor/LSP) classifier in [src/sema.zig]
> (`Type.fromTypeExpr` / `resolveTypeNode` / `resolveTypeNameStr`) honors
> `is_raw` too, so a backtick reserved-name annotation resolves to the user type
> in hover/completion, not the builtin (no two-resolver divergence). The raw bit
> is carried STRUCTURALLY through every COMPOUND shape's inner-name metadata —
> `PointerTypeInfo` / `OptionalTypeInfo` / `SliceTypeInfo` / `ManyPointerTypeInfo`
> / `ArrayTypeInfo` each store a REQUIRED `is_raw` ([src/types.zig], no default,
> so a future construction site cannot drop it) that every `resolveTypeNameStr`
> call passes as its `skip_builtin` — so `` *`s2 ``, `` ?`s2 ``, `` [N]`s2 ``,
> `` []`s2 ``, `` [*]`s2 `` field-access / unwrap / index / deref in the editor
> index all reach the user type instead of reclassifying the inner `s2` to the
> builtin (the divergence the DIRECT-only attempt left for compound forms).
> - **Declaration position.** A bare reserved-name declaration of EVERY kind
> still errors (issue 0076 preserved); the backtick form is exempt. The check
> and the exemption are made structurally symmetric:
> `checkBindingName` / `checkDeclName` ([src/ir/semantic_diagnostics.zig]) take
> `is_raw` as a REQUIRED argument and skip inside the check — no call site can
> validate a name without also honoring the exemption, which is what kept the
> two from desyncing across the earlier attempts. On the PARSER side the
> symmetry is enforced structurally for the bug-prone node: `ConstDecl`'s
> `name_span` + `is_raw` carry NO default (attempt 5), so the compiler rejects
> any construction site — including the two struct-body const forms (untyped
> `` `s2 :: 5 `` and typed `` `s2 : T : v ``) that previously dropped both —
> that omits them. `FnDecl` is built at every parser site through `parseFnDecl`,
> whose `name_is_raw` is a REQUIRED parameter (the equivalent guarantee); the
> type decls likewise route through parse-functions taking `name_is_raw`.
> - **Member-name positions are exempt** (Agra ruling, attempt 7). A struct
> **field** name, a union **tag** name, and a protocol **method-signature**
> name accept a bare reserved spelling: these sit in a member slot and are
> reached via `obj.name` / dispatched by string, so they are never
> type-classified and never mis-lower — the binding-name walk's `struct_decl`
> / `union_decl` / `enum_decl` / `protocol_decl` arms
> ([src/ir/semantic_diagnostics.zig]) check only the *type* name (and method
> *params*), not field / tag / variant / method-signature names. The backtick
> is optional there (`obj.s2` and `` obj.`s2 `` resolve to the same member).
> This bare member-name exemption covers only the **identifier-classified**
> reserved spellings — `s1`..`s64`, `u1`..`u64`, `bool`, `string`, `void`,
> `usize`, `isize`, `Any` — which all lex as ordinary identifiers. The two
> **keyword-classified** spellings, `f32` and `f64`, are lexer keywords
> ([src/token.zig]), and a member-name slot requires an identifier token
> ([src/parser.zig]); a bare `f32` / `f64` is therefore rejected at parse
> (`expected field name in struct`) even in a member position, and still needs
> the backtick there too — `` struct { `f32: s64; } `` / `` union { `f64: … } ``
> / `` protocol { `f32 :: () -> s64; } `` work as field / tag / method names.
> The exemption stops at member *definitions*: an `impl` method is a real
> function reached through the `impl_block` → `fn_decl` arm, so a
> reserved-spelled impl method needs the backtick (`` `s2 :: (self) ``), no
> more exempt than a free function (cf. `examples/1122`). Pinned by
> `examples/0158-types-reserved-name-member-exempt.sx`.
> 2. **`#import c` foreign-name exemption.** `c_import.zig` synthesizes foreign
> `#foreign` decls with `Param.is_raw = true` (and the synthesized `FnDecl`
> `is_raw = true`), so generated C names that collide with reserved type names
> (`s1`, `s2`) import unedited and a reserved-name foreign fn is bare-callable.
>
> **Bare-callable foreign / backtick fn.** `lowerCall` rewrites a `.type_expr`
> callee to an identifier when a function **of RAW provenance** of that name is in
> scope ([src/ir/lower.zig]) — scoped to the callee `FnDecl`'s `is_raw` flag, so it
> only ever fires for a backtick / `#import c` foreign fn (the decl check guarantees
> no bare reserved-name fn exists). `s2(4)` resolves to the function (`TypeName(val)`
> is not a cast).
>
> **Regression tests.** `examples/0151-types-backtick-raw-identifier.sx` (every
> VALUE position), `examples/0152-types-backtick-control-flow.sx` (every
> control-flow / capture form), `examples/0153-types-backtick-const-fn-decl.sx`
> (backtick `::` const + fn decl, bare + backtick call),
> `examples/0154-types-backtick-raw-type-reference.sx` (raw in TYPE position —
> struct / enum / union / alias decl + reference; bare `s2` still the int),
> `examples/0155-types-backtick-typed-const-union-tag.sx` (typed const + union tag),
> `examples/0156-types-backtick-struct-const.sx` (struct-body const, untyped + typed),
> `examples/0157-types-backtick-parameterized-raw-type.sx` (raw parameterized type +
> pointer/field wrappers),
> `examples/0158-types-reserved-name-member-exempt.sx` (bare reserved-name struct
> fields / union tag / protocol method signature — read & written bare and via
> backtick; impl method definition takes the backtick),
> `examples/1054-errors-backtick-reserved-binding.sx` (`catch`/`onfail` tag
> bindings), `examples/1220-ffi-c-import-reserved-name-params.{sx,h,c}` (foreign
> param + fn-name exemption, bare-callable foreign fn); negatives
> `examples/1119`/`1121`/`1123` (bare reserved binding across forms),
> `examples/1140-diagnostics-reserved-name-const-fn-decl.sx` (bare const + fn decl),
> `examples/1141-diagnostics-reserved-name-type-decl.sx` (bare struct / enum / union
> / error / typed-const decl),
> `examples/1142-diagnostics-reserved-name-struct-const.sx` (bare struct-body const,
> caret on the name). Backtick lexer + `resolveNamed(skip_builtin)` unit tests in
> `src/lexer.zig` / `src/ir/type_resolver.test.zig`; the editor/LSP raw-type
> resolution (the second classifier) is pinned in `src/sema.test.zig` — the direct
> case plus raw provenance through every compound shape (`` *`s2 `` field access,
> `` ?`s2 `` unwrap, `` [N]`s2 `` index, parameterized `` `s2(s64) ``), each with a
> bare-spelling control that stays the builtin (fail-before verified).
>
> The original report is preserved below.
---
## Symptom
Importing non-sx source whose names collide with sx reserved type names is
rejected. `library/modules/stb_truetype.sx` is a `#import c { ... }` block over a
vendored C header (`vendors/stb_truetype/stb_truetype.h`); C identifiers `s1`,
`s2` (which collide with sx's signed-int type keywords `s1`..`sN`) produce:
```
error: 's1' is a reserved type name and cannot be used as an identifier
error: 's2' is a reserved type name and cannot be used as an identifier
```
The user cannot hand-edit these — they are generated from the vendored C header.
Separately, sx-authored code has NO way to deliberately use a reserved-name-spelled
identifier even when it wants to.
## Root cause
The parser classifies any reserved-type-name spelling (`s2`, `u8`, `f64`, …) as a
`.type_expr` via `name_class.Type.fromName`, never as an `.identifier`. The F0.1 /
issue-0076 fix added `UnknownTypeChecker.checkBindingName`
(`src/ir/semantic_diagnostics.zig`) to reject a value binding / param spelled as
a reserved type name (the `.type_expr`-vs-`.identifier` mismatch otherwise breaks
address-of / autoref lowering). F0.1 deliberately extended this check to imported
declarations — which is what now fires on the C-imported `s1`/`s2`.
## Desired behaviour (Agra ruling)
External / imported source does NOT need to conform to sx naming standards. Two
mechanisms:
1. **Auto-exempt imports.** `#import c` (and other foreign) declarations are
treated as RAW identifiers: foreign names are never type-classified and never
reserved-checked, so generated bindings "just work" with zero user edits.
2. **Backtick raw-identifier for sx code.** A leading backtick makes the following
identifier raw — an identifier that is NEVER type-classified, so it bypasses the
reserved-name rule:
```sx
`s2 := 2.5; // OK — identifier "s2", distinct from the s2 signed-int type
s2 := 2.5; // ERROR — bare s2 is still the reserved type name
```
Prefix form (single leading backtick on the identifier). The raw identifier's
TEXT is `s2` (the backtick is not part of the name). A bare `s2` used as a TYPE
remains the signed-int type.
## Reproduction
sx-side (minimal):
```sx
#import "modules/std.sx";
main :: () {
`s2 := 2.5; // must compile: identifier s2 = 2.5
print("{}\n", `s2); // 2.5
}
```
Import-side: a `#import c` block over a header declaring `int s1, s2;` (or
`stb_truetype.sx`) must NOT emit the reserved-type-name error.