Files
sx/issues/0089-backtick-raw-identifier.md
agra 724a919fc1 feat(lang): raw provenance through ALL sema compound type metadata — finish universal raw identifier in the LSP classifier [F0.6]
The codegen-side resolver was already raw-aware for the universal model;
the sema/LSP editor index (the second classifier) only honored the DIRECT
raw type. A COMPOUND raw type (`*`s2`, `?`s2`, `[N]`s2`, `[]`s2`, `[*]`s2`)
stores its inner type-name as a bare string on the Type info struct, and
every resolution site re-read it with skip_builtin=false — so the index
reclassified a user type named `s2` as the builtin int, diverging from
codegen (issue-0083 class, LSP surface only; codegen unchanged).

Structural cure: every compound info struct (Pointer/Optional/Slice/
ManyPointer/Array) carries a REQUIRED is_raw bit (no default — a future
construction site cannot drop it). is_raw is set at every construction
site (resolveTypeNode arms, fieldType arms, variadic slice, .ptr/slice_expr
derivation, for-loop by-ref, substType) and passed as skip_builtin at every
resolution site (elementTypeOf, field-access pointer unwrap, index, deref,
optional unwrap/null-coalesce, if/while optional binding, match subject).
Optional-unwrap + deref sites converted from Type.fromName/pointerPointeeType
(builtin-only, divergent) to resolveTypeNameStr(name, is_raw); the now-dead
pointerPointeeType removed.

Tests: src/sema.test.zig gains pointer/optional/array raw-vs-bare
regressions (raw → user type, bare → builtin control) — each FAILS on
pre-fix sema, PASSES after — plus a parameterized-raw coverage test.
2026-06-04 21:46:31 +03:00

9.5 KiB

0089 — backtick raw-identifier escape + #import c foreign-name exemption from the reserved-type-name rule

RESOLVED (foundation step F0.6). Two mechanisms, per Agra's design ruling; the final shape is the universal raw identifier (attempt 4): `name is THE LITERAL identifier name, usable in EVERY position — value, declaration, AND type — meaning only "treat this token as a plain identifier, never the reserved keyword/type." The backtick is never part of the name's text.

  1. Backtick raw identifier. The lexer recognises a leading backtick (`s2) and emits an .identifier token whose span excludes the backtick, carrying a Token.is_raw flag ([src/lexer.zig], [src/token.zig]). The flag threads through ast.Identifier, ast.TypeExpr, and EVERY binding / capture / declaration node ([src/ast.zig]): VarDecl / ConstDecl / Param / FnDecl plus IfExpr / WhileExpr optional bindings, ForExpr capture + index, MatchArm capture, CatchExpr / OnFailStmt tag bindings, DestructureDecl per-name, protocol-default / foreign-class method params, AND every type-introducing decl — StructDecl / EnumDecl / UnionDecl / ErrorSetDecl / ProtocolDecl / ForeignClassDecl / UfcsAlias / NamespaceDecl / ImportDecl / CImportDecl / LibraryDecl.

    • Value position. The parser skips Type.fromName for a raw identifier in expression position ([src/parser.zig] parsePrimary), so `s2 is a value identifier; a later bare reference resolves to the binding.
    • Type position. parseTypeExpr sets the raw flag on the type ATOM and lets it flow through the SAME continuations as a bare name (attempt 5), so a raw reference parameterizes a reserved-spelled template (`s2(s64)) and composes under the pointer / optional / slice wrappers; ParameterizedTypeExpr carries is_raw and resolveParameterizedWithBindings skips the Vector intrinsic when raw. Resolution skips the builtin classifier (TypeResolver.resolveNamed's skip_builtin, threaded from te.is_raw in [src/ir/lower.zig] and [src/ir/type_bridge.zig]) and looks up a `s2-declared type (struct / enum / union / alias), else a NORMAL "unknown type 's2'" error (UnknownTypeChecker.reportIfUnknownType skips the builtin-name exemption when raw). A bare s2 in type position is still the builtin int. The SECOND (editor/LSP) classifier in [src/sema.zig] (Type.fromTypeExpr / resolveTypeNode / resolveTypeNameStr) honors is_raw too, so a backtick reserved-name annotation resolves to the user type in hover/completion, not the builtin (no two-resolver divergence). The raw bit is carried STRUCTURALLY through every COMPOUND shape's inner-name metadata — PointerTypeInfo / OptionalTypeInfo / SliceTypeInfo / ManyPointerTypeInfo / ArrayTypeInfo each store a REQUIRED is_raw ([src/types.zig], no default, so a future construction site cannot drop it) that every resolveTypeNameStr call passes as its skip_builtin — so *`s2, ?`s2, [N]`s2, []`s2, [*]`s2 field-access / unwrap / index / deref in the editor index all reach the user type instead of reclassifying the inner s2 to the builtin (the divergence the DIRECT-only attempt left for compound forms).
    • Declaration position. A bare reserved-name declaration of EVERY kind still errors (issue 0076 preserved); the backtick form is exempt. The check and the exemption are made structurally symmetric: checkBindingName / checkDeclName ([src/ir/semantic_diagnostics.zig]) take is_raw as a REQUIRED argument and skip inside the check — no call site can validate a name without also honoring the exemption, which is what kept the two from desyncing across the earlier attempts. On the PARSER side the symmetry is enforced structurally for the bug-prone node: ConstDecl's name_span + is_raw carry NO default (attempt 5), so the compiler rejects any construction site — including the two struct-body const forms (untyped `s2 :: 5 and typed `s2 : T : v) that previously dropped both — that omits them. FnDecl is built at every parser site through parseFnDecl, whose name_is_raw is a REQUIRED parameter (the equivalent guarantee); the type decls likewise route through parse-functions taking name_is_raw.
  2. #import c foreign-name exemption. c_import.zig synthesizes foreign #foreign decls with Param.is_raw = true (and the synthesized FnDecl is_raw = true), so generated C names that collide with reserved type names (s1, s2) import unedited and a reserved-name foreign fn is bare-callable.

Bare-callable foreign / backtick fn. lowerCall rewrites a .type_expr callee to an identifier when a function of RAW provenance of that name is in scope ([src/ir/lower.zig]) — scoped to the callee FnDecl's is_raw flag, so it only ever fires for a backtick / #import c foreign fn (the decl check guarantees no bare reserved-name fn exists). s2(4) resolves to the function (TypeName(val) is not a cast).

Regression tests. examples/0151-types-backtick-raw-identifier.sx (every VALUE position), examples/0152-types-backtick-control-flow.sx (every control-flow / capture form), examples/0153-types-backtick-const-fn-decl.sx (backtick :: const + fn decl, bare + backtick call), examples/0154-types-backtick-raw-type-reference.sx (raw in TYPE position — struct / enum / union / alias decl + reference; bare s2 still the int), examples/0155-types-backtick-typed-const-union-tag.sx (typed const + union tag), examples/0156-types-backtick-struct-const.sx (struct-body const, untyped + typed), examples/0157-types-backtick-parameterized-raw-type.sx (raw parameterized type + pointer/field wrappers), examples/1054-errors-backtick-reserved-binding.sx (catch/onfail tag bindings), examples/1220-ffi-c-import-reserved-name-params.{sx,h,c} (foreign param + fn-name exemption, bare-callable foreign fn); negatives examples/1119/1121/1123 (bare reserved binding across forms), examples/1140-diagnostics-reserved-name-const-fn-decl.sx (bare const + fn decl), examples/1141-diagnostics-reserved-name-type-decl.sx (bare struct / enum / union / error / typed-const decl), examples/1142-diagnostics-reserved-name-struct-const.sx (bare struct-body const, caret on the name). Backtick lexer + resolveNamed(skip_builtin) unit tests in src/lexer.zig / src/ir/type_resolver.test.zig; the editor/LSP raw-type resolution (the second classifier) is pinned in src/sema.test.zig — the direct case plus raw provenance through every compound shape (*`s2 field access, ?`s2 unwrap, [N]`s2 index, parameterized `s2(s64)), each with a bare-spelling control that stays the builtin (fail-before verified).

The original report is preserved below.


Symptom

Importing non-sx source whose names collide with sx reserved type names is rejected. library/modules/stb_truetype.sx is a #import c { ... } block over a vendored C header (vendors/stb_truetype/stb_truetype.h); C identifiers s1, s2 (which collide with sx's signed-int type keywords s1..sN) produce:

error: 's1' is a reserved type name and cannot be used as an identifier
error: 's2' is a reserved type name and cannot be used as an identifier

The user cannot hand-edit these — they are generated from the vendored C header. Separately, sx-authored code has NO way to deliberately use a reserved-name-spelled identifier even when it wants to.

Root cause

The parser classifies any reserved-type-name spelling (s2, u8, f64, …) as a .type_expr via name_class.Type.fromName, never as an .identifier. The F0.1 / issue-0076 fix added UnknownTypeChecker.checkBindingName (src/ir/semantic_diagnostics.zig) to reject a value binding / param spelled as a reserved type name (the .type_expr-vs-.identifier mismatch otherwise breaks address-of / autoref lowering). F0.1 deliberately extended this check to imported declarations — which is what now fires on the C-imported s1/s2.

Desired behaviour (Agra ruling)

External / imported source does NOT need to conform to sx naming standards. Two mechanisms:

  1. Auto-exempt imports. #import c (and other foreign) declarations are treated as RAW identifiers: foreign names are never type-classified and never reserved-checked, so generated bindings "just work" with zero user edits.

  2. Backtick raw-identifier for sx code. A leading backtick makes the following identifier raw — an identifier that is NEVER type-classified, so it bypasses the reserved-name rule:

    `s2 := 2.5;   // OK — identifier "s2", distinct from the s2 signed-int type
    s2 := 2.5;    // ERROR — bare s2 is still the reserved type name
    

    Prefix form (single leading backtick on the identifier). The raw identifier's TEXT is s2 (the backtick is not part of the name). A bare s2 used as a TYPE remains the signed-int type.

Reproduction

sx-side (minimal):

#import "modules/std.sx";
main :: () {
    `s2 := 2.5;            // must compile: identifier s2 = 2.5
    print("{}\n", `s2);    // 2.5
}

Import-side: a #import c block over a header declaring int s1, s2; (or stb_truetype.sx) must NOT emit the reserved-type-name error.