mem: implicit-context foundation + many compiler fixes

The session-long set of changes that lay the groundwork for the
Jai-literal implicit-Context-parameter refactor. Lots of accumulated
work; the new arrival is the implicit-ctx foundation (steps 1+2 of
the plan in current/CHECKPOINT-MEM.md):

  Step 1 — `CAllocator :: struct {}` stateless allocator in
    library/modules/allocators.sx, delegating directly to
    libc_malloc/libc_free. `ConstantValue` in src/ir/inst.zig gains a
    `func_ref: FuncId` leaf so nested aggregates can carry function
    pointers (the inline Allocator value's fn-ptr fields). Switch
    sites updated in emit_llvm.zig, print.zig, interp.zig.

  Step 2 — `emitDefaultContextGlobal` in src/ir/lower.zig synthesises
    a static `__sx_default_context` global with a nested-aggregate
    init_val pointing at the CAllocator → Allocator thunks. The
    second-pass `initVtableGlobals` in emit_llvm.zig is generalised
    to handle `.aggregate` init_vals (re-emits after func_map is
    populated so func_ref leaves resolve to real symbols).

Also folded in from earlier work this session:

  - Phase 1.1: `xx value` heap-copy in `buildProtocolValue` routes
    through `context.allocator` via the new `allocViaContext` helper.
  - interp.zig: `marshalForeignArg` double-offset bug fixed —
    `heapSlice` already adds `hp.offset` to the slice ptr, so the
    extra `+ hp.offset` was scribbling memcpy/memset into adjacent
    heap state, corrupting `heap.items[0]`. Symptom: `build_format`
    at comptime produced zero bytes, all `print` calls failed.
  - Lazy lowering: `lazyLowerFunction` now declares foreign-body
    functions as extern stubs in the local (comptime) module so
    cross-module foreign calls resolve.
  - Allocator API: all stdlib allocators on one-line `init() -> *T`
    (CAllocator/GPA: libc-backed; Arena/TrackingAllocator: parent-
    backed; BufAlloc: embeds state at head of user buffer).
  - issues 0038 (transitive #import), 0039 (chess + stdlib migration
    fallout), 0040 (generic struct method dot-dispatch), 0041
    (pointer types as type-arg), 0042 (alias name resolution) — all
    fixed; regression tests in examples/.
  - Diagnostic: `emitError` now embeds the lowering's
    `current_source_file` and enclosing function in the literal
    message; SX_TRACE_UNRESOLVED=1 dumps a Zig stack trace at the
    emit site so misattributed spans can't hide where the failure
    is.
  - tools/verify-step.sh (all-platforms gate) and tools/scratch.sh
    (interp/codegen parity tester) added.

Test suite: 152 example tests pass; chess builds + screenshots on
macOS / iOS sim / Android.
This commit is contained in:
agra
2026-05-24 22:59:20 +03:00
parent 0ba41b2980
commit 29784c22a8
63 changed files with 3448 additions and 1207 deletions

View File

@@ -0,0 +1,141 @@
# issue-0041 — Pointer types don't parse as expressions / type-argument positions
## Symptom
A pointer type like `*u8` or `*void` does not parse in positions
where a type expression is expected as a *value*, e.g.:
- As an argument to a `$T: Type` builtin: `size_of(*u8)`,
`align_of(*u8)`.
- On the RHS of a type alias: `Ptr :: *u8;`.
In each case the parser emits `error: unexpected token in expression`
at the column of the `*`.
Pointer types DO parse correctly in dedicated type-annotation
positions: function parameters (`(p: *u8)`), struct fields
(`field: *u8;`), variable annotations (`p: *u8 = ...;`). So the bug
is a parsing inconsistency between "type-annotation context" and
"expression context where a type is expected".
This is pre-existing — it affects `size_of` (already shipping) and
was just made more visible by adding `align_of` in Phase 0.6 of the
MEM plan. Not a regression introduced by 0.6, but a real limitation
worth pinning down because:
- Phase 1+ of the MEM plan will need `size_of(*T)` / `align_of(*T)`
in user-facing allocator helpers if we want to stay terse — e.g.
serializing a pointer-typed field in `field_value_int` patterns.
- It's a discoverability cliff. New users WILL write `size_of(*u8)`,
see "unexpected token", and have to learn the workaround.
## Reproduction
```sx
#import "modules/std.sx";
main :: () -> s32 {
n := size_of(*u8); // error: unexpected token in expression
print("{}\n", n);
0;
}
```
Also fails on the alias form:
```sx
#import "modules/std.sx";
Ptr :: *u8; // error: unexpected token in expression
main :: () -> s32 { 0; }
```
Both `sx run` and `sx build` reject identically.
## Confirmed working workarounds
A pointer type DOES resolve when bound through a `*void`-style
variable type and then cast, or routed via a helper:
```sx
// Workaround A: anonymous struct holding the pointer field, then
// pull alignment from the wrapping struct (clumsy).
Wrap :: struct { p: *u8; }
n := align_of(Wrap); // 8 — correct for pointer alignment.
// Workaround B: explicit *void
n := size_of(*void); // ALSO fails — same parse error.
```
Workaround B is NOT functional — it has the same parse error. Only
the wrap-in-struct or type-alias-via-typedef trick is currently
viable for code that needs pointer size/alignment.
There is no clean way today to write `size_of(*u8)`. The whole
class of "ptr type as type-expression value" is unsupported.
## Investigation prompt
> Pointer types parse via a dedicated `parseTypeExpr` (or similar)
> path that the parser invokes in type-annotation positions (param
> lists, field declarations, variable annotations). The expression
> grammar used in argument positions (e.g. inside `size_of(...)`)
> dispatches through `parseExpr` instead, which treats `*` as
> "either prefix unary deref or infix multiplication" — neither
> matches the desired "type literal" interpretation.
>
> The fix likely belongs in the call-argument parser path: when
> the callee is a builtin that takes `$T: Type`, OR more broadly
> whenever the parser sees a `*` at the start of an expression
> followed by an identifier that resolves to a type, it should
> dispatch to `parseTypeExpr` instead of `parsePrefixUnary`.
>
> Implementation sketch:
> - Check `src/parser.zig` for the expression entry point that
> handles `*` prefix. Today it likely returns a `unary_op
> { op = deref, operand = … }` AST node.
> - Look at how lower.zig's `resolveTypeArg` consumes the AST node
> for `size_of(s32)` — what AST shape does it expect for a type
> literal? Probably an `identifier` whose name resolves to a type.
> - The fix should extend `resolveTypeArg` to also accept a
> `unary_op { op = deref, ... }` and treat it as "pointer to
> resolved type" — equivalent to `Ptr$T` in spec terms.
> - For the type-alias case (`Ptr :: *u8;`), the RHS of a `::`
> const decl is parsed as an expression. The parser needs to
> recognize that the LHS-determined shape (type-level alias)
> should bias the RHS parser toward `parseTypeExpr`. Or: extend
> the constant-fold path to interpret `unary_op { deref, T }` as
> a type literal when used as a type.
>
> Verification:
> 1. Add `examples/issue-0041.sx` with the repro above and
> `tests/expected/issue-0041.txt` capturing the expected output
> (`size_of(*u8) → 8`).
> 2. Confirm `bash tests/run_examples.sh` still passes everything
> else (151 tests currently).
> 3. Run `tools/verify-step.sh` to confirm chess on three platforms.
> 4. Also bake into `examples/50-smoke.sx` near the existing
> `align_of` lines — add `align_of(*u8)`, `size_of(*u8)`,
> `align_of(*void)` and regen.
>
> Hazard: any change to expression parsing affects a huge surface.
> Watch for these contexts to make sure they still work post-fix:
> - `a * b` (multiplication)
> - `*p` (prefix deref read)
> - `*p = …` (prefix deref write)
> - `func(a, *b)` (deref as argument)
> A surgical "is the next token a built-in type identifier" lookahead
> at the `*` site is probably less invasive than a wholesale
> type-expression-in-expression-position rewrite.
## Plan-level impact
None for Phase 0.6 — `align_of` shipped and works for every shape
that `size_of` works for (primitives, structs, type aliases through
non-pointer types). The 50-smoke test addition uses only
non-pointer types, so it's stable.
Phase 1+ should bake an `align_of(*u8)` test once the parser fix
lands, since the allocator API will want to round-trip pointer
alignments at some call sites.

View File

@@ -0,0 +1,135 @@
# issue-0042 — Const-decl type aliases (`MyInt :: s32;`) silently return `.s64` from `size_of` / `align_of`
## Symptom
A type alias declared via `Foo :: SomeType;` is registered in the
lowering's `type_alias_map` but is **never consulted** when the alias
name is later used as a type argument to `size_of` / `align_of`. The
fallback returns `.s64` (8 bytes) — which coincidentally produces a
correct result for any alias whose underlying type is 8 bytes
(`*T`, `f64`, function pointers, `s64`, `u64`), silently masking the
bug for years.
Observed:
```
size_of(s32) = 4 ← direct, correct
size_of(MyInt) = 8 ← via alias, WRONG (expected 4)
```
Where `MyInt :: s32;`.
## Reproduction
```sx
#import "modules/std.sx";
MyInt :: s32;
main :: () -> s32 {
print("direct: {}\n", size_of(s32)); // 4
print("alias: {}\n", size_of(MyInt)); // 8 — should be 4
0;
}
```
`./zig-out/bin/sx run` against unmodified master prints:
```
direct: 4
alias: 8
```
## Why this surfaces now
issue-0041 work extends the const-decl alias path to register
pointer, optional, array, slice, many-pointer, and function-type
aliases (`Ptr :: *u8;`, `Maybe :: ?u8;`, `Arr :: [3]u8;`,
`Cb :: (s32) -> s32;`). Every one of those aliases ends up in
`type_alias_map`, then `size_of(<alias>)` falls through the same
`.identifier` branch that ignores the map — returning `.s64` (8).
For pointer and function-type aliases this is coincidentally right
(8 bytes). For optional, array, etc. it produces silently-wrong
sizes (`size_of(Maybe) = 8` instead of 2;
`size_of(Arr) = 8` instead of 3).
The issue-0041 work cannot land without this being fixed — the
test snapshots would pin in the wrong values and the new feature
would ship subtly broken.
## Investigation prompt
> The bug lives in `src/ir/lower.zig`, in `resolveTypeArg`
> ([line ~7132](src/ir/lower.zig#L7132)). The `.identifier`
> branch looks like:
>
> ```zig
> .identifier => |id| {
> if (self.type_bindings) |tb| {
> if (tb.get(id.name)) |ty| return ty;
> }
> const name_id = self.module.types.internString(id.name);
> return self.module.types.findByName(name_id) orelse .s64;
> },
> ```
>
> It checks `type_bindings` (generic-monomorphization) and
> `findByName` (registered named types), but never consults
> `self.type_alias_map` — which is where the const-decl alias
> registration in `lower.zig:425` puts entries. The neighbouring
> `.type_expr` branch (line ~7143) DOES check `type_alias_map`:
>
> ```zig
> .type_expr => |te| {
> if (self.type_alias_map.get(te.name)) |alias_ty| return alias_ty;
> return type_bridge.resolveAstType(node, &self.module.types);
> },
> ```
>
> Why two branches: an `.identifier` AST node is what parsePrimary
> emits for non-keyword names; `.type_expr` is what it emits for
> built-in primitive names recognised by `Type.fromName` (`s32`,
> `u8`, etc.) and for the `f32`/`f64`/`Type` keywords. User-defined
> alias names like `MyInt` and `Ptr` flow through `.identifier`.
>
> **Likely fix:** mirror the `type_alias_map.get` lookup in the
> `.identifier` branch — try alias map first (or before/after
> findByName, whichever is the established precedence elsewhere).
>
> ```zig
> .identifier => |id| {
> if (self.type_bindings) |tb| {
> if (tb.get(id.name)) |ty| return ty;
> }
> if (self.type_alias_map.get(id.name)) |alias_ty| return alias_ty;
> const name_id = self.module.types.internString(id.name);
> return self.module.types.findByName(name_id) orelse .s64;
> },
> ```
>
> **Verification:**
> 1. Add the repro above as `examples/issue-0042.sx`.
> 2. `bash tests/run_examples.sh --update` to capture expected
> output (`alias: 4`, not `alias: 8`).
> 3. Make sure existing snapshots that test type aliases (search
> `examples/` for `::` patterns followed by `size_of`) don't
> change in unexpected ways.
>
> **Possible adjacency:** the issue may extend to `align_of`
> (likely same call path) and to type-alias chains
> (`A :: s32; B :: A;` — does B resolve through A's alias entry?).
> Worth pinning down with a test once the primary fix lands.
## Plan-level impact
Blocks issue-0041 (compound-type-as-expression). Once 0042 is
fixed, 0041 work can resume from the testing phase (the parser and
lowering edits for 0041 are already in place; only the alias
lookup is broken).
## Suggested fix order
1. Land 0042's `.identifier` alias-map lookup.
2. Resume 0041 from the test step — re-run `examples/issue-0041.sx`
and verify `size_of(Maybe) = 2`, `size_of(Arr) = 3`, etc.
3. Regenerate snapshots and proceed with the 0041 finishing
steps (50-smoke, rename, etc.).