Files
sx/current/CHECKPOINT-MEM.md
agra e843b7769d checkpoint: refresh MEM after silent-arm sweep + raw-ptr work
CHECKPOINT-MEM.md "Next step" still pointed at Phase 1.2 from the
old MEM plan — but four commits have landed since: matchContextAllocCall
drop, typed raw-pointer stores, call-conv mismatch detection, and the
silent-arms sweep. The "Current state" section also still listed
matchContextAllocCall as preserved and tied test counts to 152.

Updates:
- "Last completed step" now points at the silent-arm sweep + typed
  Store work.
- "Current state" rewritten: matchContextAllocCall is GONE, interp
  raw-pointer paths enumerated, val_ty threading mentioned,
  call-conv check called out.
- "Phase 0.3 audit findings" rewritten as historical context — chess
  no longer touches any pattern-match bypass; protocol dispatch runs.
- "Next step" recommends Phase 1.3 (closure env through context),
  notes Phase 1.2 was considered and skipped.
- Three new log entries for the four post-Step-9 commits.
2026-05-25 12:06:59 +03:00

24 KiB

Memory Module — Progress + Issues Log

Tracking checkpoint for the mem.sx Zig-aligned implementation (plan: ~/.claude/plans/tidy-doodling-cray.md).

Last completed step

  • Interp silent-arm sweep + typed raw-pointer stores. Every else => arm in the interp now bails with a bailDetail("...") reason that surfaces through the host diagnostic as op=X/X: <reason>. inst.Store carries val_ty: TypeId so comptime raw-pointer stores honour the declared destination width (no more 8-byte-everywhere assumption). New CLAUDE.md REJECTED PATTERN forbids silent unimplemented arms going forward. 154/154 example tests + chess on macOS / iOS sim / Android green.

Current state

Phase 0.0c shipped (allocator API on one-line init returning *T; TrackingAllocator added). 148/148 tests pass.

Phase 0.2 verified across all known patterns. Phase 0.3 documented.

Phase 0 of the MEM plan is COMPLETE.

Fixed this session by paired sessions: issue-0038 (transitive #import), issue-0039 (chess + stdlib migration to explicit imports), issue-0040 (generic struct method dot-dispatch). All landed and re-verified. Full gate green: 151/151 example tests + chess on macOS / iOS sim / Android with screenshots.

Phase 0 spike outcomes:

  • 0.2 xx cast patterns — verified across 5 additional shapes beyond the issue-0037 regression.
  • 0.3 chess allocator audit — documented; no chess-side migration needed.
  • 0.6 align_of($T) builtin — landed. Mirrors size_of in inst.zig BuiltinId, lower.zig (registry + return-type + reflection handler), interp.zig (fallback), sema.zig (allowed-builtins list), lsp/server.zig (both completion tables), library/modules/std.sx. Smoke coverage added in examples/50-smoke.sx (u8/s32/s64/Point).
  • 0.7 #import transitivity — surfaced and fixed via issue-0038.
  • 0.8 #foreign("c") rename syntax — confirmed #foreign libc "name".
  • 0.9 generic UFCS dispatch — surfaced and fixed via issue-0040. Dot-call now works for struct methods with $T: Type parameters, so the plan's a.create(MyType) shape is viable.

issue-0041 (pointer type as type-arg) and issue-0042 (alias names in resolveTypeArg) FIXED. Regression tests at examples/issue-0041.sx and examples/issue-0042.sx. Scratch verification: size_of(*u8)=8, size_of(Ptr where Ptr::*u8)=8, size_of(?u8)=2, size_of(Maybe where Maybe::?u8)=2 — all clean on interp + codegen.

Also landed during 0041/0042: the silent .s64 fallback in resolveTypeArg is gone — unresolved type names now emit a real diagnostic. Surfaced and removed two bogus size_of(Complex) / size_of(Sx) calls in examples/10-generic-struct.sx that were relying on the silent default. Caller-side speculative paths in buildTypeBindings + inferGenericReturnType now gate the call with type_bridge.isTypeShapedAstNode before invoking resolveTypeArg. CLAUDE.md REJECTED PATTERNS gained a section forbidding silent orelse default returns in compiler code.

Parser regressions introduced by the 0041 work fixed in src/parser.zig:hasFnBodyAfterArrow:

  • (s: string) -> [:0]u8 { ... } — added .colon to the return-type token walk.
  • (x) -> Closure(...) -> R { ... } — added .arrow so nested return-type arrows continue the walk.
  • name :: (self: T) -> Ret; inside struct #compiler — recognise trailing ; as a method-decl terminator. Was silently dropping every BuildOptions.* method from fn_ast_map.

Full gate green: 151/151 example tests + chess on macOS / iOS sim / Android with screenshots.

Implicit Context refactor SHIPPED. Every default-conv sx function now carries __sx_ctx: *void at LLVM slot 0. context.X lookups resolve through the lowering's current_ctx_ref — a one-indirection load, no per-access walk, thread-safe by construction (each call stack carries its own Context chain). push Context.{...} allocates a fresh slot and rebinds current_ctx_ref for the body's lexical scope. The context LLVM global is gone; the only runtime Context is the static __sx_default_context (a CAllocator backing libc malloc/free), installed at FFI-inbound entries (main, Java_*, JNI_OnLoad) and used by the interp for #run evaluation.

What landed:

  • Function.has_implicit_ctx flag + Module.has_implicit_ctx per-compilation switch (gated by Context :: struct {...} being present in the dep graph).
  • Param prepend + call-site forwarding across every sx-to-sx path: direct calls, indirect through fn-pointer vars, protocol dispatch, closure trampolines, lambda trampolines, bare-fn trampolines, generic monomorphizations, comptime functions.
  • ConstantValue.func_ref: FuncId variant so the static initializer for __sx_default_context can reference the CAllocator thunks.
  • emit_llvm two-pass global emission: aggregates that name funcs by FuncId resolve them after func_map is populated.
  • Interp: defaultContextValue() builds the Context aggregate on demand; interp.call bootstraps slot_ptr(0) when an entry function with implicit ctx is invoked sans args; materializeCtxArg derefs the caller's slot_ptr at every sx-to-sx boundary so callees can treat ref 0 as the Context aggregate; .load of an aggregate is passthrough; .global_addr of __sx_default_context returns the aggregate directly.
  • matchContextAllocCall is GONE (commit d415bcc). Comptime now runs the full Allocator-protocol dispatch chain — the same IR codegen emits — by reusing the parent module instead of spinning up a separate ct_module. The interp gained raw-pointer paths (index_gep, index_get, store, marshalForeignArg, asString) so context.allocator.alloc bottoms out at host libc_malloc and the returned pointer survives downstream sx ops.
  • inst.Store carries val_ty: TypeId so the interp's raw-pointer store honours the destination width — no more "assume 8 bytes" silent clobber. Regression test at examples/132-comptime-typed-store-widths.sx exercises every primitive width (u8/u16/u32/u64, s8..s64, bool, f32, f64) via comptime checksums compared to runtime checksums.
  • Call-convention mismatch at bare-fn → fn-pointer coercion is now a compile error (commit f886d5f). The chess-debug sweep that surfaced the bug also moved #foreign decls to default callconv(.c) and fixed every consumer-side sx callback exposed to a C API. Regression test: examples/131-callconv-mismatch-diagnostic.sx.
  • Interp silent-arm sweep (commit e9df33a). Every else => arm has a named bail reason via bailDetail / typeErrorDetail. .deref and .unbox_any used to silently pass through arbitrary Value kinds — now enumerated. #run const errors no longer swallow into void_val; emit_llvm surfaces them via std.debug.
  • C-side callback into sx requires callconv(.c) on the sx fn (and on any fn-pointer TYPE the user casts a C fn-pointer through). Tests adjusted: examples/61-objc-roundtrip.sx, examples/62-objc-class.sx, examples/95..97-objc-block*.sx, examples/ffi-06-callback.sx.

154/154 example tests pass (two new regression tests added: 131 and 132). Chess green on macOS / iOS sim / Android.

ISSUE-MEM-002 (the context.allocator.alloc(size) pattern-match bypass) is FULLY CLOSED. User-typed context.allocator.X flows through the real protocol vtable at codegen and runs the same chain at comptime in the interp. No remaining shortcut.

Next step

Phase 1.3 (closure env allocation through context) and Phase 1.4 (codegen serializer for all interp Value variants) are unblocked. Phase 1.2 (free / malloc through context) was considered and skippedcontext.allocator.alloc/dealloc already works directly; wrapper-only malloc/free would be lossy renames.

Suggested next move: Phase 1.3. Closure trampolines in lower.zig:lowerLambda call .heap_alloc directly for the env pointer; routing through context.allocator.alloc means closures respect push Context.{ allocator = ... } and get leak-tracked by TrackingAllocator. Contained change. Regression test pattern: mirror examples/130-xx-value-routes-through-context-allocator.sx with a closure that captures a variable, install a tracker via push, verify the tracker's counter incremented.

Phase 0.3 audit findings — chess allocator usage (closed)

After Step 5 / matchContextAllocCall removal, every consumer call to context.allocator.X flows through the real protocol vtable. This section is left for history — the audit drove which sites needed migration, but no chess code actually needed any allocator- API change. The sites that used to bypass the protocol via the .heap_alloc pattern-match now dispatch through the inline Allocator value naturally.

  • ~/projects/game/main.sx — 7 sites of context.allocator.alloc(size_of(T)) for platform/GPU/pipeline state. Now real protocol dispatch.
  • ~/projects/game/chess/game.sxChessGameState.init captures context.allocator into a parent_allocator: Allocator field and restores it via push Context.{ allocator = ... }. Worked before, still works.
  • ~/projects/game/chess/pieces.sx — declares its own free bound to libc and calls it on a libc-malloc'd buffer (from a foreign reader). Intentional C-interop bypass — no change needed.
  • ~/projects/game/quick.sx — quicksort demo. Same flow as main.sx.

Log

  • 2026-05-25 (late) — Interp silent-arm sweep (e9df33a). Every else => arm has a bailDetail reason; .deref / .unbox_any previously silently passed through arbitrary Value kinds, now enumerated. #run const errors surface a real diagnostic via emit_llvm instead of becoming void_val. CLAUDE.md REJECTED PATTERNS gained the "silent unimplemented arms" section (4de565b). 154/154 + chess green.
  • 2026-05-25 (mid) — Typed raw-pointer stores (f2b3868). inst.Store carries val_ty: TypeId, threaded by builder.store and consumed by storeAtRawPtr to write exactly the declared destination width. Regression at examples/132-comptime-typed-store-widths.sx exercises every primitive width via comptime/runtime checksum comparison. index_get raw-pointer arm added (was bailing). Comptime init errors no longer swallow into zero.
  • 2026-05-25 (mid) — Drop matchContextAllocCall (d415bcc). Comptime now runs the full Allocator-protocol dispatch chain by reusing the parent module instead of spinning up a fresh ct_module. Interp gained .int / .byte_ptr arms in index_gep, store, marshalForeignArg, asString. Closes ISSUE-MEM-002 fully. JNI stub binding extended to call current_ctx_ref → &__sx_default_context (used to be gated on isExportedEntryName).
  • 2026-05-25 (mid) — Reject call-conv mismatch at bare-fn → fn-pointer coercion (f886d5f). #foreign decls now default to callconv(.c). Library audit (619aff8) — all C-side callbacks already followed the rule; documented the one remaining gap (xx <sx_fn> : *void cast to opaque, ambiguous from cast alone). Regression at examples/131-callconv-mismatch-diagnostic.sx.
  • 2026-05-25 (early) — Implicit-Context refactor SHIPPED end-to-end. All 9 plan steps (lets-see-options-for-merry-dijkstra.md) landed. context is no longer an LLVM global; every sx function carries __sx_ctx at slot 0; context.X reads load through current_ctx_ref; push Context.{...} is alloca + rebind; FFI-inbound entries install &__sx_default_context; interp bootstraps the default Context on top-level call. 152/152 + unit tests green. Commits: 29784c2 (Steps 1-2), 92c6b47 (Step 3), 4bf5908 (Steps 5-7), b69a2ea (Step 8).
  • 2026-05-24 — Phase 1.1 shipped: buildProtocolValue heap-copy now routes through context.allocator.alloc via the new allocViaContext helper. Regression at examples/130-xx-value-routes-through-context-allocator.sx proves a Tracer installed via push Context sees the alloc (Tracer.count = 1) — interp + codegen parity. 152/152 + chess green.
  • 2026-05-24 — issue-0041 and issue-0042 both fixed end-to-end. Also removed the silent .s64 fallback in resolveTypeArg, guarded the two upstream callers (buildTypeBindings, inferGenericReturnType) with type_bridge.isTypeShapedAstNode, and fixed three parser regressions introduced by the 0041 work ([:0]u8 return type, nested return-arrows, struct #compiler trailing-; method decls). Full verify-step.sh gate green. CLAUDE.md REJECTED PATTERNS gained the silent-fallback-defaults prohibition. Stream now READY for Phase 1.
  • 2026-05-24 — Phase 0.6 shipped (align_of($T) builtin). Touchpoints: inst.zig BuiltinId, lower.zig registry + return-type table + reflection handler, interp.zig fallback, sema.zig builtin allowlist, lsp/server.zig both completion tables, library/modules/std.sx. Smoke coverage added in examples/50-smoke.sx. 151/151 + chess green on all platforms. Then size_of(*u8) parse error was investigated — filed as issue-0041 (pre-existing, affects both size_of and align_of). Stream paused on 0041. Also tightened CLAUDE.md IMPASSIBLE RULES to close the "pre-existing / non-blocking" loophole that almost let this session roll past the issue filing.
  • 2026-05-24 — issue-0040 filed. Phase 0.9 verified that obj.method(Type) with a $T: Type parameter fails to dispatch via dot, while explicit static (T.method(obj, Type)) and pipe (obj |> T.method(Type)) both work. Root cause pinpoint: src/ir/lower.zig:5066-5123 has branches for generic-template struct methods (5082) and non-generic qualified (5106), but no branch for a non-template struct with a generic method. Stream paused on 0040.
  • 2026-05-24 — Phase 0.8 #foreign("c") syntax verified. Form is name :: <sig> #foreign libc ["c_name"] with the optional string literal supplying a rename. Confirmed via libc_strlen :: (s: *u8) -> usize #foreign libc "strlen"; scratch test (interp + codegen parity).
  • 2026-05-24 — issue-0039 fix verified, full tools/verify-step.sh gate green again: 150/150 example tests pass and chess builds + screenshots OK on macOS / iOS sim / Android.
  • 2026-05-24 — issue-0038 fix verified; spike now errors as expected on transitive references. 149/149 example tests pass (+1 vs pre-fix). Chess build broken as predicted — three-bucket triage written up in issues/0039-chess-needs-explicit-imports- post-0038.md. Stream paused on 0039.
  • 2026-05-24 — Phase 0.7 spike at /tmp/sx-import-spike/ (a.sx → b.sx → c.sx) showed a.sx calls c_only_fn() and reads c_only_const directly. Filed as issue-0038. Stream is paused per the IMPASSIBLE RULE — no workaround, the libc hide-by-internal- module strategy in the plan depends on the language semantics matching the assumption.
  • 2026-05-24 — Phase 0.2 sanity sweep landed. Five additional xx cast patterns exercised through tools/scratch.sh; all show interp/codegen parity. Combined with the issue-0037 regression at examples/126-xx-recover-then-dispatch.sx, the cast story is now considered tight for the cases the MEM plan relies on.
  • 2026-05-24 — Phase 0.3 chess allocator audit recorded above. No chess-side migration needed; ISSUE-MEM-002 (context.allocator bypass) is the only thing the chess codebase is exposed to and Phase 1 already owns it.
  • 2026-05-24 — Phase 0.0a (tools/verify-step.sh) shipped. Confirmed working: 145/145 example tests + chess builds + screenshots on all 3 platforms. Initial 3-second screencap delay was too short for Android — increased to 6 seconds; iOS sim + macOS to 5 seconds.
  • 2026-05-24 — Phase 0.0b (tools/scratch.sh) shipped. Verified with a hello-world snippet: interp + codegen agree.
  • 2026-05-24 — Phase 0.0c initial implementation of TrackingAllocator in library/modules/allocators.sx. Build + 145 tests pass after snapshot regen. Chess green on all 3 platforms. Manual scratch.sh test confirms counters increment correctly when called directly on the tracker variable.
  • 2026-05-24 — ISSUE-MEM-007 fixed in src/ir/lower.zig. Root cause: emitProtocolDispatch keyed the auto-unbox path on mi.ret_type == void_ptr, but *void is overloaded — both Self-disguised-as-*void AND a literal -> *void return appear as the same TypeId. With target_type leaking from the enclosing function's return type (e.g. main -> s32), every *void return was loaded as s32, yielding 0 → null. Fix: stash ret_is_self on ProtocolMethodInfo during registerProtocolDecl (set when the AST return-type node is the Self type-expr), and gate the unbox on that flag. Regression at examples/99-protocol-void-pointer-return.sx. Sister symptom (SIGTRAP inside struct-static method storing an Allocator field) also fixed by the same change.
  • 2026-05-24 — issue-0037 fixed in src/ir/lower.zig. Root cause: lowerXX had a Concrete→Protocol erasure branch but no inverse Protocol→pointer recovery — the cast fell through to coerceToType, which couldn't match the (struct, pointer) shape and returned the operand unchanged. Result: a 24-byte protocol struct stored into an 8-byte ptr alloca, corrupting adjacent stack (the protocol value's own slot was the next victim, so the next dispatch loaded garbage and crashed). Fix: when src is a protocol value and dst is a pointer, emit struct_get of field 0 (ctx), then bitcast *void → dst. Regression at examples/126-xx-recover-then-dispatch.sx.

Known issues (discovered during execution)

ISSUE-MEM-001: Type inference defaults p := malloc(64) to s64

Severity: medium (workaround exists; bites unexpectedly).

Symptom: Writing p := malloc(64) (no explicit type) infers p: s64 instead of p: *void. Subsequent free(p) then fails LLVM verification with "Call parameter type does not match function signature!" because free expects ptr but receives i64.

Workaround: Explicit type annotation p : *void = malloc(64); or xx malloc(64); at the call site.

Reproduction:

main :: () -> s32 {
    p := malloc(64);   // p inferred as s64
    free(p);            // LLVM verify fails: ptr expected, i64 given
    0;
}

Root cause: Likely in the inference path for := declarations when the RHS is a *void-returning #builtin. The compiler defaults the binding to s64 instead of matching the return type. To investigate in a future session.

Status: Open. Not blocking mem.sx work but worth fixing as a quality-of-life issue. File as examples/issue-NNNN.sx when addressed.

ISSUE-MEM-002: context.allocator.alloc/dealloc bypasses protocol dispatch

Severity: high (breaks the entire Phase 1 premise; documented fix in Phase 1).

Symptom: Any code that goes through context.allocator.alloc(size) or context.allocator.dealloc(ptr) is pattern-matched in src/ir/lower.zig:5137-5159 and lowered directly to .heap_alloc/.heap_free IR — which calls libc malloc/free. The protocol-value vtable is bypassed entirely.

This means a push Context { allocator = my_tracker } block followed by context.allocator.alloc(size) does NOT call the tracker's alloc method. The tracker sees zero allocations even though many occurred.

Workaround: Call the allocator directly via a variable rather than via context.allocator:

push Context.{ allocator = tracker, data = null } {
    p := tracker.alloc(64);     // works — dispatches through tracker
    tracker.dealloc(p);
}

But this defeats the purpose of context-allocator overriding for user code that doesn't know about the tracker.

Fix: Phase 1 of the mem.sx plan removes this pattern-match and replaces it with proper context dispatch through the Allocator protocol. After Phase 1, context.allocator.alloc(size) correctly dispatches to whatever allocator is currently set in context.

Status: Documented in plan; fixed in Phase 1. Blocks auto-tracker-wrap (Phase 5 --leak-check).

ISSUE-MEM-003: 08-types test depends on undefined memory contents

Severity: low (flaky test exposes itself when allocator code changes).

Symptom: examples/08-types.sx declares a struct field c : u8 = ---; (uninitialized) and prints the struct. The expected snapshot captures a specific value (formerly c: 176, now c: 8) which depends on whatever's in undefined memory at that moment. Any allocator change shifts the value.

Workaround: Regenerate snapshot via bash tests/run_examples.sh --update when this test fails for unrelated reasons.

Fix: Test should be rewritten to NOT depend on undefined memory content — perhaps verify the field is one of N specific values, or just don't print uninitialized data. Out of scope for mem.sx.

Status: Open. Document and live with it for now.

ISSUE-MEM-004: Adding code to allocators.sx shifts JNI/Obj-C IR snapshots

Severity: medium (eats time per step).

Symptom: Every additive change to allocators.sx (new struct, new method) cascades into ~11 IR snapshot diffs across the tests/expected/ffi-{jni,objc}-*.ir files. Each step needs --update + git diff review. The diffs are usually benign (additive declarations) but one (ffi-objc-call-06-sret-return.ir) reorganises ~1500 lines because it uses reflection on the full type registry.

Workaround: Per-step --update + diff review.

Fix: Phase 0.1 spike — extend normalize_ir() in tests/run_examples.sh to strip allocator-related declarations and constant-pool renumberings from snapshots. Should make additive changes invisible at the snapshot level while preserving JNI/Obj-C ceremony signal.

Status: Mitigated; structural fix in Phase 0.1.

ISSUE-MEM-005: Two-line create(@storage) pattern for allocators

Severity: low (cosmetic but real DX friction).

Symptom: Current pattern requires a separate storage decl + a create call:

g_gpa : GPA = ---;
libc := GPA.create(@g_gpa);

Two lines per allocator. Plan committed to one-line via heap-copy:

libc := GPA.create();   // heap-copies via xx value

Fix in progress: Phase 0.0c — switch create methods on at least TrackingAllocator (and possibly GPA/Arena/BufAlloc) to use xx value heap-copy. Add instance(a) -> *T accessor for types where users need the underlying state (TrackingAllocator).

Status: In progress (active work).

ISSUE-MEM-007: Protocol dispatch on *void-returning methods returns null (FIXED)

Severity: was CRITICAL; resolved.

Root cause: emitProtocolDispatch in src/ir/lower.zig gated its auto-unbox path on mi.ret_type == void_ptr, but the same TypeId covers both Self-disguised-as-*void and a literal -> *void. With target_type leaking from the surrounding function (e.g. main -> s32), every protocol call returning *void got its result loaded as sizeof(target_type) bytes — for s32 that's the first 4 bytes of the malloc'd block, which were zero, comparing equal to null.

Fix: Tag ProtocolMethodInfo with ret_is_self: bool, set in registerProtocolDecl when the AST return-type is the Self type-expr. emitProtocolDispatch only takes the auto-unbox path when ret_is_self is true. Literal *void returns are now passed through unchanged.

Regression: examples/99-protocol-void-pointer-return.sx. The sister symptom (SIGTRAP from protocol dispatch inside a struct static method that stores an Allocator field) was the same root cause and is also fixed.

Status: Resolved 2026-05-24.

ISSUE-MEM-006: Android screencap needs 6+ seconds; iOS sim + macOS 5+

Severity: low (fixed in tooling).

Symptom: Initial verify-step.sh used sleep 3 before screencap. Android needed longer for the side panel to render; without it, the screenshot showed only the chess board and a black strip where the side panel should be.

Fix: Updated verify-step.sh delays: macOS 5s, iOS sim 5s, Android 6s. Documented inline in the script.

Status: Resolved.