lab/sx

Files

agra a8fb1233e3 mem: Step 9 — checkpoint reconciliation for implicit-Context refactor

Update CHECKPOINT-MEM.md to reflect that Steps 1-9 of
lets-see-options-for-merry-dijkstra.md are shipped. Notes that
ISSUE-MEM-002 is closed in the user-call path (matchContextAllocCall
remains as a documented comptime escape hatch), Phase 1.2/1.3/1.4
are unblocked, and points future sessions at the MEM plan
(tidy-doodling-cray.md) for the next phase.

2026-05-25 09:15:08 +03:00

21 KiB

Raw Blame History

Memory Module — Progress + Issues Log

Tracking checkpoint for the mem.sx Zig-aligned implementation (plan: ~/.claude/plans/tidy-doodling-cray.md).

Last completed step

Implicit-Context refactor (Steps 1-9 of ~/.claude/plans/lets-see-options-for-merry-dijkstra.md) — all shipped. See "Current state" for what landed; "Log" for commits.

Current state

Phase 0.0c shipped (allocator API on one-line init returning *T; TrackingAllocator added). 148/148 tests pass.

Phase 0.2 verified across all known patterns. Phase 0.3 documented.

Phase 0 of the MEM plan is COMPLETE.

Fixed this session by paired sessions: issue-0038 (transitive #import), issue-0039 (chess + stdlib migration to explicit imports), issue-0040 (generic struct method dot-dispatch). All landed and re-verified. Full gate green: 151/151 example tests + chess on macOS / iOS sim / Android with screenshots.

Phase 0 spike outcomes:

0.2 xx cast patterns — verified across 5 additional shapes beyond the issue-0037 regression.
0.3 chess allocator audit — documented; no chess-side migration needed.
0.6 align_of($T) builtin — landed. Mirrors size_of in inst.zig BuiltinId, lower.zig (registry + return-type + reflection handler), interp.zig (fallback), sema.zig (allowed-builtins list), lsp/server.zig (both completion tables), library/modules/std.sx. Smoke coverage added in examples/50-smoke.sx (u8/s32/s64/Point).
0.7 #import transitivity — surfaced and fixed via issue-0038.
0.8 #foreign("c") rename syntax — confirmed #foreign libc "name".
0.9 generic UFCS dispatch — surfaced and fixed via issue-0040. Dot-call now works for struct methods with $T: Type parameters, so the plan's a.create(MyType) shape is viable.

issue-0041 (pointer type as type-arg) and issue-0042 (alias names in resolveTypeArg) FIXED. Regression tests at examples/issue-0041.sx and examples/issue-0042.sx. Scratch verification: size_of(*u8)=8, size_of(Ptr where Ptr::*u8)=8, size_of(?u8)=2, size_of(Maybe where Maybe::?u8)=2 — all clean on interp + codegen.

Also landed during 0041/0042: the silent .s64 fallback in resolveTypeArg is gone — unresolved type names now emit a real diagnostic. Surfaced and removed two bogus size_of(Complex) / size_of(Sx) calls in examples/10-generic-struct.sx that were relying on the silent default. Caller-side speculative paths in buildTypeBindings + inferGenericReturnType now gate the call with type_bridge.isTypeShapedAstNode before invoking resolveTypeArg. CLAUDE.md REJECTED PATTERNS gained a section forbidding silent orelse default returns in compiler code.

Parser regressions introduced by the 0041 work fixed in src/parser.zig:hasFnBodyAfterArrow:

(s: string) -> [:0]u8 { ... } — added .colon to the return-type token walk.
(x) -> Closure(...) -> R { ... } — added .arrow so nested return-type arrows continue the walk.
name :: (self: T) -> Ret; inside struct #compiler — recognise trailing ; as a method-decl terminator. Was silently dropping every BuildOptions.* method from fn_ast_map.

Full gate green: 151/151 example tests + chess on macOS / iOS sim / Android with screenshots.

Implicit Context refactor SHIPPED. Every default-conv sx function now carries __sx_ctx: *void at LLVM slot 0. context.X lookups resolve through the lowering's current_ctx_ref — a one-indirection load, no per-access walk, thread-safe by construction (each call stack carries its own Context chain). push Context.{...} allocates a fresh slot and rebinds current_ctx_ref for the body's lexical scope. The context LLVM global is gone; the only runtime Context is the static __sx_default_context (a CAllocator backing libc malloc/free), installed at FFI-inbound entries (main, Java_*, JNI_OnLoad) and used by the interp for #run evaluation.

What landed:

Function.has_implicit_ctx flag + Module.has_implicit_ctx per-compilation switch (gated by Context :: struct {...} being present in the dep graph).
Param prepend + call-site forwarding across every sx-to-sx path: direct calls, indirect through fn-pointer vars, protocol dispatch, closure trampolines, lambda trampolines, bare-fn trampolines, generic monomorphizations, comptime functions.
ConstantValue.func_ref: FuncId variant so the static initializer for __sx_default_context can reference the CAllocator thunks.
emit_llvm two-pass global emission: aggregates that name funcs by FuncId resolve them after func_map is populated.
Interp: defaultContextValue() builds the Context aggregate on demand; interp.call bootstraps slot_ptr(0) when an entry function with implicit ctx is invoked sans args; materializeCtxArg derefs the caller's slot_ptr at every sx-to-sx boundary so callees can treat ref 0 as the Context aggregate; .load of an aggregate is passthrough; .global_addr of __sx_default_context returns the aggregate directly.
matchContextAllocCall is preserved as a comptime escape hatch — the ct_module spun up by evalComptimeString doesn't get the full Allocator/CAllocator/Context type registration, so the protocol- dispatch IR can't run in the interp. Codegen also benefits from the direct libc malloc/free in the trivial default-context case.
C-side callback into sx requires callconv(.c) on the sx fn (and on any fn-pointer TYPE the user casts a C fn-pointer through). Tests adjusted: examples/61-objc-roundtrip.sx, examples/62-objc-class.sx, examples/95..97-objc-block*.sx, examples/ffi-06-callback.sx.

152/152 example tests pass. 11 JNI/ObjC IR snapshots regen for the ctx-prepended thunk signatures. examples/75-push-context-with-arena.sx still demonstrates push-as-stack-discipline. 08-types.txt regen for the undefined-init drift (the test prints =--- fields).

ISSUE-MEM-002 (the context.allocator.alloc(size) pattern-match bypass) is closed in the sense that user-typed context.allocator.X now flows through the real protocol vtable at codegen time. The pattern-match remains in the lowering as a comptime escape hatch (documented at matchContextAllocCall).

Next step

Phase 1.2 (free → context.allocator.dealloc) and Phase 1.3 (closure env allocation through context) are unblocked. Phase 1.4 (codegen serializer for all interp Value variants) is also unblocked.

The plan at ~/.claude/plans/lets-see-options-for-merry-dijkstra.md is fully shipped. Next session can pick Phase 1.2 from the MEM plan at ~/.claude/plans/tidy-doodling-cray.md.

Phase 0.3 audit findings — chess allocator usage

Sites where chess code touches an allocator API:

~/projects/game/quick.sx — standalone quicksort demo. Uses GPA.init() + Arena.init(context.allocator, …) (already on the new one-line init API). Calls context.allocator.alloc(N*size) which goes through ISSUE-MEM-002's .heap_alloc pattern-match — i.e. bypasses the protocol. Will be fixed by Phase 1.
~/projects/game/main.sx — 7 sites of context.allocator.alloc(size_of(T)) for platform/GPU/pipeline state. Same ISSUE-MEM-002 bypass; Phase 1 cleanup applies.
~/projects/game/chess/game.sx — ChessGameState.init captures context.allocator into a parent_allocator: Allocator field, then restores it via push Context.{ allocator = self.parent_allocator, data = … } in select_square. This is exactly the protocol-store + protocol-restore shape covered by Phase 0.2 sanity tests — known working.
~/projects/game/chess/pieces.sx — declares its own free :: (ptr: *void) #foreign plus read_file_bytes and calls free(xx bytes) directly on a libc-malloc'd buffer (returned by a foreign read helper). Bypasses the allocator protocol intentionally — this is C interop for buffers owned by C side. No change needed.

Net: chess is well-positioned. After Phase 1 lands, the seven context.allocator.alloc sites in main.sx + the one in quick.sx will start flowing through the protocol vtable instead of .heap_alloc, which means --leak-check (Phase 5) will start counting them for free. No chess code needs migration on the allocator API itself.

Log

2026-05-25 — Implicit-Context refactor SHIPPED end-to-end. All 9 plan steps (lets-see-options-for-merry-dijkstra.md) landed. context is no longer an LLVM global; every sx function carries __sx_ctx at slot 0; context.X reads load through current_ctx_ref; push Context.{...} is alloca + rebind; FFI- inbound entries install &__sx_default_context; interp bootstraps the default Context on top-level call; matchContextAllocCall preserved as comptime escape hatch. 152/152 + unit tests green. Three commits: 29784c2 (Steps 1-2), 92c6b47 (Step 3), 4bf5908 (Steps 5-7), b69a2ea (Step 8).
2026-05-24 — Phase 1.1 shipped: buildProtocolValue heap-copy now routes through context.allocator.alloc via the new allocViaContext helper. Regression at examples/130-xx-value-routes-through-context-allocator.sx proves a Tracer installed via push Context sees the alloc (Tracer.count = 1) — interp + codegen parity. 152/152 + chess green.
2026-05-24 — issue-0041 and issue-0042 both fixed end-to-end. Also removed the silent .s64 fallback in resolveTypeArg, guarded the two upstream callers (buildTypeBindings, inferGenericReturnType) with type_bridge.isTypeShapedAstNode, and fixed three parser regressions introduced by the 0041 work ([:0]u8 return type, nested return-arrows, struct #compiler trailing-; method decls). Full verify-step.sh gate green. CLAUDE.md REJECTED PATTERNS gained the silent-fallback-defaults prohibition. Stream now READY for Phase 1.
2026-05-24 — Phase 0.6 shipped (align_of($T) builtin). Touchpoints: inst.zig BuiltinId, lower.zig registry + return-type table + reflection handler, interp.zig fallback, sema.zig builtin allowlist, lsp/server.zig both completion tables, library/modules/std.sx. Smoke coverage added in examples/50-smoke.sx. 151/151 + chess green on all platforms. Then size_of(*u8) parse error was investigated — filed as issue-0041 (pre-existing, affects both size_of and align_of). Stream paused on 0041. Also tightened CLAUDE.md IMPASSIBLE RULES to close the "pre-existing / non-blocking" loophole that almost let this session roll past the issue filing.
2026-05-24 — issue-0040 filed. Phase 0.9 verified that obj.method(Type) with a $T: Type parameter fails to dispatch via dot, while explicit static (T.method(obj, Type)) and pipe (obj |> T.method(Type)) both work. Root cause pinpoint: src/ir/lower.zig:5066-5123 has branches for generic-template struct methods (5082) and non-generic qualified (5106), but no branch for a non-template struct with a generic method. Stream paused on 0040.
2026-05-24 — Phase 0.8 #foreign("c") syntax verified. Form is name :: <sig> #foreign libc ["c_name"] with the optional string literal supplying a rename. Confirmed via libc_strlen :: (s: *u8) -> usize #foreign libc "strlen"; scratch test (interp + codegen parity).
2026-05-24 — issue-0039 fix verified, full tools/verify-step.sh gate green again: 150/150 example tests pass and chess builds + screenshots OK on macOS / iOS sim / Android.
2026-05-24 — issue-0038 fix verified; spike now errors as expected on transitive references. 149/149 example tests pass (+1 vs pre-fix). Chess build broken as predicted — three-bucket triage written up in issues/0039-chess-needs-explicit-imports- post-0038.md. Stream paused on 0039.
2026-05-24 — Phase 0.7 spike at /tmp/sx-import-spike/ (a.sx → b.sx → c.sx) showed a.sx calls c_only_fn() and reads c_only_const directly. Filed as issue-0038. Stream is paused per the IMPASSIBLE RULE — no workaround, the libc hide-by-internal- module strategy in the plan depends on the language semantics matching the assumption.
2026-05-24 — Phase 0.2 sanity sweep landed. Five additional xx cast patterns exercised through tools/scratch.sh; all show interp/codegen parity. Combined with the issue-0037 regression at examples/126-xx-recover-then-dispatch.sx, the cast story is now considered tight for the cases the MEM plan relies on.
2026-05-24 — Phase 0.3 chess allocator audit recorded above. No chess-side migration needed; ISSUE-MEM-002 (context.allocator bypass) is the only thing the chess codebase is exposed to and Phase 1 already owns it.
2026-05-24 — Phase 0.0a (tools/verify-step.sh) shipped. Confirmed working: 145/145 example tests + chess builds + screenshots on all 3 platforms. Initial 3-second screencap delay was too short for Android — increased to 6 seconds; iOS sim + macOS to 5 seconds.
2026-05-24 — Phase 0.0b (tools/scratch.sh) shipped. Verified with a hello-world snippet: interp + codegen agree.
2026-05-24 — Phase 0.0c initial implementation of TrackingAllocator in library/modules/allocators.sx. Build + 145 tests pass after snapshot regen. Chess green on all 3 platforms. Manual scratch.sh test confirms counters increment correctly when called directly on the tracker variable.
2026-05-24 — ISSUE-MEM-007 fixed in src/ir/lower.zig. Root cause: emitProtocolDispatch keyed the auto-unbox path on mi.ret_type == void_ptr, but *void is overloaded — both Self-disguised-as-*void AND a literal -> *void return appear as the same TypeId. With target_type leaking from the enclosing function's return type (e.g. main -> s32), every *void return was loaded as s32, yielding 0 → null. Fix: stash ret_is_self on ProtocolMethodInfo during registerProtocolDecl (set when the AST return-type node is the Self type-expr), and gate the unbox on that flag. Regression at examples/99-protocol-void-pointer-return.sx. Sister symptom (SIGTRAP inside struct-static method storing an Allocator field) also fixed by the same change.
2026-05-24 — issue-0037 fixed in src/ir/lower.zig. Root cause: lowerXX had a Concrete→Protocol erasure branch but no inverse Protocol→pointer recovery — the cast fell through to coerceToType, which couldn't match the (struct, pointer) shape and returned the operand unchanged. Result: a 24-byte protocol struct stored into an 8-byte ptr alloca, corrupting adjacent stack (the protocol value's own slot was the next victim, so the next dispatch loaded garbage and crashed). Fix: when src is a protocol value and dst is a pointer, emit struct_get of field 0 (ctx), then bitcast *void → dst. Regression at examples/126-xx-recover-then-dispatch.sx.

Known issues (discovered during execution)

ISSUE-MEM-001: Type inference defaults `p := malloc(64)` to `s64`

Severity: medium (workaround exists; bites unexpectedly).

Symptom: Writing p := malloc(64) (no explicit type) infers p: s64 instead of p: *void. Subsequent free(p) then fails LLVM verification with "Call parameter type does not match function signature!" because free expects ptr but receives i64.

Workaround: Explicit type annotation p : *void = malloc(64); or xx malloc(64); at the call site.

Reproduction:

main :: () -> s32 {
    p := malloc(64);   // p inferred as s64
    free(p);            // LLVM verify fails: ptr expected, i64 given
    0;
}

Root cause: Likely in the inference path for := declarations when the RHS is a *void-returning #builtin. The compiler defaults the binding to s64 instead of matching the return type. To investigate in a future session.

Status: Open. Not blocking mem.sx work but worth fixing as a quality-of-life issue. File as examples/issue-NNNN.sx when addressed.

ISSUE-MEM-002: `context.allocator.alloc/dealloc` bypasses protocol dispatch

Severity: high (breaks the entire Phase 1 premise; documented fix in Phase 1).

Symptom: Any code that goes through context.allocator.alloc(size) or context.allocator.dealloc(ptr) is pattern-matched in src/ir/lower.zig:5137-5159 and lowered directly to .heap_alloc/.heap_free IR — which calls libc malloc/free. The protocol-value vtable is bypassed entirely.

This means a push Context { allocator = my_tracker } block followed by context.allocator.alloc(size) does NOT call the tracker's alloc method. The tracker sees zero allocations even though many occurred.

Workaround: Call the allocator directly via a variable rather than via context.allocator:

push Context.{ allocator = tracker, data = null } {
    p := tracker.alloc(64);     // works — dispatches through tracker
    tracker.dealloc(p);
}

But this defeats the purpose of context-allocator overriding for user code that doesn't know about the tracker.

Fix: Phase 1 of the mem.sx plan removes this pattern-match and replaces it with proper context dispatch through the Allocator protocol. After Phase 1, context.allocator.alloc(size) correctly dispatches to whatever allocator is currently set in context.

Status: Documented in plan; fixed in Phase 1. Blocks auto-tracker-wrap (Phase 5 --leak-check).

ISSUE-MEM-003: 08-types test depends on undefined memory contents

Severity: low (flaky test exposes itself when allocator code changes).

Symptom: examples/08-types.sx declares a struct field c : u8 = ---; (uninitialized) and prints the struct. The expected snapshot captures a specific value (formerly c: 176, now c: 8) which depends on whatever's in undefined memory at that moment. Any allocator change shifts the value.

Workaround: Regenerate snapshot via bash tests/run_examples.sh --update when this test fails for unrelated reasons.

Fix: Test should be rewritten to NOT depend on undefined memory content — perhaps verify the field is one of N specific values, or just don't print uninitialized data. Out of scope for mem.sx.

Status: Open. Document and live with it for now.

ISSUE-MEM-004: Adding code to allocators.sx shifts JNI/Obj-C IR snapshots

Severity: medium (eats time per step).

Symptom: Every additive change to allocators.sx (new struct, new method) cascades into ~11 IR snapshot diffs across the tests/expected/ffi-{jni,objc}-*.ir files. Each step needs --update + git diff review. The diffs are usually benign (additive declarations) but one (ffi-objc-call-06-sret-return.ir) reorganises ~1500 lines because it uses reflection on the full type registry.

Workaround: Per-step --update + diff review.

Fix: Phase 0.1 spike — extend normalize_ir() in tests/run_examples.sh to strip allocator-related declarations and constant-pool renumberings from snapshots. Should make additive changes invisible at the snapshot level while preserving JNI/Obj-C ceremony signal.

Status: Mitigated; structural fix in Phase 0.1.

ISSUE-MEM-005: Two-line `create(@storage)` pattern for allocators

Severity: low (cosmetic but real DX friction).

Symptom: Current pattern requires a separate storage decl + a create call:

g_gpa : GPA = ---;
libc := GPA.create(@g_gpa);

Two lines per allocator. Plan committed to one-line via heap-copy:

libc := GPA.create();   // heap-copies via xx value

Fix in progress: Phase 0.0c — switch create methods on at least TrackingAllocator (and possibly GPA/Arena/BufAlloc) to use xx value heap-copy. Add instance(a) -> *T accessor for types where users need the underlying state (TrackingAllocator).

Status: In progress (active work).

ISSUE-MEM-007: Protocol dispatch on `*void`-returning methods returns null (FIXED)

Severity: was CRITICAL; resolved.

Root cause: emitProtocolDispatch in src/ir/lower.zig gated its auto-unbox path on mi.ret_type == void_ptr, but the same TypeId covers both Self-disguised-as-*void and a literal -> *void. With target_type leaking from the surrounding function (e.g. main -> s32), every protocol call returning *void got its result loaded as sizeof(target_type) bytes — for s32 that's the first 4 bytes of the malloc'd block, which were zero, comparing equal to null.

Fix: Tag ProtocolMethodInfo with ret_is_self: bool, set in registerProtocolDecl when the AST return-type is the Self type-expr. emitProtocolDispatch only takes the auto-unbox path when ret_is_self is true. Literal *void returns are now passed through unchanged.

Regression: examples/99-protocol-void-pointer-return.sx. The sister symptom (SIGTRAP from protocol dispatch inside a struct static method that stores an Allocator field) was the same root cause and is also fixed.

Status: Resolved 2026-05-24.

ISSUE-MEM-006: Android screencap needs 6+ seconds; iOS sim + macOS 5+

Severity: low (fixed in tooling).

Symptom: Initial verify-step.sh used sleep 3 before screencap. Android needed longer for the side panel to render; without it, the screenshot showed only the chess board and a black strip where the side panel should be.

Fix: Updated verify-step.sh delays: macOS 5s, iOS sim 5s, Android 6s. Documented inline in the script.

Status: Resolved.

21 KiB Raw Blame History