Stream A (atomics) foundation. Net-new atomic load/store codegen path, wired end-to-end except LLVM emission, which deliberately bails loudly so the example locks to a clean diagnostic (A.0b turns it green — cadence: no commit both adds a test and makes it pass). - library/modules/std/atomic.sx: Ordering enum, Atomic($T) transparent wrapper (init/load/store, seq_cst-only for now), atomic_load/atomic_store #builtin intrinsics. Opt-in import, NOT in the universal std facade (Ordering in the prelude grows every program's type table + churns 37 .ir snapshots). - IR: atomic_load/atomic_store ops + AtomicOrdering (all 5) + structs (inst.zig); print arms; comptime_vm arms reuse load/store (single-thread correct); recognizer tryLowerAtomicIntrinsic (const-ordering + scalar-size guards, both loud); emit dispatch -> emitAtomicLoad/Store bail via comptime_failed. - examples/1700-atomics-load-store.sx locked to the bail diagnostic. Full ordering surface (a.load(.acquire)) blocked on comptime-constant ordering propagation (comptime enum value params) — A.0.5, migrated not legacy.
14 KiB
PLAN-ATOMICS — Stream A (atomics lowering)
Carved from PLAN-POST-METATYPE.md Stream A + the design-of-record ../design/execution-evolution-roadmap.md §3 (N1)
- §4.6 (locked surface). Progress in CHECKPOINT-ATOMICS.md.
Goal: net-new LLVM atomic codegen. Surface = a pure-sx Atomic($T) generic struct +
an Ordering enum (ordinary sx), with the actual atomic operations recognized as
#builtin intrinsics at lower-time and emitted as new IR ops. This is 100% net-new —
no atomics scaffolding exists (the only lower.zig "ordering" is comparison ordering
< <= >=, unrelated to memory ordering — do not mistake it for groundwork).
Cadence (IMPASSIBLE): no commit both adds a test AND makes it pass (lock-to-bail, then
flip to green); zig build && zig build test green after every step; never regen snapshots
while red; scope regens with -Dname=examples/NNNN-…sx -Dupdate-goldens + review the diff.
New corpus category: 17xx atomics.
Design (grounded against the tree)
Representation — minimal compiler surface
Orderingis an ordinary sx enum, zero compiler coupling:Ordering :: enum { relaxed; acquire; release; acq_rel; seq_cst; } // tags 0..4Atomic($T)is an ordinary sx generic struct (mirrorsList :: struct ($T: Type)at list.sx:5), a transparent 1-field wrapper — atomicity is a property of the operation, not the storage, soAtomic(i64)has the exact layout/size/align ofi64. NO new IR type, NO type-system coupling:Atomic :: struct ($T: Type) { value: T; init :: (v: T) -> Atomic(T) { return .{ value = v }; } load :: (self: *Atomic(T), o: Ordering) -> T { return atomic_load(T, @self.value, o); } store :: (self: *Atomic(T), v: T, o: Ordering) { atomic_store(T, @self.value, v, o); } }- The operations are
#builtinintrinsic free functions, recognized by name at lower-time (the established pattern —size_of/type_infointryLowerReflectionCall, recognized BEFORE arg lowering):Explicitatomic_load :: ($T: Type, ptr: *T, o: Ordering) -> T #builtin; atomic_store :: ($T: Type, ptr: *T, v: T, o: Ordering) #builtin;$Tfirst arg follows thesize_of($T)/field_name($T, idx)mixed type+value precedent (lowest-risk; the reflection path already resolves type args).
Ordering is compile-time-only by construction — and that forces a capability gap
LLVM atomic ordering is an instruction attribute, not a runtime operand, so the
ordering MUST be known at emit time. The lower-time handler reads the ordering arg's
variant name statically (it must be a constant enum literal .seq_cst) and bakes it
into the IR op as a Zig enum field (AtomicOrdering). A non-literal ordering is a loud
diagnostic, never a silent default (REJECTED-PATTERNS).
Discovered gap (grounded): a generic Atomic(T) method load(self, o: Ordering) would
forward o — a runtime parameter — to the intrinsic, where it is NOT a literal. And
comptime enum value params don't exist ($o: Ordering → o is "unresolved" in the
body; resolveValueParamArg folds integer constraints only). A runtime dispatch hack
(if o == { case .acquire: atomic_load(…, .acquire) … }) also fails: load with a
release/acq_rel ordering is invalid LLVM, so the arms can't be uniform. Therefore the
full ordering surface is blocked on a net-new capability (comptime-constant ordering
propagation — either comptime enum value params, or compiler-recognized Atomic method
calls). That capability is its own step (A.0.5), sequenced before ordering-bearing ops.
sx tag → LLVM ordering is EXPLICIT (non-contiguous!)
LLVM's LLVMAtomicOrdering is not 0..4: Monotonic=2, Acquire=4, Release=5, AcquireRelease=6, SequentiallyConsistent=7 ([Core.h:338-354]). The sx Ordering tags
(relaxed=0…seq_cst=4) map via an explicit switch, never an identity cast:
relaxed→Monotonic, acquire→Acquire, release→Release, acq_rel→AcquireRelease, seq_cst→SequentiallyConsistent.
LLVM-C API (verified present in llvm-c/Core.h, no new extern decls needed)
- Atomic load =
LLVMBuildLoad2+LLVMSetOrdering(v, ord)+LLVMSetAlignment(v, size)(alignment is mandatory on atomic load/store — LLVM verifier rejects atomics without it). There is noLLVMBuildAtomicLoad/Store(the Explore agent was wrong). - Atomic store =
LLVMBuildStore+LLVMSetOrdering+LLVMSetAlignment. - (Later)
LLVMBuildAtomicRMW(B, op, ptr, val, ord, singleThread),LLVMBuildAtomicCmpXchg(B, ptr, cmp, new, succOrd, failOrd, singleThread),LLVMBuildFence(B, ord, singleThread, name),LLVMSetWeak. singleThread = 0(multi-thread / cross-thread ordering). Atomic-eligibleT= integer / pointer / float of size 1·2·4·8(·16). Reject non-scalar / bad-sizeTloudly (diagnostic), do not silently emit.
Comptime VM treats atomics as ordinary load/store
Comptime is single-threaded, so seq_cst is trivially satisfied — the
comptime_vm arms for atomic_load/atomic_store
reuse the ordinary load/store paths (correct, NOT a bail). sx run JITs via LLVM so
runtime atomics execute the real ops; the VM arm only matters for #run/const-init.
Files the new IR op variants force (exhaustive switches)
atomic_load / atomic_store variants must be handled in every Op switch or the Zig
build fails (this is the desired tripwire):
- inst.zig:159 — add
atomic_load: AtomicLoad,atomic_store: AtomicStore+ the structs (mirrorStoreat inst.zig:286). - lower/call.zig:1672 — recognize the intrinsics, emit the ops (new
tryLowerAtomicIntrinsic, called alongsidetryLowerReflectionCallat call.zig:80). - print.zig:231 — print arms (sx-IR /
ir-dump). - emit_llvm.zig:1566 — dispatch arms → ops.zig.
- backend/llvm/ops.zig:325 —
emitAtomicLoad/emitAtomicStore(mirroremitLoad/emitStore). - comptime_vm.zig:659 — arms reusing load/store.
- Any other
.opswitch the Zig compiler flags (module.zig / program_index.zig) — let the build tell you.
Test snapshots — the arch-.ir requirement is a MISCONCEPTION for atomics
sx ir = emitIR, which emits LLVM IR (respects --target);
sx ir-dump is the sx-IR printer. At the LLVM-IR level, load atomic i64, ptr %x seq_cst, align 8 is arch-invariant — identical text for x86_64 and aarch64. The
x86-lock/MOV vs aarch64-ldar/stlr divergence happens only at instruction selection
(sx asm), which the corpus does not snapshot. So:
- A single host
.irsnapshot proves the achievable gate (theload atomic <ordering>keyword + correct ordering + alignment emitted). PLAN-POST §A / design §10.3's "arch-gated x86_64 + aarch64.ir" would capture byte-identical files — drop it. - Optionally add ONE cross-arch ir-only example (
.build {"target":"x86_64-linux"}on an aarch64 host) purely as a cross-target-emission-doesn't-crash smoke — note in its header that the IR body is identical to host. - State loudly (out of snapshot scope, parallel to the ordering-semantics caveat):
asm-level arch lowering AND weak-memory ordering semantics are NOT proven by
.ir; those need the Stream-C stress harness, not the corpus.
Phases
A.0 — Atomic($T) + Ordering + seq_cst-only load/store ← START HERE
Scope (descoped per the discovered gap above): ship the net-new atomic load/store
codegen with a seq_cst literal baked in the method bodies — load(self) -> T /
store(self, v) (NO ordering param yet). The intrinsic still carries the full
AtomicOrdering field (always .seq_cst here); the recognizer + emit handle all five
orderings already, so A.0.5 only has to plumb the constant through. Explicit orderings
(a.load(.acquire)) land in A.0.5. seq_cst-only is correct (conservative-strongest), not a
silent fallback.
Two-commit cadence (lock-to-bail → green):
- A.0a (lock) — land the lib + IR plumbing with emit deliberately bailing:
- New
library/modules/std/atomic.sx:Orderingenum,Atomic($T)struct (value +init/load/store),atomic_load/atomic_store#builtindecls. Opt-in import (#import "modules/std/atomic.sx"), NOT carried by the universalstd.sxfacade — mirrorstrace. Rationale (grounded): adding the concreteOrderingenum to the universal prelude registers it into EVERY program's global type table, growing@__sx_type_is_unsigned(378→380) and shifting all string-global numbering → churned 37 unrelated.irsnapshots + bloats every binary. Atomics is a deliberate concurrency capability, so consumers import it explicitly. - Add IR ops
atomic_load/atomic_store+AtomicOrdering+ the two op structs (inst.zig); print arms; comptime_vm arms (reuse load/store); lower recognition (tryLowerAtomicIntrinsic) incl. the const-ordering-literal guard + non-scalar-Treject. - emit_llvm/ops.zig arms bail loudly for now:
emitAtomicLoad/Storecall the emitter's bail-with-diagnostic path ("atomic load/store LLVM emission not yet implemented") so the Zig build is exhaustive but the example is red-by-diagnostic. - Add
examples/1700-atomics-load-store.sx(constructAtomic(i64).init,store,load,print). Seed marker; capture snapshot = the emit-bail diagnostic (nonzero exit).zig build && zig build testgreen (matches the locked bail snapshot). Commit.
- New
- A.0b (green) — replace the emit bail with real emission:
LLVMBuildLoad2+LLVMSetOrdering+LLVMSetAlignment/LLVMBuildStore+LLVMSetOrdering+LLVMSetAlignment, ordering via the explicit sx-tag→LLVMswitch. Regen1700to success output + capture its host.ir(assertsload atomic/store atomic+ ordering). Add a unit test inemit_llvm.test.zig(correct op + ordering + alignment emission). Review the diff (no stray error text). Commit.
A.0.5 — comptime-constant ordering propagation (the capability gap)
Enable a.load(.acquire) etc. — i.e. an Ordering that reaches the intrinsic as a
compile-time constant through a method. Two candidate designs (pick at pickup):
- (a) comptime enum value params — make
$o: Orderingresolve in the body to its variant tag (extendcomptime_value_bindings/the typer beyond integers). General, reusable; larger typer change. - (b) compiler-recognized
Atomicmethods — special-caseAtomic(T).load/store/…calls (read the literal ordering arg at the method call site), bounded coupling to the stdAtomictype (cf. howVectoris special-cased). Smaller; less general. Also enforce per-op ordering validity (load: relaxed/acquire/seq_cst; store: relaxed/release/seq_cst; CAS's dual orderings) as compile errors, which is exactly what the constant-ordering path buys. Retrofit the ordering param ontoload/storehere.
A.1 — RMW: fetch_add/sub/and/or/xor + fetch_min/max → atomicrmw (no nand)
One IR op atomic_rmw carrying an RmwKind (maps to LLVMAtomicRMWBinOp*). Signed vs
unsigned min/max picks Max/Min vs UMax/UMin from T's signedness. Same lock→green
cadence; 17xx examples.
A.2 — compare_exchange/_weak → cmpxchg (returns ?T, null = success)
atomic_cmpxchg op (ptr, cmp, new, success_ord, failure_ord, weak). LLVM cmpxchg
returns {T, i1}; lower to ?T where null = success (extract the i1, invert).
Validate the two orderings in the compiler (design §4.6): failure ordering may not be
release/acq_rel nor stronger than success — loud diagnostic. _weak sets LLVMSetWeak.
A.3 — swap + fence(.ordering)
swap = atomic_rmw with Xchg kind (folds into A.1's op). fence = a new atomic_fence
op (ordering only) → LLVMBuildFence. 17xx examples.
Gates (per the corrected snapshot story)
- unit
emit_llvm.test.zig: each op emits the right LLVM builder + ordering + alignment. - corpus
17xxsingle-thread deterministic (sx run, JIT executes real atomics). - host
.irsnapshot per op proves the keyword/ordering/alignment lowered. - OUT of snapshot scope, stated loudly: asm-level arch divergence (
sx asm) and weak-memory ordering semantics — Stream-C stress harness territory, not the corpus.
Kickoff prompt (A.0a — paste into a fresh session)
Implement Stream A step A.0a (atomics lock commit) per
current/PLAN-ATOMICS.md. Verifyzig build && zig build testis green first. Then: (1) createlibrary/modules/std/atomic.sxwith theOrderingenum,Atomic($T)struct, andatomic_load/atomic_store#builtindecls; wire intolibrary/modules/std.sx's tail. (2) Add theatomic_load/atomic_storeIR ops +AtomicOrdering+ op structs insrc/ir/inst.zig; handle them in every exhaustiveOpswitch the Zig build flags (print.zig, comptime_vm.zig reuse load/store, emit_llvm dispatch). (3) AddtryLowerAtomicIntrinsicinsrc/ir/lower/call.zig(recognize the two builtins, bake the const ordering literal into the op, loud-reject non-literal ordering AND non-scalar/bad-sizeT). (4) MakeemitAtomicLoad/emitAtomicStoreinsrc/backend/llvm/ops.zigBAIL loudly ("not yet implemented") this commit. (5) Addexamples/1700-atomics-load-store.sx, seed the marker, capture the bail diagnostic as the locked snapshot, confirmzig build testgreen, commit. STOP — A.0b (real emission) is the next step. Do NOT implement emission in the same commit that adds the example.