atomics A.1b: real RMW emission (atomicrmw) + unit test (green)

emitAtomicRmw: LLVMBuildAtomicRMW (binop from RmwKind; signed Min/Max vs unsigned UMin/UMax from val_ty; singleThread=0; LLVM supplies ABI alignment). examples/1701 green (add/sub/and/or/xor/min/max return old values, results verified). Unit test 'emit: atomic rmw (add + signed/unsigned min)' locks 'atomicrmw add' + signed 'min' vs unsigned 'umin'. Suite green (716/0).
2026-06-20 10:19:44 +03:00
parent 718f27e27f
commit 05311646aa
6 changed files with 98 additions and 51 deletions
--- a/current/CHECKPOINT-ATOMICS.md
+++ b/current/CHECKPOINT-ATOMICS.md
@@ -4,24 +4,24 @@ Companion to [PLAN-ATOMICS.md](PLAN-ATOMICS.md). Update after every step (one st
 time, per the cadence rule). New corpus category: `17xx`.

 ## Last completed step
-**A.0c (guard hardening) — DONE.** Adversarial review of A.0 found two CRITICAL silent-wrong
-defects (raw LLVM verifier errors leaking via the public intrinsics) + a latent align
-fallback; all fixed:
- **Scalar-kind guard** (call.zig): the size-only check admitted same-sized aggregates
-  (`[8]u8`, 8-byte struct) → invalid `load atomic [8 x i8]`. Now an allowlist switch
-  (integer/float/bool/pointer/enum/vector) rejects non-scalars loudly.
- **Per-op ordering validity** (call.zig): a load can't `release`/`acq_rel`, a store can't
-  `acquire`/`acq_rel` (LLVM rejects). Now a loud diagnostic, not invalid IR.
- **`val_ty` align fallback** (ops.zig): the `else .i64` (align-8) default would over-align a
-  sub-8 store → now bails loudly if `val_ty` is missing (a compiler bug, not a guess).
- Locked by `examples/1130-diagnostics-atomic-nonscalar.sx` +
-  `1131-diagnostics-atomic-bad-ordering.sx`. Suite green (713, 0 failed).
+**A.1 (RMW) — DONE** (A.1a lock + A.1b green). `fetch_add/sub/and/or/xor` + `fetch_min/max`
+→ LLVM `atomicrmw` (returns OLD value). New IR op `atomic_rmw` + `RmwKind` (no `nand`);
+`LLVMBuildAtomicRMW` with binop from kind, signed/unsigned `Min/Max` from `val_ty`. RMW
+restricted to INTEGER T (float fadd / pointer RMW out of scope, rejected loudly); all five
+orderings valid for RMW. comptime_vm does real single-thread RMW. `examples/1701` (add/sub/
+and/or/xor/min/max) green; unit test locks `atomicrmw add` + signed `min` vs unsigned `umin`.
+Suite green (716/0).

-**Capability landed (worker, commit 3c4305f):** comptime enum value params — `$o: Ordering`
-now binds the variant tag and resolves in the body; during LOWERING the tag is readable via
-`self.comptimeIntNamed("<param>")` → `?i64`. This unblocks A.0.5 (full ordering surface).
-Worker limitation: `.tagged_union` constraints were skipped — user wants this lifted (any
-value should be comptime-able); separate generalization worker spawned.
+## Next step
+**A.2 — CAS**: `compare_exchange`/`_weak` → LLVM `cmpxchg` (returns **`?T`, null = success**).
+New IR op `atomic_cmpxchg` (ptr, cmp, new, success_ord, failure_ord, weak). Validate the two
+orderings (failure may not be release/acq_rel nor stronger than success). `_weak` → `LLVMSetWeak`.
+
+### Earlier — A.0c (guard hardening)
+Adversarial review of A.0 found two CRITICAL silent-wrong defects (raw LLVM verifier errors
+via the public intrinsics) + a latent align fallback; all fixed: scalar-kind allowlist +
+per-op ordering validity (call.zig), `val_ty` align bail (ops.zig). Locked by examples
+1130/1131. Suite green (713/0).

 ### Earlier — A.0b (green)
 Real atomic load/store emission: `LLVMBuildLoad2`/`LLVMBuildStore`
@@ -46,19 +46,6 @@ Full atomic load/store plumbing with LLVM emission deliberately bailing loudly;
  scalar-size guard, both loud; emit dispatch arms (emit_llvm.zig) → `emitAtomicLoad`/
  `emitAtomicStore` (ops.zig) currently BAIL via `comptime_failed`.

-## Current state
- A.0a committed; suite green.
- The recognizer + IR + emit already handle ALL FIVE orderings; only the surface bakes a
-  `.seq_cst` literal (the methods can't yet forward a runtime/comptime ordering — gap below).
- emit bodies are the ONLY placeholder; A.0b swaps them for real builders.
-
-## Next step
-**A.0b** — replace the `emitAtomicLoad`/`emitAtomicStore` bail with real
-`LLVMBuildLoad2`+`SetOrdering`+`SetAlignment` / `LLVMBuildStore`+`SetOrdering`+
-`SetAlignment` (explicit sx-tag→LLVM ordering switch); regen 1700 → green (7/42/43) + host
-`.ir`; add `emit_llvm.test.zig` unit. Then adversarial review, then the comptime-enum worker
-+ A.0.5 migration to the full ordering surface.
-
 ## A.0.5 — full ordering surface (DONE)
 `Atomic($T).load($o: Ordering)` / `store(v, $o)` — ordering is a COMPTIME value param,
 explicit (Rust-style, no default; design §4.6). `a.load(.acquire)` emits `load atomic …
@@ -76,12 +63,14 @@ sidesteps it cleanly by requiring explicit ordering (matches the design). Candid
 follow-up, not an atomics blocker.

 ## Known issues / capability gaps
- **Comptime-constant ordering propagation MISSING (blocks the full surface).** A runtime
-  `Ordering` method param can't reach LLVM (orderings are instruction attributes, not
-  operands), and comptime enum value params don't exist (`$o: Ordering` → `o` unresolved in
-  body; `resolveValueParamArg` folds integers only). So A.0 ships seq_cst-only; A.0.5 closes
-  the gap (worker: implement comptime enum value params) and MIGRATES the methods — NO
-  legacy left by stream end.
+- **RESOLVED:** comptime-constant ordering propagation — landed via comptime value params
+  (3c4305f / d7a6857 / d95ba0a); A.0.5 migrated the methods, no seq_cst-only legacy.
+- **Orthogonal gap (not an atomics blocker):** default VALUES for comptime params don't bind
+  on generic-struct methods (free-fn defaults DO work). Atomics requires explicit ordering
+  (design-aligned), so it's unaffected. Candidate future fix.
+- **Cosmetic:** an invalid ordering passed through a method (`a.load(.release)`) reports the
+  diagnostic at the lib forward site (`atomic.sx`), not the user's call. Loud + correct, but
+  the span could be improved by threading the call-site span. Polish.
 - **Latent (observed, not yet filed):** calling an *unrecognized* bodiless `#builtin`
  silently returns 0 / no-ops with exit 0 (that's how 1700 behaved before recognition
  landed) — a silent-fallback footgun in the generic builtin-call path, independent of
@@ -118,3 +107,6 @@ follow-up, not an atomics blocker.
  (explicit). Recognizer resolves comptime-bound ordering via `comptimeIntNamed`. 1700
  migrated to explicit orderings (acquire/release/relaxed/seq_cst). Unblocked by
  comptime-value-param workers (3c4305f/d7a6857/d95ba0a). Suite green (715/0).
+- **A.1** — RMW: atomic_rmw op + RmwKind + recognizer (rmwKindFromName, integer-only) +
+  7 fetch_* methods/intrinsics. A.1a bail-lock → A.1b real LLVMBuildAtomicRMW (signed|unsigned
+  min/max). comptime_vm real RMW. 1701 + unit test. Suite green (716/0).
--- a/examples/expected/1701-atomics-rmw.exit
+++ b/examples/expected/1701-atomics-rmw.exit
@@ -1 +1 @@
-1
+0
--- a/examples/expected/1701-atomics-rmw.stderr
+++ b/examples/expected/1701-atomics-rmw.stderr
@@ -1,7 +1 @@
-error: atomic rmw LLVM emission not yet implemented (Stream A, A.1b)
-error: atomic rmw LLVM emission not yet implemented (Stream A, A.1b)
-error: atomic rmw LLVM emission not yet implemented (Stream A, A.1b)
-error: atomic rmw LLVM emission not yet implemented (Stream A, A.1b)
-error: atomic rmw LLVM emission not yet implemented (Stream A, A.1b)
-error: atomic rmw LLVM emission not yet implemented (Stream A, A.1b)
-error: atomic rmw LLVM emission not yet implemented (Stream A, A.1b)
+
--- a/examples/expected/1701-atomics-rmw.stdout
+++ b/examples/expected/1701-atomics-rmw.stdout
@@ -1 +1,10 @@
-
+old add: 10
+old sub: 15
+now: 12
+old and: 240
+old or:  48
+old xor: 51
+now: 60
+old min: 20
+old max: 8
+now: 15
--- a/src/backend/llvm/ops.zig
+++ b/src/backend/llvm/ops.zig
@@ -396,13 +396,33 @@ pub const Ops = struct {
        }
    }

-    // A.1a (Stream A) lock: emission BAILS LOUDLY until A.1b wires the real
-    // LLVMBuildAtomicRMW (binop from kind; signed/unsigned Min/Max from val_ty).
+    // atomicrmw returns the OLD value. The binop comes from the kind; min/max
+    // pick signed vs unsigned from val_ty. singleThread stays 0. LLVM gives the
+    // op the type's ABI alignment automatically (no explicit SetAlignment needed,
+    // unlike plain load/store).
+    fn rmwBinOp(kind: ir_inst.RmwKind, is_unsigned: bool) c.LLVMAtomicRMWBinOp {
+        return switch (kind) {
+            .add => c.LLVMAtomicRMWBinOpAdd,
+            .sub => c.LLVMAtomicRMWBinOpSub,
+            .@"and" => c.LLVMAtomicRMWBinOpAnd,
+            .@"or" => c.LLVMAtomicRMWBinOpOr,
+            .xor => c.LLVMAtomicRMWBinOpXor,
+            .min => if (is_unsigned) c.LLVMAtomicRMWBinOpUMin else c.LLVMAtomicRMWBinOpMin,
+            .max => if (is_unsigned) c.LLVMAtomicRMWBinOpUMax else c.LLVMAtomicRMWBinOpMax,
+        };
+    }
+
    pub fn emitAtomicRmw(self: Ops, instruction: *const Inst, a: AtomicRmw) void {
-        _ = a;
-        std.debug.print("error: atomic rmw LLVM emission not yet implemented (Stream A, A.1b)\n", .{});
-        self.e.comptime_failed = true;
-        self.e.mapRef(c.LLVMGetUndef(self.e.toLLVMType(if (instruction.ty == .void) .i64 else instruction.ty)));
+        const ptr = self.e.resolveRef(a.ptr);
+        const val = self.e.resolveRef(a.operand);
+        const ptr_kind = c.LLVMGetTypeKind(c.LLVMTypeOf(ptr));
+        if (ptr_kind == c.LLVMPointerTypeKind and instruction.ty != .void) {
+            const is_unsigned = self.e.ir_mod.types.isUnsignedInt(a.val_ty);
+            const result = c.LLVMBuildAtomicRMW(self.e.builder, rmwBinOp(a.kind, is_unsigned), ptr, val, llvmOrdering(a.ordering), 0);
+            self.e.mapRef(result);
+        } else {
+            self.e.mapRef(c.LLVMGetUndef(self.e.toLLVMType(if (instruction.ty == .void) .i64 else instruction.ty)));
+        }
    }

    pub fn emitAtomicStore(self: Ops, a: AtomicStore) void {
--- a/src/ir/emit_llvm.test.zig
+++ b/src/ir/emit_llvm.test.zig
@@ -248,6 +248,38 @@ test "emit: atomic load/store (seq_cst, aligned)" {
    try std.testing.expect(std.mem.indexOf(u8, ir_str, "align 8") != null);
 }

+test "emit: atomic rmw (add + signed/unsigned min)" {
+    const alloc = std.testing.allocator;
+    var module = Module.init(alloc);
+    defer module.deinit();
+
+    var b = Builder.init(&module);
+
+    _ = b.beginFunction(str(&module, "f"), &.{}, .i64);
+    const entry = b.appendBlock(str(&module, "entry"), &.{});
+    b.switchToBlock(entry);
+
+    const si = b.alloca(.i64);
+    const ui = b.alloca(.u64);
+    const five = b.constInt(5, .i64);
+    _ = b.emit(.{ .atomic_rmw = .{ .ptr = si, .operand = five, .val_ty = .i64, .ordering = .seq_cst, .kind = .add } }, .i64);
+    _ = b.emit(.{ .atomic_rmw = .{ .ptr = si, .operand = five, .val_ty = .i64, .ordering = .seq_cst, .kind = .min } }, .i64);
+    _ = b.emit(.{ .atomic_rmw = .{ .ptr = ui, .operand = five, .val_ty = .u64, .ordering = .seq_cst, .kind = .min } }, .u64);
+    b.ret(five, .i64);
+    b.finalize();
+
+    var emitter = LLVMEmitter.init(alloc, &module, "test_rmw", .{});
+    defer emitter.deinit();
+    emitter.emit();
+
+    try std.testing.expect(emitter.verify());
+
+    const ir_str = emitter.dumpToString();
+    try std.testing.expect(std.mem.indexOf(u8, ir_str, "atomicrmw add") != null);
+    try std.testing.expect(std.mem.indexOf(u8, ir_str, "atomicrmw min") != null); // signed i64
+    try std.testing.expect(std.mem.indexOf(u8, ir_str, "atomicrmw umin") != null); // unsigned u64
+}
+
 test "emit: comparison and branch" {
    const alloc = std.testing.allocator;
    var module = Module.init(alloc);