test(asm): Phase 0.0 — corpus target-gating + .build JSON config

Adds per-example build/run directives to the corpus runner via an optional
`expected/<name>.build` JSON sidecar (`BuildConfig { aot, target }`), replacing
the standalone `.aot` marker. Threads `--target` into the run/build/ir spawns
and gates the execute path on host arch+os match; a cross-target example fails
loudly ("ir-only mode not yet implemented") pending Phase 0.1.

- corpus_run.test.zig: BuildConfig + std.json parse (unknown-key => error),
  hostMatchesTarget (shorthand-expand + arch/os token match, arm64->aarch64),
  withTarget argv helper; unit tests for both.
- migrate 1226/1227 `.aot` markers -> `.build` { "aot": true }.
- lock fixture 1638-platform-target-host (`.build` { "target": "macos" }).

Test-infra only; no compiler code. zig build test green (646 corpus, 444 unit).
This commit is contained in:
agra
2026-06-15 17:37:35 +03:00
parent d6a9c4f0c4
commit c88f4fbcef
12 changed files with 282 additions and 19 deletions

View File

@@ -6,19 +6,46 @@ commit, one step at a time per the cadence rule (no commit may both add a test
and make it pass).
## Last completed step
None — plan authored, not yet started.
**0.0** — corpus runner target-gating + `<name>.build` JSON config. Added
`BuildConfig` (`std.json.parseFromSliceLeaky``struct { aot, target }`,
unknown-key ⇒ `error.UnknownField`) replacing the standalone `.aot` marker;
migrated the 2 existing `.aot` markers (1226/1227) to `{ "aot": true }`; threaded
`--target` into the `run`/`build`/`ir` spawns via `withTarget`; added
`hostMatchesTarget` (shorthand-expand + arch/os token match, `arm64``aarch64`)
gating the execute path. Cross-target mismatch fails **loudly** (placeholder until
0.1's ir-only branch) — verified the bail fires (`target=linux,
host=aarch64-macos`). Locked with `examples/1638-platform-target-host.sx`
(`.build` `{ "target": "macos" }`, runs natively + asserts stdout) + unit tests
for the JSON parse and `hostMatchesTarget`. `zig build test` green (646 corpus, 0
failed; 444 unit, 0 failed). Files: `src/corpus_run.test.zig`,
`examples/1638-*`, `examples/expected/{1226,1227,1638}-*`.
## Current state
Design fully converged (`docs/inline-asm-design.md`). Feasibility confirmed:
`llvm_api.c.*` exposes `LLVMGetInlineAsm` / `LLVMBuildCall2` /
`LLVMAppendModuleInlineAsm` (LLVM@19). No code written.
Phase 0 step 0.0 landed (test-infra only — no compiler code touched). The corpus
runner now reads `expected/<name>.build` and threads/gates `--target`; an
arch-pinned example whose target matches the host **executes**, a mismatch
currently **bails loudly** ("cross-target ir-only mode not yet implemented"). Phase
AE feasibility already confirmed against the live tree (`LLVMGetInlineAsm` /
`LLVMBuildCall2` / `LLVMAppendModuleInlineAsm` in LLVM@19 `Core.h`; ERR-stream
`extractvalue`→tuple machinery in `emit_llvm.zig:726-927`; lib-less `extern`, 60
sites; `--target` a global CLI flag).
## Next step
**A.0** — add the `kw_asm` keyword (`src/token.zig` Tag + `StaticStringMap`) and a
unit lex test. Then A.1 (parse `asm { … }``AsmExpr`, lowering bails loudly).
**0.1** — implement the **mismatch ⇒ ir-only** branch in `sweepRoot` (replace the
loud placeholder bail): when `cfg.target` doesn't match the host, skip
run/build/exec and assert only `.exit`+`.ir`+`.stderr` from `sx ir --target`;
require an `.ir` snapshot (loud failure if absent). Lock with
`examples/16xx-platform-target-cross.sx` (asm-free `() -> i64 { return 0; }`),
`.build` `{ "target": "x86_64-linux" }`, + a checked-in `.ir` snapshot. Then 0.2
(CLAUDE.md §Testing/§Test-layout docs for `.build`), then Phase A (`kw_asm`). See
`PLAN-ASM.md` Phase 0.
## Log
- (init) Plan + design doc written; ASM stream opened.
- (0.0) Corpus runner target-gating: `<name>.build` JSON config (replaces `.aot`
marker), `--target` threading, `hostMatchesTarget` execute-gate, loud
cross-target placeholder bail. Migrated 1226/1227 `.aot``.build`; locked with
1638 fixture + unit tests. `zig build test` green.
## Known issues
None yet.

View File

@@ -22,8 +22,93 @@ outputs return a tuple; templates are pure AT&T (via LLVM).
## Cadence (IMPASSIBLE)
No commit may both add a test AND make it pass. Each feature step is either a
behavior-locking PASSING test, or an xfail test the *next* commit turns green.
Arch-pinned tests live in `examples/16xx-platform-asm-*` (must declare `target=`).
Never regenerate snapshots while red.
Arch-pinned tests live in `examples/16xx-platform-asm-*` and declare their target
via the `expected/<name>.target` sidecar marker (Phase 0). Never regenerate
snapshots while red.
## Phase 0 — corpus target-gating (test-infra prerequisite; no compiler code)
**Why first.** The flagship v1 examples are `x86_64` (syscall-write, divmod,
cpuid) but the dev host is `aarch64`-Darwin, and the corpus runner
([src/corpus_run.test.zig](../src/corpus_run.test.zig)) currently (a) never threads
a per-example `--target` and (b) has no host-arch gate — its only skip is "marker
has no `.sx`". So D.0's `…-syscall-write` markers asserting exit/stdout describe
output the harness *cannot* produce on this host, which would violate the cadence
rule (the "next commit turns it green" can never happen). Phase 0 closes that gap.
It touches **only the runner + two fixtures** — zero compiler code, zero risk to
AE, and unblocks every arch-pinned asm example.
**Marker taxonomy (the cleanup).** The runner currently spreads per-example
*directives* across standalone boolean/value sidecars (`.aot` now, `.target`
proposed, more later). Replace that sprawl with **one optional config file,
`expected/<name>.build`**, holding all build/run directives; the output snapshots
(`.exit` / `.stdout` / `.stderr` / `.ir`) stay separate — they are
machine-regenerated data, not config. `.exit` remains the **test-discovery key**
(every test has one; `.build` is optional).
**`.build` format** — JSON, parsed with `std.json`:
```json
{ "aot": true, "target": "x86_64-linux" }
```
Parse via `std.json.parseFromSlice(BuildConfig, …)` into
`struct { aot: bool = false, target: ?[]const u8 = null }`. Field defaults cover
omitted keys; `std.json`'s default `ignore_unknown_fields = false` makes an
**unknown key a loud `error.UnknownField`** (surfaced as a runner failure, never a
silent ignore — CLAUDE.md no-silent-default rule). Extensible: future `"cpu"`,
`"link"`, `"cwd"` are just new optional struct fields, no new sidecar file and no
custom parser.
**What the directives do:**
1. **`target = <triple|shorthand>`** threads `--target <value>` into every `sx`
invocation for that example (`run` / `build` / `ir``--target` is a global
flag, confirmed [main.zig:39](../src/main.zig#L39)), AND **host-match selects
the mode.** The runner parses the leading `arch` + `os` tokens of the resolved
triple and compares them to `@import("builtin").target` (normalizing
`arm64``aarch64`):
- **match** → *execute* exactly as today (`sx run`, or `aot` build+exec) with
the target threaded, plus the `.ir` diff if an `.ir` snapshot exists. ⇒ an
x86_64 example gives **real end-to-end coverage on an x86_64 CI runner**.
- **mismatch** → **ir-only**: run *only* `sx ir <file> --target <t>`; assert
`.exit` (the ir command's exit), `.ir` (normalized stdout), and `.stderr`
(diagnostics, normally empty). Do **not** run/build/exec; do **not** assert
`.stdout`. An `.ir` snapshot is **required** in ir-only mode — its absence is
a loud runner failure ("arch-pinned <name>: ir-only mode requires an .ir
snapshot"), never a silent pass. Robust even if `sx ir` treats `--target` as
a partial no-op: the `inline_asm` op carries the template + constraint string
verbatim, so the IR snapshot still locks the exact thing §II.11 flags as
silently-miscompiling (the constraint assembler + template rewrite).
2. **`aot`** is the existing JIT-vs-build+exec switch, just relocated from the
standalone `.aot` marker into `.build`.
**Negative compile-error examples need NO `.build`.** `…-missing-volatile`
(no-output-without-`volatile`) is a Sema diagnostic raised before codegen/JIT, so
plain `sx run` reports it identically on any host — it stays a normal example with
no config file.
**update-goldens interaction:** in ir-only mode, `-Dupdate-goldens` writes `.exit`
(ir exit) + `.ir` (+ `.stderr` if non-empty) and skips `.stdout`. Execute mode
(incl. `aot`) is unchanged. `.build` is hand-authored — update-goldens never
writes it.
| Step | Commit | What | Files |
|---|---|---|---|
| 0.0 | lock | Add `BuildConfig` + `std.json` parse of `expected/<name>.build` (unknown-key ⇒ `error.UnknownField`); **migrate** the 2 existing `.aot` markers → `.build` (content `{ "aot": true }`) and delete them; thread `target`'s `--target` into the spawned argv; add `hostMatchesTarget(value) bool` (arch+os token parse, `arm64``aarch64`) gating the **execute** path. Lock with `examples/16xx-platform-target-host.sx` (trivial `main`) + a `.build` `{ "target": "<host arch triple>" }` (still runs+passes) and unit `test`s for the JSON parse + `hostMatchesTarget`. | `src/corpus_run.test.zig`, `examples/expected/1226-*.{aot→build}`, `…/1227-*`, + fixture |
| 0.1 | lock | Implement the **mismatch ⇒ ir-only** branch (skip run/build/exec; assert `.exit`+`.ir`+`.stderr` from `sx ir --target`; require `.ir`). Lock with `examples/16xx-platform-target-cross.sx` (asm-free `() -> i64 { return 0; }`) + `.build` `{ "target": "x86_64-linux" }` + a checked-in `.ir` snapshot — exercises ir-only on the arm64 host. | `src/corpus_run.test.zig` + fixture |
| 0.2 | docs | Update CLAUDE.md §"Test layout"/§"Testing" to document `.build` (format + `aot`/`target` keys) replacing the standalone `.aot` marker prose (lines ~435, ~492). | `CLAUDE.md` |
Both 0.0 and 0.1 are **lock** commits: the runner change and the fixture that
exercises it land together and pass the moment they land (the mechanism works
immediately — nothing is left red), which is the cadence rule's "lock in current
behavior" flavor, not a feature red→green. No asm lowering is gated on either.
**Phase 0 verification:** `zig build test` green; deliberately corrupt the
cross-target `.ir` fixture and confirm the runner reports an IR mismatch (proves
ir-only actually asserts, isn't a no-op); delete it and confirm the
"requires an .ir snapshot" failure fires.
**Estimated runner delta:** ~7090 lines (sidecar read + `--target` argv threading
+ `hostMatchesTarget` + the ir-only branch + update-mode tweak). Within the
"no step > ~500 new lines" rule; well under the read budget.
## Phase A — keyword + AST + parser (parses; no codegen)
| Step | Commit | What | Files |

View File

@@ -0,0 +1,9 @@
// Phase 0 (ASM stream) test-infra lock: exercises the `<name>.build` JSON
// config + `--target` threading + the host-match EXECUTE path of the corpus
// runner. The companion `.build` pins the HOST target (`{ "target": "macos" }`
// resolves to the host arch+os), so the runner threads `--target` and still
// runs the example natively — its stdout is asserted as usual.
#import "modules/std.sx";
main :: () {
print("target-host ok\n");
}

View File

@@ -0,0 +1 @@
{ "aot": true }

View File

@@ -0,0 +1 @@
{ "aot": true }

View File

@@ -0,0 +1 @@
{ "target": "macos" }

View File

@@ -0,0 +1 @@
0

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1 @@
target-host ok

View File

@@ -1,4 +1,5 @@
const std = @import("std");
const builtin = @import("builtin");
const corpus_paths = @import("corpus_paths");
// End-to-end example/issue regression runner. For every
@@ -159,6 +160,83 @@ fn readOptional(io: std.Io, gpa: std.mem.Allocator, abs_path: []const u8) ?[]u8
return std.Io.Dir.readFileAlloc(.cwd(), io, abs_path, gpa, .limited(MAX_OUTPUT)) catch null;
}
/// Per-example build/run directives, parsed from an optional `<name>.build`
/// JSON sidecar (replaces the old standalone `.aot` marker). Output snapshots
/// (.exit/.stdout/.stderr/.ir) stay separate — they are regenerated data, not
/// config. An unknown key is `error.UnknownField` (std.json default
/// `ignore_unknown_fields = false`), surfaced as a loud test failure — never a
/// silent ignore. Future directives (`cpu`, `link`, `cwd`) are just new
/// optional fields here, no new sidecar file.
const BuildConfig = struct {
aot: bool = false,
target: ?[]const u8 = null,
};
fn parseBuildConfig(a: std.mem.Allocator, text: []const u8) !BuildConfig {
return std.json.parseFromSliceLeaky(BuildConfig, a, text, .{});
}
/// Expand the `sx --target` shorthands we care about to a triple whose `arch`
/// (token 0) and OS substring are stable — versions/vendors are irrelevant to
/// host matching. Mirrors the table in `main.zig`; unknown values pass through
/// (already a triple, or LLVM rejects later).
fn expandTargetShorthand(raw: []const u8) []const u8 {
const eql = std.mem.eql;
if (eql(u8, raw, "macos") or eql(u8, raw, "macos-arm")) return "aarch64-apple-macos";
if (eql(u8, raw, "macos-x86")) return "x86_64-apple-macos";
if (eql(u8, raw, "linux") or eql(u8, raw, "linux-x86")) return "x86_64-linux-gnu";
if (eql(u8, raw, "linux-arm")) return "aarch64-linux-gnu";
if (eql(u8, raw, "windows")) return "x86_64-windows-msvc";
if (eql(u8, raw, "ios") or eql(u8, raw, "ios-arm")) return "aarch64-apple-ios";
if (eql(u8, raw, "ios-sim") or eql(u8, raw, "ios-sim-arm")) return "aarch64-apple-ios-simulator";
if (eql(u8, raw, "wasm") or eql(u8, raw, "wasm32") or eql(u8, raw, "emscripten")) return "wasm32-unknown-emscripten";
return raw;
}
/// `arm64` and `aarch64` name the same ISA; normalize so a `.build` may spell
/// either.
fn normalizeArch(arch: []const u8) []const u8 {
return if (std.mem.eql(u8, arch, "arm64")) "aarch64" else arch;
}
/// Canonical OS name detected from a triple by substring, mapped to the
/// `std.Target.Os.Tag` spelling used by `@tagName`. Order matters: `ios` is
/// checked before `macos`/`darwin` (Apple triples share the `apple` vendor).
fn tripleOsName(triple: []const u8) ?[]const u8 {
const has = std.mem.indexOf;
if (has(u8, triple, "ios") != null) return "ios";
if (has(u8, triple, "android") != null) return "linux"; // linux-android
if (has(u8, triple, "macos") != null or has(u8, triple, "darwin") != null) return "macos";
if (has(u8, triple, "linux") != null) return "linux";
if (has(u8, triple, "windows") != null) return "windows";
if (has(u8, triple, "emscripten") != null or has(u8, triple, "wasi") != null) return "emscripten";
return null;
}
/// True when `value` (a `.build` target shorthand or triple) names the host's
/// arch AND OS — i.e. an example built for it can actually execute here. A
/// mismatch routes the example to ir-only mode (Phase 0.1).
fn hostMatchesTarget(value: []const u8) bool {
const triple = expandTargetShorthand(value);
const dash = std.mem.indexOfScalar(u8, triple, '-') orelse return false;
const arch = normalizeArch(triple[0..dash]);
if (!std.mem.eql(u8, arch, @tagName(builtin.cpu.arch))) return false;
const os = tripleOsName(triple) orelse return false;
return std.mem.eql(u8, os, @tagName(builtin.os.tag));
}
/// `base` with `--target <t>` appended when `target` is set, else `base`
/// unchanged. `--target` is a global `sx` flag (main.zig), valid after any
/// subcommand.
fn withTarget(a: std.mem.Allocator, base: []const []const u8, target: ?[]const u8) ![]const []const u8 {
const t = target orelse return base;
const v = try a.alloc([]const u8, base.len + 2);
@memcpy(v[0..base.len], base);
v[base.len] = "--target";
v[base.len + 1] = t;
return v;
}
/// Run every `<root>/expected/*.exit` test. Appends a formatted diagnostic to
/// `failures` (owned by `fail_gpa`) for each mismatch. Returns the number of
/// tests actually run (markers whose `.sx` is missing are skipped).
@@ -218,14 +296,32 @@ fn sweepRoot(
const err_raw = readOptional(io, a, try std.fmt.allocPrint(a, "{s}/{s}.stderr", .{ exp_dir, name })) orelse "";
const ir_raw = readOptional(io, a, try std.fmt.allocPrint(a, "{s}/{s}.ir", .{ exp_dir, name }));
// An `<name>.aot` marker switches the example from JIT `sx run` to a
// build+execute flow: `sx build` links the sx object with any C
// `#source` companions into a native binary, which is then executed.
// This is the ONLY way to exercise a C-ABI symbol exported FROM sx
// (an `export` fn): in JIT mode the sx symbol lives in JIT memory and
// is invisible to a dlopen'd C dylib's flat-namespace lookup, so a
// C→sx-by-name call can only be linked ahead-of-time.
const is_aot = readOptional(io, a, try std.fmt.allocPrint(a, "{s}/{s}.aot", .{ exp_dir, name })) != null;
// Per-example directives live in an optional `<name>.build` JSON sidecar
// (BuildConfig). `aot` switches the JIT `sx run` to a build+execute flow:
// `sx build` links the sx object with any C `#source` companions into a
// native binary, which is then executed — the ONLY way to exercise a
// C-ABI symbol exported FROM sx (an `export` fn): in JIT mode the sx
// symbol lives in JIT memory and is invisible to a dlopen'd C dylib's
// flat-namespace lookup, so a C→sx-by-name call can only be linked
// ahead-of-time. `target` threads `--target` and gates host execution.
const cfg: BuildConfig = if (readOptional(io, a, try std.fmt.allocPrint(a, "{s}/{s}.build", .{ exp_dir, name }))) |raw|
parseBuildConfig(a, raw) catch |err| {
try failures.append(fail_gpa, try std.fmt.allocPrint(fail_gpa, "{s}: invalid .build config ({s})", .{ name, @errorName(err) }));
continue;
}
else
.{};
const is_aot = cfg.aot;
// An example pinned to a non-host target cannot execute here; it routes
// to ir-only mode (Phase 0.1). Until that lands, a mismatch must fail
// loudly — never silently pass.
if (cfg.target) |t| {
if (!hostMatchesTarget(t)) {
try failures.append(fail_gpa, try std.fmt.allocPrint(fail_gpa, "{s}: cross-target ir-only mode not yet implemented (target={s}, host={s}-{s})", .{ name, t, @tagName(builtin.cpu.arch), @tagName(builtin.os.tag) }));
continue;
}
}
var act_exit: u32 = undefined;
var act_out: []const u8 = undefined;
@@ -239,7 +335,7 @@ fn sweepRoot(
// linker error on stderr.
const bin_path = try std.fmt.allocPrint(a, "/tmp/sx_aot_{s}", .{name});
const build_res = std.process.run(a, io, .{
.argv = &.{ corpus_paths.sx_exe, "build", rel_path, "-o", bin_path },
.argv = try withTarget(a, &.{ corpus_paths.sx_exe, "build", rel_path, "-o", bin_path }, cfg.target),
.cwd = .{ .path = repo_root },
.timeout = deadline(io),
}) catch |err| {
@@ -270,7 +366,7 @@ fn sweepRoot(
} else {
// --- sx run ---
const run_res = std.process.run(a, io, .{
.argv = &.{ corpus_paths.sx_exe, "run", rel_path },
.argv = try withTarget(a, &.{ corpus_paths.sx_exe, "run", rel_path }, cfg.target),
.cwd = .{ .path = repo_root },
.timeout = deadline(io),
}) catch |err| {
@@ -292,7 +388,7 @@ fn sweepRoot(
var act_ir: ?[]const u8 = null;
if (ir_raw != null) {
const ir_res = std.process.run(a, io, .{
.argv = &.{ corpus_paths.sx_exe, "ir", rel_path },
.argv = try withTarget(a, &.{ corpus_paths.sx_exe, "ir", rel_path }, cfg.target),
.cwd = .{ .path = repo_root },
.timeout = deadline(io),
}) catch |err| {
@@ -419,3 +515,43 @@ test "issues corpus: every pinned issues/*.sx repro runs and matches its snapsho
defer for (failures.items) |f| std.testing.allocator.free(f);
try reportFailures("issues", ran, failures.items);
}
test "parseBuildConfig: defaults, fields, unknown key" {
var arena = std.heap.ArenaAllocator.init(std.testing.allocator);
defer arena.deinit();
const a = arena.allocator();
const empty = try parseBuildConfig(a, "{}");
try std.testing.expect(!empty.aot);
try std.testing.expect(empty.target == null);
const aot = try parseBuildConfig(a, "{ \"aot\": true }");
try std.testing.expect(aot.aot);
try std.testing.expect(aot.target == null);
const tgt = try parseBuildConfig(a, "{ \"target\": \"x86_64-linux\" }");
try std.testing.expect(!tgt.aot);
try std.testing.expectEqualStrings("x86_64-linux", tgt.target.?);
// Unknown key is a loud error, not a silent ignore.
try std.testing.expectError(error.UnknownField, parseBuildConfig(a, "{ \"bogus\": 1 }"));
}
test "hostMatchesTarget: host arch+os matches, cross-arch does not" {
const arch = @tagName(builtin.cpu.arch);
const os = @tagName(builtin.os.tag);
// A triple built from the host's own arch + os must match.
var buf: [64]u8 = undefined;
const host_triple = std.fmt.bufPrint(&buf, "{s}-unknown-{s}", .{ arch, os }) catch unreachable;
try std.testing.expect(hostMatchesTarget(host_triple));
// A different arch never matches (same os).
const other_arch = if (builtin.cpu.arch == .x86_64) "aarch64" else "x86_64";
var buf2: [64]u8 = undefined;
const cross = std.fmt.bufPrint(&buf2, "{s}-unknown-{s}", .{ other_arch, os }) catch unreachable;
try std.testing.expect(!hostMatchesTarget(cross));
// `arm64` normalizes to `aarch64`.
try std.testing.expect(normalizeArch("arm64").len == "aarch64".len);
}