test: run example corpus in zig build test; sx ir → stdout
`zig build test` now runs the full examples/ + issues/ regression corpus alongside the Zig unit tests, driven by a pure-Zig test (src/corpus_run.test.zig) — no shell script in the build path. It spawns the installed `sx` per example (subprocess-isolated, per-run timeout), diffs stdout/stderr/exit and optional `sx ir` snapshots, and fails the build on any mismatch. The file list is enumerated at runtime, so new examples are covered with no test edit. - `sx ir` / `ir-dump` now write to stdout (fd 1) instead of stderr, so the dumps can be piped/redirected. - `zig build test -Dupdate-goldens` regenerates snapshots in-build, byte-identical to the legacy `run_examples.sh --update`; on mismatch the runner prints how to regenerate. - run_examples.sh kept (still used by tools/verify-step.sh) and made portable to a bare macOS: timeout/gtimeout fallback, bash 3.2-safe empty-array handling. - CLAUDE.md: document the new workflow.
This commit is contained in:
54
CLAUDE.md
54
CLAUDE.md
@@ -405,7 +405,10 @@ any can be advanced independently.
|
||||
After any code change:
|
||||
```sh
|
||||
zig build # must compile
|
||||
zig build test # must pass
|
||||
zig build test # must pass — runs the Zig unit tests
|
||||
# AND the full examples/ + issues/
|
||||
# regression corpus (a failing example
|
||||
# fails the build)
|
||||
```
|
||||
|
||||
After completing a phase's final step, run the phase's end-to-end verification command listed in `current/PLAN.md`.
|
||||
@@ -415,9 +418,27 @@ After completing a phase's final step, run the phase's end-to-end verification c
|
||||
After any compiler change:
|
||||
|
||||
1. **Build**: `zig build && zig build test`
|
||||
2. **Run regression tests**: `bash tests/run_examples.sh`
|
||||
- Every test must show `ok` (currently 324)
|
||||
- Zero failures, zero timeouts
|
||||
- `zig build test` runs the unit tests **and** the example/issue corpus as
|
||||
one suite — a failing example fails the build. The corpus is driven by a
|
||||
pure-Zig test (`src/corpus_run.test.zig`) that spawns the installed `sx`
|
||||
binary per example (subprocess-isolated, with a per-run timeout), so no
|
||||
shell script is involved.
|
||||
2. **Regenerate snapshots**: `zig build test -Dupdate-goldens`
|
||||
- Flips the corpus test to write each example's expected
|
||||
`.exit`/`.stdout`/`.stderr` (+ `.ir` where one already exists) from
|
||||
freshly-normalized output instead of asserting against it. This is the
|
||||
preferred way to update snapshots — no shell script needed.
|
||||
- A test is still keyed off its `expected/<name>.exit` marker, so seed an
|
||||
empty marker first for a brand-new example (see "Adding a feature").
|
||||
3. **Standalone corpus run** (optional): `bash tests/run_examples.sh`
|
||||
- Runs the corpus independent of `zig build test` (used by
|
||||
`tools/verify-step.sh`). `--update` still regenerates snapshots and
|
||||
produces byte-identical output to `-Dupdate-goldens`.
|
||||
- Every test must show `ok` (currently 626); zero failures, zero timeouts.
|
||||
- Uses GNU `timeout`/`gtimeout` when present (Homebrew coreutils on macOS)
|
||||
and runs without a per-test wall-clock guard when neither is found.
|
||||
- The two normalizers (`normalize`/`normalize_ir` in the script and the
|
||||
mirrors in `src/corpus_run.test.zig`) must stay in lockstep.
|
||||
|
||||
### Test layout
|
||||
|
||||
@@ -445,12 +466,12 @@ dirs) under the same `XXXX-` prefix.
|
||||
|
||||
### Snapshot integrity
|
||||
|
||||
**Never run `--update` while tests are failing.** The `--update` flag blindly overwrites expected output with whatever the compiler produces — including error messages. If you update snapshots during a broken state, the test suite will "pass" against garbage output and real regressions become invisible.
|
||||
**Never regenerate snapshots while tests are failing.** `-Dupdate-goldens` (and the legacy `--update`) blindly overwrite expected output with whatever the compiler produces — including error messages. If you regenerate during a broken state, the test suite will "pass" against garbage output and real regressions become invisible.
|
||||
|
||||
Safe workflow:
|
||||
1. Fix the code until `bash tests/run_examples.sh` passes against the **existing** snapshots.
|
||||
2. Only run `--update` when you've intentionally changed output (new feature, new test, changed formatting).
|
||||
3. After `--update`, review the diff (`git diff examples/expected/ issues/expected/`) to confirm no error messages or empty output were captured.
|
||||
1. Fix the code until `zig build test` passes against the **existing** snapshots.
|
||||
2. Only run `zig build test -Dupdate-goldens` when you've intentionally changed output (new feature, new test, changed formatting).
|
||||
3. After regenerating, review the diff (`git diff examples/expected/ issues/expected/`) to confirm no error messages or empty output were captured.
|
||||
|
||||
### Adding a new language feature
|
||||
|
||||
@@ -461,19 +482,20 @@ There is no monolithic smoke file — each feature is its own focused example.
|
||||
2. Run it: `./zig-out/bin/sx run examples/XXXX-<category>-<name>.sx`
|
||||
3. Seed the marker and capture expected output:
|
||||
`: > examples/expected/XXXX-<category>-<name>.exit` then
|
||||
`bash tests/run_examples.sh --update`
|
||||
4. Verify all tests still pass: `bash tests/run_examples.sh`
|
||||
`zig build test -Dupdate-goldens`
|
||||
4. Verify all tests still pass: `zig build test`
|
||||
|
||||
### Test file roles
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `examples/XXXX-category-name.sx` | Focused feature example — one feature per file. |
|
||||
| `examples/expected/XXXX-category-name.{exit,stdout,stderr}` | Expected exit code + the two output streams. Regenerate with `--update`. |
|
||||
| `examples/expected/XXXX-category-name.{exit,stdout,stderr}` | Expected exit code + the two output streams. Regenerate with `zig build test -Dupdate-goldens`. |
|
||||
| `examples/expected/XXXX-category-name.ir` | Optional `sx ir` snapshot — present only where lowering shape is locked. |
|
||||
| `issues/NNNN-slug.md` | Open-issue / bug-report writeup (mark RESOLVED in a banner when fixed; the `.md` stays). |
|
||||
| `issues/NNNN-slug.sx` (+ `issues/NNNN-slug/`) | The issue's minimal repro, co-located with the `.md`. A repro with an `issues/expected/NNNN-slug.exit` marker runs in the suite; unpinned ones don't. |
|
||||
| `tests/run_examples.sh` | Test runner. Scans `examples/` and `issues/`; compares stdout/stderr/exit (+ optional IR) per test. |
|
||||
| `src/corpus_run.test.zig` | The corpus runner inside `zig build test` — spawns `sx` per example, diffs stdout/stderr/exit (+ optional IR); regenerates snapshots under `-Dupdate-goldens`. |
|
||||
| `tests/run_examples.sh` | Standalone shell runner (used by `tools/verify-step.sh`); same compare + `--update` as the Zig test. |
|
||||
|
||||
### Unit test file convention
|
||||
|
||||
@@ -496,8 +518,8 @@ All Zig unit tests live in separate `*.test.zig` files alongside the source they
|
||||
open bug, `issues/NNNN-slug.{md,sx}` (repro co-located with the writeup).
|
||||
2. Run it: `./zig-out/bin/sx run <path>.sx`
|
||||
3. Seed the marker (`: > <root>/expected/<name>.exit`) and capture expected:
|
||||
`bash tests/run_examples.sh --update`
|
||||
4. Verify: `bash tests/run_examples.sh`
|
||||
`zig build test -Dupdate-goldens`
|
||||
4. Verify: `zig build test`
|
||||
|
||||
### Resolving an open issue
|
||||
|
||||
@@ -505,8 +527,8 @@ When a bug filed under `issues/NNNN-slug.{md,sx}` is fixed:
|
||||
|
||||
1. Move the repro into the feature suite as a regression test:
|
||||
`git mv issues/NNNN-slug.sx examples/XXXX-<category>-<name>.sx`.
|
||||
2. Seed `examples/expected/XXXX-<category>-<name>.exit`, capture with `--update`,
|
||||
and review the diff.
|
||||
2. Seed `examples/expected/XXXX-<category>-<name>.exit`, capture with
|
||||
`zig build test -Dupdate-goldens`, and review the diff.
|
||||
3. Tighten the example's comment header to describe the feature (keep a one-line
|
||||
`Regression (issue NNNN)` note for provenance).
|
||||
4. Mark `issues/NNNN-slug.md` RESOLVED with a short banner (root cause + fix +
|
||||
|
||||
Reference in New Issue
Block a user