Files
distribution/.agents/runs/2026-06-02-orchestration-planning/opus-review.md

14 KiB

Opus Consultation - Distribution Platform Plan Review

Context

Read-only review of the distribution platform plan package (PLAN.md, .agents/ORCHESTRATION.md, .agents/subplans/*.md, checkpoints) against the real state of the sx repository, from four lenses: design, technical sequencing, implementation feasibility, and platform risk.

Verdict in one line: the plan is well-written and the product model is mostly right, but it is sequenced as if sx were a mature systems language. It is not. sx is self-described "highly experimental" with ~37 open compiler issues and five active fix-streams, and its own governance forbids working around compiler bugs. The plan's Phase 0-1 (build a full HTTP/TLS/SQLite/ crypto/JSON/archive stdlib in sx first) is the single largest risk and is underestimated by roughly an order of magnitude.

This file is a consultation report, not an implementation plan. Findings are severity-ordered with file references. The last sections are where Opus would overrule/refine/ask, and the layout read on the rejected mock (Opus's authority domain).


Findings (severity-ordered)

P0 - The two repos have incompatible governance; nobody owns the sx changes

  • Distribution ORCHESTRATION.md:24 makes Opus the only code writer, on named git branches, under lease timeouts. But the work in subplan 01 is sx compiler + stdlib work, and sx/CLAUDE.md:5-65 ("IMPASSIBLE RULES") mandates: the moment you hit a compiler bug, STOP, file an issue, end the session, wait for a fix in another session - do not work around it.
  • Building HTTP/TLS/SQLite/zip/crypto on an experimental compiler with 37 open issues will hit unimplemented/buggy paths repeatedly (FFI itself is an active, buggy stream: see sx/issues/0043,0052,0057). Under sx's own rules, the distribution "Opus implementation" phase will keep hard-stopping.
  • The two checkpoint systems (distribution .agents/runs/* vs sx current/CHECKPOINT-*.md) have no defined handoff. There is no answer to "who advances sx, in which repo's process, under whose lease."
  • This is the gating decision for the whole program and it is not addressed anywhere in the plan.

P0 - Distribution workspace is not a git repo (already flagged, still true)

  • subplan 08:194-198, CHECKPOINT.md:48-53, checkpoint.json:8-11. The entire orchestration is branch-based; it cannot start. Snarky P0 is correct. Trivial to fix (git init + baseline commit) but it blocks everything, so it should be the literal first action.

P1 - Phase 0 "language prerequisites" are largely gold-plating, not prerequisites

  • PLAN.md:159-190, subplan 01:23-52 require pub exports, alias imports, and pub print :: core.print namespace re-export before product code.
  • pub/module-member visibility does not exist in sx (specs.md:550 only has import-scoped impl visibility) and is on no active sx workstream. So the plan's first "blocking" slice is net-new parser/AST/resolver/ module-graph compiler work that nobody is scheduled to do.
  • But sx already has namespaced imports (math :: #import "...") and import-scoped impl visibility. The product can be built without pub. Treating it as a prerequisite front-loads hard compiler work and delays any product signal. (See "Opus would overrule," below.)

P1 - The server substrate that Phase 1 assumes does not exist, and the one primitive that does is wrong-platform

  • Confirmed absent in library/modules/: HTTP, TLS, SQLite, JSON, SHA-256/ crypto, base64/hex, RFC3339/time, CLI parser, config, zip/tar/gzip. Subplan 01 Slices 6-8 (subplan 01:104-145) describe all of these as if they are small deliverables. Each is a real subsystem.
  • library/modules/socket.sx:1-2,23-29 is the only networking code: raw POSIX, macOS-only (sin_len, macOS SO_REUSEADDR=0x4), blocking read/write, no event loop. The deployment target is Linux Docker on UGREEN NAS (PLAN.md:147-158), where sockaddr_in has no sin_len and constants differ. The HTTP foundation must be (re)written Linux-correct before distd can run where it ships.
  • Implication: every Phase-1 "slice" is actually a from-scratch systems library on an unstable compiler. This is the bulk of the project, mislabeled as foundation.

P1 - IPA/APK validation needs a ZIP reader that does not exist

  • subplan 05:19-48 requires reading the zip structure of IPA (Info.plist) and APK (AndroidManifest) to validate bundle id / version / signature. IPA and APK are ZIP containers. There is no std.archive/zip reader (subplan 01:137), and AndroidManifest.xml inside an APK is binary XML (AXML), not text - a non-trivial parser on its own. This is materially harder than the one-line "manifest package id" bullet implies.

P1 - Release/Channel model is internally contradictory and cannot support rollback as specified

  • Snarky already flagged the redundancy (Release.channel at subplan 02:43-54 vs Channel.current_release_id at subplan 02:56-61). Concur.
  • Beyond that: Channel stores only current_release_id with no promotion history. dist release rollback "moves a channel pointer to the previous valid release" (subplan 03:45-47) - but "previous" is unknowable from a single pointer. Rollback requires a promotion/channel-history table (or deriving it from audit events). The data model as drawn cannot do what the CLI promises.
  • No release state machine is defined (draft/validating/published/rejected/ superseded). Snarky P1 #3 is correct and should block domain implementation.

P1 - The multi-agent orchestration conflicts with the operator's own standing rules

  • ORCHESTRATION.md + subplan 08 are built on multiple named subagents (Codex/Snarky/Opus) invoked through an opus-runner MCP plugin (ORCHESTRATION.md:136-146), which is unbootstrapped ("may need Codex reload before MCP tools are available," CHECKPOINT.md:52-53).
  • The operator's global rules forbid subagent fan-out and require work in a single visible thread. The orchestration design is therefore in direct tension with how the operator actually wants work done. Either the global rule is waived for this project, or the multi-agent machinery should be dropped in favor of sequential single-thread phases with the same run-dir artifacts. This needs an explicit decision before more tooling is built.

P2 - Milestone 1 is the whole project, not a milestone

  • PLAN.md:259-272 bundles: sx language work + full stdlib + server + SQLite + APK and IPA validators + channels + install pages + admin UI + Docker/NAS. On an experimental language. Snarky P2 #8 is correct; I'd escalate it to P1 for scheduling purposes. Replace with a "walking skeleton" (below).

P2 - Validation policy mixes requirements with wishful tooling

  • subplan 05 is full of "if tool support exists," "when provided," "malware scan placeholder" (05:55-89). Concur with Snarky P2 #7: for v1 mark each check required | warning | informational | not-supported and delete the rest. Notarization/authenticode/malware-scan are out of reach for v1 and should be not-supported, not aspirational statuses.

P2 - iOS Enterprise vs MDM are conflated

  • subplan 04:50-52 / 05:24-28 treat "Enterprise/MDM" as one mode serving "a signed HTTPS manifest plist for enrolled devices." Those are two different mechanisms: itms-services:// in-house (Apple Enterprise Program, cert-trust on device) vs MDM-managed InstallApplication. The plist/host requirements differ. The product intent (don't imply a normal iPhone can sideload) is correct and is the strongest part of the plan - just split the two modes.
  • Also: distd cannot self-satisfy the HTTPS requirement (no TLS); it relies on the reverse proxy. dist doctor's "HTTPS base URL" check (subplan 03:14-16) can therefore only validate config/reachability, not terminate TLS itself. State that explicitly so Enterprise mode isn't marked "ready" when it isn't.

P2 - Subplan 01 still points implementers at deleted PLAN sections

  • subplan 01:18-21 tells implementers to read Standard Library API Surface and Detailed Std Struct And Method Sketches, removed from PLAN.md. Snarky P1 #4 confirmed; fix or the first sx session starts by chasing ghosts.

P3 - CI manifest schema / idempotency undefined

  • subplan 03:48-63 lists fields but no JSON schema, example, idempotency key, or rerun semantics. "CI is the primary writer" and "releases are immutable" (PLAN.md:38-42) make rerun-of-same-build behavior a v1 correctness concern, not a later detail. Define an idempotency key (app+version+build digest) now.

Where Opus would overrule, refine, or ask Snarky

Overrule - drop Phase 0 language work from the critical path. pub/alias-imports/namespace-re-export (PLAN.md:194-196, subplan 01:23-52) are ergonomics, not blockers. sx already has namespaced imports and impl visibility. Building net-new compiler features before the product compiles once is exactly backwards on an unstable compiler. Opus would build the product against the language as-is and only add pub if a concrete name-leak actually bites. This reorders the whole plan and Opus would hold this line against a "do it properly first" push.

Overrule - Milestone 1 is replaced by a walking skeleton. Opus's M1: dist ci publish --local-store writes a content-addressed artifact to disk + an in-memory/JSON-file domain model + one platform's metadata read (APK zip entry only) + a JSON output contract. No HTTP, no SQLite, no TLS, no UI. This exercises the real sx pain (fs, zip, hashing, JSON, CLI) on the smallest surface and produces a runnable artifact in days, not a quarter. SQLite/HTTP/ admin-UI become M2+, each gated on the prior actually running on Linux.

Refine - make the SHA-256 and zip decisions explicitly FFI-first. Pure-sx streaming SHA-256 and inflate are real projects and will trip compiler bugs. Bind libsqlite3, a crypto lib (CommonCrypto on mac / OpenSSL on Linux), and miniz/libzip via #foreign first; only reimplement in sx if FFI proves worse. This trades "pure sx" purity for actually shipping, and keeps bug-surface on the linker, not the comptime evaluator.

Refine - model promotion history, not just a channel pointer. Add a channel_release history (or derive rollback from audit). Without it, rollback is undefined.

Ask Snarky to decide before any code:

  1. Is the goal to ship a NAS release console or to mature sx? They imply opposite sequencing. If shipping is the goal, the sx-first framing is wrong.
  2. Is the multi-agent/opus-runner machinery actually required, given the operator's single-thread rule? If not, delete it and keep only run-dir artifacts.
  3. Linux-first or macOS-first for distd runtime? The only socket code is macOS-only; the deploy target is Linux. Pick one and make the socket layer match it before "server skeleton."
  4. For v1, is FFI-to-system-libs acceptable, or is "pure sx stdlib" a hard product constraint? This single answer changes the size of the project by months.

Layout read on the rejected mock (Opus authority domain)

The mock (index.html, styles.css) is not weak at the system level: warm off-white token palette, semantic green/amber/red with soft fills, Inter, a real shadow token, a clean CSS-grid shell (224px sidebar + 64px topbar, styles.css:49-69). The information architecture (apps list + detail, release, install, tokens, audit, settings) matches the brief. So the rejection is almost certainly about finish and tone, not structure:

  1. Placeholder iconography reads as a wireframe. Single letters as icons everywhere: brand "d," nav glyphs A/R/I/T/L/S (index.html:41-64), top actions literally rendering "R" for sync and "!" for notifications (index.html:26-31), 2-letter app monograms, and a QR faked as empty <span>s (index.html:439-446). Nothing signals "unfinished" harder than letter-in-a-box icons. A minimal real stroke-icon set would lift perceived quality more than any layout change.
  2. Tone fights the brief. The brief wants "operational SaaS density, quiet, no oversized marketing hero" (subplan 06:64-72). The mock leans airy/ card-y with big metric numbers and a phone-mockup hero on the Install tab (index.html:362-382) - that's the marketing-flavored hero the brief warns against. A release console likely wants a denser, cooler, more devtools/ terminal register than warm paper.
  3. Six fake views via class-toggle inflate surface area that all has to look finished at once; it spreads the polish too thin.

Opus's design direction for opus/redesign-distribution-mock: keep the 3-zone shell and IA; (a) replace every letter-glyph with a real minimal icon set; (b) shift to a cooler, denser operator palette and a tighter type/spacing scale; (c) demote the Install phone-hero to an inline preview, not a centerpiece; (d) make the first viewport land on real working content (apps + latest release), per subplan 06:94-99; (e) keep the iOS TestFlight/Enterprise/Artifact distinction loud - it's the one thing the current mock gets exactly right (index.html:384-437).


  1. git init the distribution workspace; baseline commit. (Unblocks everything.)
  2. Snarky decides the four questions above (ship-vs-mature, agents-yes/no, Linux-vs-mac, FFI-vs-pure-sx). These are gating.
  3. Drop Phase 0 language work from the critical path.
  4. Build the walking-skeleton M1 (dist ci publish --local-store, content- addressed fs store, JSON contract, APK zip-entry read) against sx as-is, FFI-first for hash/zip.
  5. Only then add SQLite (FFI), then the HTTP server (Linux-correct sockets), then install pages, then admin UI - each gated on the prior running on the actual Linux/NAS target.
  6. In parallel and independently: the mock redesign (no backend dependency), under Opus design direction above.

Verification (how to sanity-check the re-scope, read-only)

  • Confirm pub truly absent: sx ir / compile a probe with pub x :: 1 and observe parser rejection (don't land it - just confirm scope of work).
  • Confirm Linux socket gap: attempt sx build --target linux of a trivial socket.sx consumer and observe sin_len mismatch.
  • Confirm error model present (it is): sx/CHECKPOINT-ERR.md:15, plus the examples/10xx-errors-* suite passing (341 tests, CHECKPOINT-ERR.md:42-43).
  • For the redesign: build the static mock and screenshot the first viewport on desktop + mobile widths to validate the "working content first, no hero" acceptance (subplan 06:94-99).