Files
distribution/.agents/runs/2026-06-02-observability-redesign/opus-proposal.md

10 KiB

Opus Proposal - Local Agent Observability Dashboard Redesign

1. Verdict

Redesign is warranted, and the problem is structural, not cosmetic. The current page is a symmetric four-metric grid feeding two equal-weight panels. That layout treats every fact as equally important, which is wrong for a sequential, single-active orchestration model.

Keep the hard constraints: static HTML/CSS/vanilla JS, single /api/status poll, no build step, and no dependencies. The redesign needs no breaking API changes, only small additive, server-computed fields to make liveness and lease expiry honest rather than guessed client-side.

The reorientation:

  • Replace the equal metric grid with a "Now" command band that answers active run, phase, branch, and next action in one strip.
  • Anchor the band with one system-health token.
  • Make agent liveness a live ticking signal: lease countdown plus heartbeat age.
  • Demote the progress tail from page-dominator to bounded, auto-sticking log.
  • Remove landing-page tells: heavy shadows, shouting weights, eyebrow plus large heading, and cards-inside-cards.

Implementation is blocked until the workspace is a git repo. This proposal is approvable now and implementable once git init and a baseline commit exist.

2. Information Architecture

Three tiers, in strict priority order. Tiers 0 through 2 should sit above the fold on desktop.

Tier 0 - Chrome/status bar:

  • Workspace basename.
  • Connection state.
  • Freshness, ticking as "updated 3s ago".
  • Tail selector.
  • Refresh.
  • Auto-refresh pause/play.

Tier 1 - Now band:

  • System token: WORKING, BLOCKED, STALLED, STALE, or IDLE.
  • Active run: run id, phase stepper, branch chip.
  • Next action: prominent prose label.

Tier 2 - Operational split:

  • Left rail around 340px:
    • Active agent pinned at the top with status dot, role, lease countdown, and heartbeat age.
    • Agents for the active run.
    • Compact runs list with active run highlighted.
  • Right column:
    • Conditional blockers strip at the top when blockers exist.
    • Progress tail with path, "last N of M", and last-modified age.

Remove the standalone workspace metric and the four-up summary grid. Workspace moves to the status bar; the grid is absorbed into the Now band and health token.

3. Visual Layout Direction

Desktop layout:

+--------------------------------------------------------------------+
| distribution        updated 3s ago        Tail [80]  refresh pause |
+--------------------------------------------------------------------+
| WORKING | run 2026-06-02-obs...  4/10 Opus proposal | NEXT ACTION |
|         | branch none                                  Wait on ... |
+----------------------------+---------------------------------------+
| ACTIVE AGENT               | BLOCKERS, if any                      |
| opus running               +---------------------------------------+
| lease 28:14 heartbeat 4s   | PROGRESS runs/.../progress.log        |
|                            | +-----------------------------------+ |
| AGENTS                     | | 14:02:11 proposal requested       | |
| codex running              | | 14:02:48 reading brief.md         | |
| snarky completed           | | ...                               | |
|                            | +-----------------------------------+ |
| RUNS                       |                         expand  new  |
| active run                 |                                       |
+----------------------------+---------------------------------------+

Grid and sizing:

  • Shell: width: min(1440px, 100%), padding 20px.
  • Status bar: sticky, top 0, about 44px, hairline bottom border.
  • Now band: single panel with grid-template-columns: 150px minmax(0, 1fr) minmax(300px, 1fr).
  • Main split: grid-template-columns: minmax(300px, 340px) minmax(0, 1fr).
  • Progress body: max-height: clamp(320px, 52vh, 720px); overflow: auto.
  • Expanded progress body: about 80vh.

Visual language:

  • Use one elevation level. Prefer hairline borders over heavy shadows.
  • Delete nested bordered boxes; internal sections use dividers and spacing.
  • Reduce blanket font-weight: 800. Labels around 600, values 400, system token 700.
  • Use monospace for machine facts: run ids, branches, paths, timestamps, counts, lease values, heartbeat values.
  • Keep prose in the sans stack.
  • Keep the warm-paper base and the current semantic colors, but assign strict meaning:
    • green: healthy, live, completed
    • blue: working or active focus
    • amber: warning, expiring lease, aging heartbeat, stale data
    • red: blocked, expired, dead, fetch error
    • muted grey: idle, recorded, none, absent
  • Use 8px status dots for rows. Reserve pills for the system token and active or recorded run status.
  • Keep the progress log dark as the one large terminal-like surface.

Phase stepper:

  • Render the ten canonical sequential-flow steps as a segmented bar.
  • Filled segments are done, current segment is ringed, future segments are hollow.
  • Show N / 10 - <label>.
  • If phase text is unknown, degrade to label-only text.

4. Interactions And States

System health token priority:

BLOCKED  red    blocker_count > 0
STALLED  red    active agent lease_expired and not blocked
STALE    amber  data age is stale or active heartbeat is stale
WORKING  blue   active-run agent is running with healthy unexpired lease
IDLE     grey   no active run or no leased agent

Use aria-live="polite" for health transitions.

Liveness:

  • Add a 1-second local ticker independent from the 5-second poll.
  • Lease countdown displays mm:ss remaining.
  • Lease at or below two minutes becomes amber.
  • Expired lease becomes red and displays elapsed expiry age.
  • Heartbeat age displays as heartbeat 4s.
  • Heartbeat over 60 seconds becomes amber.
  • Heartbeat over 180 seconds becomes red.
  • Data freshness in the status bar ticks locally.
  • The ticker only updates timestamp text nodes and never re-renders lists.

Polling and errors:

  • Keep the last good render on fetch error.
  • Flip the status bar to red error/stale state.
  • Do not replace the progress tail with an error string.
  • Pause timers while the tab is hidden and resume with an immediate refetch on focus.

Selection and affordances:

  • Run rows should be selectable.
  • Selecting a run refocuses progress and agent details if the API supports ?run=.
  • Progress auto-sticks to bottom only if the user was already at bottom.
  • If new content arrives while scrolled up, show a "new" jump button.
  • Add expand/collapse for the progress log.
  • Add click-to-copy for run id, branch, and progress path with a brief copied state.

Empty and degraded states:

  • No runs.
  • No active run.
  • No progress file.
  • No lease recorded: show lease -, muted.
  • No heartbeat recorded: show heartbeat -, muted.
  • Unknown phase: label-only stepper.
  • Checkpoint parse error: surface it in the blockers strip.

5. API Or Data Shape Changes

All requests are additive and optional. The dashboard must render correctly if any are absent.

Per agent:

{
  "lease_remaining_ms": 1694000,
  "heartbeat_age_ms": 4000,
  "lease_expired": true,
  "is_active": true
}

Summary:

{
  "active_role": "opus",
  "health": "working"
}

Active progress:

{
  "modified_at": "2026-06-02T14:05:03Z",
  "total_lines": 342,
  "truncated": true
}

New optional query parameter:

/api/status?tail=N&run=<id>

When run is supplied, active_progress reflects that run instead of the active one. Default behavior remains unchanged.

Optional manager-written fields in state.json:

{
  "phase_index": 4,
  "phase_total": 10,
  "updated_at": "2026-06-02T14:05:03Z"
}

Suggested server tweak: in readProgress, pick the newest candidate file by mtime rather than the first file in a fixed list.

6. Mobile Behavior

At widths up to 980px, switch to one column.

Stacking order:

  1. Sticky status bar, compressed to workspace basename.
  2. Now band, collapsed to rows: token, run plus stepper, next action.
  3. Blockers, if any.
  4. Active agent.
  5. Progress tail, collapsed by default to about 42vh.
  6. Runs accordion, collapsed.

At widths up to 560px:

  • Toolbar becomes a compact wrapping row: tail selector, refresh, pause.
  • Tap targets stay at least 44px.
  • The segmented phase bar degrades to text.
  • Mono ids and paths ellipsize.
  • Prose wraps.
  • No text overlap.

7. Implementation Notes

Likely write scope:

  • .agents/observability/index.html
  • .agents/observability/styles.css
  • .agents/observability/app.js
  • .agents/scripts/observe.mjs for additive API fields

Implementation guidance:

  • No frameworks and no build step.
  • Use a small inline SVG sprite or simple text/icon fallbacks.
  • Keep two timers:
    • setInterval(poll, 5000) rebuilds data.
    • setInterval(tick, 1000) updates timestamp/countdown text only.
  • Centralize constants:
    • POLL_MS
    • DATA_STALE_MS
    • LEASE_WARN_MS
    • HEARTBEAT_WARN_MS
    • HEARTBEAT_DEAD_MS
  • Add one pure computeHealth(data) helper.
  • Add formatAge(ms) and formatCountdown(ms) helpers.
  • Prefer document.createElement and textContent for rows instead of template-literal innerHTML.
  • Preserve progress scroll position and bottom-stick behavior across renders.
  • Remember selected run id.
  • Treat every optional API field as missing-safe.
  • Add role="status" to freshness and aria-live="polite" to health and blocker changes.
  • Gate any pulse animation behind prefers-reduced-motion.
  • Keep the CSS token approach and add a mono font token.

8. Concerns

  1. Implementation is currently blocked by process. The workspace is not a git repository, and Opus implementation must happen on a named branch from a clean baseline.
  2. Liveness is only honest if the manager writes lease_expires_at and heartbeat_at. Missing values must render as unknown, not healthy.
  3. Heartbeat cadence is unspecified. Proposed defaults are warning at 60 seconds and dead at 180 seconds, but the manager should pick a real cadence.
  4. Phase stepper depends on canonical phase strings. Unknown phases must degrade gracefully.
  5. Server-computed millisecond deltas are preferred to avoid client clock drift.
  6. Polling cannot show sub-5-second events, which is acceptable for human-paced orchestration.
  7. Avoid adding timelines, charts, or history graphs in the first pass.
  8. Non-active run progress depends on the optional ?run= parameter. Without it, run selection degrades to metadata-only.