10 KiB
Opus Proposal - Local Agent Observability Dashboard Redesign
1. Verdict
Redesign is warranted, and the problem is structural, not cosmetic. The current page is a symmetric four-metric grid feeding two equal-weight panels. That layout treats every fact as equally important, which is wrong for a sequential, single-active orchestration model.
Keep the hard constraints: static HTML/CSS/vanilla JS, single /api/status
poll, no build step, and no dependencies. The redesign needs no breaking API
changes, only small additive, server-computed fields to make liveness and lease
expiry honest rather than guessed client-side.
The reorientation:
- Replace the equal metric grid with a "Now" command band that answers active run, phase, branch, and next action in one strip.
- Anchor the band with one system-health token.
- Make agent liveness a live ticking signal: lease countdown plus heartbeat age.
- Demote the progress tail from page-dominator to bounded, auto-sticking log.
- Remove landing-page tells: heavy shadows, shouting weights, eyebrow plus large heading, and cards-inside-cards.
Implementation is blocked until the workspace is a git repo. This proposal is
approvable now and implementable once git init and a baseline commit exist.
2. Information Architecture
Three tiers, in strict priority order. Tiers 0 through 2 should sit above the fold on desktop.
Tier 0 - Chrome/status bar:
- Workspace basename.
- Connection state.
- Freshness, ticking as "updated 3s ago".
- Tail selector.
- Refresh.
- Auto-refresh pause/play.
Tier 1 - Now band:
- System token:
WORKING,BLOCKED,STALLED,STALE, orIDLE. - Active run: run id, phase stepper, branch chip.
- Next action: prominent prose label.
Tier 2 - Operational split:
- Left rail around 340px:
- Active agent pinned at the top with status dot, role, lease countdown, and heartbeat age.
- Agents for the active run.
- Compact runs list with active run highlighted.
- Right column:
- Conditional blockers strip at the top when blockers exist.
- Progress tail with path, "last N of M", and last-modified age.
Remove the standalone workspace metric and the four-up summary grid. Workspace moves to the status bar; the grid is absorbed into the Now band and health token.
3. Visual Layout Direction
Desktop layout:
+--------------------------------------------------------------------+
| distribution updated 3s ago Tail [80] refresh pause |
+--------------------------------------------------------------------+
| WORKING | run 2026-06-02-obs... 4/10 Opus proposal | NEXT ACTION |
| | branch none Wait on ... |
+----------------------------+---------------------------------------+
| ACTIVE AGENT | BLOCKERS, if any |
| opus running +---------------------------------------+
| lease 28:14 heartbeat 4s | PROGRESS runs/.../progress.log |
| | +-----------------------------------+ |
| AGENTS | | 14:02:11 proposal requested | |
| codex running | | 14:02:48 reading brief.md | |
| snarky completed | | ... | |
| | +-----------------------------------+ |
| RUNS | expand new |
| active run | |
+----------------------------+---------------------------------------+
Grid and sizing:
- Shell:
width: min(1440px, 100%), padding20px. - Status bar: sticky, top
0, about44px, hairline bottom border. - Now band: single panel with
grid-template-columns: 150px minmax(0, 1fr) minmax(300px, 1fr). - Main split:
grid-template-columns: minmax(300px, 340px) minmax(0, 1fr). - Progress body:
max-height: clamp(320px, 52vh, 720px); overflow: auto. - Expanded progress body: about
80vh.
Visual language:
- Use one elevation level. Prefer hairline borders over heavy shadows.
- Delete nested bordered boxes; internal sections use dividers and spacing.
- Reduce blanket
font-weight: 800. Labels around 600, values 400, system token 700. - Use monospace for machine facts: run ids, branches, paths, timestamps, counts, lease values, heartbeat values.
- Keep prose in the sans stack.
- Keep the warm-paper base and the current semantic colors, but assign strict
meaning:
- green: healthy, live, completed
- blue: working or active focus
- amber: warning, expiring lease, aging heartbeat, stale data
- red: blocked, expired, dead, fetch error
- muted grey: idle, recorded, none, absent
- Use 8px status dots for rows. Reserve pills for the system token and active or recorded run status.
- Keep the progress log dark as the one large terminal-like surface.
Phase stepper:
- Render the ten canonical sequential-flow steps as a segmented bar.
- Filled segments are done, current segment is ringed, future segments are hollow.
- Show
N / 10 - <label>. - If phase text is unknown, degrade to label-only text.
4. Interactions And States
System health token priority:
BLOCKED red blocker_count > 0
STALLED red active agent lease_expired and not blocked
STALE amber data age is stale or active heartbeat is stale
WORKING blue active-run agent is running with healthy unexpired lease
IDLE grey no active run or no leased agent
Use aria-live="polite" for health transitions.
Liveness:
- Add a 1-second local ticker independent from the 5-second poll.
- Lease countdown displays
mm:ssremaining. - Lease at or below two minutes becomes amber.
- Expired lease becomes red and displays elapsed expiry age.
- Heartbeat age displays as
heartbeat 4s. - Heartbeat over 60 seconds becomes amber.
- Heartbeat over 180 seconds becomes red.
- Data freshness in the status bar ticks locally.
- The ticker only updates timestamp text nodes and never re-renders lists.
Polling and errors:
- Keep the last good render on fetch error.
- Flip the status bar to red error/stale state.
- Do not replace the progress tail with an error string.
- Pause timers while the tab is hidden and resume with an immediate refetch on focus.
Selection and affordances:
- Run rows should be selectable.
- Selecting a run refocuses progress and agent details if the API supports
?run=. - Progress auto-sticks to bottom only if the user was already at bottom.
- If new content arrives while scrolled up, show a "new" jump button.
- Add expand/collapse for the progress log.
- Add click-to-copy for run id, branch, and progress path with a brief copied state.
Empty and degraded states:
- No runs.
- No active run.
- No progress file.
- No lease recorded: show
lease -, muted. - No heartbeat recorded: show
heartbeat -, muted. - Unknown phase: label-only stepper.
- Checkpoint parse error: surface it in the blockers strip.
5. API Or Data Shape Changes
All requests are additive and optional. The dashboard must render correctly if any are absent.
Per agent:
{
"lease_remaining_ms": 1694000,
"heartbeat_age_ms": 4000,
"lease_expired": true,
"is_active": true
}
Summary:
{
"active_role": "opus",
"health": "working"
}
Active progress:
{
"modified_at": "2026-06-02T14:05:03Z",
"total_lines": 342,
"truncated": true
}
New optional query parameter:
/api/status?tail=N&run=<id>
When run is supplied, active_progress reflects that run instead of the
active one. Default behavior remains unchanged.
Optional manager-written fields in state.json:
{
"phase_index": 4,
"phase_total": 10,
"updated_at": "2026-06-02T14:05:03Z"
}
Suggested server tweak: in readProgress, pick the newest candidate file by
mtime rather than the first file in a fixed list.
6. Mobile Behavior
At widths up to 980px, switch to one column.
Stacking order:
- Sticky status bar, compressed to workspace basename.
- Now band, collapsed to rows: token, run plus stepper, next action.
- Blockers, if any.
- Active agent.
- Progress tail, collapsed by default to about
42vh. - Runs accordion, collapsed.
At widths up to 560px:
- Toolbar becomes a compact wrapping row: tail selector, refresh, pause.
- Tap targets stay at least 44px.
- The segmented phase bar degrades to text.
- Mono ids and paths ellipsize.
- Prose wraps.
- No text overlap.
7. Implementation Notes
Likely write scope:
.agents/observability/index.html.agents/observability/styles.css.agents/observability/app.js.agents/scripts/observe.mjsfor additive API fields
Implementation guidance:
- No frameworks and no build step.
- Use a small inline SVG sprite or simple text/icon fallbacks.
- Keep two timers:
setInterval(poll, 5000)rebuilds data.setInterval(tick, 1000)updates timestamp/countdown text only.
- Centralize constants:
POLL_MSDATA_STALE_MSLEASE_WARN_MSHEARTBEAT_WARN_MSHEARTBEAT_DEAD_MS
- Add one pure
computeHealth(data)helper. - Add
formatAge(ms)andformatCountdown(ms)helpers. - Prefer
document.createElementandtextContentfor rows instead of template-literalinnerHTML. - Preserve progress scroll position and bottom-stick behavior across renders.
- Remember selected run id.
- Treat every optional API field as missing-safe.
- Add
role="status"to freshness andaria-live="polite"to health and blocker changes. - Gate any pulse animation behind
prefers-reduced-motion. - Keep the CSS token approach and add a mono font token.
8. Concerns
- Implementation is currently blocked by process. The workspace is not a git repository, and Opus implementation must happen on a named branch from a clean baseline.
- Liveness is only honest if the manager writes
lease_expires_atandheartbeat_at. Missing values must render as unknown, not healthy. - Heartbeat cadence is unspecified. Proposed defaults are warning at 60 seconds and dead at 180 seconds, but the manager should pick a real cadence.
- Phase stepper depends on canonical phase strings. Unknown phases must degrade gracefully.
- Server-computed millisecond deltas are preferred to avoid client clock drift.
- Polling cannot show sub-5-second events, which is acceptable for human-paced orchestration.
- Avoid adding timelines, charts, or history graphs in the first pass.
- Non-active run progress depends on the optional
?run=parameter. Without it, run selection degrades to metadata-only.