Skip to content

Stage 6: Trace Mesh Visualization πŸ•ΈοΈ ​

See the organism working. Every agent as a node, every delegation as an edge, every hop counted β€” a living force-directed graph at /mesh.

Status: βœ… Complete Lives in: src/xray/mesh-view.tsURL: http://localhost:3210/meshData endpoint: GET /mesh/data?limit=N (JSON)

Why This Exists ​

After Stages 4b/5/5a/5c, the organism has real depth β€” Axiom delegates, plugins dispatch, the architect spawns, inboxes chain into each other. The single-trace timeline at /trace?id=X shows one request, but doesn't answer "what's the shape of this organism?"

The Mesh does. It aggregates recent traces into a force-directed graph where:

  • Nodes are agents β€” user, axiom, and every tenant agent that showed up in a trace
  • Edges are message flow β€” who invoked or dispatched to whom, accumulated across all traces in the window
  • Node size scales with activity count
  • Edge width scales with hop count

You look at it once and you know: what's busy, what's idle, who talks to whom, whether the architect is dispatching, whether messages are cycling.

The Extraction ​

Each trace is a sequence of events with a seq index. We walk each trace in order, tracking the current executing agent, and emit edges whenever an agent hands off to another:

Event TypeProduces Edge?How
agent_calledβœ…currentAgent β†’ data.agent, then current becomes data.agent
skill_called (send_to_inbox)βœ…currentAgent β†’ input.to_agent
skill_called (delegate_to_agent)βœ…currentAgent β†’ input.target_agent
skill_called (create_task)βœ…currentAgent β†’ input.assignee
task_createdβœ…currentAgent β†’ data.assignee
response_sentβž–accrues tokensUsed to currentAgent

Every trace starts with user as the current agent (the entry point), so the graph is rooted in real human input.

The UI ​

A single page using d3-force from CDN β€” no bundler, no build step, consistent with the rest of Orbita's SSR HTML pages.

  • Drag nodes to reposition; layout resettles via force simulation
  • Scroll/pinch to zoom, click-drag background to pan
  • Click a node β†’ sidebar shows activity, traces touched, tokens in/out; connected neighbors highlight, rest dim
  • Click an edge β†’ sidebar lists every traceId that flowed through it (each links to /trace?id=X)
  • Limit selector β€” 25 / 50 / 100 / 250 traces
  • Auto-refresh every 5s so a running organism is visibly alive

Totals Panel ​

Top-level counts for the current window:

MetricMeaning
agentsdistinct nodes in the mesh
edgesdistinct agent→agent pairs that exchanged at least one message
tracesnumber of traces in the window
eventstotal trace events scanned
tokens in/outClaude API token totals (aggregated across response_sent events)

Data API ​

The UI uses a plain JSON endpoint β€” same data can power other tools, exports, or dashboards:

bash
curl -s "http://localhost:3210/mesh/data?limit=100" | jq .

Response shape:

json
{
  "nodes": [
    { "name": "axiom", "activity": 42, "tokensIn": 18432, "tokensOut": 2103, "traces": 12 }
  ],
  "edges": [
    { "source": "user", "target": "axiom", "count": 12, "traceIds": ["abc123…", "def456…"] }
  ],
  "traceCount": 12,
  "eventCount": 284,
  "generatedAt": "2026-04-05T12:34:56.789Z"
}

What This Unblocks ​

  • Debugging β€” a cycle between two agents is visible instantly as a thick loop
  • Demos β€” prospects see the organism "firing" as they talk to it
  • Cost attribution β€” per-agent token totals surface expensive agents
  • Dead-code detection β€” agents with zero activity across 100+ traces probably aren't wired up
  • Architect QA β€” after an architect run, you can confirm the new agent actually participates in real flows

Design Choices ​

Why d3 and not Vue Flow? Orbita has zero frontend build step. All pages are SSR HTML with a script tag. d3-force via CDN preserves that. Vue Flow would require bundling.

Why auto-refresh instead of SSE? The mesh is a snapshot view, not a stream. 5-second polling is cheap, simple, and plays well with browser caching. A future iteration could switch to SSE for sub-second updates if demos demand it.

Why an /mesh/data JSON endpoint? The graph data is useful beyond the UI β€” CLI scripts, tests, cost exports. Splitting the data from the render keeps the contract clean.

Why infer edges from trace events instead of a separate "flow" table? Trace events are already the source of truth. A separate table would need to be kept in sync. Inference is free and stays correct by construction.

Failure Modes ​

  • Empty mesh β€” when there are no traces yet (fresh install). Just talk to Axiom once and it populates.
  • Orphan nodes β€” if an agent name appears in a skill input but never as an executor, it'll show as a leaf with just inbound edges. This is correct behavior (passive agents look like this).
  • Large meshes (>100 agents) β€” force layout becomes slow. Current workaround: lower the trace-limit selector. Future: add zoom-based clustering.

Not Implemented (pending) ​

  • Filter by agent or time window (beyond the trace-limit selector)
  • Edge labels showing dominant message types
  • Per-edge token/cost attribution
  • Persisted graph snapshots for historical comparison
  • SSE push instead of polling

Orbita β€” We don't build software. We grow organisms.