Skip to content

Stage 5c: Claude CLI Architect 🏛️

The organism writes new code for itself. Say "build me a gym tracker" → Claude CLI scaffolds an entire self-contained agent, sandboxed to its own folder, and reloads it into the running system.

Status: ✅ Complete (Paperclip-pattern) Trigger skill: build_with_architect (structural, BUILD mode) Lives in: src/architect/Live view: http://localhost:3210/architect-live

The Paradigm

Paperclip proved that Claude CLI can be used as an invisible worker: spawn it with a single SKILL.md, hand it the environment, let it discover context via HTTP, and it will build what you asked for — without you typing in its terminal.

We wanted that, but sandboxed. Our first run got burned: Claude CLI happily edited src/tools/factories.ts, src/genesis/build-catalog.ts, and test files because the instruction wasn't strict enough. Every patch rippled through the core.

So we redesigned around one hard rule:

Architect-generated agents write only inside tenant/agents/<new-agent>/. Nowhere else.

Everything else follows from that rule.

Architecture

┌────────────────────────────────────────────────────────┐
│  User → Axiom (API): "build me a gym tracker agent"   │
└───────────────────────────┬────────────────────────────┘

                            ▼  build_with_architect (structural)
┌────────────────────────────────────────────────────────┐
│  src/architect/build-agent.ts                          │
│   - publishes events → /architect-live (SSE)           │
│   - spawns claude CLI                                  │
└───────────────────────────┬────────────────────────────┘

                            ▼  spawn with env
┌────────────────────────────────────────────────────────┐
│  claude CLI                                            │
│   - ORBITA_API_URL, ORBITA_SESSION_ID,                 │
│     ORBITA_PROJECT_ROOT                                │
│   - SKILL.md added via --add-dir                       │
│   - stream-json output parsed by cli-runner.ts         │
└───────────────────────────┬────────────────────────────┘

                            ▼  curl discovery
┌────────────────────────────────────────────────────────┐
│  /architect/*  (src/architect/api.ts)                  │
│   - /conventions   (write rules)                       │
│   - /dna           (universal laws)                    │
│   - /agents        (don't duplicate)                   │
│   - /catalog       (reuse existing skills)             │
│   - /examples/tool (the exact pattern to follow)       │
│   - /check-path    (sandbox validator)                 │
│   - /reload        (rescan tenant tools)               │
│   - /done          (report completion)                 │
└───────────────────────────┬────────────────────────────┘

                            ▼  writes ONLY inside
┌────────────────────────────────────────────────────────┐
│  tenant/agents/<new-agent>/                            │
│   ├── dna.md            (copied from root)             │
│   ├── instruction.md                                   │
│   ├── config.json                                      │
│   └── tools/                                           │
│       ├── <tool>.skill.ts   (CatalogEntry export)      │
│       └── <tool>.skill.md   (human-readable)           │
└────────────────────────────────────────────────────────┘

                            ▼  POST /architect/reload
┌────────────────────────────────────────────────────────┐
│  tenant-tool-loader.ts                                 │
│   - scans tenant/agents/*/tools/*.skill.ts             │
│   - imports, validates shape, registers in catalog     │
│   - cache-busts via ?t=<now> so reload actually works  │
└────────────────────────────────────────────────────────┘

The Sandbox (hard constraint)

Architect runs are trusted as narrowly as possible. The SKILL.md defines:

Writable:

$ORBITA_PROJECT_ROOT/tenant/agents/<new-agent-name>/**

Read-only (explicitly forbidden):

  • src/** — factories, build-catalog, loaders, anything
  • skills/** — framework-owned skill docs
  • test/** — tests
  • agents/** — framework agents (Axiom)
  • doc/** — documentation
  • dna.md — root DNA (copied, never modified)
  • package.json, tsconfig.json, .gitignore
  • Other agents' folders

If architect tries to escape, /architect/check-path returns {allowed: false} and the SKILL.md tells it to stop and report rather than try workarounds.

The sandbox is not enforced at the OS level — Claude CLI has full filesystem access. Enforcement is via the SKILL.md instruction + post-hoc git diff review. In practice the instruction holds if it's written loudly enough.

Self-Contained Agents

An architect-built agent is completely portable. Zip the folder → complete agent. Delete the folder → agent gone, cleanly, no framework edits to revert.

tenant/agents/gym-tracker/
├── dna.md                            ← copy of root dna.md
├── instruction.md                    ← persona + behavior
├── config.json                       ← skills list, grants, profile
└── tools/
    ├── log_weight.skill.ts           ← CatalogEntry export
    ├── log_weight.skill.md           ← human-readable doc
    ├── calculate_bmi.skill.ts
    ├── calculate_bmi.skill.md
    ├── get_weight_history.skill.ts
    └── get_weight_history.skill.md

Tool Contract

Every *.skill.ts inside tenant/agents/*/tools/ MUST export a skill constant of type CatalogEntry:

typescript
import type { CatalogEntry, SkillContext } from "../../../../src/genesis/skill-catalog.js";
import type { SkillDefinition } from "../../../../src/tools/registry.js";

function buildLogWeight(ctx: SkillContext): SkillDefinition {
  return {
    name: "log_weight",
    description: "Log a daily weight entry",
    parameters: {
      weight_kg: { type: "number", description: "Weight in kg" },
      date: { type: "string", description: "ISO date; defaults to today" },
    },
    required: ["weight_kg"],
    execute: async (input) => {
      const record = await ctx.services.data!.insert(
        ctx.agentName, "weight_logs",
        { date: input.date, weight_kg: input.weight_kg },
        ctx.agentName,
      );
      return JSON.stringify({ success: true, entry: record });
    },
  };
}

export const skill: CatalogEntry = {
  name: "log_weight",
  description: "Log a daily weight entry",
  structural: false,
  create: (ctx) => buildLogWeight(ctx),
};

At startup (and on /architect/reload), tenant-tool-loader.ts auto-discovers these and registers them in the shared catalog. No edits to factories.ts or build-catalog.ts.

Live Viewing

Every architect run publishes events to an in-memory SSE stream:

  • http://localhost:3210/architect-live → HTML UI, live log
  • http://localhost:3210/architect-stream/events → raw SSE

Events include: session-start, stream-json lines from the CLI (tool calls, text deltas, usage), session-done with cost/duration/turns, errors.

A 500-event ring buffer lets you open the live view mid-run and still see history.

Files in src/architect/

FilePurpose
api.tsHTTP handler for /architect/* discovery endpoints
build-agent.tsOrchestrator — spawns CLI, publishes events
cli-runner.tsSpawns claude process, parses stream-json output
event-stream.tsIn-memory pub/sub with 500-event buffer
live-view.tsSSE endpoint + HTML live viewer

Running It

Mode must be BUILD (structural skills gated):

bash
# In Orbita talk:
You: "switch to build mode"
Axiom: [sets BUILD]

You: "use the architect to build a gym tracker agent that logs weight and calculates BMI"
Axiom: [calls build_with_architect with the spec]

# Meanwhile watch the live log:
open http://localhost:3210/architect-live

When architect reports /architect/done, reload has already happened, and the new agent is usable immediately.

Observed Behavior (first real run)

Building the gym-tracker agent end-to-end:

  • 103 tool calls by Claude CLI
  • $0.92 total cost
  • 333 seconds wall time
  • 6 files created, all inside tenant/agents/gym-tracker/
  • 0 core files touched

Initial exploration wasted turns on /agents, /status, /reload (root paths that don't exist) before finding /architect/*. SKILL.md was patched to front-load the API prefix and forbid guessing paths.

What This Unblocks

  • True self-extension — organism writes new code in response to conversation
  • BYOA / BYOS — customers describe their domain, system generates the agents
  • Scales per-domain — GST, HIPAA, payroll, fitness: each becomes its own self-contained agent folder
  • Portable marketplace — share a zipped agent folder; it drops into any Orbita

Guardrails (pending — see roadmap)

The quality gates described in the original roadmap (static validation, QA review, dependency mgmt, isolated agent tests, integration smoke test) are not yet implemented. Current trust model:

  • Sandbox via SKILL.md instruction
  • BUILD-mode governance gate
  • Manual review (git diff) after a run
  • Reload failures are logged

Gates 1–5 land incrementally as production usage grows.

Orbita — We don't build software. We grow organisms.