Will DiMaio
Index
01In Progress· 2026· Systems Architecture

Exploring agent infrastructure for long-horizon work

ODIN

An autonomous agent system that decomposes complex goals into parallelized subtasks, self-monitors via OODA loops, and recovers from failure without human intervention — built because single-shot LLM calls collapse under multi-day work and existing agent frameworks stop at demos.

0
Specialist agents
<2s
OODA loop target
Session-spanning context
0
Operator in the loop
01Key Insight
WHY OODA OVER REACT

ODIN runs OODA — Observe, Orient, Decide, Act — instead of the now-standard ReAct loop. The split between observation and orientation is what lets the agent ABANDON failing strategies, not just retry them. ReAct keeps trying the same approach until it works or runs out of tokens; OODA forces the agent to re-orient on every cycle, which means a stuck plan gets re-decomposed instead of brute-forced.

02The Problem

Why this exists.

Single-model LLM calls are great at well-scoped one-shot tasks and terrible at long-horizon work. The moment a problem requires sustained reasoning across hours, days, or distinct disciplines — research, planning, coding, verification, reporting — a single chat session collapses under its own context window. Information gets dropped. Decisions get re-litigated. The model forgets what it already tried.

Existing agent frameworks largely treat this as a prompt-engineering problem and stop at impressive demos. They lack the substrate that production work actually needs: durable scheduling, structured handoffs between cooperating agents, a context graph that survives a process restart, and a human supervisor who can intervene without burning the whole network down. The interesting work in this space isn't a smarter prompt — it's the operating system underneath.

03The Approach

How it works.

ODIN treats agents as cooperating processes, not chat sessions. The orchestrator decomposes a goal into a typed plan, dispatches subtasks to specialist agents (Research, Plan, Code, Verify, Report), and supervises the network as it works. Each agent has its own scoped capability set and its own tool surface — the planner doesn't write code, the verifier doesn't talk to the outside world.

Underneath, everything runs on the Amplify runtime — the engine I built to handle the unglamorous parts: a durable scheduler, a cross-session context graph, tool routing with capability negotiation, structured handoffs between agents, and full observability. Amplify is the substrate, ODIN is the product surface. Together they make it possible for a single human operator to supervise dozens of concurrent agent networks without losing the thread.

The operator is a first-class citizen, not an afterthought. Every decision the network makes flows through a console where a human can inspect rationale, approve risky moves, or roll back a branch of the agent graph. ODIN is autonomous, but it's not opaque.

04AGENT ARCHITECTURE

Five specialist agents, one orchestrator.

01 / 05L3

Orchestrator

Decomposes the goal, dispatches subtasks to specialists, supervises the network, and arbitrates handoffs. Holds the plan; never executes inside it.

02 / 05L2

Research agent

Read-only specialist. Walks the context graph, scrapes external sources, gathers facts before any irreversible action runs. The first agent in almost every plan.

03 / 05L2

Plan agent

Takes a goal and a set of facts and produces a typed plan — subtasks, dependencies, success criteria. Outputs are reviewable artifacts, not free-form prose.

04 / 05L2

Code agent

The only agent allowed to write into its sandbox. Capabilities scoped per task: a documentation task can't touch production secrets, a refactor can't deploy.

05 / 05L2

Verify agent

Runs tests, checks invariants, validates outputs against the plan's success criteria. If verification fails, the plan re-orients — that's where the OODA loop earns its keep.

05Features

What it does.

01 / 04L3 · Orchestrator

Goal decomposition with explicit plans

ODIN turns a single high-level goal into an explicit plan — a typed graph of subtasks, dependencies, and success criteria — before a single specialist agent runs. Plans are first-class objects: they can be inspected, edited, replayed, and diffed. This is what makes long-horizon work auditable instead of magical.

02 / 04L1 · Amplify runtime

Cross-session context graph

Every fact an agent learns is written to a typed context graph that outlives any individual session. When an agent picks up a task three hours later — or after a process restart — it walks the graph instead of starting cold. This is the single biggest reason long-horizon work actually completes: the network never forgets what it already knows.

03 / 04L2 · Specialist agents

Tool routing with capability negotiation

Agents declare what tools they need; the runtime decides what they're allowed to call. A research agent can read; a verifier can run tests; a code agent can write inside its sandbox. Capabilities are negotiated at handoff, not granted globally — so a single compromised step can't escalate across the whole network.

04 / 04L4 · Operator console

Live supervision with rollback

The operator console is a real interface, not a log viewer. Every decision the network makes flows through a feed where a human can pause, approve, or roll back any branch of the plan. ODIN is built to run unattended for hours — but built so that intervening costs seconds, not minutes.

06Achievements

What's shipped so far.

ARCHITECTURE
0 agents

Research, Plan, Code, Verify, Report — each with scoped capabilities negotiated at handoff. No agent has more authority than its current task requires.

FAILURE MODE
Self-recovery

Failed plans don't retry — they re-orient. The OODA split lets the orchestrator abandon strategies that aren't working and re-decompose the goal from scratch.

DECISION TRAIL
Auditable

Every plan, handoff, and verification result is a typed object in the context graph. The operator can replay any decision branch and inspect why a path was taken.