Agent Loops

An agent loop is a programmatic wrapper that repeatedly prompts an AI coding agent, reads the result, checks whether the task is done, and either stops or prompts again. Matt Van Horn summarizes the shift as moving from being the human who types prompts inside the loop to being the person who writes the loop; Addy Osmani calls the adjacent practice "loop engineering," where the engineer designs the system that prompts the agents instead of manually driving every turn.^{source: addy-osmani-loop-engineering-2026.md}

The article distinguishes several historical layers that are often collapsed under the word "loop": ReAct-style reason/tool/observe loops, AutoGPT-style self-prompting, Geoffrey Huntley's ralph loop with fixed anchor context, productized /goal loops, and newer orchestration loops that run on schedules, supervise other agents, and preserve durable state. In that newer layer, the unit of work is the loop itself rather than a single prompt or task.^{source: matt-van-horn-wtf-is-a-loop-2026.md}

A practical loop is not just cron with rebranding, but cron plus a model-driven decision-maker in the body. Scheduling can be ordinary infrastructure; the novel part is that the agent inspects current state, chooses the next action, executes it, evaluates feedback, and may dispatch other agents. This connects directly to harness-engineering: the value is in context, tools, feedback, permissions, state, observability, and recovery around the model.^{source: matt-van-horn-wtf-is-a-loop-2026.md}

Verification is the central safety boundary. Van Horn's synthesis argues that open-ended loops without review generate compounding mistakes, while useful loops write, run, read results, and correct themselves. Serious loops therefore need self-verification, review tools, maximum iteration counts, no-progress detection, and token or dollar budget ceilings.^{source: matt-van-horn-wtf-is-a-loop-2026.md}

The source also reframes skillification as the compounding asset inside the loop. A loop with no reusable skills simply re-derives work and burns money; a loop that calls named, tested skills turns repeated workflows into reusable infrastructure. This makes loops part of ai-assisted-software-development's broader shift from hand-writing code toward designing context, stopping conditions, validators, and durable automation.

Osmani frames loop engineering as one layer above harness-engineering. A harness makes one agent useful; a loop adds heartbeat, delegation, persistent state, and self-feeding task discovery. His five recurring building blocks are scheduled automations, isolated worktrees, reusable skills, connectors/plugins, and sub-agents, plus a sixth memory surface such as markdown or a Linear board that survives the single conversation.^{source: addy-osmani-loop-engineering-2026.md}

The same source sharpens the human boundary. Better loops do not remove the engineer; they make verification, comprehension, cost control, and review bandwidth more important. Osmani warns that two people can build the same loop and get opposite outcomes: one uses it to move faster on work they understand, while another uses it to avoid understanding and accelerates cognitive-surrender.^{source: addy-osmani-loop-engineering-2026.md}

Brown's test-time-compute-evaluations argument adds an external measurement consequence: if loops, scaffolds, best-of-N, and longer rollouts can turn more inference into better performance, then an agent loop is not just an implementation detail but part of the measured capability. Benchmarks and safety evaluations need to say what token, cost, or time budget the loop was allowed to spend.^{source: nvidia-enpire-agentic-robot-policy-self-improvement-2026.md}

ENPIRE extends the same loop shape into robot learning. Instead of edit → test → log, the physical loop is reset scene → run policy rollout → automatically verify outcome → inspect traces/videos/logs → change policy or training code. This shows that agent-loops are a general optimization structure, but they only become useful in the real world when the task is packaged as an agent-operable environment.^{source: nvidia-enpire-agentic-robot-policy-self-improvement-2026.md}

Ronacher sharpens the boundary between two loop layers: the coding agent already has an internal tool loop, but the new frontier is the outer harness loop that decides whether "done" is actually done. That outer loop can inject another message, continue the same session, start a fresh session with changed context, or route work to another machine. This makes stopping conditions and human role design first-class engineering problems rather than incidental UI choices.^{source: armin-ronacher-the-coming-loop-2026.md}

His caution is that loops amplify the local habits of present-day models. If each iteration responds to a failure by adding another fallback or defensive branch, long-lived code can become more complex while appearing more robust. Loops therefore work best today for porting, benchmarking, security triage, research, and short-lived experiments, where outputs are either mechanically checkable or not meant to become durable architecture; long-lived production code needs stronger invariants, review, and loop-dependency controls.^{source: armin-ronacher-the-coming-loop-2026.md}

The Liberman brothers add a market-design implication: agents are not loyal to provider brands. Because an agent can spend orders of magnitude more tokens than a chat session, its harness may eventually route itself toward cheaper model and compute providers when asked to optimize cost. That makes ai-compute-infrastructure and decentralized-ai-compute part of loop design: model choice, price discovery, and fallback routing become runtime behavior, not a one-time vendor decision.^{source: forbes-liberman-brothers-ai-infrastructure-2026.md}

PRINCE applies a similar loop shape to enterprise research: clarify intent → think and plan → retrieve from RAG/Text-to-SQL tools → reflect on data sufficiency → write the answer with citations. The important production addition is recoverability: checkpointed state, node-level retries, provider fallback, and user-initiated retry prevent one failed step from forcing the whole workflow to restart.^{source: martin-fowler-bayer-reliable-agentic-ai-systems-2026.md}

Agent Loops

Resources