Physical Autoresearch

Physical autoresearch is agent-driven experimentation in the real world: an AI coding agent proposes policy or training changes, executes them on physical hardware, receives automatic feedback, and uses the result to choose the next experiment. ENPIRE frames this as the missing abstraction for robot policy self-improvement: reset the scene, execute a policy, verify the outcome, and refine the next iteration.^{source: nvidia-enpire-agentic-robot-policy-self-improvement-2026.md}

The key distinction from ordinary coding-agent work is that the scarce resource is not only tokens or test runtime, but physical rollouts on robots. ENPIRE therefore treats robot availability, GPU use, wall-clock time, logs, and videos as part of the research loop. This turns harness-engineering into infrastructure for embodied trial-and-error rather than merely software development scaffolding.^{source: nvidia-enpire-agentic-robot-policy-self-improvement-2026.md}

Jim Fan's follow-up makes the hidden engineering cost explicit: physical autoresearch only looks simple after the loopcraft exists. Before pressing Enter, the team has to build hard safety envelopes, torque-limited compliant manipulation, frozen success criteria, real-time reward classifiers, telemetry, and resource-aware scheduling. In other words, the research loop is not LLM + robot; it is LLM + robot fleet + safety harness + immutable /done + telemetry + human supervision.^{source: jim-fan-physical-autoresearch-loopcraft-2026.md}

The approach also extends test-time-compute-evaluations. ENPIRE proposes Mean Robot Utilization (MRU) and Mean Token Utilization (MTU) to evaluate multi-agent physical autoresearch. Larger robot fleets can reach success sooner, but they consume more tokens and coordination effort; capability should therefore be read as a curve over robots, GPUs, tokens, and time rather than as a single final success rate.^{source: nvidia-enpire-agentic-robot-policy-self-improvement-2026.md}

Physical autoresearch depends on agent-operable-environments: without automatic reset and verification, the agent cannot close the loop. With them, the real world starts to resemble a CI system where experiments are slow, expensive, embodied, and safety-constrained, but still machine-operable.

Physical Autoresearch

Resources