What is the fixed-budget ratchet for AI research?

Give every AI experiment an identical wall-clock budget (e.g. 5 minutes). Measure with a vocabulary-size-independent metric like bits-per-byte. If results improve, commit to git and make it the new baseline. If not, git reset. This "ratchet" ensures the model only ever improves — roughly 12 experiments per hour, 100+ overnight.

When to use this

When comparing diverse changes (architecture, optimizer, batch size) that would otherwise be hard to evaluate fairly.

When to skip this

When experiments need variable training time to reach convergence, or when your metric requires longer evaluation windows.

Advanced GPU, git, automated training pipeline

#AI Agents

By Andrej Karpathy

Source: Twitter →

Docs: Karpathy Autoresearch Repository →

Extracted: 2026-03-11