What is the fixed-budget ratchet for AI research?
Give every AI experiment an identical wall-clock budget (e.g. 5 minutes). Measure with a vocabulary-size-independent metric like bits-per-byte. If results improve, commit to git and make it the new baseline. If not, git reset. This "ratchet" ensures the model only ever improves — roughly 12 experiments per hour, 100+ overnight.
When to use this
When comparing diverse changes (architecture, optimizer, batch size) that would otherwise be hard to evaluate fairly.
When to skip this
When experiments need variable training time to reach convergence, or when your metric requires longer evaluation windows.
Advanced GPU, git, automated training pipeline