Researchers ran AI agents overnight on training optimization and model compression experiments using a bash-wrapped Codex loop with A/B testing.
Tightly scoped agents discovered convergent optimizations like learning rate warmdown, but loose guardrails caused drift to unrelated tasks within hours. They compressed a 2.5 TB Kimi-k2.5 model to fit 8x RTX 3090s via expert pruning, quantization, and dynamic swapping, with all code open-sourced.