Claude Code Agents
Directory of Claude Code agents and tools
An agent generates Python code for a search reranker using the ESCI dataset and a keyword search tool, aiming to improve NDCG from a BM25 baseline of 0.30.
The initial code-dumping approach yields inconsistent NDCG around 0.33 across test queries due to overfitting risks. An iterative optimization process applies small code patches, enforces generalization via holdout evaluations, and rejects overfit changes to produce a deployable reranker function.