Turning my laptop into a Search Relevance Judge with local LLMs
Local LLMs can evaluate 100s of result pairs a minute in a Macbook, enabling a new age of rapid search relevance improvements
LLMs generate inconsistent lexical labels for identical inputs, but these labels are often semantically similar.
Clustering generated labels using vector embeddings and disjoint set union enables consistent classification, reducing unique labels from linear to sublinear growth as dataset size increases. This vectorization approach is initially slower and slightly more expensive, but scales to become significantly cheaper and faster than pure LLM-only classification at large volumes.