Measure twice, deploy once: Evaluation of retrieval systems
Paul verhaar, Maarten koopmans
Improving retrieval systems—especially in RAG pipelines—requires a clear understanding of what’s working and what isn’t. The only scalable way to do that is through meaningful metrics. In this talk, we share insights from building a platform-agnostic search and retrieval product, and how we balance performance against cost. Bigger models often give better results… but at what price? We explain how to assess what’s “good enough” and why the choice of benchmark really matters.