Iryna Kondrashchenko
Iryna is a data scientist and co-founder of DataForce Solutions GmbH, a company specialized in delivering end-to-end data science and AI services. She contributes to several open-source libraries, and strongly believes that open-source products foster a more inclusive tech industry, equipping individuals and organizations with the necessary tools to innovate and compete.
Session
Evaluating large language models (LLMs) in real-world applications goes far beyond standard benchmarks. When LLMs are embedded in complex pipelines, choosing the right models, prompts, and parameters becomes an ongoing challenge.
In this talk, we will present a practical, human-in-the-loop evaluation framework that enables systematic improvement of LLM-powered systems based on expert feedback. By combining domain expert insights and automated evaluation methods, it is possible to iteratively refine these systems while building transparency and trust.
This talk will be valuable for anyone who wants to ensure their LLM applications can handle real-world complexity - not just perform well on generic benchmarks.