PyData Seattle 2025

Sebastian Duerr

Seb is a Lead AI Engineer with a Master’s in Information Systems, originally from Germany and now US‑based. After beginning a PhD, he moved into consulting and served as Chief Product Officer at a major Austrian bank. He later pursued NLP research with MIT, co‑founded and exited a startup, and built AI/NLP systems in production. He has taught 20+ academic courses and published seven peer‑reviewed articles, known for translating complex concepts into practical solutions that bridge technical rigor with stakeholder needs.


Session

11-08
14:35
45min
Evaluation is all you need
Sebastian Duerr

LLM apps fail without reliable, reproducible evaluation. This talk maps the open‑source evaluation landscape, compares leading techniques (RAGAS, G-Eval, graders) and frameworks (DeepEval, Phoenix, LangFuse, OpenAI Evals), and shows how to combine unit tests, RAG‑specific evals, and observability to ship higher‑quality systems.
Attendees leave with a decision checklist, code patterns, and a production‑ready playbook.

Talk Track 3