Sebastian Duerr
Seb is a Senior Member of Technical Staff at Cerebras Systems with a Master’s in Information Systems, originally from Germany and now US‑based. After beginning a PhD, he moved into consulting and served as Chief Product Officer at a major Austrian bank. He later pursued NLP research with MIT, co-founded and exited a startup, and built many AI/NLP systems in production. He has taught 20+ academic courses and published seven peer‑reviewed articles, known for translating complex concepts into practical solutions that bridge technical rigor with stakeholder needs.
Session
LLM apps fail without reliable, reproducible evaluation. This talk maps the open‑source evaluation landscape, compares leading techniques (RAGAS, Evaluation Driven Development) and frameworks (DeepEval, Phoenix, LangFuse, and braintrust), and shows how to combine tests, RAG‑specific evals, and observability to ship higher‑quality systems.
Attendees leave with a decision checklist, code patterns, and a production‑ready playbook.