Bauke Brenninkmeijer
I’m an experienced AI engineer, having built ML and AI projects in a variety of industries. After working for several start-ups, I transitioned to corporate, where I learned about the challenges of large organisations when executing tasks that require multiple skillsets, and whether those skillsets should be centralized or not. I’m focused on scalable solutions that drive clear and quantifiable business impact. I’m a builder that sometimes does research. Or a researcher who sometimes builds. My next chapter brings me back to the world of startups, where all skills are one person. Simple but chaotic. My current focus is on building LLM applications, managing the exponential complexity of frameworks and LLM providers. Find me to chat about MCP and multi-agent orchestration. Or anything else LLM.
Session
Grounding Large Language Models in your specific data is crucial, but notoriously challenging. Retrieval-Augmented Generation (RAG) is the common pattern, yet practical implementations are often brittle, suffering from poor retrieval, ineffective chunking, and context limitations, leading to inaccurate or irrelevant answers. The emergence of massive context windows (1M+ tokens) seems to offer a simpler path – just put all your data in the prompt! But does it truly solve the "needle in a haystack" problem, or introduce new challenges like prohibitive costs and information getting lost in the middle? This talk dives deep into the engineering realities. We'll dissect common RAG failure modes, explore techniques for building robust RAG systems (advanced retrieval, re-ranking, query transformations), and critically evaluate the practical viability, costs, and limitations of leveraging long context windows for complex data tasks in Python. Leave understanding the real trade-offs to make informed architectural decisions for building reliable, data-grounded GenAI applications.