PyData Virginia 2025

Krishna Rekapalli

Krishna is a Senior Data Scientist at IBM's Watsonx.ai Solution Architecture Center of Excellence, specializing in designing and implementing enterprise-scale LLM-powered AI solutions and agentic workflows. With over 7 years of experience building machine learning applications, they bring extensive expertise in hybrid cloud architectures, geospatial data analysis, and artificial intelligence. At IBM, they work directly with clients to architect and deploy production-ready AI solutions, focusing on practical implementation challenges and scalable architectures.


Sessions

04-19
11:00
90min
Building Rich RAG Systems with Docling: Unlock Information from Tables, Images, and Complex Documents
Krishna Rekapalli

Traditional PDF extraction tools often struggle with complex layouts, tables, and images, Docling (an opensource Python library developed at IBM) excels at extracting structured information from these elements, enabling the creation of richer, more accurate vector databases. This hands-on tutorial will guide participants through building a Retrieval Augmented Generation (RAG) system using Docling, an open-source document processing library.

Participants will learn how to harness Docling's advanced capabilities to build superior RAG systems that can understand and retrieve information from complex document elements that traditional tools might miss. Participants will learn how to handle complex documents, extract structured information, and create an efficient vector database for semantic search. The session will cover best practices for document parsing, chunking strategies, and integration with popular LLM frameworks.

Room 120