Sankalp Gilda
As a Staff MLE, Sankalp gets fired up by complex technical challenges, diving deep into time series, constrained optimization, and high-performance computing. He's currently exploring the practical frontier of Generative AI, applying LLMs and multimodal techniques to improve how knowledge graphs are built from diverse sources. This talk focuses on a crucial component of that work: efficiently mapping and aligning extracted concepts to standard knowledge bases like Wikidata. Off-duty, his adventures shift from algorithmic to atmospheric (skydiving) and aquatic (scuba diving), often accompanied by his adventure-loving dog.
Session
Many projects build knowledge graphs with custom schemas but struggle to align them with standard hubs like Wikidata. Manual mapping is tedious and error-prone, while fully automated methods often lack accuracy. This talk introduces wikidata-mapper
, a Python tool leveraging Large Language Models (LLMs via DSPy
) to suggest semantic mappings between simple YAML ontology schemas and Wikidata identifiers (QIDs/PIDs). We demonstrate its interactive workflow, including confidence-based auto-acceptance, batch suggestion/review modes for scalability, and a novel hierarchy suggestion feature. Learn how this tool combines LLM power with human oversight to efficiently ground custom knowledge representations in Wikidata, using libraries like inquirer
, tenacity
, and platformdirs
. Ideal for KG practitioners, data engineers, and anyone needing to integrate custom schemas with public knowledge bases.