Kirsten Lum
At Schemantic.io and Storytellers.ai, I oversee all aspects of data science, product, and engineering with more than 40 patent claims underyling our tech. An almost decade long analytics veteran of Amazon and Expedia, I have led dozens of leaders across applied science, economics, analytics, data architecture, instrumentation, customer segmentation, customer retention, marketing operations and impact measurement at global scale.
Session
AI initiatives don’t stall because of weak models or scarce GPUs—they stall because organizations (and their LLMs) can’t reliably find, connect, and trust their own tabular data. Traditional catalogs promised order but turned into graveyards of stale metadata: manually curated, impossible to maintain, and blind to the messy realities of enterprise-scale environments.
What’s needed is a semantic foundation that doesn’t just document data, but deterministically maps it—every valid join, entity, and lineage verifiable against the data itself.
This talk explores methods designed for that reality: statistical profiling to reveal true distributions, functional type detection to identify natural keys and relationships, deterministic join validation to separate signal from noise, and entity-centric mapping that organizes data around business concepts rather than table names. These approaches automate what was once brittle and manual, keeping catalogs alive, current, and grounded in evidence.