04-18, 15:30–16:05 (US/Eastern), Auditorium 4
The team behind DVC has spent years tackling data versioning challenges. With the rise of AI, we’ve seen new complexities emerge - especially with multimodal datasets like images, video, audio, and text. This talk shows why multimodal data versioning is different and how Pydantic provides a powerful way to structure and integrate metadata.
The team behind DVC has spent years tackling data versioning challenges. With the rise of AI, we’ve seen new complexities emerge - especially with multimodal datasets like images, video, audio, and text. Simply tracking files is no longer enough; metadata, including bounding boxes, poses, text annotations, and embeddings, is now central to dataset management, using LLM for auto-annotation is becoming a daily routine. This talk shows why multimodal data versioning is different, how Pydantic provides a powerful way to structure and integrate metadata and how this approach is implemented in open-source library DataChain.
We’ll also cover efficient dataset operations at scale: computing diffs across millions of files, managing expensive GPU-based metadata computations like embeddings and performing incremental dataset updates. The audience will learn practical tricks for building scalable, high-performance AI workflows with modern dataset management techniques.
No previous knowledge expected
Creator of open-source tool DVC. Ex-Data Scientist at Microsoft. PhD in Computer Science. Now co-founder of datachain.ai