Versioning Multimodal Data: Metadata & Beyond PyData Virginia 2025

Versioning Multimodal Data: Metadata & Beyond
.ical

04-18, 15:30–16:05 (US/Eastern), Auditorium 4

The team behind DVC has spent years tackling data versioning challenges. With the rise of AI, we’ve seen new complexities emerge - especially with multimodal datasets like images, video, audio, and text. This talk shows why multimodal data versioning is different and how Pydantic provides a powerful way to structure and integrate metadata.

The team behind DVC has spent years tackling data versioning challenges. With the rise of AI, we’ve seen new complexities emerge - especially with multimodal datasets like images, video, audio, and text. Simply tracking files is no longer enough; metadata, including bounding boxes, poses, text annotations, and embeddings, is now central to dataset management, using LLM for auto-annotation is becoming a daily routine. This talk shows why multimodal data versioning is different, how Pydantic provides a powerful way to structure and integrate metadata and how this approach is implemented in open-source library DataChain.

We’ll also cover efficient dataset operations at scale: computing diffs across millions of files, managing expensive GPU-based metadata computations like embeddings and performing incremental dataset updates. The audience will learn practical tricks for building scalable, high-performance AI workflows with modern dataset management techniques.

Prior Knowledge Expected –

No previous knowledge expected

Dmitry Petrov

Creator of open-source tool DVC. Ex-Data Scientist at Microsoft. PhD in Computer Science. Now co-founder of datachain.ai

Versioning Multimodal Data: Metadata & Beyond .ical 04-18, 15:30–16:05 (US/Eastern), Auditorium 4

Versioning Multimodal Data: Metadata & Beyond
.ical

04-18, 15:30–16:05 (US/Eastern), Auditorium 4