2025-12-09 –, General Track
Hybrid Execution is a new capability introduced in the open-source Modin library that lets developers write familiar pandas code while automatically selecting the most efficient execution backend. Small datasets run locally for fast, interactive development, while larger workloads are transparently pushed down to distributed backends for scalable, high-performance execution. This approach enables faster development for rapid prototyping and iteration and future-proofs pipelines as data volumes grow.
pandas is one of the most widely used tools in the Python ecosystem, but scaling it beyond memory limits has traditionally required significant refactoring or switching to other tools. In this talk, we introduce Hybrid Execution, a new capability powered by Modin that allows pandas code to seamlessly switch between local, in-memory execution and distributed backends. This approach preserves the familiar pandas API while enabling users to scale their workflows without rewriting code. We'll explore how Hybrid Execution works under the hood, how Modin enables backend flexibility, and what it means for building interactive, scalable data pipelines with pandas.