PyData Amsterdam 2025

Streamlining data pipeline development with Ordeq
2025-09-25 , Nebula

In this talk, we will introduce Ordeq, a cutting-edge data pipeline development framework used by data engineers, scientists and analysts across ING. Ordeq helps you modularise pipeline logic and abstract IO, elevating projects from proof-of-concepts to maintainable production-level applications. We will demonstrate how Ordeq integrates seamlessly with popular data processing tools like Spark, Polars, Matplotlib, DSPy, and orchestration tools such as Airflow. Additionally, we showcase how you can leverage Ordeq on public cloud offering like GCP. Ordeq has 0 dependencies and is available under MIT license.


The talk is targeted at data engineers, scientists, analysts and machine learning engineers who would like to kick-start their next data pipelines project. Our talk will:
- Introduce Ordeq, its key motivations, and design philosophy
- Compare Ordeq against existing tools like dbt, Hamitlon, Kedro and LangChain
- Conduct a technical deep-dive into the core components of Ordeq
- Showcase Ordeqs seamless integration with popular data processing tools like Spark, Polars, Matplotlib and DSPy
- Demonstrate Ordeq in several case studies
- Conclude with a Q&A session

Simon Brugman is a Lead Data Scientist based in Amsterdam, currently working at ING Wholesale Banking Advanced Analytics. He is the original developer behind the widely adopted open-source tool pandas-profiling 1 (10k+ GitHub stars, millions of downloads) and has among others open-sourced popmon 2 and an entity-matching-model 3 under the ING umbrella. Simon has also contributed to popular Python and Rust projects including ruff and uv. He likes to spend time working within the Python and Rust ecosystems, particularly on effective developer tooling (linters and compilers), data tooling (data profiling and model monitoring) and recently LLM failure modes.

Together with Niels Neerhoff he will present ordeq 4 at PyData Amsterdam this year. This open-source project bundles years of experience and learnings from the broader community into a lightweight framework for effective, reproducible, and maintainable pipelines. We’ve found this useful from short data science experimentation, till production level data pipelines.

Niels has been a software engineer at ING for over four years, and currently focuses on data products for ESG. Previously, he ran his own company, delivering machine learning models to small and medium-sized businesses. Outside of work, Niels is passionate about cycling and music.