PyData Amsterdam 2025

Simon Brugman

Simon Brugman is a Lead Data Scientist based in Amsterdam, currently working at ING Wholesale Banking Advanced Analytics. He is the original developer behind the widely adopted open-source tool pandas-profiling 1 (10k+ GitHub stars, millions of downloads) and has among others open-sourced popmon 2 and an entity-matching-model 3 under the ING umbrella. Simon has also contributed to popular Python and Rust projects including ruff and uv. He likes to spend time working within the Python and Rust ecosystems, particularly on effective developer tooling (linters and compilers), data tooling (data profiling and model monitoring) and recently LLM failure modes.

Together with Niels Neerhoff he will present ordeq 4 at PyData Amsterdam this year. This open-source project bundles years of experience and learnings from the broader community into a lightweight framework for effective, reproducible, and maintainable pipelines. We’ve found this useful from short data science experimentation, till production level data pipelines.


Session

09-25
13:40
50min
Streamlining data pipeline development with Ordeq
Simon Brugman, Niels Neerhoff

In this talk, we will introduce Ordeq, a cutting-edge data pipeline development framework used by data engineers, scientists and analysts across ING. Ordeq helps you modularise pipeline logic and abstract IO, elevating projects from proof-of-concepts to maintainable production-level applications. We will demonstrate how Ordeq integrates seamlessly with popular data processing tools like Spark, Polars, Matplotlib, DSPy, and orchestration tools such as Airflow. Additionally, we showcase how you can leverage Ordeq on public cloud offering like GCP. Ordeq has 0 dependencies and is available under MIT license.

Nebula