PyData Eindhoven 2025

From Data Lake Entanglement to Data Mesh Decoupling: Scaling a Self-Service Data Platform
2025-12-09 , Planck-Bohr

Our data platform journey started with a classic data lake — easy to ingest, hard to evolve. As domains scaled, tight coupling across source systems, pipelines, and data products slowed everything down. In this talk, we share how we re-architected toward a domain-oriented data mesh using PySpark, Delta Lake and DQX to achieve true decoupling. Expect practical lessons on designing independent data products, managing lineage and governance, and scaling self-service without chaos.


  1. What exactly is architectural decoupling
  2. Identify how hidden coupling creeps into data lakes and data meshes.
  3. Learn technical design patterns for decoupling ingestion, transformation, and ownership using open tools.
  4. Understand how to scale self-service data mesh principles without breaking governance or lineage integrity.

Prior Knowledge Expected: Beginner - No prior knowledge needed

Geert Jongen is a System Architect for Data & Analytics at Vanderlande, where he designs and evolves the company’s data platform toward a federated data mesh. Before that, he worked as a data consultant at Pipple for five years. With a background in data engineering, analytics, and data science, he focuses on building scalable data architectures that empower teams through autonomy and governance.