2025-12-09 –, Ernst-Curie
DataBallPy is an open-source Python package that quickly starts your analysis of a football-related question. In the current talk, we will introduce the core features and functionalities of DataBallPy using code examples with compelling visualisations. The second part of the talk will showcase a practical example of how the Royal Belgian Football Association (RBFA) has used components of DataBallPy to analyse the effectiveness and efficiency of pressuring the opponent in over 200 games. Taken together, this talk will give you a clear starting point of how to start answering your football-related questions.
Answering complex data-driven questions about football tactics is a lot of fun. Trying to parse football data, synchronize tracking and event data, and determine which individual player actually has ball possession is just tedious work, which limits the time to actually answer the questions you wanted to answer in the first place. DataBallPy is an open-source Python package designed to abstract away these repetitive tasks, allowing analysts and developers to focus on building models and answering their questions.
In the first half of this talk, we introduce DataBallPy’s architecture and core functionality (and how it differs from existing packages like Kloppy). Using code examples, we will show how you can perform multiple preprocessing steps and visualise the data, and we will showcase some of the implemented features. Equally important, we value transparency and education. Therefore, we have some elaborate documentation on how all features are implemented. In short, DataBallPy will quickstart your process in answering your questions.
The second half of the talk presents a practical case study: how the Royal Belgian Football Federation (RBFA) uses open-source packages, among which DataBallPy, to analyse the efficiency and effectiveness of pressuring the opponent. The RBFA was inspired by the Common Data Format (CDF) (Anzer et al., 2025) to document and store data. Similarly, DataBallPy functionalities were made to work with the CDF as intended. The research builds on Bekkers’ pressing model, which uses the Time To Intercept (TTI) concept to estimate how quickly a defender can reach a specific target location on the pitch, allowing it to represent pressure. During the talk, we will showcase how the RBFA got to answer the following two questions: (1) Does external load differ significantly between successful and unsuccessful pressing actions? and (2) How does external load vary across pressing actions initiated from high-, mid-, and low-block defensive positions?
PhD student at the University of Groningen investigating the one-on-one dribbles.
24 year old Sport Science and Business Administration student at the University of Groningen, currently doing an internship as Data Scientist at the Royal Belgian Football Association (RBFA). Passionate about working with (soccer) data, and driven to excel in the fast-paced, performance-driven world of football.