2025-12-09 –, Ernst-Curie
FootballBERT introduces a new way of representing football players — not as static IDs or statistical aggregates that fluctuate wildly over short periods, but as contextual embeddings learned directly from match data.
Built on a Transformer architecture and trained through a Masked Player Prediction (MPP) objective, FootballBERT captures how a player’s identity emerges from teammates, opponents, and coaches tactical demands — much like BERT learns word meaning from sentences.
Openly released on Hugging Face, FootballBERT is a plug-and-play foundation model whose embeddings can be integrated into any downstream system, paving the way for player-aware analytics across performance modeling, recruitment and prediction.
In football analytics, player identity is still often encoded using one-hot vectors or individual statistics — highly volatile over short time periods. Such approaches ignore the relational context between teammates and opponents, and coaches tactical demands.
FootballBERT changes that paradigm. Inspired by NLP breakthroughs, it applies the Transformer architecture to lineup data, learning contextual player embeddings through a Masked Player Prediction (MPP) objective. Each embedding encodes a player’s identity through patterns of who they play with, against, and under which tactical setups.
In this talk, I’ll walk through:
Why embedding player identity into dense vectors matters for football analytics.
How FootballBERT is trained end-to-end from raw lineup data and positional features across 170K+ matches.
Empirical insights on what these embeddings capture — especially their ability to generalize across leagues.
How such representations enable downstream applications like BALLER-Transfer Portal, an AI model that predicts how any player would perform in any tactical context with unprecedented granularity.
This session bridges deep learning, transformers, and sports data science — showing how domain-specific foundation models like FootballBERT can power the next generation of context-aware analytics.
Whether you’re an ML researcher, AI engineer, or football data scientist, you’ll leave with a concrete understanding of how to build, train, and apply transformer models beyond text — starting from real-world, messy data.
Achraff Adjileye is a research engineer passionate about football analytics and artificial intelligence. He is the founder of the BALLER project, which aims to build a foundational model for football analytics—powering the next generation of context-aware football analysis, much like GPT revolutionized text understanding.
His vision: Football is the ultimate team sport, yet most analytics treat players as isolated individuals. Players are often represented by radar charts of individual statistics, ignoring the rich collective context that shapes their identity. While this approach transformed data-driven scouting, it is inherently prone to misinterpretation, leading to costly mistakes in transfers and strategic decisions. Achraff works every day to create a football analytics world that respects the collective DNA of the beautiful game.