PyData Tel Aviv 2025

To see our schedule with full functionality, like timezone conversion and personal scheduling, please enable JavaScript and go here.

08:00

08:00

60min

Breakfast and Registration

Blue

08:00

60min

Breakfast and Registration

Red

08:00

60min

Breakfast and Registration

Green

09:00

09:00

15min

Opening Words

Blue

09:00

15min

Opening Words

Red

09:00

15min

Opening Words

Green

09:15

A New Kind of Learning Systems

Large language models are amazing, but I argue that the real promise is in systems that are built around them. Such systems include---but are not limited to---what is often referred to as "Agents", and they bring with them both opportunities and non-trivial challenges.

09:45

09:45

15min

break

Blue

09:45

15min

break

Red

09:45

15min

break

Green

10:00

Admiral-Driven ML Framework for Marine Operations and Resource Allocation

Unmanned vassals, from ships to mini-submarines, shaping the new age of marine warfare. However, this transformation, occurring in traditional and high-risk environment, should be both better and transformative from our current state to the future. To this end, we developed admiral-driven machine learning framework for marine operations and resource allocation. This frameworks allows to produce admiral approved solution while still using state-of-the-art machine learning methods to obtain mathematical optimum for different needs. In this talk, we discuss the process we followed to developed this framework with real-world examples, sampled data from secret operation, and the code that could make it all happen.

From Quiz to Conversation: Engineering Production-Ready Onboarding Agents

Everyone is excited about conversational AI. Everyone is implementing their own chatbots, until they have to make a conversation behave in production.
Replacing a rigid, quiz-style signup with a dynamic onboarding chat sounds great, but executing it is far from simple. It requires designing a conversation that adapts dynamically, collects data, run actions, and completes all of this within a reasonable timeframe. The real headache isn’t “adding an LLM”; it’s the engineering of an agent that can make decisions, and acts on them using automatic tool triggering, which includes presenting actual assets during the conversation.
In this talk, I’ll show how we treated onboarding as a conversation-engineering problem and shipped a production agent using our internal Python-based Agents SDK. We'll walk through the core building blocks and key considerations of creating and maintaining a real-world onboarding agent in production. By the end of the talk you will learn how to structure conversation flows that adapt dynamically, and how to engineer your agent to reliably achieve specific conversational goals.

Tabular data Transformed? Tab-PFN Brings Deep Learning to the Table

Tab-PFN is a recently introduced transformer-based model for classification and regression on tabular
data. Published after nearly a decade without significant advances in modeling of such data, it has
generated substantial interest due to its promise of exceptional performance across diverse knowledge
domains. Developed by researchers at the Machine Learning Lab in Freiburg, Germany, Tab-PFN aims
to set a new standard for predictive modeling in structured data.
Despite its potential, Tab-PFN remains relatively unknown among data scientists in Israel. In this talk,
we will explore the principles behind Tab-PFN, demonstrate its application across real-world use cases, benchmark its performance against established "gold standard" models for tabular data and discuss
its limitations.

10:30

10:30

15min

break

Blue

10:30

15min

break

Red

10:30

15min

break

Green

10:45

How to Build AI Agents and Keep Your Sanity

Talking about AI agents is easy, but building ones that actually work is very hard.

In this talk, I’ll introduce the concept of AI agents and outline key implementation challenges—compound errors, business context, and high cost and latency. I’ll share a practical framework for building agents, including core components, planning strategies, and evaluation methods, and show how we at AI21 Labs apply these to develop effective, reliable agent-based products.

Is This Feature Actually Interesting? ML & LLMs for Automated Insight Discovery

Machine learning excels at prediction, but often leaves data scientists manually sifting through feature importance lists to find truly interesting insights. This talk introduces "InterFeat," an automated pipeline that goes beyond predictive power to identify features that are novel, plausible, and useful, i.e., "Interesting". We'll demonstrate how combining classical ML, knowledge graphs, literature mining, and Large Language Models (LLMs) can operationalize the elusive concept of "interestingness." Using a case study on real world biomedical data (UK Biobank), I show how this framework automatically surfaces potentially groundbreaking hypotheses (validated by doctors) that traditional methods miss. Attendees will learn a practical approach to accelerate discovery in their own complex datasets.

Tailoring Language Models with Python: Practical SLM Fine-Tuning for Data Scientists

A hands-on guide to fine-tuning small language models (SLMs) using Python tools like Axolotl, PEFT, Accelerate, and HuggingFace Transformers. Learn practical workflows that empower data teams to build performant, private, and domain-specific LLMs.

11:15

11:15

15min

break

Blue

11:15

15min

break

Red

11:15

15min

break

Green

11:30

Faster Pandas: Speed Up Your Code, Shrink Your Cloud Bill

What if you could make your Pandas code faster, leaner, and cheaper to run?
In this talk, you'll learn how to measure performance, hunt bottlenecks, and optimize your code.

Let Your Data Tell Its Story: Building a Lightweight In-House Data Lineage Solution

What if your data could tell you its own story—where it came from, how it moved, and how it was used? In this talk, we’ll show you how to build a lightweight, in-house data lineage tool that brings that story to life. By capturing how data flows through your pipelines and systems, you gain instant visibility into dependencies, usage, and downstream impact of data changes and failures. Whether you're tracking broken pipelines or auditing data usage, keeping track of your data’s lineage gives you the context needed to take quick, informed action.

Revealing the Unseen: Leveraging XAI for Deeper Data Insights

A wise man once told me, “It is not only the what that matters - but the WHY.” In today’s rapidly evolving landscape of fraud, relying on opaque machine learning models is no longer feasible. This talk will explore how we can—and should—harness eXplainable AI (XAI) to demystify these “black-boxed” models, providing transparency and valuable insights into the decision-making processes behind fraud detection.
We will discuss how, at PayPal, we leveraged GenAI to personalize model explanations for each business need and overcame significant production scalability challenges while scaling for over half Billion accounts. By changing our perspective, we found solutions rooted in a fundamental computer science principle that unlocked new efficiencies and transparency.

12:00

12:00

60min

Lunch

Blue

12:00

60min

Lunch

Red

12:00

60min

Lunch

Green

13:00

Mining Parliamentary Gold: Building Hebrew ASR from 9,000 Hours of Knesset Debates

Yanir Marmor, Yoad Snapir

How we transformed 16 years of challenging Knesset recordings, featuring overlapping speech, audience heckling, and non-verbatim protocols, into an 8,825-hour Hebrew ASR dataset, then fine-tuned Whisper models to achieve a 10% WER reduction through smart data engineering over brute-force pre-training. Attendees learn about the challenges and potential of "public-data gold mining.".

Talk Less, Graph More: NLP, Networks and Musicals

Hamilton isn't just a groundbreaking musical—it's a web of lyrical motifs, callbacks, high fives over or heads and repeated phrases that echo across songs and acts.
In this talk, we explore how Natural Language Processing and graph theory can be combined to uncover the hidden structure of the corpus.

You Just Play: Using Data to learn and teach any guitar riff

Tabbers is a platform made by guitarists, for guitarists, enabling them to teach each other guitar riffs simply by playing.

Our core technology turns guitar-playing videos into rich, instructional content. We use audio and video analysis to generate tablature, color-coded finger visualizations, and other enhancements that make any riff easy to learn.

It is useful whether you’re a tutor teaching a targeted passage, a social media content creator, or an amateur player exploring phrasing and analyzing techniques.

Tabbers aims to reduce the friction between inspiration and understanding.
Our motto: You just play - we do the rest.

13:15

Putting the "fun" in function calling

‪Roi Tabach‬‏

Let's get back to the basic with an energized 5 minutes talk about what tool use in LLMs really is, how it started, how it's measured, the Gorilla paper that started it all, and the difference between function calling and strucutured output.

Recreational Image Reconstruction with Decision Trees

Daniel Anderson

A couple of months ago I got married! Very exciting, and a great excuse to complicate things by writing code.

One of the ways Python was involved is in creating abstract backgrounds for copy such as the invitation, menus and so on, using pictures we took in our travels. In the talk I'll walk through the journey of finding the technique to artistically recreate images with decision trees.

Spell My Name With... a Story

Names hold stories. And not just any stories, but stories of communities. They can represent a shared history, folklore, or even values. But in any case, they showcase the representation people have of their society and their inner self.
With that in mind, I took the CBS’s files on birth names and used Python to derive meaning, comparing OpenAI with several common naming sites, and came up with a single definition. Having one meaning for each name, I clustered the names to groups, trying to answer:
• What themes we can find in Israeli names?
• How these themes changed throughout the years (and how this represents changes in the Israeli society)?
• Are there differences in themes between the more and the lesser common names?
In this talk I will try to answer these questions and reflect on the changes we’ve gone through between Sarah and Moshe, to Tamar and David.

13:30

13:30

15min

break

Blue

13:30

15min

break

Red

13:30

15min

break

Green

13:45

13:45

30min

TBD

Blue

13:45

30min

TBD

Red

13:45

30min

TBD

Green

14:15

14:15

15min

break

Blue

14:15

15min

break

Red

14:15

15min

break

Green

14:30

Autonomous LLM-driven research - from data to human-verifiable research papers

Lukas Hafner, Tal Ifargan

AI has led to major accelerations across various domains, and is also prone to become a cornerstone of scientific discovery in the future. Yet, it remains unclear whether AI systems can perform fully autonomous research while also adhering to key scientific values, such as transparency, traceability and verifiability. Translating human scientific practices into a code workflow, we built data-to-paper, an automation platform that guides interacting LLM agents through a complete stepwise research process, from annotated data to comprehensive research papers. The platform can write, correct and execute code, perform literature research and produce simple figures and write and compile a scientific manuscript. To enhance accuracy and enforce good scientific practices during the process, data-to-paper features both programmatic guardrails and LLM-based feedback. As a key feature, data-to-paper programmatically back-traces the information flow during the process, resulting in “data-chained” manuscripts in which each data element is linked to its source and which are highly readable and explainable for a human user. The platform can run fully autonomously but also allows human intervention. Testing the platform on diverse datasets, it produced autonomously correct papers in 80%-90% of runs for simple datasets and research goals, yet human interventions became critical for more complex tasks. Data-to-paper is the first peer-reviewed, open-source system to present an agentic workflow of an LLM-driven AI-scientist and demonstrates a potential for AI-driven acceleration of scientific discovery in data-driven research and beyond, while setting through “data-chaining” a new standard for verifiability and traceability for the coming era of AI-driven science.

Building the Future of AI Trip Planning: LLMs, Inference Optimization, and Agentic Designs at Booking.com

Moran Beladev, Chana Ross

In this practical talk, we share how Booking.com built its AI Trip Planner—an LLM-powered experience that personalizes travel planning at scale. We’ll walk through real-world design decisions, technical challenges, and infrastructure optimizations involved in delivering real-time hotel and destination recommendations using large language models (LLMs).
We’ll cover key challenges like moderating user input, classifying intent, structuring dialogues, and generating grounded responses. Through prompt engineering and custom model development, we tailored LLM interactions to our product needs while ensuring speed and relevance.
To address inference latency, we implemented speculative decoding and integrated Medusa-1, a novel architecture that predicts multiple tokens in parallel, achieving a 1.8x speedup with no loss in quality. We’ll detail its design and training trade-offs.
Beyond acceleration, we’ll highlight our move toward agentic AI systems—modular components that orchestrate LLMs, retrieval services, and Booking.com APIs to solve complex travel queries. For example:
A Question-Answering Agent that fuses LLMs, real-time data, and APIs for context-aware answers.

An Itinerary-Building Agent that generates dynamic, multi-step travel plans by integrating user preferences and live availability.

Finally, we’ll show how we evaluate quality in production using LLM-based evaluations, including Judge LLMs for automatic assessment, dialog quality and more.

Integrating LLMs with Traditional Data Analysis

Maria Murashova

The rise of Large Language Models has transformed data analysis, but these powerful tools aren't replacements for traditional statistical methods - they're complements. This talk explores practical strategies for building hybrid systems that leverage the strengths of both approaches. I'll demonstrate architectural patterns for effective integration, share performance comparisons, and present a real-world case study of a production sentiment analysis system that combines traditional ML for data preprocessing with LLMs for nuanced text understanding. Attendees will gain practical knowledge for enhancing their data pipelines with LLM capabilities while maintaining statistical rigor and computational efficiency.

15:00

15:00

15min

break

Blue

15:00

15min

break

Red

15:00

15min

break

Green

15:15

Evaluating Your AI Agent: How Do You Properly Measure Performance?

Linoy Cohen, Shirli Di Castro Shashua

AI agents are becoming the next big thing. But deploying an agent without truly understanding its performance, limits, and potential failure points is a high-stakes gamble. How do you ensure your agent is not just functional, but genuinely reliable, robust, and safe?
This talk explores the practical challenges of evaluating AI agents effectively. We'll discover how to define meaningful success metrics, implement comprehensive testing strategies that reflect real world complexity, and meaningfully incorporate human feedback. You'll leave with a practical framework to confidently assess your agent's capabilities and ensure reliable performance when stakes are high.

From Pandas Chaos to Production Gold: Mastering ML Features with Feast

Yuval Gorchover

Production ML failures often stem from one overlooked issue: features that work perfectly in development break during inference. Through hands-on demonstration, this session shows how to eliminate feature drift using Feast's Python-based open source architecture. Learn to build reliable feature pipelines that maintain consistency across training and serving environments, ensuring your models perform as expected when deployed at scale.

Marimo: A new notebook

Reuven M. Lerner

Jupyter notebooks have long been the standard among people working with data. But over the years, we've seen that their have a number of limitations, too. Marimo is a completely new Python-specific notebook, similar in many ways to Jupyter but with an eye toward easier code sharing, predictable execution order, and distribution. In this talk, I'll introduce Marimo, show you a number of its features, and indicate how (and why) you might start to adopt it in your organization — for coding, analysis, and even data dashboards.

15:45

15:45

15min

break

Blue

15:45

15min

break

Red

15:45

15min

break

Green

16:00

A Game-Theoretic Perspective on the Recommender (Eco-)System

Recommender systems play a critical role in modern digital platforms, driving personalized user experiences across e-commerce, content streaming, and social media. At the heart of these systems is the matching of users with relevant content, prompting content creators to optimize their work for visibility. This optimization leads to a competitive dynamic where creators continually adjust their content strategies until reaching a stable state, often at the expense of creative integrity. The strategic behavior of creators, in turn, influences the content available to end users, making the design of the recommendation mechanism a crucial factor in shaping user satisfaction. The platform operating the recommendation system must then balance creators' incentives, user satisfaction, and ecosystem stability to sustain long-term engagement. Understanding these interdependencies and the incentives of all participants is crucial for designing robust and sustainable recommendation systems.

In this talk, we will explore the recommendation ecosystem through the lens of game theory, addressing a fundamental question: how can we design recommendation mechanisms that guarantee the convergence of natural dynamics among strategic content creators to a stable state? We will go over basic concepts in game theory (no background needed :)) and apply them to derive a formal (yet simple) framework in which the stability of different recommendation mechanisms can be analyzed and compared. We will also present empirical evidence highlighting the trade-offs between content creators' welfare, user satisfaction, and ecosystem stability, along with practical tools for system designers to manage them effectively in alignment with business goals.

The talk is based on joint work with Idan Pipano, Itamar Reinman, and Moshe Tennenholtz:
https://arxiv.org/abs/2305.16695
https://arxiv.org/abs/2405.11517

Do You Want to Build a Snowman? Leveraging DBT, SQL and Python to Build Production Data Science Pipelines in Snowflake

As data scientists and analysts working in a data-led product team, we often found ourselves struggling to move our carefully crafted heuristics, insightful features, and promising machine learning models from initial experimentation into reliable, production-ready systems.

In this talk, I’ll share how we tackled this by building a scalable solution leveraging DBT and Snowflake.
This talk is meant for all data-oriented professions, and while a background in building data pipelines is helpful, it is not required to understand the talk.

Learning How to Learn in the AI Era (Using Agents as a Use Case)

Mor Hananovitz, Ortal Ashkenazi

Today, the key skill isn’t mastering every line of code - it’s keeping up. This talk shows how understanding core concepts, using AI tools, and writing effective prompts can accelerate learning and development in a fast-moving AI landscape.

The use case will be demonstrated using LangChain and LangGraph LLM frameworks, via Cursor as the IDE with native LLM infrastructure.

16:30

16:30

15min

Closing Words

Blue

16:30

15min

Closing Words

Red

16:30

15min

Closing Words

Green