PyData Tel Aviv 2025

To see our schedule with full functionality, like timezone conversion and personal scheduling, please enable JavaScript and go here.
08:00
08:00
60min
Breakfast and Registration
AI
08:00
60min
Breakfast and Registration
ML+analytics
08:00
60min
Breakfast and Registration
Eng
09:00
09:00
15min
Opening Words
AI
09:00
15min
Opening Words
ML+analytics
09:00
15min
Opening Words
Eng
09:15
09:15
30min
Kenote: Yoav Goldberg
AI
09:15
30min
Kenote: Yoav Goldberg
ML+analytics
09:15
30min
Kenote: Yoav Goldberg
Eng
09:45
09:45
15min
break
AI
09:45
15min
break
ML+analytics
09:45
15min
break
Eng
10:00
10:00
30min
From Quiz to Conversation: Engineering Production-Ready Onboarding Agents
Noa Radin

Everyone is excited about conversational AI. Everyone is implementing their own chatbots, until they have to make a conversation behave in production.
Replacing a rigid, quiz-style signup with a dynamic onboarding chat sounds great, but executing it is far from simple. It requires designing a conversation that adapts dynamically, collects data, run actions, and completes all of this within a reasonable timeframe. The real headache isn’t “adding an LLM”; it’s the engineering of an agent that can make decisions, and acts on them using automatic tool triggering, which includes presenting actual assets during the conversation.
In this talk, I’ll show how we treated onboarding as a conversation-engineering problem and shipped a production agent using our internal Python-based Agents SDK. We'll walk through the core building blocks and key considerations of creating and maintaining a real-world onboarding agent in production. By the end of the talk you will learn how to structure conversation flows that adapt dynamically, and how to engineer your agent to reliably achieve specific conversational goals.

Eng
10:00
30min
Tabular data Transformed? Tab-PFN Brings Deep Learning to the Table
Noa Henig

Tab-PFN is a recently introduced transformer-based model for classification and regression on tabular
data. Published after nearly a decade without significant advances in modeling of such data, it has
generated substantial interest due to its promise of exceptional performance across diverse knowledge
domains. Developed by researchers at the Machine Learning Lab in Freiburg, Germany, Tab-PFN aims
to set a new standard for predictive modeling in structured data.
Despite its potential, Tab-PFN remains relatively unknown among data scientists in Israel. In this talk,
we will explore the principles behind Tab-PFN, demonstrate its application across real-world use cases, benchmark its performance against established "gold standard" models for tabular data and discuss
its limitations.

ML+analytics
10:00
30min
The Animal Kingdom Through AI Eyes: Emotions, Movement, and Disease Detection
Teddy Lazebnik

Wouldn't it be amazing to know what your cat feels? Is your dog a lover or a fighter? When does a cow’s moo signal sickness? Today, in the age of Artificial Intelligence (AI), these questions are no longer science fiction. The desire to understand animals is as old as humanity itself, with stories of people speaking to animals etched on cave walls. In this talk, we present three studies where machine learning (and deep learning) were used to answer these very questions, taking a bold step toward fulfilling this ancient dream. There will be code and data on-screen so you will be able to try on your own pet.

AI
10:30
10:30
15min
break
AI
10:30
15min
break
ML+analytics
10:30
15min
break
Eng
10:45
10:45
30min
How to Build AI Agents and Keep Your Sanity
Shuki Cohen, AI Evangelist at AI21 Labs

Talking about AI agents is easy, but building ones that actually work is very hard.

In this talk, I’ll introduce the concept of AI agents and outline key implementation challenges—compound errors, business context, and high cost and latency. I’ll share a practical framework for building agents, including core components, planning strategies, and evaluation methods, and show how we at AI21 Labs apply these to develop effective, reliable agent-based products.

Eng
10:45
30min
Is This Feature Actually Interesting? ML & LLMs for Automated Insight Discovery
Dan Ofer

machine learning excels at prediction, often leaving data scientists manually sifting through feature importance lists to find truly interesting insights. This talk introduces "InterFeat," an automated pipeline that goes beyond predictive power to identify features that are novel, plausible, and useful, i.e., "Interesting". We'll demonstrate how combining classical ML, knowledge graphs, literature mining, and Large Language Models (LLMs) can operationalize the elusive concept of "interestingness." Using a case study on real world biomedical data (UK Biobank), I show how this framework automatically surfaces potentially groundbreaking hypotheses (validated by doctors) that traditional methods miss. Attendees will learn a practical approach to accelerate discovery in their own complex datasets.

ML+analytics
10:45
30min
Tailoring Language Models with Python: Practical SLM Fine-Tuning for Data Scientists
Sigal Shaked

A hands-on guide to fine-tuning small language models (SLMs) using Python tools like transformers, unsloth, and trl. Learn practical workflows that empower data teams to build performant, private, and domain-specific LLMs.

AI
11:15
11:15
15min
break
AI
11:15
15min
break
ML+analytics
11:15
15min
break
Eng
11:30
11:30
30min
Faster Pandas: Speed Up Your Code, Shrink Your Cloud Bill
Miki Tebeka

What if you could make your Pandas code faster, leaner, and cheaper to run?
In this talk, you'll learn how to measure performance, hunt bottlenecks, and optimize your code.

ML+analytics
11:30
30min
Let Your Data Tell Its Story: Building a Lightweight In-House Data Lineage Solution
אילנה מקובר

What if your data could tell you its own story—where it came from, how it moved, and how it was used? In this talk, we’ll show you how to build a lightweight, in-house data lineage tool that brings that story to life. By capturing how data flows through your pipelines and systems, you gain instant visibility into dependencies, usage, and downstream impact of data changes and failures. Whether you're tracking broken pipelines or auditing data usage, keeping track of your data’s lineage gives you the context needed to take quick, informed action.

Eng
11:30
30min
Revealing the Unseen: Leveraging XAI for Deeper Data Insights
Gal Benor

A wise man once told me, “It is not only the what that matters - but the WHY.” In today’s rapidly evolving landscape of fraud, relying on opaque machine learning models is no longer feasible. This talk will explore how we can—and should—harness eXplainable AI (XAI) to demystify these “black-boxed” models, providing transparency and valuable insights into the decision-making processes behind fraud detection.
We will discuss how, at PayPal, we leveraged GenAI to personalize model explanations for each business need and overcame significant production scalability challenges while scaling for over 1 Billion accounts. By changing our perspective, we found solutions rooted in a fundamental computer science principle that unlocked new efficiencies and transparency.

AI
12:00
12:00
60min
Lunch
AI
12:00
60min
Lunch
ML+analytics
12:00
60min
Lunch
Eng
13:00
13:00
15min
Mining Parliamentary Gold: Building Hebrew ASR from 9,000 Hours of Knesset Debates
Yanir Marmor, Yoad Snapir

How we transformed 16 years of challenging Knesset recordings, featuring overlapping speech, audience heckling, and non-verbatim protocols, into an 8,825-hour Hebrew ASR dataset, then fine-tuned Whisper models to achieve a 10% WER reduction through smart data engineering over brute-force pre-training. Attendees learn about the challenges and potential of "public-data gold mining.".

Eng
13:00
15min
Tabbers - Turn Guitar-Playing Video Clips into Lessons
Lior Kupfer

Tabbers is a suite of tools that transforms guitar-playing videos into rich, instructional content.
We use audio and video analysis to generate tablature, color-coded finger visualizations, and additional enrichments that would make it accessible and easy to learn how to play.

This is a great tool for you whether you’re a beginner trying to understand phrasing, an advanced player analyzing technique, or a tutor trying to teach a specific phrase.

Tabbers aims to reduce the friction between inspiration and understanding. Our moto is - You just play, we do the rest.

AI
13:00
15min
Talk Less, Graph More: NLP, Networks and Musicals
Tal Mizrachi

Hamilton isn't just a groundbreaking musical—it's a web of lyrical motifs, callbacks, high fives over or heads and repeated phrases that echo across songs and acts.
In this talk, we explore how Natural Language Processing and graph theory can be combined to uncover the hidden structure of the corpus.

ML+analytics
13:15
13:15
15min
Lightning Talk: Fun Intro to Function Calling
‪Roi Tabach‬‏

Let's get back to the basic with an energized 5 minutes talk about what tool use in LLMs really is, how it started, how it's measured, the Gorilla paper that started it all, and the difference between function calling and strucutured output.

Eng
13:15
15min
Recreational Image Reconstruction with Decision Trees
Daniel Anderson

A couple of months ago I got married! Very exciting, and a great excuse to complicate things by writing code.

One of the ways Python was involved is in creating abstract backgrounds for copy such as the invitation, menus and so on, using pictures we took in our travels. In the talk I'll walk through the journey of finding the technique to artistically recreate images with decision trees.

ML+analytics
13:15
15min
Spell My Name With... a Story
Ira Yaari

Names hold stories. And not just any stories, but stories of communities. They can represent a shared history, folklore, or even values. But in any case, they showcase the representation people have of their society and their inner self.
With that in mind, I took the CBS’s files on birth names and used Python to derive meaning, comparing OpenAI with several common naming sites, and came up with a single definition. Having one meaning for each name, I clustered the names to groups, trying to answer:
• What themes we can find in Israeli names?
• How these themes changed throughout the years (and how this represents changes in the Israeli society)?
• Are there differences in themes between the more and the lesser common names?
In this talk I will try to answer these questions and reflect on the changes we’ve gone through between Sarah and Moshe, to Tamar and David.

AI
13:30
13:30
15min
break
AI
13:30
15min
break
ML+analytics
13:30
15min
break
Eng
13:45
13:45
30min
TBD
AI
13:45
30min
TBD
ML+analytics
13:45
30min
TBD
Eng
14:15
14:15
15min
break
AI
14:15
15min
break
ML+analytics
14:15
15min
break
Eng
14:30
14:30
30min
Autonomous LLM-driven research - from data to human-verifiable research papers
Lukas Hafner, Tal Ifargan

AI has led to major accelerations across various domains, and is also prone to become a cornerstone of scientific discovery in the future. Yet, it remains unclear whether AI systems can perform fully autonomous research while also adhering to key scientific values, such as transparency, traceability and verifiability. Translating human scientific practices into a code workflow, we built data-to-paper, an automation platform that guides interacting LLM agents through a complete stepwise research process, from annotated data to comprehensive research papers. The platform can write, correct and execute code, perform literature research and produce simple figures and write and compile a scientific manuscript. To enhance accuracy and enforce good scientific practices during the process, data-to-paper features both programmatic guardrails and LLM-based feedback. As a key feature, data-to-paper programmatically back-traces the information flow during the process, resulting in “data-chained” manuscripts in which each data element is linked to its source and which are highly readable and explainable for a human user. The platform can run fully autonomously but also allows human intervention. Testing the platform on diverse datasets, it produced autonomously correct papers in 80%-90% of runs for simple datasets and research goals, yet human interventions became critical for more complex tasks. Data-to-paper is the first peer-reviewed, open-source system to present an agentic workflow of an LLM-driven AI-scientist and demonstrates a potential for AI-driven acceleration of scientific discovery in data-driven research and beyond, while setting through “data-chaining” a new standard for verifiability and traceability for the coming era of AI-driven science.

AI
14:30
30min
Building the Future of AI Trip Planning: LLMs, Inference Optimization, and Agentic Designs at Booking.com
Moran Beladev, Chana Ross

In this practical talk, we share how Booking.com built its AI Trip Planner—an LLM-powered experience that personalizes travel planning at scale. We’ll walk through real-world design decisions, technical challenges, and infrastructure optimizations involved in delivering real-time hotel and destination recommendations using large language models (LLMs).
We’ll cover key challenges like moderating user input, classifying intent, structuring dialogues, and generating grounded responses. Through prompt engineering and custom model development, we tailored LLM interactions to our product needs while ensuring speed and relevance.
To address inference latency, we implemented speculative decoding and integrated Medusa-1, a novel architecture that predicts multiple tokens in parallel, achieving a 1.8x speedup with no loss in quality. We’ll detail its design and training trade-offs.
Beyond acceleration, we’ll highlight our move toward agentic AI systems—modular components that orchestrate LLMs, retrieval services, and Booking.com APIs to solve complex travel queries. For example:
A Question-Answering Agent that fuses LLMs, real-time data, and APIs for context-aware answers.

An Itinerary-Building Agent that generates dynamic, multi-step travel plans by integrating user preferences and live availability.

Finally, we’ll show how we evaluate quality in production using LLM-based evaluations, including Judge LLMs for automatic assessment, dialog quality and more.

Eng
14:30
30min
Integrating LLMs with Traditional Data Analysis
Maria Murashova

The rise of Large Language Models has transformed data analysis, but these powerful tools aren't replacements for traditional statistical methods - they're complements. This talk explores practical strategies for building hybrid systems that leverage the strengths of both approaches. I'll demonstrate architectural patterns for effective integration, share performance comparisons, and present a real-world case study of a production sentiment analysis system that combines traditional ML for data preprocessing with LLMs for nuanced text understanding. Attendees will gain practical knowledge for enhancing their data pipelines with LLM capabilities while maintaining statistical rigor and computational efficiency.

ML+analytics
15:00
15:00
15min
break
AI
15:00
15min
break
ML+analytics
15:00
15min
break
Eng
15:15
15:15
30min
Evaluating Your AI Agent: How Do You Properly Measure Performance?
Linoy Cohen, Shirli Di Castro Shashua

AI agents are becoming the next big thing. But deploying an agent without truly understanding its performance, limits, and potential failure points is a high-stakes gamble. How do you ensure your agent is not just functional, but genuinely reliable, robust, and safe?
This talk explores the practical challenges of evaluating AI agents effectively. We'll discover how to define meaningful success metrics, implement comprehensive testing strategies that reflect real world complexity, and meaningfully incorporate human feedback. You'll leave with a practical framework to confidently assess your agent's capabilities and ensure reliable performance when stakes are high.

AI
15:15
30min
From Pandas Chaos to Production Gold: Mastering ML Features with Feast
Yuval Gorchover

Production ML failures often stem from one overlooked issue: features that work perfectly in development break during inference. Through hands-on demonstration, this session shows how to eliminate feature drift using Feast's Python-based open source architecture. Learn to build reliable feature pipelines that maintain consistency across training and serving environments, ensuring your models perform as expected when deployed at scale.

Eng
15:15
30min
Marimo: A new notebook
Reuven M. Lerner

Jupyter notebooks have long been the standard among people working with data. But over the years, we've seen that their have a number of limitations, too. Marimo is a completely new Python-specific notebook, similar in many ways to Jupyter but with an eye toward easier code sharing, predictable execution order, and distribution. In this talk, I'll introduce Marimo, show you a number of its features, and indicate how (and why) you might start to adopt it in your organization — for coding, analysis, and even data dashboards.

ML+analytics
15:45
15:45
15min
break
AI
15:45
15min
break
ML+analytics
15:45
15min
break
Eng
16:00
16:00
30min
A Game-Theoretic Perspective on the Recommender (Eco-)System
Omer Madmon

Recommender systems play a critical role in modern digital platforms, driving personalized user experiences across e-commerce, content streaming, and social media. At the heart of these systems is the matching of users with relevant content, prompting content creators to optimize their work for visibility. This optimization leads to a competitive dynamic where creators continually adjust their content strategies until reaching a stable state, often at the expense of creative integrity. The strategic behavior of creators, in turn, influences the content available to end users, making the design of the recommendation mechanism a crucial factor in shaping user satisfaction. The platform operating the recommendation system must then balance creators' incentives, user satisfaction, and ecosystem stability to sustain long-term engagement. Understanding these interdependencies and the incentives of all participants is crucial for designing robust and sustainable recommendation systems.

In this talk, we will explore the recommendation ecosystem through the lens of game theory, addressing a fundamental question: how can we design recommendation mechanisms that guarantee the convergence of natural dynamics among strategic content creators to a stable state? We will go over basic concepts in game theory (no background needed :)) and apply them to derive a formal (yet simple) framework in which the stability of different recommendation mechanisms can be analyzed and compared. We will also present empirical evidence highlighting the trade-offs between content creators' welfare, user satisfaction, and ecosystem stability, along with practical tools for system designers to manage them effectively in alignment with business goals.

The talk is based on joint work with Idan Pipano, Itamar Reinman, and Moshe Tennenholtz:
https://arxiv.org/abs/2305.16695
https://arxiv.org/abs/2405.11517

ML+analytics
16:00
30min
Do You Want to Build a Snowman? Leveraging DBT, SQL and Python to Build Production Data Science Pipelines in Snowflake
Ori Cohen

As data scientists and analysts working in a data-led product team, we often found ourselves struggling to move our carefully crafted heuristics, insightful features, and promising machine learning models from initial experimentation into reliable, production-ready systems.

In this talk, I’ll share how we tackled this by building a scalable solution leveraging DBT and Snowflake.
This talk is meant for all data-oriented professions, and while a background in building data pipelines is helpful, it is not required to understand the talk.

Eng
16:00
30min
Learning How to Learn in the AI Era (Using Agents as a Use Case)
Mor Hananovitz, Ortal Ashkenazi

Today, the key skill isn’t mastering every line of code - it’s keeping up. This talk shows how understanding core concepts, using AI tools, and writing effective prompts can accelerate learning and development in a fast-moving AI landscape.

The use case will be demonstrated using LangChain and LangGraph LLM frameworks, via Cursor as the IDE with native LLM infrastructure.

AI
16:30
16:30
15min
Closing Words
AI
16:30
15min
Closing Words
ML+analytics
16:30
15min
Closing Words
Eng