PyData Seattle 2025

Allison Ding is a developer advocate for GPU-accelerated AI APIs, libraries, and tools at NVIDIA, with a specialization in large language models (LLMs) and advanced data science techniques. She brings over nine years of hands-on experience as a data scientist, focusing on managing and delivering end-to-end data science solutions. Her academic background includes a strong emphasis on natural language processing (NLP) and generative AI. Allison holds a master’s degree in Applied Statistics from Cornell University and a master’s degree in Computer Science from San Francisco Bay University.

Scaling Large-Scale Interactive Data Visualization with Accelerated Computing

Allison Wang

Software Engineer at Databricks. Apache Spark Committer.

Polars on Spark: Unlocking Performance with Arrow Python UDFs

Andy Terrel

I lead CUDA Python Product Management, working to make CUDA a Python native.

I received my Ph.D. from the University of Chicago in 2010, where Ibuilt domain-specific languages to generate high-performance code for physics simulations with the PETSc and FEniCS projects. After spending a brief time as a research professor at the University of Texas and Texas Advanced Computing Center, I have been a serial startup executive, including a founding team member of Anaconda.

I am a leader in the Python open data science community (PyData). A contributor to Python's scientific computing stack since 2006, I am most notably a co-creator of the popular Dask distributed computing framework, the Conda package manager, and the SymPy symbolic computing library. I was a founder of the NumFOCUS foundation. At NumFOCUS, I served as the president and director, leading the development of programs supporting open-source codes such as Pandas, NumPy, and Jupyter.

Building Inference Workflows with Tile Languages
GPU Accelerated Python

Anquida Adams

Anquida Adams is a keynote speaker, consultant, and leadership developer dedicated to cultivating healthier, more inclusive cultures within leaders, teams, and communities. With a foundation in culture development and Equity, Inclusion, and Diversity (EID), she brings over a decade of expertise helping organizations build environments that foster belonging, accountability, and innovation through human-centered leadership.

Through her signature frameworks—Identity Intelligence™️, Human Energetic Systems™️, Human Emotional Set-point Systems™️, and Socio-Emotional/Psychological Intelligence™️—Anquida equips executives and teams with tools to foster self-awareness, relational balance, and authentic leadership. Her work bridges the science of behavior with the art of human connection, guiding leaders to understand how identity, emotions, and social systems influence organizational health and culture.

Anquida’s executive and team programs emphasize inclusive, emotionally intelligent leadership, equipping organizations to navigate change, conflict, and transformation with resilience and well-being. Her philosophy—“We are the subculture of the larger culture; how are you showing up?”—reflects her belief that cultural change begins with individual accountability and self-leadership.

As a Social Relations Coach and Organizational Developer, Anquida explores how interpersonal, team-based, and institutional relationships shape opportunity, collaboration, and long-term sustainability. She draws upon theories such as social exchange theory, social constructionism, and symbolic interactionism to develop practical strategies for enhancing communication, strengthening networks, resolving conflict, and achieving deeper cultural alignment.

Her expertise spans all dimensions of human systems:

-Interpersonal Relationships: Building trust, empathy, and resilience across professional and personal contexts.

-Teams & Social Networks: Strengthening collaboration and innovation through relationship-based leadership.

-Social Interactions: Navigating cooperation, conflict, power, and negotiation with clarity and fairness.

-Organizational & Social Structure: Addressing systemic patterns, hierarchy, and culture that impact equity and inclusion.

-Social Impact: Connecting leadership practices to community well-being, organizational performance, and sustainability.

Anquida’s unique approach integrates socio-emotional intelligence with leadership development, helping organizations recognize that leadership is not just about strategy—it is about relationships, behavior, and co-creating thriving, equitable cultures.

Educated at Mississippi State University, she earned a Bachelor's degree in Sociology, Gender Studies, and Leadership Skills in 2010. After relocating to Seattle, she pursued a Master’s degree in Leadership and Organizational Development Systems (2014–2015). She also holds a Green Belt Lean Six Sigma Certification (University of Washington Tacoma, 2016) and completed the Minority Business Executive Program at UW Seattle (2018).

Her leadership in EID and community engagement includes serving as:

-Women In Bio Seattle Chapter DEI Chair (2024–Present)

-Chapter of Mothers (2024–Present)

-2025 PNW Climate Week Eastside Lead, managing over 35 sessions

-Space Week Planning Committee (Lead Space and Disability Inclusion, 2025)

-Mayor Appointed Seattle Disabilities Commission, Former Commissioner and Chair of Inclusion, Development & Outreach

-Rotary Club World Disability Advocacy Board Member (2022–23), now Honorary Member and President Advisor

-AFP, Event Planning Committee Member

-Elected 43LD Executive Board Officer, KCDCC Alternate/ KCDCC Co-Committee Chair of Membership Appreciation / WA Dem Caucus 2020/ Chair of Affirmative Action Committee (WADemCaucus2020), Election Observer

-Board of Directors, Metropolitan Democratic Club

-King County Board Developmental Disabilities/VC of the Board/ and Legislative Committee Co-Lead

-Techstars Seattle Startup Week VP of Tracks (2018)

-Kent Chamber of Commerce, Chamber Member/ DEI Committee Member

-Fremont Chamber of Commerce, Board Member, DEI Committee Lead

-GSBA Chamber of Commerce Ambassador and Scholar Interviewer Volunteer

-Seattle Chamber of Commerce, YPN Board Member

-Year Up, Student Mentor

Diversity Panel: Data for All: Empowering Underrepresented Voices in Data Science and Analytics

Avik Basu

Avik is a seasoned machine learning engineer and data scientist who is passionate about developing tech that enhances people's lives. With deep expertise in scientific Python and a proven track record of building impactful ML solutions, he focuses on creating systems that address real-world challenges and improve people's lives.

Beyond Just Prediction: Causal Thinking in Machine Learning

Aziza Mirsaidova

Aziza is an Applied Scientist at Oracle (AI Science) in Generative AI Evaluations specifically working with multi-modal, text and code generation. Previously she worked in Content Moderation, AI safety at Microsoft’s Responsible & OpenAI research team. She is a graduate of a master of science in Artificial Intelligence from Northwestern University. Aziza is interested in developing tools and methods that embed human-like reasoning capabilities into AI systems and applying these technologies to socially-driven tasks. Aziza is based in Seattle and after work, she gets busy training for her next marathon or hiking somewhere around PNW.

Prompt Variation as a Diagnostic Tool: Exposing Contamination, Memorization, and True Capability in LLMs

Bernardo Dionisi

Hi, I’m Bernardo. I earned my PhD at Duke University, where I studied the economics of innovation. That work drew me into the practical challenges of data—how to make pipelines reliable, how to integrate validation naturally, and more recently, how these tools can be combined with AI.

Know Your Data(Frame) with Paguro: Declarative and Composable Validation and Metadata using Polars

Bill Engels

Bill Engels is a Principal Data Scientist with PyMC Labs, with 10 years of experience in industry and an MS in Statistics from Portland State University. He enjoys all phases of data analysis and is particularly interested in Bayesian modeling and Gaussian processes.

Actually using GPs in practice with PyMC

C.A.M. Gerlach

Python and Spyder core developer, specializing in docs, infra, and UI. Python Docs Team and PEP Editor. Star✦Fleet Commander. Former NASA-funded lighting UAH researcher.

Democratizing (Py)Data: Remote computing for all
Newcomer Sprint!

Carl Kadie

Carl Kadie leads the FaST-LMM open-source Python project for genomics. He also contributes to other Python and Rust projects, including a visualizer for the Turning Machine bbchallenge.org website. Previously, Carl was a Principal Applied Scientist at Microsoft and Microsoft Research, where he worked in machine learning, statistics, and genomics, with publications in Science and Nature.
(On the side, he writes fun articles about Python, Rust, and scientific programming on Medium.)

Explore Solvable and Unsolvable Equations with SymPy
How to Optimize your Python Program for Slowness: Inspired by New Turing Machine Results

Carlos Garcia Jurado Suarez

My name is Carlos Garcia Jurado Suarez, and I’m a Software and Machine Learning Engineering Consultant at CodePointers, helping research organizations.

I have over 25 years of experience as an engineer, applied scientist, and manager at organizations of all sizes: Big Tech (Microsoft Research, Meta), early and growth stage startups and academia. My expertise and passion are in Machine Learning and Scientific Computing, and in particular bridging the research and engineering worlds.

I hold master's degrees in Computer Science and in Applied Mathematics, both from the University of Washington, as well as a bachelor's degree in Physics from ITESM, in Monterrey, Mexico.

Wrangling Internet-scale Image Datasets

Catherine Nelson

Catherine Nelson is an experienced data scientist and ML engineer, and the author of two O'Reilly books: Software Engineering for Data Scientists (2024) and Building Machine Learning Pipelines (2020). Previously, she was a Principal Data Scientist at SAP Concur, where she deployed NLP models to production and created innovative features including ML-powered carbon emissions analytics. She is currently consulting for startups on AI evaluation and developer relations. Catherine holds a PhD in Geophysics from Durham University and a Masters in Earth Sciences from Oxford University.

Robert Masson is a Senior Principal Data Scientist at Atlassian using data to inform strategic decisions at the company. He previously worked 11 years as a data scientist at Meta and 3 years as a quant at a hedge fund. Robert has a PhD in Mathematics from University of Chicago.

Going From Notebooks to Production Code

Chang She

Chang She is CEO/Co-founder at LanceDB building modern data infrastructure for AI. Previously he architected the ML and experimentation stack at TubiTV as VP of Engineering. In the mythical pre-pandemic epoch, Chang was the 2nd major contributor to Pandas, CTO/Co-founder of DataPad, and a recovering financial quant.

Keynote: Chang She - Never Send a Human to do an Agent's Search

Daniel Chen

Lecturer at the University of British Columbia and Data Science Educator at Posit, PBC

LLMs, Chatbots, and Dashboards: Visualize and Analyze Your Data with Natural Language

David Aronchick

I am CEO and co-founder of Expanso, and the Bacalhau Project helping, deploying and organizing our community building the next generation of the Internet.

Previously, I was co-director of Research Development at Protocol Labs, led Open Source Machine Learning Strategy at Azure, product management for Kubernetes on behalf of Google, launched Google Kubernetes Engine, and co-founded the Kubeflow project and the SAME project. I have also worked at Microsoft, Amazon and Chef and co-founded three startups.

When not spending too much time in service of electrons, I can be found on a mountain (on skis), traveling the world (via restaurants) or participating in kid activities, of which there are a lot more than I remember than when I was that age.

Taming the Data Tsunami: An Open-Source Playbook to Get Ready for ML

Denny Lee

Denny Lee is a long-time Apache Spark™ and MLflow contributor, Unity Catalog and Delta Lake maintainer, and a Product Management Director and Principal Developer Advocate at Databricks. He is a hands-on distributed systems and data sciences engineer with extensive experience developing internet-scale data platforms and predictive analytics and AI systems. He has previously built enterprise DW/BI and big data systems at Microsoft, including Azure Cosmos DB, Project Isotope (HDInsight), and SQL Server. He was also the Senior Director of Data Sciences Engineering at SAP Concur. He also has a Masters of Biomedical Informatics from Oregon Health and Sciences University and has implemented powerful data solutions for enterprise Healthcare customers. His current technical focuses include AI, Distributed Systems, Delta Lake, Apache Spark, Deep Learning, Machine Learning, and Genomics.

Building Agents with Agent Bricks and MCP

Devin Petersohn

Devin Petersohn is a Software Engineer at Snowflake, focusing on dataframes and distributed systems. Prior to working at Snowflake, Devin did a PhD at UC Berkeley, where he created a dataframe project called Modin, and wrote his thesis on dataframes. Devin is passionate about making complex distributed systems more accessible, and has contributed to multiple open source projects.

We don't dataframe shame: A love letter to dataframes

Eloisa Elias T

Eloisa stands as a trailblazer in Seattle’s tech scene, recognized as the Pacific Northwest’s premier open-source event host. As a data scientist and the visionary founder of PyData Seattle under NumFOCUS, she has cultivated a vibrant community for data enthusiasts. She also chairs PyLadies Seattle, empowering women in Python programming.

Her leadership extends to founding the Women in Data Science (WiDS) and Women’s conferences in Seattle, fostering inclusive spaces for learning and collaboration. As a Women Techmakers Ambassador and Databricks MVP, Eloisa amplifies her impact, inspiring countless individuals through mentorship and advocacy.

Serving on the Technical Board at NumFOCUS, Eloisa shapes the strategic direction of open-source scientific computing. Her tireless collaboration with nonprofit tech organizations, city and state governments, and enterprises drives impactful diversity and inclusion programs, uplifting women and underrepresented minorities in tech.

Diversity Panel: Data for All: Empowering Underrepresented Voices in Data Science and Analytics
Panel: Building Data-Driven Startups with User-Centric Design
Newcomer Sprint!

Esteban Ginez

I am a software engineer living in Seattle, with extensive experience at Amazon working on devices and cloud computing. At Oracle, I worked on developer tooling and novel compiler research for Java. Currently, I apply my classical computing and compiler expertise to quantum computing infrastructure challenges at Q-CTRL, where I oversee the technical direction of the quantum computing teams.

Subgraph Isomorphism at Scale with data science tools

Everett Kleven

Everett Kleven is a Solutions Engineer and Public Speaker at Daft, an open-source distributed query engine providing simple and reliable data processing for any modality and scale. Previously a Big Data TPM at Lucid Motors and Flight Controls Engineer at Boeing, he stewards community engagement curating technical content and demos on the latest advancements in multimodal AI. Everett holds advanced degrees in Aerospace Engineering, Mechanical Engineering, and Applied Physics from Washington University in St. Louis and Whitworth University in Spokane WA.

Why Models Break Your Pipelines (and How to Make Them First-Class Citizens)

FTC 18225 High Definition

We are FTC 18225 High Definition, a 5x worlds qualifier robotics team that participates in the FIRST Tech Challenge. Since our founding in 2020, we've been focused on bringing STEM and robotics to as many communities as possible through various robotics clubs and STEM advocacy efforts. We are excited to host this workshop at PyData Seattle!

Building Intelligent DIY Robots: From Hardware to Vision Systems

Fangchen Li

pandas core developer

Newcomer Sprint!

Heejoon Ahn

Heejoon is a Senior Data Scientist at Northwell Health working in healthcare research with a focus on analyzing fitness tracker data in personalized clinical trials. Heejoon has her MS from Dartmouth College in Quantitative Biomedical Sciences (Health Data Science) and is passionate about the interdisciplinary field of computational biology. She is passionate about advancements in AI and data governance. Heejoon is also an Ambassador of the Open Data Science Conference Seattle chapter to help other data professionals share their ideas and the latest methodologies in the data science field.

Diversity Panel: Data for All: Empowering Underrepresented Voices in Data Science and Analytics

Ivan Perez Avellaneda

I’m a data-driven problem solver with a Ph.D. in Electrical Engineering and a strong foundation in mathematics, economics, and nonlinear systems. Currently working as an Analytics Engineer at Monaghan Medical Corporation, I apply advanced analytics and modeling techniques to improve operational efficiency and strategic decision-making.
My doctoral research focused on data-driven reachability analysis of nonlinear systems, bridging control theory, optimization, and AI safety. I’m passionate about translating complex mathematical frameworks into scalable, intelligent solutions—whether in business, finance, or engineering domains.
With experience across academia, healthcare, and financial sectors, I’ve applied tools from machine learning, operations research, and statistical modeling to solve real-world challenges. I’ve also co-taught university-level courses in mathematics and control systems, reinforcing my commitment to clear communication and technical leadership.
Main Interests:
- Nonconvex and constrained optimization
- Optimal control and calculus of variations
- Machine learning, AI interpretability, natural language processing (NLP)
- Time-series analysis and predictive modeling
- Symbolic computation and formal methods
Education:
- Ph.D. in Electrical Engineering – University of Vermont (2023)
- M.Sc. in Economics – Pontifical Catholic University of Peru (2018)
- B.Sc. in Mathematics – Pontifical Catholic University of Peru (2016)

The Problem of Address Matching: a Journey through NLP and AI

Jack Ye

Jack Ye is a software engineer at LanceDB. He is a PMC member of Apache Iceberg and contributor to various open source projects in the data infra domain such as Apache Spark and Trino. Before joining LanceDB, Jack was a tech lead at AWS for products including SageMaker Lakehouse, S3 Tables, EMR and Athena integration with Iceberg and Delta Lake.

Supercharging Multimodal Feature Engineering with Lance and Ray

Jake Stevens-Haas

Recent Ph.D. in Applied Mathematics, with original research on ML for physics. Along the way, began maintaining pysindy, a library for Sparse Identification of Nonlinear Dynamics and occasionally contributing to NumFOCUS projects. What I love about open source is the opportunity for people of all backgrounds, countries, and ages to learn production-quality software development skills. I've felt welcomed and grateful for all my interactions with the community. Formerly, I was an officer in the U.S. Navy, sailing ships around the world, so research and engineering was a huge career transition.

I live in Seattle, where I play hockey and read a bunch.

I'm looking for work at the nexus of ML research and engineering.

Newcomer Sprint!

Jim Dowling

Dr. Jim Dowling is the CEO and a co-founder of Hopsworks. He has previously worked at MySQL and as an Associate Prof at KTH Stockholm. Jim organizes the annual feature store summit and is a co-organizer of PyData Stockholm. Jim has written a book for O'Reilly called "Building ML systems with a feature store: batch, real-time, and LLM systems".

Real-TIme Context Engineering for Agents

Jiten Oswal

Jiten Oswal is an engineering leader and AI systems architect based in San Francisco with over 14 years of experience building large-scale data and AI infrastructure. He spent several years at Salesforce, earning 6 U.S. patents in cloud data and AI systems. Later, he founded and led an intelligent automation startup as CTO. Currently, he’s the founding AI Lead Engineer at a AI Startup developing an open-source AI Agent Platform for Enterprises.

Building Bazel Packages for AI/ML: SciPy, PyTorch, and Beyond

John Carney

Hi, I’m John, I’m a consultant helping deliver value from data, especially AI and ML.
I’ve worked across Data Engineering, Machine Learning Engineering, Data Science, AI engineering, from startups to corporate giants. From greenfield projects to mature business processes. I specialise in delivering value from data projects, having delivered 10s of millions of pounds in value across various projects.

From Manchester in the UK, I co-founded PyDataMCR in 2018, and along with the team I've been running it ever since. In 2023 I took on the role of conference co-chair of PyData London for the 2024 conference, and I'm still doing that. In the summer of 2025 I helped create, and became chair of the PyData Strategic Committee

I got my PhD in genetics in 2016, on “Adapting UK Winter Wheat to predicted climate change scenarios”. It was during this research project that I was introduced both to programming and the PyData stack.

Building valuable Deterministic products in a Probabilistic world

John Tigue

Founder/CTO of Connoiter, producing liberally licensed open source DataMap tooling and driving the effort to have a widely useful DataMap data schema in order to promote interoperability and reduce bit rot.

How to make datamap web-apps of embedding vectors via open source tooling

Joseph Holsten

Automator & Operator; Head of Engineering at Guardrail Technologies; contributor to some codebase you’ve used today.

Newcomer Sprint!

Josh Starmer

Keynote: Josh Starmer - Communicating Concepts, Clearly Explained!!! (Or, why I don’t worry about AI taking my job and sense of purpose away from me.)

Joshua Ahmed

Bio – Joshua Ahmed | Founder & CEO, RealEngineers

Joshua Ahmed is the founder and CEO of RealEngineers, a platform that gives recruiters the ability to evaluate hardware engineers with the same clarity and confidence as technical experts — without needing an engineering degree.

A former RF Engineer at Lockheed Martin, Joshua worked on mission data and sensor fusion tools for the F-35 program. There, he saw how exceptional engineers were often overlooked — not because of skill, but because recruiters lacked tools to understand their work.

That frustration became the seed for RealEngineers — a “HackerRank for hardware engineers” that replaces artificial coding tests with real engineering proof. The platform is now evolving into a GitHub-style home for physical engineering disciplines, where engineers can showcase their real-world projects and recruiters can instantly see what great engineering looks like.

Joshua’s mission is to make technical talent visible, measurable, and transferable — starting with the 70% of engineers the internet forgot.

Panel: Building Data-Driven Startups with User-Centric Design

JustinCastilla

Justin started his Software Engineering career as a Web Development Boot Camp Instructor where he developed a passion for exciting others with new concepts and empowering individuals with the tools needed to excel in their own right. As an Advocate at Redis, Justin created numerous videos breaking down Data Structures into easy-to-understand, relatable examples with real-world use cases. Now at Elastic, he has expanded into the realm of enhanced search, monitoring, and observability capabilities.

In his spare time, Justin enjoys hiking around the Pacific Northwest, building hobby electronics, and collecting vintage music synthesizers. His love of hardware and software has led him into a deep exploration of IoT for practical applications as well as performance art!

There and back again... by ferry or I-5?

Jyotinder Singh

I'm a software engineer working on model optimization techniques in the Keras team at Google. I spend my time writing code in OSS, publishing new issues of my newsletter, or making YouTube videos!

Practical Quantization in Keras: Running Large Models on Small Devices

Khuyen Tran

Khuyen Tran transforms how data scientists learn and work. She is the author of Production-Ready Data Science: From Prototyping to Production with Python, a comprehensive guide that helps data professionals bridge the gap between experimentation and deployment.

As founder of CodeCut, she publishes daily Python tips in her newsletter that reach over 10,000 views per month and has built a community of 110,000 LinkedIn followers.

Previously an MLOps Engineer and Senior Data Engineer at Accenture, she built enterprise data solutions for clients worldwide.

Multi-Series Forecasting at Scale with StatsForecast

Micheleen Harris

Micheleen’s day job is working as a data scientist and bioinformatician, partnering with healthcare and academic customers to help make experiences and treatments better for patients. Micheleen has her MSc from NYU in Bioinformatics and loves to be at that intersection of bio and tech. She is passionate about learning, sharing her experiences with new ML and DS methods, and collaborating with others who enjoy data science. Micheleen is an organizer for two Seattle area community meetups: Seattle Women in Machine Learning and Data Science (a chapter of WiMLDS global) and Seattle AI Workshops. She believes in giving everyone a voice in tech, especially those who aren’t often give a chance. In her spare time she can be found volunteering, hiking, skiing, and painting.

Diversity Panel: Data for All: Empowering Underrepresented Voices in Data Science and Analytics

Nicholas Merchant

Machine Learning Engineer with experience training billion-parameter generative models and building
high-throughput data pipelines across 1000+ GPUs. Specializes in scalable PyTorch training, structured
dataset curation, and distributed infra for large-scale multimodal systems.

Wrangling Internet-scale Image Datasets

Noor Aftab

Noor Aftab is the Global Program Lead at Amazon Web Services (AWS), where she drives strategic programs for Amazon S3, supporting some of the world’s most complex data, AI, and analytics workloads. With a foundation in software engineering and data science, she brings over a decade of experience building and scaling cloud-native solutions, AI/ML systems, and developer-focused programs.

She serves as Vice President of the Society of Women Engineers (SWE) Pacific Northwest section, championing technical leadership and mentoring initiatives across engineering communities. Noor is also Chair of the NumFOCUS Code of Conduct Working Group and User Group Leader for IBM Women in AI, where she fosters inclusive, resilient communities across 300+ open-source projects.

A frequent keynote speaker, Noor has presented at PyData Global, SciPy, ODSC, TEDx, IEEE, and 13+ global venues, delivering talks that connect technical depth with real-world adoption of AI and cloud. She has authored and led initiatives such as the IEEE Hour of Power AI training program, empowering engineers and professionals with practical AI skills.

Her contributions to technology and leadership have been recognized with awards, including the Australia Alumni Excellence Award and Asia Pacific HRM Congress Award, with media features in the BBC, Martha’s Vineyard Times, and Hindustan Times.

GitHub: aftabn81
| Website: www.nooraftab.com

The Missing 78%: What We Learned When Our Community Doubled Overnight

Ojas Ankurbhai Ramwala

Ojas A. Ramwala is a final-year Ph.D. candidate at the University of Washington, Seattle, in the Department of Biomedical Informatics and Medical Education, School of Medicine. His research focuses on enhancing the clinical translation of mammography-based deep learning algorithms for breast cancer screening. His work aims to explore how to validate the generalizability of AI models in large and diverse cohorts, establish explainability methods faithful to the AI model architecture to interpret algorithm predictions, and develop robust deep learning algorithms to predict challenging clinical outcomes.

As an inquisitive research enthusiast, his interests include developing and applying Artificial Intelligence and Deep Learning techniques for Biomedical Signal and Image Processing, Bioinformatics, and Genomics. He spent a year at New York University, studying Bioinformatics, where he pursued research at the NYU Center for Genomics and Systems Biology

Previously, Ojas was at the National Institute of Technology - Surat, India, in the Electronics Engineering Department. He has been fortunate to work as a Research Intern at the Council of Scientific and Industrial Research (CSIR-CSIO), the Indian Space Research Organization (ISRO-IIRS), and the Indian Institute of Science (IISc).

Explainable AI for Biomedical Image Processing

Oli Dinov

Reliable Enterprise AI @ AI By The Bay

Diversity Panel: Data for All: Empowering Underrepresented Voices in Data Science and Analytics

Pedro Albuquerque

Hi everyone — I’m Pedro Albuquerque, Principal Data Scientist at AppOrchid. I work where machine learning, econometrics, and applied research meet, with a big focus on interpretable, trustworthy AI. Over the past 15+ years I’ve built and shipped data products in industry (AppOrchid, FleetOps, Convoy, ServiceNow/ElementAI) and stayed active in academia (2,000+ citations, multiple peer-reviewed papers). I’ve also taught in math, CS, and business departments and founded a lab that mentored 30+ students on ML for finance, business, and social impact.

Generalized Additive Models: Explainability Strikes Back

Pedro Luraschi

Pedro is the cofounder and COO of Hal9, an AI platform startup incubated at the Allen Institute for AI (AI2) in Seattle. Hal9 empowers entrepreneurs to build and launch their own AI-powered products without the complexity of managing technical infrastructure. In this work, Pedro has collaborated with professionals including doctors, real estate agents, artists, toy designers, and educators to bring their AI ideas to life.

Before Hal9, Pedro founded Baud, an emerging technology product development studio, and Cardbots, an edTech startup that helped children develop creative and technical skills for the future. With a background in robotics and philosophy, Pedro has spent over a decade teaching high school and college students, combining hands-on technical expertise with a human-centered approach to innovation.

Panel: Building Data-Driven Startups with User-Centric Design

Rachel Wagner-Kaiser

Rachel Wagner-Kaiser, Ph.D., is a director and data scientist at KPMG. She has 15 years of experience in data and AI, and specializes in building AI and natural language processing solutions for real-world problems constrained by limited or messy data. Rachel works across industries to lead KPMG’s technical teams to design, build, deploy, and maintain NLP solutions. Her expertise has helped companies organize and decode their unstructured data to solve a variety of business problems and drive value through automation. She is also author of the upcoming book “Teaching Computers to Read”.

Newcomer Sprint!

Rajesh

👋 Hi everyone! I’m Rajesh, a Software Engineer based in Tempe, Arizona 🌵 with 7.5+ years of experience. Currently at Jenius Bank, I’ve been building AI/ML solutions for finance clients 💳🤖 over the past 3 years.

Always excited to chat about AI engineering and where the future of AI is headed 🚀✨. Let’s connect on LinkedIn! 👉 linkedin.com/in/rajeshsk

Securing Retrieval-Augmented Generation: How to Defend Vector Databases Against 2025 Threats

Ramesh Oswal

Ramesh Oswal is a Senior Motion Planning Engineer at Aurora, with experience from Luminar and Noble.AI. He has expertise in AI/ML for Autonomous Systems and Education. He has also served as a review committee member for NeurIPS 2024, CNCF 2024, and CNCF 2023.

Building Bazel Packages for AI/ML: SciPy, PyTorch, and Beyond

Ravi Kumar Yadav

I’m currently a full-stack machine learning engineer at Walmart E-commerce, where I get to tackle exciting challenges in the world of online retail. Before that, I was a data scientist at Bank of America, building real-time fraud detection models using deep neural networks and big data – talk about high stakes!

My research interests lie in the fascinating areas of graph embedding, neural architecture search, and fast optimization methods for neural networks. I love pushing the boundaries of what’s possible with AI.

But my passion for technology extends beyond my day job. I’m also deeply invested in two side projects:

AI-Powered Vision for IoT: I’m exploring the potential of NVidia Jetson Nano to create innovative machine learning vision applications for the Internet of Things.
ML Design Patterns: I’m developing reusable design patterns to solve common machine learning problems, making AI development more efficient and accessible.
And when I need a break from the digital world, I head to my garden. I’m an avid grower of Cayenne peppers – the hotter, the better!

My journey to AI was paved with diverse experiences. Earlier in my career, I worked on NLP-based automated evaluation of text data, gaining valuable insights into the power of language processing. I hold a master’s degree in computer science from North Carolina State University – Raleigh (graduated in Spring 2016) and a bachelor’s degree in electronics and communication engineering.

Building a Deep Research Agentic Workflow

Robert Masson

Going From Notebooks to Production Code

Roman Lutz

Roman Lutz is a Responsible AI Engineer on Microsoft's AI Red Team, specializing in the safety and security of generative AI and open source software. He is a maintainer of PyRIT, Microsoft’s open-source AI red teaming toolkit, and has helped shape projects like Fairlearn and the Responsible AI Dashboard. Roman’s work bridges technical rigor with a commitment to transparency and accountability, empowering practitioners to build more robust and ethical AI systems. He shares his projects and insights at romanlutz.github.io.

Red Teaming AI: Getting Started with PyRIT for Safer Generative AI Systems

Sarah Kaiser

Sarah has spent most of her career developing technology in the lab, from virtual reality hardware to satellites. She got her PhD in Physics by starting plasma fires with lasers, Python, and Jupyter Notebooks. She has also written tech books for folks of all ages, including ABCs of Engineering and Learn Quantum Computing with Python and Q#. As a Cloud Developer Advocate for Python at Microsoft and a Python Software Foundation Fellow, she finds all kinds of new ways to build and break OSS tools for data science and machine learning. When not at her split ergo keyboard, she loves boating in the Seattle area, laser cutting everything, and playing with her German Shepard, Chewie.

There's no place like home: using AI agents in Jupyter notebooks

Saurabh Garg

I'm currently focused on building a frictionless Machine Learning Platform at Outerbounds, where our mission is to let data scientists and ML engineers stay focused on AI/ML development—while we manage the infrastructure that powers it.

My background is in large-scale distributed systems, with experience spanning cloud infrastructure and identity/authorization systems. I've worked on infrastructure teams at Oracle Cloud and Outerbounds, and on IAM/authorization platforms at Atlassian and Databricks.

At Atlassian, I was part of the team that built a CQRS-based permissions system deployed across six AWS regions, handling 100K+ read requests with sub-3ms P99 latencies.

At Databricks, I founded and led a 6-engineer team focused on authorization. We transitioned the platform from a monolithic client-based model to a service-oriented architecture, integrating with ~35 internal services and achieving P99 latencies under 1 second for over 10K requests per second.

Outside of engineering, I enjoy spending time with my daughter, and I'm always up for a game of cricket or table tennis.

Optimizing AI/ML Workloads: Resource Management and Cost Attribution

Sebastian Duerr

Seb is a Senior Member of Technical Staff at Cerebras Systems with a Master’s in Information Systems, originally from Germany and now US‑based. After beginning a PhD, he moved into consulting and served as Chief Product Officer at a major Austrian bank. He later pursued NLP research with MIT, co-founded and exited a startup, and built many AI/NLP systems in production. He has taught 20+ academic courses and published seven peer‑reviewed articles, known for translating complex concepts into practical solutions that bridge technical rigor with stakeholder needs.

Evaluation is all you need

Shujing Yang

Software Engineer at Databricks

Polars on Spark: Unlocking Performance with Arrow Python UDFs

Stephen Cheng

Stephen Cheng is a software engineer at Parakeet Health, an AI powered voice agent startup that serves medical providers, where he works on infrastructure and backend. He has also worked at Uber and Microsoft.

Scaling Background Noise Filtration for AI Voice Agents

Trent Nelson

Principal Software Engineer @ NVIDIA.

Unlocking Parallel PyTorch Inference (and More!) with Python Free-Threading

Weston Pace

Weston is an open source software engineer at LanceDB. He is on the PMC for Apache Arrow and Substrait and has spent an unhealthy amount of time thinking about how best to read data from cloud storage. Recently he has been helping develop the Lance file and table formats and studying how random access, multimodal data, and search can be integrated into the modern data lake.

Data Loading for Data Engineers

Yibei Hu

Multi-Series Forecasting at Scale with StatsForecast

Yinhan Liu

Cofounder & CEO of FlickBloom

Panel: Building Data-Driven Startups with User-Centric Design

Yujian Tang

Yujian Tang is the founder of OSS4AI. His work primarily focuses on helping developers and founders access information, resources, and community.

Panel: Building Data-Driven Startups with User-Centric Design

Zaheera Valani

At Databricks, Zaheera Valani is a VP of Engineering & Site Lead for our growing Seattle and Bellevue offices. She leads the Databricks SQL Experiences, Partner Ecosystem, and AI/BI teams. Prior to Databricks, Zaheera was the Vice President of Product Development at Tableau leading the Data Management organization. She started out her career as a software engineer on Microsoft Excel. She is passionate about data, analytics, engineering and has grown teams and shipped widely adopted data and analytics products during her 20+ year career in technology.

Keynote: Zaheera Valani- Driving Data Democratization with the Databricks Data Intelligence Platform

nidhin pattaniyil

ML Engineer at Walmart

Building a Deep Research Agentic Workflow