PyData Virginia 2025

A Beginner's Guide to Variational Inference
04-19, 11:00–12:30 (US/Eastern), Room 140

When Bayesian modeling scales up to large datasets, traditional MCMC methods can become impractical due to their computational demands. Variational Inference (VI) offers a scalable alternative, trading exactness for speed while retaining the essence of Bayesian inference.

In this tutorial, we’ll explore how to implement and compare VI techniques in PyMC, including the Adaptive Divergence Variational Inference (ADVI) and the cutting-edge Pathfinder algorithm.

Starting with simple models like linear regression, we’ll gradually introduce more complex, real-world applications, comparing the performance of VI against Markov Chain Monte Carlo (MCMC) to understand the trade-offs in speed and accuracy.

This tutorial will arm participants with practical tools to deploy VI in their workflows and help answer pressing questions, like "What do I do when MCMC is too slow?", or "How does VI compare to MCMC in terms of approximation quality?".


Description

This tutorial is for data scientists, statisticians, and machine learning practitioners who are comfortable with Python and basics of probability.

We’ll break down the mechanics of VI and its application in PyMC in an approachable way, starting with intuitive explanations and building up to practical examples.

Participants will learn how to apply ADVI and Pathfinder in PyMC and evaluate their results against MCMC, gaining insights into when and why to choose VI.

Takeaways

Participants will leave understanding:

  • The fundamentals of VI and how it differs from MCMC.
  • How to implement ADVI and Pathfinder in PyMC.
  • Practical considerations when selecting and evaluating inference methods.

Background Knowledge Required

  • Basic understanding of probability and Bayesian inference.
  • Familiarity with Python. Prior PyMC experience is helpful but not required.

Materials Distribution

All materials, including notebooks and datasets, will be available on GitHub.

Outline

  1. Introduction: Why Variational Inference? (10 min)
    - The limitations of MCMC for large datasets.
    - Overview of VI: How it works and why it’s faster.

  2. Variational Inference Basics (20 min)
    - Key concepts: Evidence Lower Bound (ELBO), optimization, and approximation families.
    - Intuitive explanation of ADVI and Pathfinder.

  3. Implementing VI with PyMC (15 min)
    - Step-by-step walkthrough of VI with a linear model.
    - Comparing ADVI, Pathfinder, and MCMC.

  4. Evaluating VI Approximations (10 min)
    - How to measure the quality of VI approximations (ELBO, simulation-based calibration, etc.).
    - Practical trade-offs between speed and accuracy.

  5. Scaling Up: Complex Models and Real-World Applications (25 min)
    - Applying VI to hierarchical and large-scale models.
    - Tips for debugging and optimizing VI workflows.

  6. Open Discussion and Q&A (10 min)
    - Address audience-specific use cases and questions.


Prior Knowledge Expected

No previous knowledge expected

Chris is a Principal Quantitative Analyst at PyMC Labs and an Adjoint Associate Professor at the Vanderbilt University Medical Center, with 20 years of experience as a data scientist in academia, industry, and government. He is interested in computational statistics, machine learning, Bayesian methods, and applied decision analysis. He hails from Vancouver, Canada and received his Ph.D. from the University of Georgia.​​