Generalized Additive Models: Explainability Strikes Back
.ical
2025-11-07 15:20–16:05, Room 313

Generalized Additive Models (GAMs)

Generalized Additive Models (GAMs) strike a rare balance: they combine the flexibility of complex models with the clarity of simple ones.

They often achieve performance comparable to black-box models, yet remain:
- Easy to interpret
- Computationally efficient
- Aligned with the growing demand for transparency in AI

With recent U.S. AI regulations (White House, 2022) and increasing pressure from decision-makers for explainable models, GAMs are emerging as a natural choice across industries.

Audience

This guide is for readers with some background in Python and statistics, including:
- Data scientists
- Machine learning engineers
- Researchers

Takeaway

By the end, you’ll understand:
- The intuition behind GAMs
- How to build and apply them in practice
- How to interpret and explain GAM predictions and results in Python

Prerequisites

You should be comfortable with:
- Basic regression concepts
- Model regularization
- The bias–variance trade-off
- Python programming

Why GAMs Matter

In machine learning, practitioners often face a trade-off:

Simple models (e.g., linear or logistic regression) are transparent but too rigid and risk underfitting.
Black-box models (e.g., deep neural networks, GANs, XGBoost) are powerful, but costly, opaque, and often difficult to trust. Their credibility primarily stems from empirical testing, which may not translate into business interpretability or regulatory compliance.

Generalized Additive Models (GAMs) resolve this tension.
They allow features to interact with the target in flexible, nonlinear ways—without losing interpretability and while achieving performance comparable to more complex models, especially on structured/tabular data.

Key Advantages

Performance: Research (Hastie & Tibshirani, 1990; Lou, Caruana, & Gehrke, 2012) and empirical studies show that GAMs often rival tree-based and boosting methods. For example, StitchFix demonstrated that GAMs achieved nearly the same AUROC as random forests in customer acquisition, but with much lower scoring times.
Industry adoption: Companies increasingly adopt GAMs, as the marginal accuracy gains of black-box models rarely justify their costs. Transparent models reduce computational overhead and simplify validation pipelines.
Regulation & Trust: In regulated domains like healthcare and finance, interpretability is a requirement. The U.S. AI Bill of Rights (2022) and standards like the NIST AI RMF emphasize transparency, fairness, and accountability in AI. GAMs offer practitioners a practical path to align with these standards while preserving predictive power.

Supporting Evidence

📊 StitchFix (Customer Acquisition): GAMs matched random forests in predictive performance while requiring a fraction of the scoring time, making them far more deployable.
⏱ Forecasting (SeasonalNaive vs. LagLlama): Simple SeasonalNaive models outperformed LagLlama (a deep foundational model) by 42% in accuracy and 1000× in speed, underscoring how interpretable, computationally efficient models can surpass state-of-the-art approaches.
✅ Trustworthy AI Standards (NIST AI RMF, ISO/IEC 23894, ISO/IEC 42001): These frameworks stress explainability, robustness, fairness, and accountability as cornerstones of trustworthy AI. GAMs inherently support these values by being interpretable, auditable, and easier to govern compared to opaque architectures.

Together, these findings reinforce that interpretable ≠ weak. GAMs and similar models demonstrate that simplicity can coexist with power, compliance, and efficiency—making them the responsible choice for modern AI.

Outline & Time Breakdown

0–10 min: Setting the Stage
The trade-off: simple vs. complex models
Model intuition
Real-world examples of why interpretability matters
10–20 min: Understanding GAMs
The math (intuitively explained)
Tools in Python: pyGAM, statsmodels, pyro
Smooth functions, splines, and additive structures
20–30 min: Hands-On Examples
Building a GAM in Python
Benchmarking against logistic regression & random forests
Visualizing terms for interpretability
30–40 min: Applications & Case Studies
Healthcare: risk prediction with trust
Finance: credit scoring with compliance
Business: churn modeling with interpretability
40–45 min: Limitations & What’s Next
When not to use GAMs
Extensions: Explainable Boosting Machines (EBMs)
Open questions in interpretable ML
45–50 min: Q&A

References

Hastie, T., & Tibshirani, R. (1990). Generalized Additive Models. Chapman and Hall.
Lou, Y., Caruana, R., & Gehrke, J. (2012). Intelligible Models for Classification and Regression. KDD.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning. Springer.
Caruana, R. et al. (2015). Intelligible Models with Pairwise Interactions. KDD.
White House (2022). Blueprint for an AI Bill of Rights. Link
Larsen, K. (StitchFix). GAM vs. Random Forest Performance in Direct Mail Customer Acquisition. GitHub
Nixtla AI (2023). SeasonalNaive vs. LagLlama: Large-Scale Forecasting Benchmark. arXiv
Nannapaneni, S. (2025). Trustworthy and Responsible AI Modeling. AppOrchid Inc.
Wood, S. N. (2017). Generalized Additive Models: An Introduction with R, Second Edition. CRC Press. ISBN: 1498728332 | ISBN13: 9781498728331.

Prior Knowledge Expected: Previous knowledge expected

Pedro Albuquerque

Hi everyone — I’m Pedro Albuquerque, Principal Data Scientist at AppOrchid. I work where machine learning, econometrics, and applied research meet, with a big focus on interpretable, trustworthy AI. Over the past 15+ years I’ve built and shipped data products in industry (AppOrchid, FleetOps, Convoy, ServiceNow/ElementAI) and stayed active in academia (2,000+ citations, multiple peer-reviewed papers). I’ve also taught in math, CS, and business departments and founded a lab that mentored 30+ students on ML for finance, business, and social impact.

Generalized Additive Models: Explainability Strikes Back .ical 2025-11-07 15:20–16:05, Room 313