Jyotinder Singh PyData Seattle 2025

Jyotinder Singh
.ical

I'm a software engineer working on model optimization techniques in the Keras team at Google. I spend my time writing code in OSS, publishing new issues of my newsletter, or making YouTube videos!

Session

11-08

10:10

45min

Practical Quantization in Keras: Running Large Models on Small Devices

Jyotinder Singh

Large language models are often too large to run on personal machines, requiring specialized hardware with massive memory. Quantization provides a way to shrink models, speed them up, and reduce memory usage - all while retaining most of their accuracy.

This talk introduces the fundamentals of neural network quantization, key techniques, and demonstrates how to apply them using Keras’s extensible quantization framework.

Room 313

Jyotinder Singh .ical

Session

Jyotinder Singh
.ical