Waris Gill PyData Virginia 2025

Waris Gill
.ical

I am a final-year PhD student in the Computer Science department at Virginia Tech. Currently, I am interning at Redis as a Machine Learning Engineer.

Session

04-18

12:05

30min

Fine tuning embeddings for semantic caching

Tyler Hutcherson, Srijith Rajamohan, Waris Gill

Large Language Models (LLMs) have opened new frontiers in natural language processing but often come with high inference costs and slow response times in production. In this talk, we’ll show how semantic caching using vector embeddings—particularly for frequently asked questions—can mitigate these issues in a RAG architecture. We’ll also discuss how we used contrastive fine-tuning methods to boost embedding model performance to accurately identify duplicate questions. Attendees will leave with strategies for reducing infrastructure costs, improving RAG latency, and strengthening the reliability of their LLM-based applications. Basic familiarity with NLP or foundation models is helpful but not required.

Auditorium 5

Waris Gill .ical

Session

Waris Gill
.ical