Tyler Hutcherson
Tyler leads the Applied AI Engineering group at Redis, working hands-on with customers and partners on real-time GenAI and ML workloads. Previously, Tyler led ML Engineering at a early-stage eCommerce startups building novel search & recommendation systems graduated from the University of Virginia with a BS in Physics and MS in Data Science. His passions involve MLOPs system design and working with LLMs to solve actual problems. He also enjoys distilling myths and building bridges in the tech community through knowledge and resource sharing.
Tyler and his wife Cynthia reside in Richmond, VA where they enjoy hosting friends, family, and soaking in the city's history, landmarks, nature, food and creative scene.
Sessions
Large Language Models (LLMs) have opened new frontiers in natural language processing but often come with high inference costs and slow response times in production. In this talk, we’ll show how semantic caching using vector embeddings—particularly for frequently asked questions—can mitigate these issues in a RAG architecture. We’ll also discuss how we used contrastive fine-tuning methods to boost embedding model performance to accurately identify duplicate questions. Attendees will leave with strategies for reducing infrastructure costs, improving RAG latency, and strengthening the reliability of their LLM-based applications. Basic familiarity with NLP or foundation models is helpful but not required.