2025-11-08 –, Talk Track 3
Modern LLM applications rely heavily on embeddings and vector databases for retrieval-augmented generation (RAG). But in 2025, researchers and OWASP flagged vector databases as a new attack surface — from embedding inversion (recovering sensitive training text) to poisoned vectors that hijack prompts. This talk demystifies these threats for practitioners and shows how to secure your RAG pipeline with real-world techniques like encrypted stores, anomaly detection, and retrieval validation. Attendees will leave with a practical security checklist for keeping embeddings safe while still unlocking the power of retrieval.
Objective:
This talk explores why embeddings and vector databases — the backbone of modern retrieval-augmented generation — are now on OWASP’s Top 10 LLM Risks for 2025 (LLM08: Vector Database Weaknesses). We’ll highlight the latest research on embedding attacks and show how to translate it into reproducible defenses practitioners can use today.
Outline & Central Thesis:
0–10 min: Why embeddings matter (RAG pipelines explained simply)
10–20 min: Threats in 2025
- Embedding inversion: attackers recover sensitive text from embeddings
- Vector poisoning: malicious entries steer model outputs
- Supply-chain risks: poisoned community datasets or models
20–30 min: Practical defenses
- Encrypted / access-controlled vector stores
- Retrieval filtering and context sanitization
- Anomaly detection for malicious embeddings
- Monitoring & auditing inserts into vector DBs
30–35 min: Mapping to OWASP 2025 LLM08
35–40 min: Takeaways & Q&A
Key Takeaways:
- How embeddings can leak private data or act as a Trojan horse in your RAG pipeline
- The latest OWASP-backed security principles for vector databases
- A step-by-step checklist for securing embeddings in production LLM apps
Background Knowledge Expected:
Attendees should understand LLM basics and the idea of embeddings/RAG. No prior security expertise required — the talk is designed for data scientists, ML engineers, and developers building practical AI applications.
No previous knowledge expected
👋 Hi everyone! I’m Rajesh, a Software Engineer based in Tempe, Arizona 🌵 with 7.5+ years of experience. Currently at Jenius Bank, I’ve been building AI/ML solutions for finance clients 💳🤖 over the past 3 years.
Always excited to chat about AI engineering and where the future of AI is headed 🚀✨. Let’s connect on LinkedIn! 👉 linkedin.com/in/rajeshsk