Maximizing Multimodal: Exploring the search frontier of text-to-image models to improve visual find-ability for creatives PyData Virginia 2025

Maximizing Multimodal: Exploring the search frontier of text-to-image models to improve visual find-ability for creatives
.ical

04-18, 11:30–12:05 (US/Eastern), Auditorium 5

Text-to-Image models, like CLIP, have brought us into a new frontier of visual search. Whether it's searching by circling a section of a photo or powering image generators like Dalle-E the gap between pixels and tokens has never been smaller. This talk discusses how we are improving search and empowering designers with these models at Eezy, a stock art marketplace.

Objective:
Describe where and how we have improved the search experience in our product with open source multi-modal models and libraries. Real world examples from the things we have shipped (and decided not to ship) to production.

Outline:
1. Cover the architecture of open source hybrid search stack at Eezy (Elasticsearch, FAISS, PyTorch)
2. Demo the capabilities and limitations of openCLIP for retrieval embeddings
3. Highlight meaningful stops on our product roadmap from the last 2 years of deploying features into production.
4. Describe notable missteps and surprises uncovered along the way, so people see it's not all roses in the AI powered future.
5. Demo of BORGES, a novel search framework that allows users to search with multiple queries for a nuanced navigation of the catalog to find exactly what they need

Audience:
- Anyone curious about real-world results we have extracted from AI
- Search practitioners developing hybrid search applications
- PyTorch and transformers enthusiasts interested in applications in vector space
- This talk is not overtly technical and does not require a background in ML/search/AI. The most math required is some multiplication and division, you got it, jump in.

Prior Knowledge Expected –

No previous knowledge expected

Nathan Day

I dance in vector space.

Maximizing Multimodal: Exploring the search frontier of text-to-image models to improve visual find-ability for creatives .ical 04-18, 11:30–12:05 (US/Eastern), Auditorium 5

Maximizing Multimodal: Exploring the search frontier of text-to-image models to improve visual find-ability for creatives
.ical

04-18, 11:30–12:05 (US/Eastern), Auditorium 5