PyData Eindhoven 2025

CompactifAI: Quantum-Inspired AI Model Compression
2025-12-09 , Planck-Bohr

Large AI models have become powerful but increasingly impractical; with escalating training costs, bloated memory requirements, and latency bottlenecks that limit real-world deployments. This talk introduces CompactifAI: a quantum-inspired compression framework that uses tensor networks to surgically shrink large models while preserving their accuracy and capabilities.


We will begin with the story of how Multiverse came to be in 2019 with the mission to solve today’s problems through quantum technologies. Along this path, we discovered during a project for Bosch that quantum-inspired algorithms running entirely on classical hardware could ultra-compress AI models. In 2024, we realized that these same techniques could be applied to Large Language Models. This insight gave birth to CompactifAI. From there, we’ll walk through CompactifAI and its compression pipeline, highlighting how it outperforms naive pruning or quantization approaches in both precision and control leveraging Tensor Networks.

Attendees will see how this enables new deployment scenarios: running powerful LLMs on edge devices, routing queries between local and cloud models, and even removing or restoring specific behaviors (e.g. safety filters or domain knowledge).

Additionally, we’ll show how to integrate our compactifAI compressed models via API with minimal code changes and provide relevant developer resources for those interested in benefitting from faster, cheaper, more efficient models.
This talk is aimed at ML engineers, researchers, and technical leads working with LLMs, vision models, or constrained deployment targets who are ready to think beyond just “bigger is better.”


Prior Knowledge Expected: Advanced - Deep Understanding (I am proficient in topic)

Solution Architect @ Multiverse Computing