09-26, 14:55–15:25 (Europe/Amsterdam), Apollo
Operationalizing ML isn’t just about models — it’s about moving and engineering data. At Hopsworks, we built a composable AI pipeline builder (Brewer) based on two principles: Tasks and Data Sources. This lets users define workflows that automatically analyse, clean, create and update feature groups, without glue code or brittle scheduling logic.
In this talk, we’ll show how Brewer drives the automation of feature engineering, enabling reproducible, declarative pipelines that respond to changes in upstream data. We’ll explore how this fits into broader ML workflows, from ingestion to feature materialization, and how it integrates with warehouses, streams, and file-based systems.
We’ll also unpack the real challenges: triggering logic, metadata management, integration with orchestration engines, and maintaining transparency when LLMs are involved. Yes — there will be diagrams and code.
Topics covered include:
- Tasks and Connectors in feature pipelines
- Automating tables updates
- Integration with existing data stacks
- Architecture + orchestration lessons
- Scaling reproducibility with metadata
Key takeaways:
- Feature engineering deserves real automation
- LLMs in ML workflows
- Metadata is a first-class citizen
- Clean abstractions beats MCP