2025-09-25 –, Orbit
"How quickly will you be able to get this model into production?" is a common question in analytical projects. Often, this is the first time anyone considers the complexities of deploying models within enterprise systems.
This talk introduces an approach to enhance the success rate of complex AI/ML integration projects while reducing time-to-market. Using examples from global banks J.P. Morgan and ING, we will demonstrate team organisation and engineering patterns to achieve this.
This talk is ideal for data scientists, engineers, and product managers interested in adopting an efficient Model Development Lifecycle (MDLC).
Part 1: Breaking Down Barriers in Your AI/ML Projects (10 mins)
- Context Setting: Discuss common reasons why AI/ML projects fail to progress beyond the pilot phase, supported by relevant research and statistics.
- Engineering Integration Challenges: Highlight the late consideration of engineering integration/inference mode as a key failure reason.
- Waterfall Model Issues: Explain the "waterfall" model of data science vs engineering, emphasising the communication overhead it creates.
- Conway's Law: Use Conway's law to illustrate how team structures impact project success rates and time-to-market.
- Ideal Team Structure: Propose an integrated, collaborative, iterative team structure. Show an animation illustrating how concurrent phases can reduce time-to-market by 50%.
- Audience Poll: Gauge the feasibility of the ideal team structure (Yes/No) and discuss reasons for responses.
Part 2: Implementing the Back-to-Front Model Deployment Pattern (20 mins)
- Efficient Deployment Architecture: Introduce the back-to-front model deployment pattern as the main takeaway.
- Real-World Examples: Provide detailed examples from J.P. Morgan (credit risk alerting system) and ING (Customer Due Diligence automation).
- Key Concepts: Explain starting from model inference, defining data contracts, considering failure modes, orchestrating model training, and wiring up ETL/data pipelines.
[Suitable for 45 min extended talk] Detailed/hands-on walk-through: Present the open-source project Inference Server (https://github.com/jpmorganchase/inference-server), which supports the back-to-front deployment pattern by deploying an "empty" model and progressively “back-filling” actual model logic.