When a Model Upgrade Breaks Production
A Gemini 2.5 Pro upgrade caused a regression in our evidence extraction pipeline. Context adherence dropped. Structured outputs degraded. The benchmarks said it was better. Our production data said otherwise.
Practical observations, lessons learned, and technical notes from the field — captured as they happen.
A Gemini 2.5 Pro upgrade caused a regression in our evidence extraction pipeline. Context adherence dropped. Structured outputs degraded. The benchmarks said it was better. Our production data said otherwise.
Discover how a major UK bank's screening platform leveraged AI governance in financial services to reduce commercial customer onboarding from 10 days to under 12 hours, with vendor-agnostic architecture and zero unplanned incidents.
Master building governed AI delivery pipelines with AI engineering methodology. Bugni Labs' proven approach delivers 4-month concept-to-production for financial services, with zero incidents and 3-5x velocity.