Enterprise AI Adoption: Why Finance Lacks Data Strategy
Enterprise AI in financial services is being sold as a model problem. Almost every time we look closely, it is a data problem. Most banks are evaluating models without seriously evaluating where their data lives, what shape it is in, and whether the AI they are buying can do anything useful with it.
Enterprise AI in financial services is being sold as a model problem. Almost every time we look closely, it is a data problem.
The CIOs we work with are evaluating models, vendors, agent frameworks, and copilots. They are not, with equal seriousness, evaluating where their data actually lives, what shape it is in, and whether the AI they are buying can do anything useful with it.
The result is a long catalogue of expensive pilots that never reach production, and a smaller catalogue of pilots that reach production and quietly underperform.
We have started to think of data strategy as the part of the AI conversation banks systematically refuse to have.
The pilots that fail look the same
When a financial services AI pilot fails, the post-mortem almost always blames the model, the vendor, or the regulator. Occasionally the team. Almost never the data.
In the engagements we look at, the failure pattern is uniform. The data the model needs is not in one place.
The data that is in one place is stale. The data that is fresh is in a format the model cannot ingest.
And the audit trail required to defend the model's outputs to a regulator does not exist.
None of this is a model problem. It is the architecture that sits underneath the model, and most banks have not invested in that architecture for fifteen years.
The semantic gap is the real bottleneck
The conversation banks are not having is about what the model needs the data to mean.
A credit-decisioning model needs to know what "customer" means consistently across deposits, lending, and onboarding. A fraud-detection model needs to know what counts as a related transaction across counterparties that have been onboarded under different rules over twenty years.
An agentic onboarding system needs to know which fields the regulator considers material to the KYC decision.
In every case, the model is asking for a semantic layer, not a data lake. Semantics is the work of deciding what each field means in business terms, and committing to that meaning across systems.
Most banks have a data lake. Very few have a semantic layer.
This is not a tooling gap. It is a discipline gap.
The teams that solve it sit down and write the definitions. The teams that do not solve it buy another tool.
Governance is downstream of the same problem
The compliance arguments banks raise about AI are real. The EU AI Act becomes enforceable for high-risk systems in August 2026.
DORA and NIS 2 are already operative. None of these regimes can be satisfied with a model card and a confidence score.
What they require is a traceable lineage from input data to model decision to human reviewer, evidenced in a form a regulator can audit. That lineage is impossible to construct on top of a broken data foundation, because the data has no canonical identity to trace.
This is why we have stopped treating governance as a separate workstream from data strategy. They are the same workstream, viewed from different sides of the same regulator's letter.
A bank that fixes its data foundation makes governance tractable. A bank that does not, will write thousand-page policy documents describing an oversight it cannot actually perform.
The work that actually moves the number
The banks we have watched succeed in production AI did three things in order.
They picked a single business outcome with enough specificity that a model could be evaluated against it. Not "improve customer experience".
Something like "reduce commercial onboarding time below twelve hours without breaking screening accuracy". The outcome forces the data conversation.
They built the data layer for that outcome before they bought a model. Domain-aligned events, real-time streams, canonical entity definitions, lineage capture.
The platform looks unglamorous next to the demo. It is what makes the demo viable.
Only then did they buy or build the model. In one engagement with a major UK bank, that ordering produced a commercial customer screening platform that took onboarding from ten days to under twelve hours, with audit trails the regulator could read directly.
The model was not the breakthrough. The data was.
Why this keeps not happening
If the answer is so simple, the question is why most banks do not do it.
Two reasons, and they reinforce each other.
The first is that data work does not photograph well. A CIO can show a board a model demo.
The CIO cannot show a board the entity-resolution rules that made the demo work. The incentives reward the part the board can see.
The second is that the vendor market sells in the same direction. Every AI vendor has a slide that says "works with your existing data".
None of them have a slide that says "your existing data has to be reshaped before this works". The first slide gets the meeting. The second slide tells the truth.
The CIOs who close the gap usually do so after a public failure has made the truth unavoidable. The ones who close it before that point are unusual.
They tend to be the ones whose engineering teams have been allowed to publish honest internal post-mortems.
The compass calibration is upstream of all of this. Data strategy is not the boring part of enterprise AI adoption. It is the only part that decides which firms still have an AI strategy in three years.
Bugni Labs
R&D Engine
The R&D engine powering our advanced software engineering practices — platform engineering, AI-native architectures, and AI-Native Engineering methodologies for enterprise clients.