01The measurement gap
When AI teams report on programme performance, they typically use adoption metrics: number of active users, number of queries processed, utilisation rate against licence allocation, net promoter scores from user surveys. These metrics tell you whether the tool is being used. They do not tell you whether the business is better off for having deployed it.
A CFO who receives a report showing that 67% of Copilot licences are being used has no basis for determining whether those licences represent good value. A board that is told AI has saved the legal team "significant time" cannot assess whether that time saving justifies the investment or what the business impact has been.
The measurement gap exists because AI ROI measurement is genuinely difficult. The benefits are diffuse, attributing them requires a baseline comparison that is hard to establish after deployment, and the indirect benefits (better decisions, lower error rates, faster customer response) are harder to quantify than the direct costs. These difficulties do not make measurement impossible. They make it require more deliberate design.
02Category one: productivity measurement
The most direct AI benefit in most deployments is productivity: the same work gets done faster, or more work gets done with the same headcount. Measuring this requires establishing a pre-deployment baseline.
For Microsoft Copilot deployments, this means measuring before rollout the time taken to complete representative tasks: drafting a board report, producing a client proposal, summarising a set of meeting notes, responding to a query from a junior colleague. After deployment, measure the same tasks with the same cohort. The difference is your productivity benefit, which can be converted to headcount equivalent cost savings using average fully-loaded staff costs.
Microsoft's own Copilot measurement framework provides a starting template for this approach. Organisations that have implemented it are consistently finding productivity improvements in the 20-35% range for knowledge work tasks, which for a team of 50 with £60,000 average fully-loaded cost represents £600,000 to £1,050,000 in annual value per year, against a Copilot licence cost of roughly £60,000. That is a ROI calculation a CFO can understand.
03Category two: quality and error rate measurement
In many AI deployments, the business case is not primarily about speed but about accuracy. An AI that reduces the error rate in a financial reconciliation process, catches compliance issues in contract review, or identifies anomalies in expense reporting creates value through risk reduction rather than productivity.
Measuring this requires establishing an error rate baseline before deployment and tracking the same metric after. In financial services, error rates in process outputs are often already measured as part of operational risk management, which makes this category of AI ROI relatively tractable. In other sectors, establishing the baseline may require deliberate measurement design.
The monetary value of quality improvement comes from the cost of errors: the time to investigate and correct them, any financial penalties or write-offs, customer service costs arising from errors, and reputational cost where significant. Even a partial estimate of these costs, set against the AI investment, typically produces a compelling ROI calculation.
04Category three: revenue and growth measurement
The hardest AI ROI to measure, but often the most significant, is the revenue impact. AI that improves sales team productivity, personalises customer experience, accelerates product development, or enables new service offerings creates revenue that would not otherwise have been captured.
The challenge is attribution: how much of the revenue growth that occurred after AI deployment would have occurred anyway? The answer requires controlled comparison, either A/B testing (AI-enabled sales team versus control group) or cohort comparison (customers served by AI-enhanced process versus those not) or time-series analysis with careful controls.
For organisations that have done this work, the results are often striking. Salesforce research on AI-assisted selling found productivity improvements of 26% in pipeline generation and 24% in deal win rates in controlled deployments. For a business with £10 million in annual sales, a 24% improvement in win rates represents a material revenue impact that dwarfs the AI investment cost.
05Building a board-ready AI ROI framework
An AI ROI framework that will satisfy a finance committee and audit team has three characteristics.
It measures baseline before deployment, not after, because post-deployment baselines are subject to attribution bias and cannot be independently verified. It separates correlation from causation by using control groups or before-and-after comparisons with explicit assumptions about confounding factors. And it expresses benefits in financial terms using verifiable inputs (staff cost rates, error cost estimates, revenue data from existing systems) rather than survey-derived satisfaction scores.
Building this framework takes more upfront work than deploying the AI tool and reporting on adoption. But it produces the evidence base that allows AI investment to be governed, scaled, and justified with the same rigour that applies to any other material business investment.
Key Takeaways
- 1.Adoption metrics (utilisation rates, user satisfaction) do not constitute ROI evidence and will not satisfy a CFO or audit committee.
- 2.Productivity ROI requires a pre-deployment baseline, a post-deployment comparison, and conversion to financial value using staff cost rates.
- 3.Quality and error rate improvement creates value through risk reduction that can be quantified against the cost of errors.
- 4.Revenue attribution from AI investment requires controlled comparison, not simply noting that revenue grew after deployment.
- 5.A board-ready AI ROI framework establishes baselines before deployment, separates causation from correlation, and expresses all benefits in financial terms.
References & Further Reading
- [1]Microsoft Copilot ROI Study: Forrester Total Economic ImpactMicrosoft / Forrester
- [2]The Economic Potential of Generative AIMcKinsey & Company
- [3]Salesforce State of Sales Report 2024Salesforce
Want to discuss this with an expert?
Book a strategy call to explore how these insights apply to your organisation.
Book a Strategy Call