Every Copilot conversation we have with an SMB eventually arrives at the same question, usually from the FD: 'how do we know it's worth it?'. The wrong answer is to wave around a Microsoft-published statistic about hours saved per user. The right answer is to measure your own business, in a way that survives a sceptical finance review.
Here's the framework we use.
Don't start with ROI - start with the claim
Before you measure anything, write down what you actually expect Copilot to do for your business. Not 'improve productivity'. Something specific, like: 'reduce time spent on routine email and meeting summarisation by at least 25%, freeing up roughly half a day per user per week for higher-value work.' That sentence tells you what to measure and what to compare against.
If you can't write the sentence, that's the real problem - and no amount of dashboards will fix it.
Measure four things, not one
Hours saved is the most common metric and the easiest to get wrong. On its own it tells you almost nothing. We measure four things in combination, and the picture only makes sense when you look at all four together.
- Usage: how many of your licensed users are actually using Copilot, on how many days, in which apps. Pulled from the Microsoft 365 admin centre. This is your leading indicator - low usage explains away every other metric.
- Time saved: self-reported hours per user per week, on the specific tasks you targeted. Survey the same people every month with the same questions. Imperfect, but consistent imperfection is fine for trend.
- Output quality and volume: did the marketing team ship more campaigns, did the sales team get more proposals out, did finance close the month faster. These are the real business metrics and they're often already being tracked for other reasons.
- Perceived value: 'if we took Copilot away tomorrow, how disruptive would that be?' on a 1-5 scale. This is the single most predictive question we ask. People are honest about it, and a score below 3 by month three is a serious warning sign.
Convert hours saved into something the FD recognises
FDs don't get excited about hours; they get excited about money, capacity, and risk. Translate your hours saved into one of those.
Money: hours saved per user per week, multiplied by fully-loaded cost per hour for that role, multiplied by 46 working weeks, minus the licence cost. For a typical knowledge worker at £30 fully-loaded hourly cost, saving 3 hours a week is roughly £4,100 a year per user. The licence is around £288 a year. The ratio is fine.
Capacity: 3 hours per user per week across a 30-person team is roughly 2.3 FTE-equivalents of reclaimed time per year. That's the language a growing business actually thinks in - 'we got the equivalent of two more heads without hiring them'.
Risk: harder to quantify but real. Faster proposal turnaround means more deals worked. Faster month-end means earlier visibility on commercial issues. Better-written customer comms means fewer complaints. Name the risks you reduced, even without a number.
Don't double-count
The honest version of this maths comes with caveats. Self-reported hours saved are systematically optimistic; discount them by 30 to 40%. Saved time only converts to value if people actually fill it with useful work; some of it gets absorbed into longer breaks and tidier inboxes, which is fine but isn't the headline number. Some Copilot wins displace existing tools you can now stop paying for - count that saving once, in the licence-cost line, not twice.
Compare against the right counterfactual
ROI calculations love to compare 'with Copilot' to a fictional alternative where everyone just does it the slow way. That's not the real comparison. The real comparison is 'with Copilot' versus 'team finds workarounds with the consumer ChatGPT they're already using on their phones'. The shadow-IT counterfactual is usually about 30% as good - so the marginal value of a managed Copilot rollout is the gap between the two, not the gap to zero.
This matters because it stops you over-claiming and it stops you under-claiming. Both make you look bad in front of the FD.
Run the measurement on a cadence
Once a quarter is enough. Pull the usage data, run the survey, capture the perceived-value score, update the time-to-money translation, and write a one-page summary. The summary should fit on a single side of A4 and answer three questions: are people using it, is it changing how work gets done, and is it still worth the money. If your quarterly summary takes more than a page, you're measuring too much.
When to stop a Copilot rollout
Measuring isn't only about justifying renewal - it's also about being honest when something isn't working. Stop or scale back if: usage is below 40% of licensed users after three months and not improving; perceived value sits at 2 or below; you can't name three specific use cases per team without prompting. None of those mean Copilot is bad. They mean Copilot is wrong for this team, this moment, or this set of use cases - and the licence money would be better spent elsewhere.
The honest summary
Copilot ROI in an SMB is almost always positive when measured properly, and almost always overstated when measured lazily. Write the claim first. Measure four things, not one. Translate hours into money, capacity or risk. Discount self-reported numbers. Compare to the realistic alternative, not the fictional one. Do it quarterly, fit it on a page, and be willing to call it off if the numbers don't hold up. That's the version of ROI measurement that survives contact with a real FD - and the version that earns the trust to do bigger AI projects next.