AI ROI: More than a feeling?

AI marketing campaigns are clearly meant to drive fomo, and they paint a picture of increasing adoption. But how do you measure returns from that adoption? As usual, garbage in, garbage out, and Thomson Reuters’s annual summary on the development and implementation of AI in law firms seems drawn straight from the wastepaper basket.

The 82 Percent

Thomson Reuters publishes an annual Future of Professionals report. Fielded between February and March 2025, the 2025 edition went to 2,275 global professionals across legal, tax, accounting, risk, and compliance, with about 60% of respondents from legal. Steve Hasker, the company’s chief executive, wrote the foreword. His framing was direct: Thomson Reuters was “no longer predicting AI’s impact — we’re quantifying it.” Eight months later, in October and November 2025, the same institute fielded a second report, AI in Professional Services, on the same population and on many of the same questions. That report, published in early 2026, discloses in its executive summary that 18% of organisations collect AI ROI metrics, 42% do not, and 40% do not know whether they do. Those three numbers sum to 82%, which is the share of the market telling its own vendor-research publisher that the business case for legal AI is running on something other than evidence.

That this number sits inside a report published by the company whose commercial story depends on it for its own survival should be at least concerning to discerning observers of the AI legal market. Thomson Reuters crossed one million CoCounsel users in February, and the share of its contracts tagged GenAI-enabled moved from 15% at the end of 2024 to 28% at the end of 2025. Outlooks are rising on the assumption that adoption is integration, which in turn measures business value. But what the 2026 report discloses, in the third page of its executive summary, is that most of the industry has no way to tell you whether that is true or not.

The drift in the AI story matters because the commercial narrative priced on adoption requires integration and measurement behind it, and the 2026 report reveals that nothing in this report or the prognostications from 2025 have much meaning. A seat count tells you that a license was purchased. Whether the workflow actually moved is a separate question, and whether anyone is measuring the return on that movement is a third question the 2026 report says most of the industry cannot answer.

The 2025 Report: what did it quantify?

The 2025 report said that 53% of respondents stated that their organisation was “already experiencing at least one benefit” from AI. On top of that belief number the report built a forecasting claim: legal professionals expected AI to free up nearly 240 hours per year within the next year, which Thomson Reuters calculated as an average of $19,000 per professional and a $32 billion combined impact across the US legal and tax-and-accounting sectors. Making sense of those findings was the job of what the report called the AI Success Pyramid — a four-layer model in which strategy, leadership, operations, and individual engagement stacked into outcomes. Organisations with a visible AI strategy, the report said, were twice as likely to experience revenue growth and 3.5 times more likely to realize critical benefits than those without one.

The 2025 report’s accuracy language is where the framing already started to wobble and looking back should have raised serious questions about the pace and level of adoption. 91% of professionals believed AI should be held to a higher standard of accuracy than humans, the report said, and 41% wanted 100% accuracy before using AI without human review. Those two numbers describe a profession revealing that the bar it will accept for unsupervised AI use is unmet by any currently shipping system. The 2025 report did not name the contradiction with its own benefit-quantification framing. It treated the gap as a challenge the profession would work through.

What the 2025 report did not yet surface was how any of these numbers would be measured at the organisational level. Benefit was inferred from respondent belief. Revenue impact was estimated from time saved. And strategy correlation was self-reported by respondents about their own organisations. ROI as a distinct instrument, along with a methodology, had yet to become the report’s focus. The 2026 report is where they tried to close that gap.

The 2026 Measurement Problem

Between October and November of 2025 TR sent a survey to 1,514 professionals across legal, tax and accounting, corporate risk and fraud, and government, with respondents screened for AI familiarity. This 2026 AI in Professional Services report is the same series in a later moment. Published in early 2026 and now in its fourth year.

Organisational GenAI use moved from 22% in 2025 to 40% in 2026. Share of respondents saying GenAI is currently central to their organisation’s workflow moved from 13% to 16%. Among organisations that have deployed GenAI, more than 80% of current users report weekly or more frequent use. This is the adoption story so far.

Among the 18% that measure, the report notes that the metrics are mostly internal: cost savings, employee usage, employee satisfaction. Few organisations measure client satisfaction, projected revenue generation, or new business won from AI implementations. Thomson Reuters’ own phrasing, in the same section, is that the operational impact of AI is largely divorced from business impact.

A careful reading of the 82% number matters. It is the sum of the respondents who say the organization does not measure and the respondents who say they do not know whether it does. Those are not the same thing. An organization can have ROI metrics its respondent is not aware of; the report’s framing is generous to the possibility. But the combined figure is what TR chose to lead with, and the point is that even if every “don’t know” turns out to be a yes, the 2025 story about benefits being “no longer predicted but quantified” is not the really playing out in the 2026 report.

Tool-use data carries a pattern the February 2026 piece on privilege already argued for. Among individuals, publicly available GenAI tools like ChatGPT are at 57% use, enterprise business GenAI tools like Microsoft Copilot at 44%, and paid industry-specific legal AI products at 31%. For law firms the split is 55% public, 38% enterprise, 35% legal-specific. For corporate legal teams the enterprise number jumps to 62%, higher than either the public-tool number at 60% or the legal-specific number at 36%. The February piece argued that public tools were outpacing legal-specific products in observed user behaviour; the 2026 report keeps the pattern, widens it for corporate legal, and adds an enterprise-Microsoft layer the prior piece did not frame. Legal-specific tool use grew 14 percentage points year-on-year and another 42% of respondents plan or consider it. This is consistent with Bad Faith’s recent take: https://badfaith.beehiiv.com/p/lawyers-and-ai-check-your-privilege

Threat-perception metrics all moved in the same direction. Unauthorised practice of law as a major threat rose from 36% to 50% year-on-year. Jobs loss as a major threat rose from 15% to 24%. Fewer billable hours as a major threat moved from 10% to 14%. After having an additional year to use AI tools, it seems that more lawyers are waking up to the threat.

Several of the 2025 report’s hero statistics are simply gone from the 2026 edition. The 240-hours-per-professional time-savings figure that anchored the 2025 forecasting claim has vanished. Alongside it, the $19,000-per-professional and $32-billion-combined projections are also absent. The AI Success Pyramid has been replaced by narrative sections on strategy and readiness, without its four-layer ladder-to-outcomes architecture. The 91%-want-higher-than-human-accuracy and 41%-require-100% findings are not repeated as a paired measurement. None of these absences is a correction or reframing this year.

The Missing Middle

Between a bought AI license and a changed business model, there are five things that have to happen, and the 2026 report shows one of them running ahead of the rest by a large margin. The easiest one is adoption: a license is purchased, the tool is opened weekly, and that alone can be called “organizational use.”

Beyond that point, the stages harden. A firm that has moved past adoption into integration has rebuilt a workflow around the tool rather than letting the tool sit adjacent to existing practice. Measurement follows when someone attaches numbers to the business case, like tracking cost savings, revenue impact, or time reallocation against what the tool is actually doing for the organisation. Governance comes later still: it is where client instructions and firm policy align so the firm can apply the tool consistently across matters from different clients. And the last stage is business-model change, where billing, staffing, matter scoping, and client delivery reshape themselves around what the tool is actually doing.

The 2026 evidence says the profession has cleared adoption and left the other four unresolved. Adoption accelerated; the rest did not.

Adoption is the easy stage and the report cleary sets that out. Workflow-centrality is the hardest number to argue with. But has there been any meaningful progress? After two years of dense vendor selling, hundreds of millions of dollars of product investment, and so much attention, the share of legal work that GenAI sits in the middle of moved only three percentage points. 87% of respondents said they expect AI to be central within five years. 16% said it is now. Thomson Reuters frames this gap closing as merely a matter of time, with nothing really but vibes in support.

Inside the 82%, the don’t-knows matter in a specific way. 42% of respondents say their organization does not measure ROI, which is the absence of a practice. Another 40% lack even a view on whether it does, which is the absence of a position from which to speak about the practice. Those compound. An organisation whose respondent does not know whether it measures has almost certainly decided, through whatever combination of signal and omission, that the measurement is not a thing worth telling its staff about. Either the organisation does measure and has not told its AI-adjacent employees what the metrics are, or it does not measure and its employees have inferred that from the absence of the question. Either way, the business case the 2025 report framed as being quantified has not been communicated to the people whose work the measurements would be about.

Several other vendor-research studies have landed in the same place. MIT’s NANDA initiative published a finding in mid-2025 that 95% of enterprise GenAI pilots produce no measurable P&L impact; the failure, NANDA argued, was not about model quality but about organisational integration. McKinsey’s March 2026 global AI survey reported that 86% of enterprises have increased their AI budgets while only 29% say they can reliably measure the return. Gartner placed enterprise AI in the trough of disillusionment for 2026, and a vendor-survey round-up in Legal IT Insider in January 2026 quoted iManage’s VP of AI services calling the year “the reality check” on 2025’s agentic-AI marketing. Those are different populations from the TR legal review, but the structural pattern is the same across all four: each stage in the chain from adoption through measurement costs more effort than the one before.

Elsewhere in the 2026 report sits a quieter contradiction. Non-adopters are behind, according to the report. AI for innovation’s sake no longer works, according to the same page. And most organisations are not measuring ROI, per the executive summary a few pages earlier. Those three positions form a triangle: adopt now; but not for its own sake; but also we do not know if it is paying off. Each corner of the triangle is defensible on its own. Read together, they describe a publication that knows the 2025 story no longer matches the 2026 data and has yet to write the replacement. The corner that says non-adopters are behind is the 2025 framing carried forward. Sitting next to it, the corner that says innovation-for-its-own-sake no longer works is TR’s post-2024 cooling statement, a refusal to sell on capability alone. Third corner: the report concedes measurement absence, and with it TR refuses to argue from benefits the reports cannot currently ground. Writing a clean recommendation for 2027 is not something the 2026 report can do, because any recommendation would have to choose which of those three corners it is speaking from, and each choice costs something. A vendor-research publication that can say all three simultaneously has, in effect, conceded the framework its prior year used to decide what “doing this well” meant.

Business-model change is the most cautious line to hold. The 2026 report measures the inputs: threat perception rose across UPL, jobs, less work, and billing. Outputs are not directly measured. Billing-model shifts, matter-scoping changes, firm unbundling: those are the things a profession’s business model would show under genuine pressure. The 2026 report forecasts them as “what’s next” and surfaces the conditions that could produce them. It does not yet show them. The middle’s fifth stage is latent.

Also carried in the 2026 report is a rhetorical move on accuracy. In 2025 the report quantified the profession’s demand for a higher-than-human accuracy standard (91% agreeing, 41% requiring 100% before unsupervised use). A year later the same instrument does not ask the same question in the same form. What it does do is argue, in its commentary, that the goal should not be 100% accurate output but a better mix of human and technology verification. TR has moved from measuring the profession’s accuracy expectation to arguing against the accuracy expectation. I imagine its sales reps are parroting similar lines on calls.

Accuracy reframing is also, not incidentally, the marketing infrastructure for CoCounsel. CoCounsel’s product pitch is grounded accuracy: citations anchored to Westlaw content, verification workflows that catch hallucinations, outputs that a lawyer can defend in court without the risk of the sanctions cases that accumulated through 2024 and 2025. A profession that requires 100% accuracy before unsupervised AI use is a profession for whom grounded accuracy is not a premium feature but a necessary one. A profession for whom “a better mix of human and technology verification” is the accepted standard is a profession for whom grounded accuracy is one product claim among several. Moving the rhetorical goalposts from perfection to the-right-mix changes the market that CoCounsel is selling into. And the 2026 report is the commercial infrastructure of that move.

Why the CoCounsel Story Depends on the Second Number

Thomson Reuters’ commercial AI story depends on the second stage of the middle, not the first. That is why the 2026 report matters here.

As our TR deepdive argued, Thomson Reuters’ AI bet is workflow-integration-dependent, not adoption-dependent. What makes Westlaw and Practical Law commercially valuable (the editorial moat) becomes an AI moat only if the firms using CoCounsel are rebuilding research, drafting, review, and client delivery around trusted TR content. A lawyer asking ChatGPT to write up a meeting does not validate the thesis. A firm rebuilding its research workflow around an assistant that cites Westlaw headnotes at source-of-truth quality does. Usage of the first kind is what the 2026 report shows in very large numbers. Usage of the second kind is what it shows at 16% of the legal market by the publisher’s own definition of workflow-central. https://badfaith.beehiiv.com/p/deepdive-thomson-reuters

Sharper than our previous privilege piece anticipated is the competitive picture the 2026 report surfaces. For corporate legal teams, the individual-use rate of enterprise business GenAI tools is 62%. For the same population, paid industry-specific tools are at 36%. That means the default AI interface inside corporate legal is Microsoft, not Harvey or CoCounsel or Lexis+. For law firms the picture is different. Enterprise sits at 38% and legal-specific at 35%, but the corporate-legal number is the one that matters for where the market is moving. Jones Walker’s analysis reports in-house GenAI use climbed from 23% to 52% year-on-year and 64% of in-house teams expect reduced reliance on outside counsel. This means thst the customer segment leading the adoption of the technology Thomson Reuters has bet on is the segment that is most likely to route its AI use through Microsoft.

Sharpening against workflow-centrality is the ACV story specifically. As the TR deepdive argued, GenAI-Enabled ACV is reclassification-vulnerable: it records products that have GenAI features, rather than incremental revenue earned from those features being used at workflow depth. The 2026 data gives the gap between ACV and integration a pair of numbers to contrast against each other. GenAI-Enabled ACV rose from 15% at the end of the third quarter of 2024 to 28% at the end of the fourth quarter of 2025. Thirteen percentage points in about fifteen months. Workflow centrality, in the same publisher’s instrument, rose from 13% in the 2025 report to 16% in the 2026 report. Three percentage points in about twenty months. ACV is about pricing-and-classification; workflow centrality is the integration story. The pricing story is running roughly four times faster than the integration story.

The investor case that reads all of this as a buy deserves some pushback. Gian Estrada’s April 2026 piece at Tikr argues that Thomson Reuters’ AI bet is paying off: the 15-to-28% ACV trajectory has not been matched by RELX or Wolters Kluwer, CoCounsel’s one million users and roughly one-third contribution to the Legal Professionals segment show commercial traction, and the 2026 EBITDA margin projection at 40.1% plus a $134.69 mean analyst price target across 17 covering analysts suggest the stock is undervalued at current levels. Estrada’s framing of the thesis is direct: the winners in legal AI will be the firms with trusted content domain expertise. On the trusted-content half of that statement, Estrada is right and so is Thomson Reuters. Westlaw and Practical Law are not things an LLM wrapper can replicate from a crawl.

What the investor case assumes is that seat count equals integration, that ACV classification equals revenue quality, and that the commercial metrics will continue to move ahead of the operational ones indefinitely. But integration and measurement are not moving as fast as ACV; that is the publisher’s own disclosure in the 2026 report. Consider the reportorial coverage of the CoCounsel milestone. Bob Ambrogi at LawSites quoted TR’s own framing of the million users as “not for pilots, not for experiments, as core infrastructure for how they work.” Which is the framing the 2026 report itself quietly contradicts. Share of legal workflow that sits at the centre of AI use: 16%.

“Core infrastructure” and 16% are not describing the same world.

The operative question for an investor holding Thomson Reuters is how many quarters the divergence between the commercial metrics and the workflow metrics can run before one of them has to bend toward the other.

A habit layer is where a profession reaches first. It is the tool the lawyer opens when she wants to check a clause, draft a paragraph, or summarise a thread. Public tools like ChatGPT own the habit layer for individual use, and Microsoft Copilot owns it for enterprise use in corporate-legal environments. A workflow layer is where institutional knowledge, matter-tracking, and billable time live. What Westlaw and Practical Law have historically been, and what CoCounsel is trying to extend those into, is the workflow layer. Thomson Reuters competes with ChatGPT and Harvey in the habit layer, where it is already losing on usage counts. It competes with Microsoft in the workflow layer, where the 62%-corporate-legal-enterprise-use number is the structural threat the 2025 Success Pyramid did not have to account for.

The Side-Channel Problem

Our February piece on lawyer privilege predicted that adoption would be shaped by structural ambiguity more than by model capability. It argued that privilege questions, discovery exposure, confidentiality constraints, and the uneven diffusion of general-purpose tools would shape AI adoption as much as any technical benchmark. TR’s 2026 data vindicates that position so far.

This is explored further in the individual-versus-firm adoption gap. An independent survey fielded in early 2026 by the 8am Report found that legal-professional individual GenAI adoption moved from 31% to 69% year-on-year, while only 46% of law firms had implemented general-purpose AI tools at the firm level and only 34% had adopted legal-specific products. More than half of the firms surveyed provide no AI training, and 43% have no formal policy. That is the shadow-adoption pattern our February piece argued for, made explicit by a non-TR data source. Individuals are 23 percentage points ahead of firms on adoption, which means whatever the firm’s AI policy is, the practitioners inside it have already moved past it.

Threat-perception shifts are the second piece. Unauthorised practice of law moved 14 points as a major threat. Jobs moved 9. Less need for professional services moved 8. Fewer billable hours moved 4. Direction is uniform. The profession spent the year using the tool and noticed what it does. Framed by the February piece as structural ambiguity — the binding constraint on adoption; the 2026 data keeps the framing and extends it. Ambiguity has not narrowed. It has hardened across every dimension the report measures.

Client-firm governance is the third piece. 40% of firms receive conflicting AI guidance from clients. Two-thirds of corporate respondents want their outside firms to use AI, while fewer than 20% require it through guidelines or RFPs. Jones Walker’s analysis adds that 60% of in-house teams do not know whether their outside counsel uses GenAI on their matters at all. Binding constraint on firm-wide adoption is no longer the firm’s own strategy. It is the matter-by-matter instruction the firm has to reconcile as it comes in. A firm that has solved strategy, leadership, operations, and individual engagement still has to field 40 different client instructions at the intake gate.

The economic shape of that friction has a specific form. A firm that wants to use AI efficiency to lower the cost of a matter has to first know whether the client wants efficiency or wants time. Some clients are paying hourly and want AI to reduce the bill. Other clients are paying a fixed fee and want AI to improve quality within the scope. A third set of clients is paying hourly but has told the firm, sometimes in writing and sometimes by proxy, that the client does not want AI anywhere near their matter because of confidentiality, privilege, discovery, or sanctions concerns. The 2026 report’s 40% conflicting-direction number is the number of firms that have at least one of the first set and at least one of the third set in their book of business, often on matters that look similar from the outside. Workflow-level integration, in a firm subject to that governance environment, is a policy problem before it is a technology problem. And the 2025 Success Pyramid did not describe a policy problem.

The implication for the profession runs in a different direction from “lawyers should use the tools less.” Conditions under which the tools are being used have drifted from the conditions the 2025 report described. Adoption happens mostly outside firm channels. Measurement is absent. Governance is client-by-client. The profession has spent a year moving through the first stage of the middle and running up against three stages TR’s framing doesn’t consider.

Firm-side risk exposure from this is real and concrete. A firm with 46% of its practitioners using general-purpose AI tools and 43% of its policy-makers without a written AI policy has, functionally, a staff that is running AI workflows its own partners could not consistently describe. If a sanctions case lands on one of those workflows, the firm’s defence (what the tool was, how it was used, what policies governed the use, which client approved which scope) has to be reconstructed after the fact from practitioner-level habit rather than from documented firm practice. Sanctions and privilege were flagged by the February piece as the binding risks. What the 2026 data tells us is that those risks are being carried by practitioners whose employers do not yet know which of them are carrying which risk. That is not a technology problem the 2025 Pyramid’s Individual-Engagement layer was ever built to describe.

The 2027 Report

The 2026 Thomson Reuters report tells the story of a drift without naming it as one. Every number the 2025 framing could not carry is registered here (the three-point workflow-centrality move, the 82%-don’t-measure disclosure, the hardened threat perception, the client-governance friction) without yet narrating the drift as such.

The next 2027 report will tell us whether the drift is the story. Vendor-research cycles have a pattern. Year t is when the publisher identifies the new wave and sells the upside. In year t+1, when the upside is hard to demonstrate, the publisher replaces the wave with the next wave. By year t+2, the publisher is claiming the new wave has been the story all along. Visibly, the 2026 report is in year t+1 of that cycle. Agentic AI is the forward narrative. GenAI is where the measurement disclosure sits. If the 2027 report retires GenAI as the hero and treats agentic as the story the 2025 report was pointing toward the whole time, the pattern is confirmed and the framework the 2026 report drifted away from becomes the thing the 2027 report pretends never happened. If instead the 2027 report returns to the GenAI measurement problem and starts reporting on what 18% of organisations found when they measured, the drift was a correction TR is committing to. Those are different futures, and the report itself is the test that tells us which one we are in.

Everyone interested in legal AI should be watching the 2027 report for the same reason. The investor case depends on which report lands. The profession’s governance problem turns on it too. And Thomson Reuters’ own editorial authority, which is what makes its reports worth reading in the first place, is what’s ultimately being tested.

AI ROI: More than a feeling?

Reply

Keep Reading

Putty