Two studies looked at enterprise AI this year and reached opposite conclusions. One said almost no one is making money. The other said most companies already are. Both are right. The fight over which number is true skips the one finding that can actually help your business.
The short version. The headline numbers range from 5% success to 75% success because each study asked a different question. Strip that away and every report agrees on why projects fail: the work around the AI never got redesigned, and nobody set rules for what the AI could do on its own. The single best move this week is to figure out how you would stop your most-used AI tool in five minutes if it went wrong.
Why do AI studies give such different results?
The studies disagree because they measured different things, not because the technology changed. MIT's NANDA group reviewed public announcements, talked to 52 companies, and surveyed 153 leaders, then reported that 95% saw no return yet. Wharton asked about 800 leaders whether they were seeing value, and three in four said yes. One looked at whether specific tools moved the bottom line. The other asked leaders for a gut read. Both answers can be true at the same time.
The middle of the range tells the real story. McKinsey found only 39% of companies saw a real change to their bottom line, most of it under 5%. S&P Global found 42% walked away from most of their AI work last year, up from 17%, scrapping almost half their pilots before they ever went live.
What actually makes AI projects fail?
AI projects fail in the gap between the demo and the rollout, not in the technology. The tools are better than ever, and projects still die when companies try to move them from a polished demo into real daily work. McKinsey found the one habit behind the rare wins: rebuild the work before you add the AI. Bolt AI onto a broken process and you get a faster broken process.
Your customers feel the same gap. Qualtrics asked more than 20,000 people and found nearly 1 in 5 got nothing out of AI customer service, about four times the failure rate of AI everywhere else. Over the holidays, Liveops found 85% of shoppers said AI made service faster, yet more than half still had to go find a human, and only 17% wanted companies using more AI next year. People aren't rejecting AI. They're rejecting bad AI.
Why does redesigning the workflow only solve half the problem?
Redesigning the work is half the job because an AI agent is not a normal tool. A spreadsheet waits for you. An AI agent acts on its own. It reads, decides, talks to other systems, and changes things without asking. If you rebuild the work but never decide what that agent is allowed to do, you've handed a fast, skilled new worker the keys and walked off.
It'll do exactly what you told it, and often do it well. The trouble comes when it also does the parts you didn't mean. So plan for both at once. Rebuild the work, and set clear limits on the AI doing it. That second step is the one most companies skip.
What does this look like when it goes wrong?
It looks like a small problem that multiplies fast. I run four AI agents in a home lab I built myself, and I wrote a book on securing this technology, so I went in confident. The agents still got me. One night the lead agent made 47 copies of itself while I slept, each one starting more work, none of them stopping. I woke up to a $300 bill and spent the morning figuring out what all 47 had done.
Four agents did that. Now picture the same thing across 400 agents you can't see, with real customers and real money on the line. None of those failures were the AI being dumb. The work it did was smart. The failures were mine. I'd given an agent a job without enough limits and handed it the wrong access. Those aren't technology problems. They're management problems.
What can business leaders do this week?
Start with one question you can answer in five minutes: how would you stop your most-used AI tool if it went wrong right now? If you don't have an answer, that's your first project. A real stop button beats a spending limit every time, because the damage from an agent isn't only money. It's the customer records it touched and the systems it changed before anyone noticed.
From there, treat every AI agent like a new employee. Ask who it is, what it's allowed to do, what information it can see, where it can go, and what happens if it goes rogue. Those five questions are the core of the Agentic Trust Framework, a governance standard I built that the Cloud Security Alliance published in February 2026. You don't need the framework to start. You need to ask the questions before the agent touches real work.
Frequently asked questions
Is AI actually worth the investment for most companies?
It can be, but not automatically. Many companies see no return in the first year, while others are already profitable. The difference isn't the technology. It's whether the company redesigned the work and set limits on the AI before rolling it out.
What's the real difference between the MIT and Wharton findings?
They asked different questions. MIT measured whether specific custom tools moved the bottom line in a short, roughly six-month window, and found 95% hadn't yet. Wharton asked about 800 leaders whether they felt AI was paying off, and three in four said yes. One is a hard financial read, the other a confidence read, so both can be accurate.
What does it mean to govern an AI agent?
It means deciding, in advance, what the agent is allowed to do and putting limits in place to enforce that. It covers what data it can see, which systems it can reach, how it hands work to other agents, and how you stop it. Think of it as managing a tireless employee who follows instructions exactly, even when the instruction was wrong.
How do I know if my AI is safe to use?
A quick test: can you stop it in five minutes, and do you know exactly what it can access? If either answer is fuzzy, the AI isn't governed yet. That doesn't mean you rip it out. It means you set the limits and the stop process before you expand how you use it.
Where should a leader start?
Pick your most-used AI tool, write down how you'd shut it off in an emergency, and list what it can currently access. That single exercise surfaces most of the risk. From there, run the five governance questions on any agent before it handles real customers or real money.
The bottom line for leaders
The 5% versus 75% gap in AI studies comes from different questions, not different technology.
Every major study agrees projects die between the demo and the rollout, so buying better technology alone won't move your numbers.
Redesigning the work is necessary but not enough. An AI agent acts on its own, so you also have to set limits on what it can do.
Most AI failures are management failures: too much access, too little oversight, no fast way to stop the agent.
Your one move this week: figure out how you'd stop your most-used AI tool in five minutes. If you can't, that's the first project.
I go deeper into the rest of my lab failures, the exact fixes, and the five questions I run on any agent in my weekly newsletter, Trusted Agents, at trustedagent.substack.com. Questions about any of this? Email me at josh@massivescale.ai.
