An AI agent I built lied to me last week. Confidently.
It told me a job was finished. It wasn't. I only caught it because I'd just spent two weeks building the system meant to catch exactly that. That same system is why our sharpest agent can read sales leads and research companies, but it can't write a single email, open our private files, spend real money, or act on its own. Every agent we run starts as an intern. It can look, but it can't touch. On purpose.
TL;DR: Govern a new AI agent the way you'd onboard a new hire. Give it an identity it can't fake, log every move, and grant access in stages it has to earn. Start it as an intern that can read but can't act. Promote it only when you trust it, and keep the power to pull it back.
What does it mean to treat an AI agent like an intern?
Treating an AI agent like an intern means it can read and research on day one, but it can't act until it earns the right. When you hire a person, they get a badge that opens a few doors, a manager who watches the work, and a probation period. An AI agent should get the same deal.
Most companies do the opposite. They hand a brand-new agent full access on day one, with no probation and no fast way to shut it off. That's an onboarding failure, not a technology problem. MIT found that 95% of enterprise AI pilots deliver no measurable return, and the ones that worked had real governance behind them.
We built our agent to climb a ladder, the same way a person earns trust at work. Four rungs:
Intern: reads and researches, nothing more.
Junior: can start to act, but only with a human signing off.
Senior: works on its own across most of the job.
Principal: runs with full autonomy, including the thing that scares people most, creating other agents.
Right now, ours is an intern. You don't grant trust up front. You earn it over time, and you can take it back. An intern you can hire but can't fire isn't worth much. Neither is an agent you can't pull back. If you can't stop it, it's a liability with a login.
Why do most AI agents start with too much access?
Most AI agents start over-privileged because companies treat them like software you configure once, not like a hire you onboard. The numbers show the cost. 86% of AI agents get deployed with no security sign-off at all (Gravitee, February 2026). Once one's running, 91% of companies say they can't stop it before it acts.
Here's the part most leaders miss. An attacker usually has to work sideways through your network to reach the systems that count. Your agent skips that part. It's already inside the systems it can change, because you handed it those privileges when you set it up. So the blast radius isn't theoretical. It's whatever you granted on day one.
We wanted to show what "in control" actually looks like, so we took every open-source tool our framework recommends, the Agentic Trust Framework, and wired them into one running system. Not a slide. A real multi-agent setup on a Mac Studio in my house that governs itself. We pointed it at a real sales process and watched what happened. (I wrote more about this in "Assume Your AI Agents Will Be Breached.")
What are the five controls that keep an AI agent in check?
Five controls keep an AI agent in check: an identity it can't fake, a full record of its actions, a filter on the data it reads and sends, walls around where it can go, and a kill switch. The key design choice is that all five sit outside the model. The agent can ask for anything. A separate layer decides what it actually gets.
That distinction counts. The first thing a lot of clients show me is a page of rules they wrote for the agent, then they ask why it still went off the rails. Telling an agent what not to do is like onboarding an employee with a rulebook and hoping. It follows instructions right up until it doesn't. A clever prompt or a poisoned document, and your rules go out the window.
Here's what each control does, in words a CEO would use:
An ID badge it can't fake. Every agent carries an identity the system checks before it does anything. No badge, no action.
A record of everything. Every move gets logged. Right now the system is just learning what normal looks like. It watches. It doesn't judge yet.
A bouncer on the data. Before the agent reads anything, the system strips out private details like emails and phone numbers. Before it sends anything out, it checks again.
Walls around where it can go. The agent can only reach the data and the parts of the network its level allows. It can't even see the paths it isn't cleared for.
A kill switch. One command shuts the agent down. We tested it, and I'll give the honest version below.
We gave the intern a real job: a CISO at a company we'll call FortMesh sends an inquiry after hearing me speak. The full job runs thirteen steps, from first contact to a draft proposal. (FortMesh is made up, modeled on real situations, never contacted.) The intern could read the lead, scrub the private details, and research the company on the public web. It couldn't open our private playbook, write a response email, reach premium research sources, create a client file, write a proposal, or spend past its budget. The system said no twelve different ways, and each refusal left a receipt in the logs.
On the kill switch, here's the honest version. We shut an agent down with one command, well under our one-second target. At intern level, that fast kill is a simple switch held in memory. The fuller version, the one that also rips away the agent's network identity, is wired in and gets exercised more as we climb the levels. I'm not going to sell you a network-wide kill we haven't fully earned yet. But the plain version is what counts: the number one fear with agents is that one goes wrong and you can't stop it. 91% of companies say they can't. We can, and we timed it.
Why doesn't Zero Trust fully cover AI agents?
Zero Trust doesn't fully cover AI agents because it was built for people and the devices they carry. John Kindervag created Zero Trust at Forrester in 2010. The rule is simple: never trust, always verify. It keeps checking the connection, who's connecting and from what device. What it doesn't check is the meaning of what's moving through that trusted connection.
That gap is where agents get hurt. A poisoned instruction rides inside a fully verified channel, and nothing blinks. The connection is clean. The payload isn't. (It's the same mechanism behind the agent-targeted phishing I covered in "There's a New Kind of Phishing.")
So Zero Trust is necessary here. It just needs an extra push. Check what the agent is about to do, not only that it connected. Make it earn its reach in stages with an identity it can't fake, instead of handing over everything at once. That ladder is Zero Trust for agents. The platform, not the prompt, decides what the agent is allowed to do.
What is agent debt, and why does it cost more later?
Agent debt is the hidden cost of standing up an AI agent today and never maintaining it. The model underneath gets updated and its behavior shifts. The tools it calls change. The data it reads drifts. The job you wrote it for last quarter isn't the job this quarter. None of this announces itself, so the gap between what it's doing and what you need quietly widens.
The industry calls this the technical debt of AI agents. A simpler term is agent debt. You take the easy path today, it works, and the bill shows up later with interest. With agents the interest compounds faster, because the thing keeps acting on its own while the debt builds.
This is the conversation that surprises clients most. They budget for the launch. They don't budget for the year after, and the year after is where agent debt lives. That's why "set it and forget it" is the most expensive phrase in AI right now. An agent works more like a hire than a tool. New hires get reviews. Their roles change. Same with an agent, except it moves at machine speed and won't tell you when it's drifting.
