Who Watches the AI Agents?

Gartner predicted in June 2025 that a category of AI called guardian agents will capture 10 to 15 percent of the agentic AI market by 2030 (Gartner, 2025, https://www.gartner.com/en/newsroom/press-releases/2025-06-11-gartner-predicts-that-guardian-agents-will-capture-10-15-percent-of-the-agentic-ai-market-by-2030). The prediction deserves more attention than it has received, not because the market share number is interesting in itself, but because the existence of this category reveals something uncomfortable. We have started deploying AI agents to make decisions inside organizations, and we do not yet have reliable mechanisms to catch when those agents make the wrong ones.

The context here matters. Gartner also predicted that by the end of 2026, 40 percent of enterprise applications will feature task-specific AI agents, up from less than 5 percent in 2025 (Gartner, 2025, https://www.gartner.com/en/newsroom/press-releases/2025-08-26-gartner-predicts-40-percent-of-enterprise-apps-will-feature-task-specific-ai-agents-by-2026-up-from-less-than-5-percent-in-2025). That is a massive expansion of autonomous decision-making inside enterprise environments. Agents are approving expense reports, routing support tickets, generating contract language, flagging security anomalies, and making procurement recommendations. And only 17 percent of organizations have actually deployed agents so far, with more than 60 percent planning to within two years. The deployment wave is just beginning.

When agents make decisions, something has to catch the bad ones. This is the meta-governance problem. It is not enough to say "humans will review the outputs." If the agent is processing thousands of decisions per hour, humans cannot review them all. If the agent is embedded in a workflow where its output feeds directly into the next automated step, there may be no review point at all before the decision has consequences. This is why the guardian agent category exists. Guardian agents are AI systems designed specifically to monitor other AI systems for compliance, safety, and performance. They watch what the other agents do, flag anomalies, check outputs against policy constraints, and in some designs can halt or override agent actions before they execute.

The idea has a certain recursive quality that I find both elegant and slightly unsettling. You deploy an AI agent to do work. You deploy another AI to watch the first one. If the watcher also makes mistakes, do you need a third AI to watch the watcher? This is not a purely theoretical concern. The EU AI Act, which is now shaping how European enterprises and their global counterparts think about AI governance, requires meaningful human oversight for high-risk AI systems. A guardian agent might satisfy the letter of that requirement while raising a genuine question about whether it satisfies the spirit. If the oversight mechanism is itself an AI, how do we know it is reliable? Who audits the auditor?

IBM's 2024 Cost of Data Breach report found that organizations using AI and automation in security saved an average of $2.2 million per breach compared to organizations that did not (IBM, 2024, https://newsroom.ibm.com/2024-07-30-ibm-report-escalating-data-breach-disruption-pushes-costs-to-new-highs). That finding is often cited as evidence that AI in security pays for itself. It probably does. But the finding is about detection and response speed, not about the accuracy of AI judgment. AI can find anomalies faster than humans. It cannot reliably distinguish between a genuine threat and a legitimate but unusual pattern without some kind of governance layer. Guardian agents are, in part, a response to that limitation. They add a layer of cross-checking that a single agent cannot provide for itself.

The IS governance research question that I find most interesting here is not technical. It is organizational. When agents monitor agents, where does accountability sit? The standard accountability model assumes a human who made a decision and is responsible for its consequences. In a system where an AI agent made the decision and a guardian agent either approved it or failed to flag it, the accountability is distributed across systems in a way that does not map cleanly onto existing organizational responsibility structures. If the expense approval agent approves a fraudulent claim and the guardian agent does not catch it, who is responsible? The vendor who built the approval agent? The vendor who built the guardian? The IT team that deployed them? The manager whose approval threshold the system used? The policy owner who defined the constraints? All of them and none of them.

I wrote about how AI governance tends to get copied from IT governance templates in an earlier post, and the guardian agent problem sharpens that critique. IT governance frameworks have accountability structures that assume human decision-makers at identifiable points in a process. Agentic systems remove those points. Guardian agents are an attempt to reinsert governance checkpoints, but they do so in a way that may actually deepen the accountability gap if organizations treat the guardian as the human-equivalent oversight layer without examining whether it actually provides the human judgment that oversight requires.

The 40-plus percent of agentic AI projects that Gartner notes are being canceled due to costs and risk controls tells part of the story. Organizations are deploying agents, discovering that the risk control problem is harder than expected, and pulling back. Guardian agents are one proposed solution to the risk control problem, which is why Gartner is predicting meaningful market share for the category. But a guardian agent is still software. It has its own training data, its own blind spots, its own failure modes. The organization that deploys a guardian agent and considers the oversight problem solved has probably made a governance error, not solved a governance problem.

What would adequate governance of agentic systems look like? My read is that it requires at least three things that guardian agents alone cannot provide. First, clear specification of what the agent is and is not authorized to decide. Not a general policy document, but precise operational boundaries that can be tested and audited. Second, logging at a level of detail that allows post-hoc reconstruction of any consequential decision. If something goes wrong, the organization needs to be able to determine what the agent decided, why, and what the guardian agent did or did not do in response. Third, genuine human review of the decisions where errors have the most serious consequences, at a frequency and depth that actually provides oversight rather than just satisfying the appearance of oversight.

Guardian agents are a real and probably necessary development. The agentic AI market is growing fast enough that manual oversight of every agent decision is not going to be feasible. I understand why the category is emerging and why Gartner thinks it will be significant. My concern is that "guardian agent" becomes another version of the AI ethics board problem: an artifact that signals oversight is happening, while the actual accountability questions remain unresolved in the organizational structures underneath.