Vaikora › Blog › Threats & Attacks

OWASP LLM06 Excessive Agency: AI Agent Risk Explained

Threats & Attacks · June 30, 2026 · 10 min read

OWASP LLM06 Excessive Agency occurs when an AI agent is granted too many permissions, can perform actions beyond its intended scope, or lacks guardrails on tool use. A misconfigured agent might delete data, modify infrastructure, exfiltrate secrets, or execute unauthorized transactions because the LLM chose to, not because the business logic required it. The risk compounds in production, where agents control money, customer data, or critical systems, and traditional access controls assume human decision-making.

What OWASP LLM06 Really Means

OWASP's LLM Top 10 categorizes excessive agency as the risk that an AI system can perform actions with insufficient oversight or scope constraints. In practice, this takes three forms:

Tool overprovisioning. Your agent has access to thirty APIs when it needs five. It can call delete operations, write to databases, or trigger infrastructure changes because the developer granted broad permissions upfront and assumed the LLM would use them responsibly.

Scope creep without boundaries. The agent was designed to answer customer questions, but it also has access to the payment processor, email system, and customer database. An attacker who jailbreaks the prompt or a benign logic error can cause it to refund orders, email customer lists, or lock accounts.

Absent or ineffective constraints. You set up an approval workflow, but it requires the agent to summarize its own reasoning, which it can rationalize away. Or you built a "dry-run" mode that the LLM ignores because the prompt never explicitly forbids it from executing. Traditional try-catch blocks and type systems don't stop an LLM from deciding to take an action you didn't foresee.

The OWASP LLM Top 10 explicitly names this as a critical risk because LLMs are unpredictable. Unlike a compiled program that executes exactly what you coded, an LLM can reinterpret instructions, chain tool calls in novel ways, and justify actions that sound rational to a language model but violate your business policy.

Why Traditional Access Controls Fail for AI Agents

Security teams are familiar with the principle of least privilege. You run your application with the minimum permissions it needs to function, you scope database credentials to specific tables, and you restrict API tokens to read-only operations when writes are unnecessary. This works for code: a function either has the capability to delete data or it doesn't.

AI agents break that model because the LLM makes the decision to use a capability in real time, based on the user prompt. You cannot predict every scenario the agent will encounter or which tools it will need for each scenario. This creates a dilemma: grant the agent enough permissions to be useful, and you increase the blast radius if the LLM misuses them. Restrict permissions, and the agent becomes brittle and unhelpful.

A human employee also follows least privilege. They have a login, they have specific permissions, and their manager (ideally) reviews unusual actions. But a human can be held accountable for mistakes and can explain their reasoning. An LLM cannot. It cannot tell you why it decided to execute a refund or explain its logic in a way that helps you improve the system. It generates text that says it seemed like the right choice.

Even well-designed permission systems fail at the LLM layer. You might grant the agent an API token that can only read customer data, but if the LLM is prompted with "do whatever it takes to answer this question" and someone asks it to export the customer database to an external URL, the LLM can chain read operations and pass data to a tool that submits HTTP requests. The permission boundary held, but the data moved.

OWASP LLM06 is not a gap in your identity and access management. It is a gap in the policy layer between the LLM's decisions and the systems it controls.

The Cost of Excessive Agency in Production

Excessive agency carries real financial and compliance consequences.

Money. Autonomous agents with broad refund or transfer permissions can execute high-value transactions based on interpretations of user input that don't match business intent. A refund API call chain, a fund transfer, or a cloud resource over-allocation can result in significant unplanned expenditure if no policy layer validates the request before execution.

Data. An agent with broad database access can output customer PII to external systems, including third-party LLMs you did not authorize, during routine processing requests. Environment variables containing API keys or credentials can be exfiltrated if the agent is prompted to inspect or debug system configuration.

Compliance. In a HIPAA environment, an agent with database access to patient records can process requests without proper authentication or audit trails, creating a record of unauthorized access. In a PCI DSS environment, an agent with access to payment data can log decrypted card numbers to debug endpoints if you ask it to troubleshoot payment flows. In a GDPR context, an agent can transfer personal data across borders without the authorization checks your compliance framework requires.

Availability. An agent granted infrastructure permissions can delete database tables, disable services, or apply patches to production systems if a malformed user request is misinterpreted as an administrative command. The damage happens in milliseconds, before a human can intervene.

These scenarios are production-ready failure modes. Any agent deployed with insufficient policy controls and broad tool permissions is exposed to all four categories of risk.

How Least Privilege Actually Works for AI Agents

Least privilege for AI agents requires thinking differently than traditional access control.

Segment tools by business outcome. Do not hand the agent every tool you have and hope it uses them wisely. Group tools by the specific business flow they enable. A customer support agent might have tools for reading tickets, posting replies, and looking up account status. It does not need access to the refund API, password reset, or payment processor.

Enforce constraints at runtime. Least privilege only works if you can enforce it every time the agent runs. You need a policy layer that runs before every tool call and decides whether to allow it. This cannot be a code comment or a prompt instruction. It has to be deterministic and authoritative.

For example: "The agent can call the refund API, but only if the refund amount is less than fifty dollars and the customer has a valid return authorization." That constraint lives outside the LLM's reasoning. Every time the agent tries to refund, the system checks the policy before executing. If the agent tries to refund one thousand dollars, the system blocks it, every time, regardless of what the LLM wrote in its reasoning.

Enforce authorization for every action. Even if the agent has access to a tool, it does not mean every invocation should succeed. A traditional API checks whether the user is authenticated and authorized to call it. For AI agents, you need the same check, but you also need to validate that the specific parameters (which customer, which amount, which resource) are authorized for this user and this context. An agent might have permission to read customer data, but not this customer's data.

Monitor and log every decision. You need to know what the agent tried to do, what you allowed, and what you blocked. This serves two purposes: you can audit the system for policy violations and retrain your policies based on real agent behavior. When you discover a pattern of the agent requesting unauthorized actions, you can tighten the policy.

Runtime Policy Enforcement for AI Agents

Runtime policy enforcement is the mechanism that makes least privilege enforceable for AI agents. It is a policy engine that runs between the LLM and your tools, evaluating every tool call against your defined policies and returning allow, log, constrain, or block before the tool executes.

In practice, this means you can enforce policies like:

This agent can call the database query tool, but only with SELECT statements and only against tables in the whitelist.
This agent can call the email API, but only if the recipient is on the approved list.
This agent can access the payment system, but only if the user is authenticated to MFA and the transaction is under the per-transaction limit.
This agent can invoke AWS APIs, but only describe and read operations; no creates, updates, or deletes.

These policies run at runtime, before every tool call. The LLM cannot opt out or rationalize its way around them. The policy engine is the system of record for what the agent is allowed to do.

Addressing OWASP LLM06 in Enterprise Deployments

Effective OWASP LLM06 mitigation requires both architectural and operational decisions.

Inventory your tools and their permissions. Document what each tool can do, what data it can access, and what resources it can modify. Be specific about CRUD operations: if a tool can read, does it also need to create? To update? To delete? Most agents need only a subset.

Map agent personas to tool sets. What tools does a customer support agent need versus an internal IT agent or a data analytics agent? Create separate tool inventories for each. When you provision a new agent, start with nothing and add only the tools it needs.

Implement a deny-by-default policy. Do not grant permissions and hope the agent uses them safely. Grant nothing, then explicitly allow specific operations. "Allow the support agent to call read_ticket and post_reply if the user is authenticated and the ticket belongs to their organization."

Audit before allowing sensitive operations. Some operations are sensitive enough that they should never be automatic. If an agent wants to execute a delete, send an email to an address outside your organization, or transfer money, require an audit trail and human review before it executes.

Test with adversarial prompts. Use MITRE ATLAS and other adversarial frameworks to test how the agent behaves when prompted with jailbreak attempts, prompt injections, and requests that violate your policies. Does the policy engine stop it? Does it appear to comply but send the data somewhere else? Does it rationalize the violation in its output?

Frequently Asked Questions

What is OWASP LLM06 excessive agency?

OWASP LLM06 excessive agency is the risk that an AI agent performs actions beyond its intended scope because it has too many permissions or insufficient policy enforcement. The agent might delete data, modify systems, exfiltrate secrets, or execute unauthorized transactions because the LLM decided to, not because the business required it. Traditional access controls assume human decision-making and cannot predict what an LLM will do with a capability at runtime.

How do you limit AI agent permissions?

Limit permissions by segmenting tools into specific business outcomes, implementing a deny-by-default policy, and enforcing constraints at runtime before every tool call. Document what each tool does and what data it accesses. Assign only the tools each agent persona needs. Use a policy engine to validate every tool invocation against your rules, blocking or logging violations. Test with adversarial prompts to identify policy gaps.

What is least privilege for AI agents?

Least privilege for AI agents means the agent has access to only the tools and data it needs to fulfill its specific business purpose. Unlike traditional least privilege, which is often static, AI agent least privilege must be enforced at runtime because the LLM decides which tools to use in response to user input. This requires a policy layer that runs before every tool call and validates whether the specific invocation is authorized.

Why is excessive agency dangerous in production AI?

Excessive agency in production AI introduces financial, compliance, and operational risk. An agent with too many permissions can authorize refunds, transfer money, export customer data, or delete infrastructure because the LLM chose to, causing direct financial loss or data breach. It can violate compliance regulations like HIPAA or PCI DSS by processing sensitive data without proper authorization. It can disable services or corrupt data. The damage happens in milliseconds, before a human can intervene.

How do you enforce OWASP LLM06 mitigation?

Enforce OWASP LLM06 mitigation by inventorying your tools and their permissions, mapping agent personas to tool sets, implementing a deny-by-default policy, implementing runtime policy enforcement, and auditing sensitive operations before allowing them. Use a policy engine to validate every tool call against your rules and log every decision for audit and compliance. Test with adversarial prompts to ensure policies hold under attack.

What is the difference between permission-based and policy-based access control for AI agents?

Permission-based access control grants capabilities to users or services; policy-based access control adds rules about when and how those capabilities can be used. Traditional access management is permission-based: you have or do not have the capability. For AI agents, permission-based alone is insufficient because the LLM decides when to use a capability at runtime. Policy-based control adds the constraint layer: "You have the capability to refund, but only under these conditions." Both are necessary.

See Vaikora enforce policy on your AI

Open-core AI runtime control. Self-host the MIT gateway free, or run the hosted Control Plane.

Get a demo Self-host the gateway