Agent Risk Taxonomy

Stop Agent Chaos. Secure AI Autonomy.

The first granular taxonomy that turns AI-risk standards into hands-on security controls.
7

Core Risk Domains

21

Risk Categories

5

Framework Mappings

100+

Specific Risk Scenarios

Built on Industry Standards

Mapped to the standards you already track.
Actionable Intelligence
Comprehensive Coverage
Engineering Ready
Actionable Intelligence
Comprehensive Coverage
Engineering Ready
Actionable Intelligence
Comprehensive Coverage
Engineering Ready

OWASP Agentic AI

15/15 threat IDs covered (T1–T15)


Mitre

MITRE ATLAS

Live tactics & techniques like AML.T0053 (Plugin Compromise)

Eu

EU AI Act

Direct references to Articles 9, 10, 14 + Annex III.

NIST

NIST AI RMF

Each risk slotted into Govern → Map → Measure → Manage

ISO

ISO 42001 / 24028

Governance & trustworthiness clauses cross-linked.

Agent Risk Taxonomy

Privilege Escalation
T3
Privilege Escalation: Agents gain excessive permissions beyond their intended scope, potentially accessing or modifying resources they shouldn't. This often occurs through misconfigured access controls or exploiting system vulnerabilities.
Credential
Theft
T9
Credential Theft: Agent authentication credentials are compromised, allowing unauthorized access to systems and data. This includes stolen API keys, session tokens, or identity spoofing attacks.
Confused Deputy
T9
Confused Deputy: Agents are tricked into misusing their legitimate authority to perform unauthorized actions on behalf of attackers. This exploits the agent's trusted position while making malicious actions appear legitimate.
Goal Misalignment
T6
Goal Misalignment: Agents pursue objectives that deviate from their intended purpose, often optimizing for metrics that don't align with actual business goals. This includes reward hacking where agents find unintended ways to maximize their success criteria.
Policy Drift
T6
Policy Drift: The agent's behavior gradually changes over time, deviating from its original instructions and safety constraints. This can occur through cumulative exposure to biased inputs or subtle prompt modifications.
Hallucination
T5
Hallucination: Agents generate confident but factually incorrect information, often creating cascading errors when subsequent decisions are based on these false premises. This is particularly dangerous in high-stakes domains like finance or healthcare.
Bias & Toxicity
T15
Bias & Toxicity: Agents reflect harmful stereotypes or generate inappropriate content, leading to discriminatory outcomes or offensive responses. This includes demographic bias in recommendations and toxic language generation.
API Integration
T2
API Integration: Issues arise from changing API schemas, rate limits, or service unavailability that can break agent functionality. Agents may fail silently or make decisions based on stale or malformed data.
Supply Chain Vulnerabilities
T2
Supply-Chain Vulnerabilities: Compromised dependencies, libraries, or containers can introduce malicious behavior into agent systems. This includes backdoors in AI models or malicious code in third-party integrations.
Uncontrolled Resource Consumption
T4
Uncontrolled Resource Consumption: Agents consume excessive computational resources through infinite loops, prompt storms, or recursive API calls. This can lead to denial-of-service conditions and unexpected infrastructure costs.
Sensitive Data Exposure
T1
Sensitive Data Exposure: Agents inadvertently reveal confidential information from training data, logs, or connected systems. This includes exposing personal identifiable information (PII) or proprietary business data.
Data Exfiltration Channel
T2
Data Exfiltration Channels: Malicious actors use agents as conduits to steal data through covert channels or unauthorized transfers. This can involve encoding sensitive data in seemingly normal outputs or responses.
Unsafe Actuation
T7
Unsafe Actuation: Agents perform destructive operations or are weaponized for malicious purposes, including unauthorized modifications to systems or data. This represents the most direct physical or digital harm potential.
Human Manipulation
T15
Human Manipulation: Agents mislead users, create over-reliance, or exploit psychological vulnerabilities to influence human behavior. This includes deceptive practices and undermining human decision-making autonomy.
Opaque Reasoning
T8
Opaque Reasoning: Inability to trace or explain the agent's decision-making process, making it impossible to audit outcomes or debug failures. This creates compliance risks and hampers incident response efforts.
Data & Memory Poisoning
T1
Data & Memory Poisoning: Agents' knowledge bases or memory systems are corrupted with false information, leading to persistent misinformation. This includes attacks on retrieval-augmented generation (RAG) systems and vector databases.
Access Control & Permissions: Risks of agents obtaining or being granted unauthorized access to data and systems through privilege escalation or credential theft.
Tool Misuse: Risks arising from the failure, vulnerability, or improper use of external tools, APIs, and other dependencies.
Governance: Risks related to agents deviating from their intended goals, rules, or instructions.
Agent Output Quality: Risks from agents generating false, biased, toxic, or otherwise harmful content.
Agent Behaviour: Risks of agents being manipulated or used to deceive users, perform harmful actions, or cause unintended consequences.
Privacy: Risks of agents inadvertently leaking, exposing, or exfiltrating sensitive data.
Reliability & Observability: Risks of performance degradation over time and an inability to understand or trace an agent's decision-making process.
Access control
& Permission
Tool Misuse
Agent Behaviour
Governance
Privacy
Agent Output Quality
Reliability & observability
Agent Failure
Agent Misuse
Tool Failure
Agent Failure
close

T1 - T15

OWASP Agentic Risk (ID.Name)
T1
Memory Poisoning
T2
Tool Misuse
T3
Privilege Compromise
T4
Resource Overload
T5
Cascading Hallucination Attacks
T6
Intent Breaking & Goal Manipulation
T7
Misaligned & Deceptive Behaviors
T8
Repudiation & Untacebility
T9
Identity Spoofing & Impersonation
T10
Overwhelming Human in the Loop
T11
Unexpected RCE and Code Attacks
T12
Agent Communication Poisoning
T13
Rogue Agents in Multi-Agent Systems
T14
Human Attacks on Multi-Agent Systems
T15
Human Manipulation
Agent Risk Taxonomy

Mappings with existing frameworks

We mapped the agent risks with existing frameworks like OWASP, NIST, EU AI Act etc.

SEE
MORE
Risk Domain Category OWASP Agentic Risk (ID + Name) MITRE ATLAS (ID + Name) NIST AI RMF ID(s) ISO AI Safety Standard(s)
Governance Goal Misalignment T6 – Intent Breaking & Goal Manipulation AML.T0053 – LLM Plugin Compromise GOVERN 1.2 TR 24028; 42001; 23894
Governance Policy Drift T6 – Intent Breaking & Goal Manipulation AML.T0010 – AI Supply Chain Compromise GOVERN 1.5 TR 24028; 23894
Agent Output Quality Hallucination T5 – Cascading Hallucinations AML.T0062 - Discover LLM Hallucinations MEASURE 2.5 TR 24028; 24029-1
Agent Output Quality Bias & Toxicity T15 – Human Manipulation AML.T0048 – External Harms MEASURE 2.11 TR 24028; 23894
Tool Misuse API Integration T2 – Tool Misuse AML.T0053 - LLM Plugin Compromise MAP 2.2 TR 24028; 42001; 23894
Tool Misuse Supply-Chain Vulnerabilities T2 – Tool Misuse AML.T0040 – AI Supply Chain Compromise MAP 4.1 TR 24028; 42001; 23894
Tool Misuse Uncontrolled Resource Consumption T4 – Resource Overload AML.T0029 – Denial of ML Service MAP 3.2 TR 24028; 42001; 23894
Privacy Sensitive Data Exposure T1 – Memory Poisoning AML.T0057 - LLM Data Leakage MEASURE 2.10 TR 24028; 23894
Privacy Data Exfiltration Channels T2 – Tool Misuse AML.T0024 - Exfiltration via AI Inference API MAP 4.2 TR 24028; 23894
Reliability & Observability Data & Memory Poisoning T1 – Memory Poisoning AML.T0020 – Poison Training Data MEASURE 3.1 TR 24028; 24029-1; 23894
Reliability & Observability Opaque Reasoning T8 – Repudiation & Untraceability AML.T0049 - Exploit Public-Facing Application MEASURE 2.9 TR 24028; 23894
Agent Behaviour Human Manipulation T15 – Human Manipulation AML.T0054 - LLM Jailbreak MAP 5.1 TR 24028; 42001; 23894
Agent Behaviour Unsafe Actuation T7 – Misaligned & Deceptive Behaviors AML.T0048 – External Harms MEASURE 2.6; MANAGE 1.3 TR 24028; 24029-1; 23894
Access Control & Permissions Credential Theft T9 – Identity Spoofing & Impersonation AML.T0012 – Valid Accounts MEASURE 2.7 TR 24028; 42001; 23894
Access Control & Permissions Privilege Escalation T3 – Privilege Compromise AML.T0055 – Unsecured Credentials GOVERN 6.1 TR 24028; 42001; 23894
Access Control & Permissions Confused Deputy T9 – Identity Spoofing & Impersonation AML.T0054 – LLM Jailbreak GOVERN 6.1 TR 24028; 42001; 23894

Frequently Asked Questions

How is this different from general AI security?

We focus specifically on autonomous agents that use tools—not traditional ML models. The risks are fundamentally different.

Which agent types does this cover?

Any AI system that can invoke external APIs, make decisions autonomously, or interact with tools. Technology-agnostic.

How does this differ from traditional AI security frameworks?

Unlike traditional frameworks that focus on ML model security, our taxonomy specifically addresses the unique risks of agentic AI systems that can take autonomous actions and interact with external tools.

Is the framework applicable to all types of AI agents?

Yes, the taxonomy covers single-model agents, multi-agent systems, and any AI system that can invoke external tools or APIs autonomously. It's designed to be technology-agnostic.

Can my team contribute?

Yes! We welcome input from security practitioners. Contact us about our contributor program.

Get the complete Agent Risk Taxonomy & stay ahead of agent threats.