Agent Risk Taxonomy

Stop Agent Chaos. Secure AI Autonomy.

The first granular taxonomy that turns AI-risk standards into hands-on security controls.

Download PDF Get Demo

Core Risk Domains

Risk Categories

Framework Mappings

100+

Specific Risk Scenarios

Built on Industry Standards

Mapped to the standards you already track.

Actionable Intelligence

Comprehensive Coverage

Engineering Ready

Actionable Intelligence

Comprehensive Coverage

Engineering Ready

Actionable Intelligence

Comprehensive Coverage

Engineering Ready

OWASP Agentic AI

15/15 threat IDs covered (T1–T15)

‎

MITRE ATLAS

Live tactics & techniques like AML.T0053 (Plugin Compromise)

EU AI Act

Direct references to Articles 9, 10, 14 + Annex III.

NIST AI RMF

Each risk slotted into Govern → Map → Measure → Manage

ISO 42001 / 24028

Governance & trustworthiness clauses cross-linked.

Agent Risk Taxonomy

Privilege Escalation

Privilege Escalation: Agents gain excessive permissions beyond their intended scope, potentially accessing or modifying resources they shouldn't. This often occurs through misconfigured access controls or exploiting system vulnerabilities.

Credential
Theft

Credential Theft: Agent authentication credentials are compromised, allowing unauthorized access to systems and data. This includes stolen API keys, session tokens, or identity spoofing attacks.

Confused Deputy

Confused Deputy: Agents are tricked into misusing their legitimate authority to perform unauthorized actions on behalf of attackers. This exploits the agent's trusted position while making malicious actions appear legitimate.

Goal Misalignment

Goal Misalignment: Agents pursue objectives that deviate from their intended purpose, often optimizing for metrics that don't align with actual business goals. This includes reward hacking where agents find unintended ways to maximize their success criteria.

Policy Drift

Policy Drift: The agent's behavior gradually changes over time, deviating from its original instructions and safety constraints. This can occur through cumulative exposure to biased inputs or subtle prompt modifications.

Hallucination

Hallucination: Agents generate confident but factually incorrect information, often creating cascading errors when subsequent decisions are based on these false premises. This is particularly dangerous in high-stakes domains like finance or healthcare.

Bias & Toxicity

T15

Bias & Toxicity: Agents reflect harmful stereotypes or generate inappropriate content, leading to discriminatory outcomes or offensive responses. This includes demographic bias in recommendations and toxic language generation.

API Integration

API Integration: Issues arise from changing API schemas, rate limits, or service unavailability that can break agent functionality. Agents may fail silently or make decisions based on stale or malformed data.

Supply Chain Vulnerabilities

Supply-Chain Vulnerabilities: Compromised dependencies, libraries, or containers can introduce malicious behavior into agent systems. This includes backdoors in AI models or malicious code in third-party integrations.

Uncontrolled Resource Consumption

Uncontrolled Resource Consumption: Agents consume excessive computational resources through infinite loops, prompt storms, or recursive API calls. This can lead to denial-of-service conditions and unexpected infrastructure costs.

Sensitive Data Exposure

Sensitive Data Exposure: Agents inadvertently reveal confidential information from training data, logs, or connected systems. This includes exposing personal identifiable information (PII) or proprietary business data.

Data Exfiltration Channel

Data Exfiltration Channels: Malicious actors use agents as conduits to steal data through covert channels or unauthorized transfers. This can involve encoding sensitive data in seemingly normal outputs or responses.

Unsafe Actuation

Unsafe Actuation: Agents perform destructive operations or are weaponized for malicious purposes, including unauthorized modifications to systems or data. This represents the most direct physical or digital harm potential.

Human Manipulation

T15

Human Manipulation: Agents mislead users, create over-reliance, or exploit psychological vulnerabilities to influence human behavior. This includes deceptive practices and undermining human decision-making autonomy.

Opaque Reasoning

Opaque Reasoning: Inability to trace or explain the agent's decision-making process, making it impossible to audit outcomes or debug failures. This creates compliance risks and hampers incident response efforts.

Data & Memory Poisoning

Data & Memory Poisoning: Agents' knowledge bases or memory systems are corrupted with false information, leading to persistent misinformation. This includes attacks on retrieval-augmented generation (RAG) systems and vector databases.

Access Control & Permissions: Risks of agents obtaining or being granted unauthorized access to data and systems through privilege escalation or credential theft.

Tool Misuse: Risks arising from the failure, vulnerability, or improper use of external tools, APIs, and other dependencies.

Governance: Risks related to agents deviating from their intended goals, rules, or instructions.

Agent Output Quality: Risks from agents generating false, biased, toxic, or otherwise harmful content.

Agent Behaviour: Risks of agents being manipulated or used to deceive users, perform harmful actions, or cause unintended consequences.

Privacy: Risks of agents inadvertently leaking, exposing, or exfiltrating sensitive data.

Reliability & Observability: Risks of performance degradation over time and an inability to understand or trace an agent's decision-making process.

Access control
& Permission

Tool Misuse

Agent Behaviour

Governance

Privacy

Agent Output Quality

Reliability & observability

Agent Failure

Agent Misuse

Tool Failure

Agent Failure

T1 - T15

OWASP Agentic Risk (ID.Name)

Memory Poisoning

Tool Misuse

Privilege Compromise

Resource Overload

Cascading Hallucination Attacks

Intent Breaking & Goal Manipulation

Misaligned & Deceptive Behaviors

Repudiation & Untacebility

Identity Spoofing & Impersonation

T10

Overwhelming Human in the Loop

T11

Unexpected RCE and Code Attacks

T12

Agent Communication Poisoning

T13

Rogue Agents in Multi-Agent Systems

T14

Human Attacks on Multi-Agent Systems

T15

Human Manipulation

Mappings with existing frameworks

We mapped the agent risks with existing frameworks like OWASP, NIST, EU AI Act etc.

Risk Domain	Category	Sub-Category	Scope	OWASP Agentic Risk (ID + Name)	EU AI Act	MITRE ATLAS (ID + Name)	NIST AI RMF ID(s)	ISO AI Safety Standard(s)	Mitigation Pattern
Governance	Goal Misalignment	Reward Hacking / Proxy Metric Gaming	agent failure	T6 – Intent Breaking & Goal Manipulation	Article 9	AML.T0053 – LLM Plugin Compromise	GOVERN 1.2	TR 24028; 42001; 23894	Implement human-in-the-loop for goal changes; Use preference models for fine-tuning.
Governance	Policy Drift	Version Skew / Silent Prompt Edits	agent failure	T6 – Intent Breaking & Goal Manipulation	–	AML.T0010 – AI Supply Chain Compromise	GOVERN 1.5	TR 24028; 23894	Implement version pinning for models and prompts; Monitor for behavioral drift.
Agent Output Quality	Hallucination	Confident False Facts / Cascading Errors	agent failure	T5 – Cascading Hallucinations	–	AML.T0062 - Discover LLM Hallucinations	MEASURE 2.5	TR 24028; 24029-1	Implement fact-checking via knowledge-base/RAG; Tune model temperature/top-p.
Agent Output Quality	Bias & Toxicity	Demographic Stereotyping / Harmful Content	agent failure	T15 – Human Manipulation	Recital 45	AML.T0048 – External Harms	MEASURE 2.11	TR 24028; 23894	Audit and curate training data for representation; Implement content filtering on inputs and outputs.
Tool Misuse	API Integration	Schema Changes / Rate Limits	tool failure	T2 – Tool Misuse	–	AML.T0053 - LLM Plugin Compromise	MAP 2.2	TR 24028; 42001; 23894	Use API versioning; Implement resilient error handling and circuit breakers.
Tool Misuse	Supply-Chain Vulnerabilities	Compromised Dependencies / Containers	tool failure	T2 – Tool Misuse	Annex III §2	AML.T0040 – AI Supply Chain Compromise	MAP 4.1	TR 24028; 42001; 23894	Scan dependencies and container images for vulnerabilities; Use trusted sources.
Tool Misuse	Uncontrolled Resource Consumption	Prompt Storms / Runaway Recursion	tool failure	T4 – Resource Overload	–	AML.T0029 – Denial of ML Service	MAP 3.2	TR 24028; 42001; 23894	Impose rate limits, timeouts, and resource quotas on agent actions.
Privacy	Sensitive Data Exposure	Training Data Exposure / PII in Logs	tool misuse	T1 – Memory Poisoning	Article 10	AML.T0057 - LLM Data Leakage	MEASURE 2.10	TR 24028; 23894	Use data anonymization; Implement log sanitization to scrub PII.
Privacy	Data Exfiltration Channels	Covert Channels / Unauthorized Transfers	tool misuse	T2 – Tool Misuse	Article 10	AML.T0024 - Exfiltration via AI Inference API	MAP 4.2	TR 24028; 23894	Monitor network traffic for anomalous patterns; Restrict egress destinations.
Reliability & Observability	Data & Memory Poisoning	Concept Drift / Feedback Loops	agent failure	T1 – Memory Poisoning	Article 15	AML.T0020 – Poison Training Data	MEASURE 3.1	TR 24028; 24029-1; 23894	Monitor model performance; Implement feedback diversity and circuit breakers.
Reliability & Observability	Opaque Reasoning	Non-Deterministic Behavior / Opaque Reasoning	agent failure	T8 – Repudiation & Untraceability	Article 13; Recital 45	AML.T0049 - Exploit Public-Facing Application	MEASURE 2.9	TR 24028; 23894	Use interpretable models; Implement structured, deterministic logging.
Agent Behaviour	Human Manipulation	Over-Reliance / Deception / Behavioral Nudging	agent misuse	T15 – Human Manipulation	Recital 45	AML.T0054 - LLM Jailbreak	MAP 5.1	TR 24028; 42001; 23894	Communicate model limitations; Require human oversight; Watermark AI content; Design for explicit consent.
Agent Behaviour	Unsafe Actuation	Destructive Operations / Weaponization	agent misuse	T7 – Misaligned & Deceptive Behaviors	Annex III §1 (c)	AML.T0048 – External Harms	MEASURE 2.6; MANAGE 1.3	TR 24028; 24029-1; 23894	Require confirmation for destructive actions; Implement dry-run modes; Use layered safety controls.
Access Control & Permissions	Credential Theft	Credential Theft / Identity Spoofing	tool failure	T9 – Identity Spoofing & Impersonation	Article 13	AML.T0012 – Valid Accounts	MEASURE 2.7	TR 24028; 42001; 23894	Use secure credential storage; Implement short-lived tokens.
Access Control & Permissions	Privilege Escalation	Privilege Escalation / Policy Bypass	tool misuse	T3 – Privilege Compromise	Article 13	AML.T0055 – Unsecured Credentials	GOVERN 6.1	TR 24028; 42001; 23894	Apply least privilege principle; Regular policy audits.
Access Control & Permissions	Confused Deputy	Confused Deputy / Impersonation	tool misuse	T9 – Identity Spoofing & Impersonation	–	AML.T0054 – LLM Jailbreak	GOVERN 6.1	TR 24028; 42001; 23894	Validate delegation chains; Use signed attestations.

Risk Domain	Category	OWASP Agentic Risk (ID + Name)	MITRE ATLAS (ID + Name)	NIST AI RMF ID(s)	ISO AI Safety Standard(s)
Governance	Goal Misalignment	T6 – Intent Breaking & Goal Manipulation	AML.T0053 – LLM Plugin Compromise	GOVERN 1.2	TR 24028; 42001; 23894
Governance	Governance	Policy Drift	T6 – Intent Breaking & Goal Manipulation	AML.T0010 – AI Supply Chain Compromise	GOVERN 1.5	TR 24028; 23894
Agent Output Quality	Hallucination	T5 – Cascading Hallucinations	AML.T0062 - Discover LLM Hallucinations	MEASURE 2.5	TR 24028; 24029-1
Agent Output Quality	Agent Output Quality	Bias & Toxicity	T15 – Human Manipulation	AML.T0048 – External Harms	MEASURE 2.11	TR 24028; 23894
Tool Misuse	API Integration	T2 – Tool Misuse	AML.T0053 - LLM Plugin Compromise	MAP 2.2	TR 24028; 42001; 23894
	Tool Misuse	Supply-Chain Vulnerabilities	T2 – Tool Misuse	AML.T0040 – AI Supply Chain Compromise	MAP 4.1	TR 24028; 42001; 23894
	Tool Misuse	Uncontrolled Resource Consumption	T4 – Resource Overload	AML.T0029 – Denial of ML Service	MAP 3.2	TR 24028; 42001; 23894
Privacy	Sensitive Data Exposure	T1 – Memory Poisoning	AML.T0057 - LLM Data Leakage	MEASURE 2.10	TR 24028; 23894
Privacy	Privacy	Data Exfiltration Channels	T2 – Tool Misuse	AML.T0024 - Exfiltration via AI Inference API	MAP 4.2	TR 24028; 23894
Reliability & Observability	Data & Memory Poisoning	T1 – Memory Poisoning	AML.T0020 – Poison Training Data	MEASURE 3.1	TR 24028; 24029-1; 23894
Reliability & Observability	Reliability & Observability	Opaque Reasoning	T8 – Repudiation & Untraceability	AML.T0049 - Exploit Public-Facing Application	MEASURE 2.9	TR 24028; 23894
Agent Behaviour	Human Manipulation	T15 – Human Manipulation	AML.T0054 - LLM Jailbreak	MAP 5.1	TR 24028; 42001; 23894
Agent Behaviour	Agent Behaviour	Unsafe Actuation	T7 – Misaligned & Deceptive Behaviors	AML.T0048 – External Harms	MEASURE 2.6; MANAGE 1.3	TR 24028; 24029-1; 23894
Access Control & Permissions	Credential Theft	T9 – Identity Spoofing & Impersonation	AML.T0012 – Valid Accounts	MEASURE 2.7	TR 24028; 42001; 23894
	Access Control & Permissions	Privilege Escalation	T3 – Privilege Compromise	AML.T0055 – Unsecured Credentials	GOVERN 6.1	TR 24028; 42001; 23894
	Access Control & Permissions	Confused Deputy	T9 – Identity Spoofing & Impersonation	AML.T0054 – LLM Jailbreak	GOVERN 6.1	TR 24028; 42001; 23894

Download AI Safety Standards Table

Frequently Asked Questions

How is this different from general AI security?

We focus specifically on autonomous agents that use tools—not traditional ML models. The risks are fundamentally different.

Which agent types does this cover?

Any AI system that can invoke external APIs, make decisions autonomously, or interact with tools. Technology-agnostic.

How does this differ from traditional AI security frameworks?

Unlike traditional frameworks that focus on ML model security, our taxonomy specifically addresses the unique risks of agentic AI systems that can take autonomous actions and interact with external tools.

Is the framework applicable to all types of AI agents?

Yes, the taxonomy covers single-model agents, multi-agent systems, and any AI system that can invoke external tools or APIs autonomously. It's designed to be technology-agnostic.

Can my team contribute?

Yes! We welcome input from security practitioners. Contact us about our contributor program.

Get the complete Agent Risk Taxonomy & stay ahead of agent threats.

Download the PDF - Case Studies, Implementation Walkthroughs

Prefer a walkthrough? Book a demo