AI Threats

Explore the key threats to agentic AI systems, mapped to components and mitigations.

Memory Poisoning

Attackers inject malicious data into the agent’s memory to manipulate future decisions, affecting any memory type from in-agent session to cross-agent cross-user memory.

Manipulation of tools, APIs, or environment access to perform unintended actions or access unauthorized resources, including exploitation of access to external systems.

Privilege Compromise

Breaking information system boundaries through context collapse, causing unauthorized data access/leakage, or exploiting tool privileges to gain unauthorized access to systems.

Cascading Hallucination

Foundation models generate incorrect information that propagates through the system, affecting reasoning quality and being stored in memory across sessions or agents.

Prompt Injection

Attackers manipulate model behavior by injecting malicious prompts, causing the model to ignore instructions, leak data, or perform unintended actions.

Sensitive or confidential information is exposed through model outputs, logs, or unintended responses.

Model Theft & Extraction

Attackers extract model parameters, intellectual property, or proprietary data through repeated queries or side channels.

Supply Chain Compromise

Malicious or vulnerable dependencies, pre-trained models, or third-party services introduce risk into the AI system.

Alignment Failure

The model’s objectives, values, or behaviors diverge from intended or ethical outcomes, leading to harmful or unintended actions.

Social Engineering & Manipulation

Attackers exploit model outputs or agent workflows to manipulate users, operators, or downstream systems (e.g., phishing, fraud, misinformation).