Security Threats in AI Agent Skills
AI agent skills and MCP servers face 13 categories of security threats. Aguara detects them with 188+ rules across pattern matching, NLP analysis, and taint tracking. Each guide below explains the threat, shows real examples, and tells you how to protect your agents.
External Download
Skills that fetch and execute remote code, turning your agent into a malware delivery system
Prompt Injection
When skill definitions contain concealed instructions that hijack your AI agent behavior
MCP Configuration
Insecure MCP server configurations that weaken security boundaries or expose sensitive settings
Data Exfiltration
Skills that secretly send your files, credentials, or conversation data to external servers
SSRF & Cloud Metadata
Skills that access internal networks or cloud instance metadata to steal cloud credentials and pivot deeper
Supply Chain
Attacks that compromise the skill dependencies, build process, or distribution to poison it before you install it
MCP Protocol Attack
Attacks that exploit the MCP protocol itself to inject capabilities, intercept calls, or manipulate agent behavior
Command Execution
Skills that run arbitrary system commands, giving attackers a shell on your machine through your AI agent
Indirect Prompt Injection
Attacks concealed in data the agent processes, not in the skill itself, triggered when your agent reads a file or webpage
Credential Leak
Skills that expose API keys, tokens, passwords, or private keys in their source code or descriptions
Toxic Data Flow
When trusted and untrusted data mix without boundaries, letting tainted inputs contaminate the agent decisions
Third-Party Content Inclusion
Skills that load external scripts, iframes, or resources that can change without the skill author knowledge
Unicode Attack
Invisible characters and bidirectional text tricks that hide malicious content in plain sight