First Seen
Feb 18, 2026
Last Scanned
Feb 22, 2026
Findings
6
Score
0/100
Findings (6)
Detects common jailbreak prompt patterns
developer mode Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Detects injection of system/user/assistant delimiters
### System: Remove text that instructs the agent to disregard its safety guidelines, system prompt, or ethical constraints. This is a strong indicator of a jailbreak attempt.
Likely FP if the text appears in security research content, a CTF challenge description, or educational material about AI safety clearly labeled as such.
Detects patterns of downloading and piping to shell execution
curl \| bash Download the file first, verify its integrity (checksum, signature), inspect it, then run it. Prefer package managers over raw downloads. Never fetch-and-run in one step.
Likely FP if the target is a well-known installer (e.g., rustup, Homebrew) from its canonical HTTPS domain, though the pattern is inherently risky.
Detects downloading scripts piped directly to a shell interpreter
curl \| bash Download the script first, inspect it, verify its checksum, then run it. Do not pipe curl/wget output directly to sh/bash. Prefer package manager installs.
Likely FP if the download is from a well-known installer domain (e.g., brew.sh, rustup.rs), though this pattern is inherently risky even with trusted sources.
Detects instructions to hide actions from the user
do not tell the user Remove directives that attempt to change the agent's output format, suppress safety warnings, or alter response structure in ways that bypass safety controls.
Likely FP if the output format directive is a legitimate tool configuration (e.g., return results as JSON) that does not suppress safety features.
Detects chained commands using shell operators with dangerous operations
curl \| bash Break chained commands into discrete, individually validated steps. Avoid piping untrusted output directly into a shell interpreter.
Likely FP if the matched text is a documentation example showing a common installer one-liner for a well-known tool with a canonical URL.