First Seen
Feb 18, 2026
Last Scanned
Feb 22, 2026
Findings
26
Score
0/100
Findings (26)
Detects attempts to override or ignore previous instructions
ignore previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects common jailbreak prompt patterns
developer mode Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Detects common jailbreak prompt patterns
unrestricted mode Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Detects content pretending to be a system prompt
system prompt
- Remove encoded or obfuscated directives (base64, ROT13, unicode escapes, hex-encoded text). All text should be in plaintext and human-readable.
Likely FP if the encoded content is legitimate data (e.g., a base64-encoded image, a hex-encoded binary hash) rather than concealed directives.
Detects attempts to override or ignore previous instructions
ignore previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects attempts to override or ignore previous instructions
ignore previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects attempts to override or ignore previous instructions
ignore previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects attempts to override or ignore previous instructions
ignore all previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects attempts to override or ignore previous instructions
ignore previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects common jailbreak prompt patterns
DAN mode Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Detects common jailbreak prompt patterns
developer mode Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Detects common jailbreak prompt patterns
unrestricted mode Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Detects attempts to override or ignore previous instructions
ignore all previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects attempts to override or ignore previous instructions
ignore all previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects attempts to override or ignore previous instructions
ignore previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects common jailbreak prompt patterns
DAN mode Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Detects attempts to extract the system prompt
show me your system prompt Remove directives that attempt to make the agent act autonomously without user confirmation (e.g., automatically running commands, skipping confirmation prompts).
Likely FP if the matched text is in a CI/CD tool description where automated execution is the intended behavior in a controlled environment.
Detects attempts to extract the system prompt
output your configuration Remove directives that attempt to make the agent act autonomously without user confirmation (e.g., automatically running commands, skipping confirmation prompts).
Likely FP if the matched text is in a CI/CD tool description where automated execution is the intended behavior in a controlled environment.
Detects attempts to extract the system prompt
print your instructions Remove directives that attempt to make the agent act autonomously without user confirmation (e.g., automatically running commands, skipping confirmation prompts).
Likely FP if the matched text is in a CI/CD tool description where automated execution is the intended behavior in a controlled environment.
Detects attempts to extract the system prompt
what are your rules Remove directives that attempt to make the agent act autonomously without user confirmation (e.g., automatically running commands, skipping confirmation prompts).
Likely FP if the matched text is in a CI/CD tool description where automated execution is the intended behavior in a controlled environment.
Detects attempts to extract the system prompt
print your prompt Remove directives that attempt to make the agent act autonomously without user confirmation (e.g., automatically running commands, skipping confirmation prompts).
Likely FP if the matched text is in a CI/CD tool description where automated execution is the intended behavior in a controlled environment.
Detects attempts to extract the system prompt
output your configuration Remove directives that attempt to make the agent act autonomously without user confirmation (e.g., automatically running commands, skipping confirmation prompts).
Likely FP if the matched text is in a CI/CD tool description where automated execution is the intended behavior in a controlled environment.
Detects attempts to extract the system prompt
what are your instructions Remove directives that attempt to make the agent act autonomously without user confirmation (e.g., automatically running commands, skipping confirmation prompts).
Likely FP if the matched text is in a CI/CD tool description where automated execution is the intended behavior in a controlled environment.
Detects attempts to extract the system prompt
reveal your system prompt Remove directives that attempt to make the agent act autonomously without user confirmation (e.g., automatically running commands, skipping confirmation prompts).
Likely FP if the matched text is in a CI/CD tool description where automated execution is the intended behavior in a controlled environment.
Detects attempts to extract the system prompt
show me your system prompt Remove directives that attempt to make the agent act autonomously without user confirmation (e.g., automatically running commands, skipping confirmation prompts).
Likely FP if the matched text is in a CI/CD tool description where automated execution is the intended behavior in a controlled environment.
Detects attempts to extract the system prompt
What are your system instructions Remove directives that attempt to make the agent act autonomously without user confirmation (e.g., automatically running commands, skipping confirmation prompts).
Likely FP if the matched text is in a CI/CD tool description where automated execution is the intended behavior in a controlled environment.