First Seen
Feb 20, 2026
Last Scanned
Feb 22, 2026
Findings
5
Score
50/100
Findings (5)
Detects common jailbreak prompt patterns
developer mode Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Detects attempts to override or ignore previous instructions
ignore previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects execution of shell script files via bash/sh command or direct invocation
./scripts/install.sh Replace direct shell script execution with a language-native implementation or a sandboxed executor. If shell scripts must run, restrict them to a vetted allowlist with integrity checks.
Likely FP if the match references running a script that is part of the skill's own repository (e.g., ./setup.sh) with clear, auditable contents.
Detects execution of shell script files via bash/sh command or direct invocation
./scripts/install.sh Replace direct shell script execution with a language-native implementation or a sandboxed executor. If shell scripts must run, restrict them to a vetted allowlist with integrity checks.
Likely FP if the match references running a script that is part of the skill's own repository (e.g., ./setup.sh) with clear, auditable contents.
Detects execution of shell script files via bash/sh command or direct invocation
./scripts/install.sh Replace direct shell script execution with a language-native implementation or a sandboxed executor. If shell scripts must run, restrict them to a vetted allowlist with integrity checks.
Likely FP if the match references running a script that is part of the skill's own repository (e.g., ./setup.sh) with clear, auditable contents.