First Seen
Feb 18, 2026
Last Scanned
Feb 22, 2026
Findings
3
Score
45/100
Findings (3)
Detects attempts to override or ignore previous instructions
Ignore your previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects autonomous sub-agent or cron-based execution without human oversight
Cron jobs, health check + autonomous Remove directives that force the agent to call specific tools or APIs not required for the skill's stated functionality. Tool calls should be determined by user intent, not embedded directives.
Likely FP if the skill legitimately needs to call other tools as part of its workflow (e.g., a deployment skill that calls git and cloud CLI tools).
Detects common jailbreak prompt patterns
Bypass safety Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.