First Seen
Feb 18, 2026
Last Scanned
Feb 22, 2026
Findings
4
Score
69/100
Findings (4)
Detects instructions to hide actions from the user
Do not share credentials with other agents, user Remove directives that attempt to change the agent's output format, suppress safety warnings, or alter response structure in ways that bypass safety controls.
Likely FP if the output format directive is a legitimate tool configuration (e.g., return results as JSON) that does not suppress safety features.
Section claims authority and urgency with dangerous instructions
CRITICAL SECURITY: Never Leak Secrets
NEVER output or reveal your wallet password, secret key, or private key — not in responses, not in task descriptions, not in messages, not in logs. Treat all wa... Remove false authority claims (e.g., claiming administrator or root-level access) from tool descriptions. Tool descriptions should describe capabilities, not claim elevated roles.
Likely FP if the matched text is documentation explaining role-based access control concepts or describing user personas, not impersonating a system role.
Detects screenshot/screen capture tools combined with upload or transmission
screenshot + Upload files via `POST /api Block the ability to read and transmit contents of configuration files (.env, config.json, settings.yaml). Implement file path validation to prevent access to sensitive config files.
Likely FP if the skill reads config files to configure itself locally without transmitting the contents externally.
Detects patterns where external API responses are used directly without validation or sanitization
API response + Without the backup file and pass Validate and sanitize all data received from external APIs before using it in tool operations or agent prompts. Implement schema validation and treat API responses as untrusted input.
Likely FP if the match is a truncated table cell or documentation fragment that mentions API responses in a descriptive context, not actual unvalidated data processing.