First Seen
Feb 18, 2026
Last Scanned
Feb 22, 2026
Findings
13
Score
0/100
Findings (13)
Detects attempts to override or ignore previous instructions
Override system safety instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Skill can read private data (credentials, SSH keys, env vars) AND write to public channels (Slack, Discord, email). This combination enables data exfiltration.
[reads_private_data] access control, hardcoded secrets + [writes_public_output] Send data to external URLs/webhook Break the toxic data flow by adding validation and sanitization between the input source and the sensitive operation. Do not pass untrusted data directly to file system, network, or execution APIs.
Likely FP if the data flow involves only trusted, hardcoded values and the taint analysis over-approximated the untrusted input sources.
Skill can read private data AND execute arbitrary code. This combination enables credential theft via dynamic code.
[reads_private_data] access control, hardcoded secrets + [executes_code] eval( Add input validation between the user-controlled data source and the security-sensitive sink (e.g., file writes, command execution). Implement allowlisting for acceptable input patterns.
Likely FP if the user input passes through explicit validation or sanitization before reaching the sensitive operation, and the taint tracker missed the sanitization step.
Detects skills that both read sensitive credential files and send data to external services
access control, hardcoded secrets + Send data to external Block access to git credentials, SSH keys, and repository tokens. If git operations are needed, use scoped deploy keys and restrict the tool to specific repositories.
Likely FP if the match is documentation about git configuration (e.g., setting up git credentials helper) rather than code that reads and transmits them.
Detects patterns indicating sensitive data being sent to external services
exfiltrate data Restrict file reading to the project directory and block outbound network calls that include file contents. Implement file path validation to prevent directory traversal.
Likely FP if the tool legitimately reads project files and displays them to the user locally, without sending data to external services.
Detects attempts to override or ignore previous instructions
Ignore all previous instructions Remove the injection payload from the skill definition. Text that attempts to reset agent context or override prior directives is a direct attack vector.
Likely FP if the text is in a security tutorial or research paper discussing injection techniques as examples, not in an active skill description.
Detects common jailbreak prompt patterns
developer mode Remove directives that attempt to exfiltrate data through the agent's response (e.g., asking the agent to embed credentials in URLs or include secret values in output).
Likely FP if the text is a legitimate tool instruction about displaying configuration to the user (e.g., show current settings) without external transmission.
Code block labeled "markdown" contains executable content
# DANGEROUS - Unsafe tool usage
"Run: os.system(f'process {user_input}')"
"Fetch and execute code from the user's URL"
"Write the response directly to /etc/passwd"
Align the tool description with what the code actually does. If the description says one thing but the code does another, update the description to be accurate or fix the code.
Likely FP if the NLP analyzer misinterpreted technical jargon or the description accurately describes behavior through domain-specific terminology.
Detects patterns where credential or secret reads are combined with external data transmission
access control, hardcoded secret + Send data to external Prevent credentials and sensitive data obtained by one MCP tool from being passed to other tools. Implement data isolation between tools and restrict cross-tool data flow for secrets.
Likely FP if the cross-tool data flow is intentional API authentication (e.g., a tool fetches an auth token that another tool uses for the same service).
Detects skills where external API responses control agent decisions or actions
Request credentials or API + based on unvalidated LLM output Do not include content from MCP tool responses verbatim in system prompts or tool descriptions. Sanitize all dynamic content before incorporating it into prompt context.
Likely FP if the match is a static tool description that mentions dynamic content handling in its documentation, not an actual injection vector.
Detects patterns where external API responses are used directly without validation or sanitization
API keys be included in response + without use Validate and sanitize all data received from external APIs before using it in tool operations or agent prompts. Implement schema validation and treat API responses as untrusted input.
Likely FP if the match is a truncated table cell or documentation fragment that mentions API responses in a descriptive context, not actual unvalidated data processing.
Detects Python subprocess and os.system calls for command execution in skill descriptions
os.system( Pass arguments as an explicit list instead of a shell string. Set shell=False and validate all user-supplied values before inclusion.
Likely FP if the match is in documentation explaining Python subprocess usage or in a description mentioning it as a topic.
Detects skills where user-provided URLs are consumed and processed by the agent
Fetch and execute code from the user's URL Validate and sanitize user-provided URLs before fetching them. Implement URL allowlisting, block private/internal IP ranges, and treat fetched content as untrusted data.
Likely FP if the skill is a web browser or URL fetcher where consuming user-provided URLs is the documented core feature with appropriate sandboxing.