First Seen
Feb 18, 2026
Last Scanned
Feb 22, 2026
Findings
2
Score
85/100
Findings (2)
Detects instructions to hide actions from the user
never tell the user Remove directives that attempt to change the agent's output format, suppress safety warnings, or alter response structure in ways that bypass safety controls.
Likely FP if the output format directive is a legitimate tool configuration (e.g., return results as JSON) that does not suppress safety features.
Detects skills where user-provided URLs are consumed and processed by the agent
Fetch`) for fetching user-specified URL Validate and sanitize user-provided URLs before fetching them. Implement URL allowlisting, block private/internal IP ranges, and treat fetched content as untrusted data.
Likely FP if the skill is a web browser or URL fetcher where consuming user-provided URLs is the documented core feature with appropriate sandboxing.