CRITICAL

Instruction override combined with dangerous operations NLP_OVERRIDE_DANGEROUS

Instruction override combined with dangerous operations

Use variable overrides ONLY for auth (api_key_id, private_key)

FIX

Remove language that attempts to override safety guidelines or bypass system-level controls. This pattern is a strong indicator of a jailbreak attempt.

FP?

Likely FP if the matched text is in a security research document or educational content clearly marked as an example rather than an active payload.

CRITICAL

Instruction override combined with dangerous operations NLP_OVERRIDE_DANGEROUS

prompt-injection L772

Instruction override combined with dangerous operations

Use variable overrides ONLY for auth (api_key_id, private_key)

FIX

Remove language that attempts to override safety guidelines or bypass system-level controls. This pattern is a strong indicator of a jailbreak attempt.

FP?

Likely FP if the matched text is in a security research document or educational content clearly marked as an example rather than an active payload.

HIGH

Fetch URL and use as instructions INDIRECT_001

indirect-injection L134

Detects fetching external URLs and using the content as agent instructions or rules

fetch sig-verified oracle quote instruction

FIX

Sanitize or validate all external inputs (file contents, API responses, user messages) before including them in prompts or tool calls. Implement input/output boundaries between trusted and untrusted data.

FP?

Likely FP if the matched text is the skill's own instruction set describing how to handle user input, not an actual injection payload.

HIGH

Fetch URL and use as instructions INDIRECT_001

indirect-injection L135

Detects fetching external URLs and using the content as agent instructions or rules

fetch managed update instructions

FIX

Sanitize or validate all external inputs (file contents, API responses, user messages) before including them in prompts or tool calls. Implement input/output boundaries between trusted and untrusted data.

FP?

Likely FP if the matched text is the skill's own instruction set describing how to handle user input, not an actual injection payload.

HIGH

Fetch URL and use as instructions INDIRECT_001

indirect-injection L637

Detects fetching external URLs and using the content as agent instructions or rules

Fetch update instructions

FIX

Sanitize or validate all external inputs (file contents, API responses, user messages) before including them in prompts or tool calls. Implement input/output boundaries between trusted and untrusted data.

FP?

Likely FP if the matched text is the skill's own instruction set describing how to handle user input, not an actual injection payload.

LOW

Global package installation EXTDL_004

external-download L124

Detects global installation of packages which affects the host system

npm install -g @

FIX

Replace npm install -g with a local install (npm install --save-dev) or use npx with a pinned version. Global installs modify the system and risk supply chain attacks.

FP?

Likely FP if the global install is for a well-known CLI tool (e.g., typescript, eslint) in setup documentation, though the supply chain risk remains real.

LOW

Global package installation EXTDL_004

external-download L221

Detects global installation of packages which affects the host system

npm install -g @

FIX

Replace npm install -g with a local install (npm install --save-dev) or use npx with a pinned version. Global installs modify the system and risk supply chain attacks.

FP?

Likely FP if the global install is for a well-known CLI tool (e.g., typescript, eslint) in setup documentation, though the supply chain risk remains real.

switchboard-data-operator

Findings (7)