Research Methodology

Aguara Watch is an open research initiative that continuously scans public AI agent skill registries for security vulnerabilities. This page explains how we collect, analyze, and publish our data.

The scanning pipeline

1

Crawl

Every 12 hours, crawlers discover and download skills from 7 public registries. Content hashing ensures only changed files are reprocessed.

2

Scan

The Aguara binary runs 188+ detection rules across pattern matching, NLP semantic analysis, and taint tracking. Each finding includes severity, matched text, and surrounding code context.

3

Audit

An automated auditor classifies findings as true positives, false positives, or needs-review using heuristic rules with confidence scores. FP findings above 80% confidence are excluded from grades.

4

Score

Skills start at 100 points. CRITICAL findings subtract 25, HIGH subtracts 15, MEDIUM subtracts 8. LOW and INFO are tracked but do not affect the score. Grades map to ranges: A (90-100), B (75-89), C (50-74), D (25-49), F (0-24).

5

Publish

Results are exported as a static JSON API and built into individual skill report pages. All data is public and downloadable as CSV datasets.

Research principles

Deterministic analysis

No LLM in the detection loop. Every finding is reproducible. Run the same scan twice, get the same results. The NLP analyzer uses embedding similarity, not generative models.

Open data, open methodology

All findings, scores, and datasets are public. The scanner is open source (Apache-2.0). Anyone can verify our results by running Aguara on the same files.

False positive transparency

We show FP hints on every finding. The audit system classifies findings with confidence scores. We would rather flag something and explain why it might be a false positive than silently suppress it.

Severity reflects impact, not probability

A CRITICAL finding means the impact would be severe if exploited, regardless of how likely exploitation is. Context (is this documentation or real code?) is handled by the audit layer, not the severity rating.

Known limitations

Aguara is a static analysis tool. It reads skill definitions and configuration files as text. It does not execute code, observe runtime behavior, or test actual API endpoints. Some threats (like a server that behaves differently after installation) are outside its detection scope.

The scanner produces false positives. Documentation that describes security threats (like a tutorial about prompt injection) will trigger the same rules that detect actual attacks. The audit system mitigates this, but manual review is still needed for borderline cases.

Content quality varies by registry. Registries that expose full source code (Skills.sh, ClawHub) produce more accurate scans than registries where only metadata or HTML descriptions are available (mcp.so). A clean score on a registry with limited content does not guarantee the skill is safe.

Advisory board

We are building an advisory board of security researchers, AI safety practitioners, and registry operators to review detection rules, validate findings, and guide the research direction.

The board will help with false positive review, rule calibration, and identifying emerging threat patterns in the AI agent ecosystem. Members get early access to findings data and contribute to the open-source scanner.

Interested in joining? gustavo@oktsec.com

Open source