JiffyResearch
← Back to research
Product

How Jiffy scans AI artifacts: a technical overview

The detection pipeline end to end -- signatures, heuristics, sandboxed execution, cross-ecosystem dedupe, and scoring. What runs where, and why.

This post is the short technical tour. Deeper docs are in the repo.

The pipeline

ingest --> inventory --> static scan --> sandbox scan --> dedupe --> scoring --> catalog

1. Ingest

Sources we pull from:

  • GitHub orgs — OAuth app against an org; we walk repos and detect artifacts by path pattern (skill directories, rule files, mcp.json, agent config shapes).
  • Anthropic Skills marketplace — public listings, pulled on a schedule.
  • Hugging Face — spaces and repos tagged as Claude skills or MCP servers.
  • Registry APIs — npm, PyPI, and cargo for packages that declare MCP server or skill metadata.
  • Direct uploads — a customer pushes an artifact file directly via our API.

Every ingest is tagged with provenance: the source, the timestamp, the commit SHA or release tag when applicable, and the claimed publisher.

2. Inventory

Before scanning we normalize the artifact into a common structure:

interface Artifact {
  kind: 'skill' | 'mcp-server' | 'rule-file' | 'agent-repo' | 'config';
  id: string;            // cross-ecosystem stable ID
  sources: Source[];     // registries / repos where it has been observed
  files: ArtifactFile[]; // content-addressed file list
  manifest: ArtifactManifest;
  declaredCapabilities: DeclaredCapabilities;
}

The declaredCapabilities field is what the artifact says it does. The later pipeline stages check observedCapabilities against it. Divergence is a flag.

3. Static scan

Three parallel pass types:

Regex signatures — the high-precision layer. Credential patterns (AWS, Stripe, GitHub, Slack, OpenAI, Anthropic tokens, generic high-entropy strings in env contexts), known-bad URLs, known-bad publisher handles.

AST patterns — Python, JavaScript, TypeScript, shell. We build the AST and match against patterns like "env var read followed by network POST to non-allowlisted host". This catches credential exfiltration that regex alone misses.

Prompt-content heuristics — for the Markdown parts of the artifact (SKILL.md, rule files, prompt bundles). A separate model (not the agent under protection) classifies prompt content against the Jiffy taxonomy: override, smuggling, sub-agent spawn, prompt-injection boilerplate, instructional network-destination reference.

Each static finding includes file pointer, line range, signature ID, confidence weight, and a short human-readable explanation.

4. Sandbox scan

Artifacts that include executable code get a sandboxed execution pass. We use E2B microVMs with:

  • Per-invocation teardown.
  • A network egress policy that defaults to deny, with an allowlist derived from declaredCapabilities.
  • Filesystem and process telemetry via a lightweight collector inside the sandbox.

We exercise the artifact's initialization path plus any exported tools with synthetic inputs. Every network destination and every filesystem write is logged. Divergence from the declared capabilities becomes a finding.

5. Cross-ecosystem dedupe

The same malicious artifact shows up in multiple places. A rule file on GitHub, an MCP server on npm under a different publisher name, a skill on Hugging Face. We dedupe against:

  • Content hash — identical file bytes.
  • Near-match hash — minor diffs (whitespace, rename, boilerplate change). We use a locality-sensitive hash that tolerates these.
  • Behavioral signature — same observed capabilities, same network destinations, same AST shape even under renaming.

A deduped entry in the catalog records all registries where it appears. The ID is stable. If it shows up in a sixth place next month, it joins the existing record.

6. Confidence scoring

A weighted score combining:

  • Static signature strength (weighted by historical precision).
  • Runtime findings (heavier weight).
  • Capability divergence (declared vs observed).
  • Publisher reputation (a signed publisher with history outranks a first-time anonymous one).
  • Catalog cross-references.

Outputs a tier:

  • Trusted — signed publisher, no findings, observed behavior matches declaration.
  • Caution — minor findings, unsigned publisher, or capability declaration issues.
  • Risky — static or runtime findings that match known-bad patterns at moderate precision.
  • Malicious — matches a confirmed entry in the catalog or produces high-confidence runtime evidence of exfil / smuggling / spawn.

Every score is a structured document you can audit. Policy layers can override — a customer's policy engine can downgrade Caution to Trusted for specific first-party publishers.

The public catalog

Every Malicious entry (and, with publisher opt-in, Risky entries) flows into the public catalog at intel.jiffylabs.app. Each catalog entry includes:

  • Stable ID.
  • Observed registries and publishers.
  • Content hashes and near-match siblings.
  • Signature matches and runtime findings.
  • Confidence score and tier.
  • First-seen and last-seen timestamps.
  • A JSON export via the public API.

The catalog is licensed CC BY 4.0. Downstream security tools are welcome to ingest it directly.

What you get when you connect Jiffy

  • Inventory across skills, MCP servers, rule files, agent repos for your connected orgs.
  • Scored findings for each.
  • Policy-configurable block / allow / quarantine actions.
  • Continuous re-scan on artifact updates.
  • Webhook and API access for integration with your SIEM, SOAR, MDM, or Zero Trust layer.

Frequently asked questions

What does Jiffy actually scan?
Five classes of artifact: Anthropic Skills, MCP servers, IDE rule files (.cursorrules, agents.md, .windsurfrules, .clinerules, .github/copilot-instructions.md), agent config repos (LangChain, CrewAI, LlamaIndex), and AI-adjacent package manifests. Connect a GitHub org or point us at a registry and we discover the artifacts automatically.
Is this static analysis or runtime?
Both. Static catches most of what we publish -- regex signatures, AST patterns, prompt-content heuristics. The tail needs runtime: we execute anything that runs code at load inside an E2B sandbox with full network and filesystem telemetry. Runtime findings cross-check against the static ones.
Why E2B specifically?
E2B gives us isolated microVMs with per-invocation teardown, configurable network egress policy, and tight invocation overhead. For the skill and MCP-server workload (lots of short-lived sandbox invocations), that matches the workload shape. We pin sandbox versions and publish the manifest with each scan.
How do you score confidence?
A scored tier (Trusted / Caution / Risky / Malicious) produced from a weighted combination of static signature strength, runtime findings, registry provenance, and publisher reputation. Every score attaches to a findings manifest -- you can see the specific inputs that produced the tier and override it at the policy layer if your environment has a different risk tolerance.
Is the signature catalog open?
The threat catalog -- the specific artifacts we've observed -- is fully public at intel.jiffylabs.app under CC BY 4.0. The signature definitions are partially open (we publish the high-confidence patterns) and partially closed (the research-team-only patterns we use for the long tail). We are moving toward fully open over time.
Does this work on private artifacts?
Yes. Connect your GitHub org, your S3 bucket, or your private MCP registry and we scan inside your perimeter. Findings stay within your tenant. Only aggregate statistics and confirmed-malicious entries flow into the public catalog, and only with explicit opt-in for the latter.

More from Jiffy

AI security

Mythos-ready: the artifact side of the AI vulnerability storm

The CSA, SANS, and OWASP GenAI just told CISOs to become Mythos-ready. Their brief is the best strategy document the industry has produced on the post-Mythos threat environment. It focuses on the code and vulnerability side. The artifact side -- skills, MCP servers, rule files -- is the adjacent surface that needs the same treatment.

Jiffy Research Team8 min read

Scan your AI artifacts, free.

Point Jiffy at a GitHub org or registry and get a signed artifact inventory with scored risk on every skill, MCP server, and IDE rule file.

Try it