AI Security

MCP Protocol Security: What the Model Context Protocol Means for Supply Chains

Anthropic's Model Context Protocol standardizes how AI models interact with external tools. The security implications for software supply chains are significant.

Alex
AI Infrastructure Security Lead
6 min read

The Model Context Protocol (MCP), introduced by Anthropic, is an open standard for connecting AI models to external data sources and tools. Think of it as a USB-C for AI integrations: a standardized interface that allows any compliant AI system to interact with any compliant tool or data source.

For developers building AI-powered applications, MCP simplifies integration. For security teams, MCP introduces a new protocol layer in the software supply chain that needs careful analysis.

What MCP Does

MCP defines a client-server protocol where:

  • MCP Clients (AI applications, IDE extensions, chatbots) connect to external capabilities
  • MCP Servers provide resources (data), tools (functions the AI can call), and prompts (pre-defined interaction templates)
  • Transport layers handle communication, currently supporting stdio (local processes) and HTTP with Server-Sent Events

When a developer installs an MCP server, their AI assistant gains new capabilities. An MCP server for a database lets the AI query data. An MCP server for a cloud provider lets the AI manage infrastructure. An MCP server for a code repository lets the AI read and analyze code.

The power is obvious. So is the attack surface.

The Supply Chain Trust Model

Installing an MCP server creates a trust chain that looks remarkably similar to installing a software dependency, with some important differences.

MCP servers are executable code. Unlike a data file or configuration, an MCP server is a running process that executes code, makes network connections, and accesses local resources. Installing an MCP server from an untrusted source is as risky as installing any untrusted software.

MCP servers access privileged context. By design, MCP servers interact with AI models that have access to user conversations, code, documents, and credentials. A malicious MCP server can observe and exfiltrate everything the AI model processes.

MCP servers can influence AI behavior. Through the resources and tools they provide, MCP servers shape the AI's responses and actions. A compromised MCP server that returns manipulated data influences the AI's reasoning without the user seeing the manipulation.

The discovery and distribution model is immature. There isn't yet a centralized, vetted registry for MCP servers. Servers are distributed through GitHub repositories, npm packages, and ad-hoc sharing. The verification and trust infrastructure is minimal.

Security Risks

Server Authenticity

How do you verify that an MCP server is what it claims to be? If someone shares an "MCP server for Jira" in a forum, you have limited ability to verify:

  • That it actually connects to Jira and not to a proxy that intercepts your data
  • That it doesn't include additional functionality beyond what's advertised
  • That it hasn't been modified since the original author published it
  • That the original author is trustworthy

This is the package registry problem all over again, but without the registry's (imperfect) trust signals like download counts, maintenance history, and community reviews.

Permission Scope

MCP servers can declare broad capabilities. A server providing file system access might request read/write to arbitrary paths. A server providing API access might request credentials for multiple services. The permission model is still evolving, and current implementations often operate with more access than necessary.

The principle of least privilege is difficult to enforce when the capabilities a server needs depend on the natural language requests a user might make. A database MCP server might need broad query access because the AI might ask any query.

Tool Invocation Risks

When an AI model calls an MCP tool, it's executing a function defined by the MCP server. The model selects which tool to call and what parameters to pass based on its understanding of the user's intent. This creates several risks:

Prompt injection through MCP. If an MCP server returns data containing prompt injection payloads, the AI model processes that data and may follow the injected instructions. This is the indirect prompt injection problem, amplified by the structured tool interface.

Parameter manipulation. A compromised MCP server could ignore the parameters the AI sends and perform different actions. The AI thinks it's reading a file; the server is actually exfiltrating data to an external endpoint.

Chained tool abuse. In environments with multiple MCP servers, a compromised server could influence the AI to invoke tools on other servers in unintended ways. The AI becomes an unwitting proxy for cross-server attacks.

Transport Security

The stdio transport (local process communication) inherits the security of the local operating system. The HTTP/SSE transport introduces network security considerations: TLS configuration, authentication, and protection against man-in-the-middle attacks.

Remote MCP servers (hosted by third parties) add another dimension. Data flows to an external service, is processed by someone else's code, and results return to your AI model. Every concern about cloud service supply chains applies.

Mitigation Strategies

Vet MCP servers like dependencies. Before installing an MCP server, review its source code, check its author's reputation, and verify its behavior. Apply the same rigor you would (or should) apply to adding a new library to your project.

Restrict file system and network access. Run MCP servers in sandboxed environments that limit their access to the file system, network, and other system resources. Container-based isolation or operating system-level sandboxing can constrain what a compromised server can access.

Monitor MCP server behavior. Log tool invocations, data returned, and network connections. Establish baselines and alert on anomalies. If an MCP server that normally returns small JSON responses suddenly starts returning large data blobs, investigate.

Pin server versions. Like any dependency, MCP servers should be pinned to specific, verified versions. Automatic updates that pull new versions without review could introduce compromised code.

Validate tool outputs. Before the AI model acts on data from an MCP server, apply validation checks. Range checking, format verification, and sanity testing of returned data reduce the impact of compromised servers.

How Safeguard Helps

Safeguard extends supply chain governance to the MCP ecosystem. Our platform can track MCP server dependencies alongside traditional software dependencies, maintaining SBOMs that include the MCP servers integrated into your AI workflows.

Policy gates can enforce standards for MCP server adoption: approved server registries, required code reviews, mandatory version pinning, and vulnerability checking of MCP server dependencies. As the MCP ecosystem grows and becomes a standard part of AI-powered development, Safeguard ensures that this new supply chain dimension receives the same security governance as your existing software components.

Related articles in AI Security

AI Security

Safeguard Now Supports Every Major AI Model Family for Zero-Day Discovery: Anthropic, OpenAI, Gemini, Microsoft, Meta, and Your Own Models

You should not have to choose between your organization's AI strategy and your security platform. Safeguard's agentic zero-day discovery and remediation pipeline now works on Anthropic Claude Fable 5, OpenAI GPT, Google Gemini, Microsoft Phi, Meta Llama, Safeguard native models, and privately hosted custom models — all running as first-class agents in the same Multi-Agent TAOR Deep Think AI Engine.

June 9, 2026Read
AI Security

Anthropic Claude Mythos Releases Tomorrow: Capabilities, Benchmarks, and What Security Teams Must Do Now

Anthropic's Claude Mythos model goes public on June 10, 2026 — a frontier AI that scored 97.6% on the Math Olympiad, completed expert-level hacking tasks at 73% success, and found 271 vulnerabilities in Firefox 150. Here is everything security teams need to know before it lands, and how Safeguard already supports Mythos zero-day discovery natively.

June 9, 2026Read
AI Security

Claude Fable 5: Anthropic's Most Capable Public Model Is Here — Benchmarks, Capabilities, and What It Means for Security

Anthropic just released Claude Fable 5, its most capable publicly available model and the first Mythos-class AI open to everyone. 80.3% on SWE-Bench Pro, 88% on Terminal-Bench 2.1, state-of-the-art across software engineering, vision, and scientific research. Safeguard has already integrated Fable 5 natively — here is everything you need to know.

June 9, 2026Read

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.