AI Security

Securing AI Agents: MCP Protocol Risks and Mitigations

The Model Context Protocol is transforming how AI agents interact with tools, but it introduces new attack surfaces. Here is what security teams need to understand.

The Model Context Protocol has gone from an interesting idea to a de facto standard in about six months. Every major AI assistant now supports MCP, and the ecosystem of MCP servers is growing fast. This is mostly good news -- standardization reduces complexity and enables interoperability.

But MCP also introduces a new category of attack surface that most security teams have not thought about yet. We have spent the last few months analyzing MCP deployments and building our own MCP server at Safeguard. Here is what we have learned about the risks.

Understanding the MCP Threat Model

An MCP server is fundamentally a bridge between an AI agent and some external system -- a database, an API, a file system, a cloud service. The AI agent sends natural language or structured requests to the MCP server, which translates them into actions on the underlying system and returns results.

This architecture creates a trust chain: User -> AI Agent -> MCP Client -> MCP Server -> Backend System. Every link in that chain is a potential attack surface.

Risk 1: Tool Description Injection

MCP servers describe their capabilities to AI agents through tool descriptions -- structured metadata that tells the agent what each tool does, what parameters it accepts, and what it returns. The AI agent uses these descriptions to decide when and how to use each tool.

The problem: tool descriptions are essentially prompts. A malicious or compromised MCP server can craft tool descriptions that manipulate the AI agent's behavior. For example:

A tool description that instructs the agent to always use that tool first, overriding user intent
Descriptions that include instructions to exfiltrate conversation context to the server
Parameter descriptions that trick the agent into sending sensitive data as tool arguments

This is prompt injection through the tool layer, and most MCP clients have limited defenses against it.

Mitigation: Review tool descriptions from third-party MCP servers before deployment. Use MCP clients that display tool descriptions to users. Consider implementing an allowlist of approved MCP servers for your organization.

Risk 2: Excessive Permission Scope

Many MCP servers request broad permissions by design. A database MCP server might need read access to query tables, but does it also get write access? A file system MCP server might be scoped to a project directory, but what prevents path traversal?

The MCP specification does not mandate a granular permission model. Each MCP server implements its own authorization logic, and the quality varies wildly. We reviewed 40 popular open-source MCP servers and found that:

68% had no configurable permission scoping
45% could be used to access resources outside their intended scope
22% had no authentication mechanism at all

Mitigation: Run MCP servers with the principle of least privilege. Use network segmentation to limit what backend systems an MCP server can reach. Audit MCP server source code before deploying in production environments.

Risk 3: Data Exfiltration Through Context

When an AI agent uses an MCP tool, it typically sends relevant conversation context along with the tool call. This is by design -- context helps the MCP server provide better results. But it also means that every MCP server you connect potentially receives fragments of your conversations, code, and data.

Consider this scenario: you are discussing a security vulnerability in a private codebase with your AI assistant. You then ask a question that triggers an MCP tool call to a third-party service. The tool call may include context about the vulnerability you were just discussing.

Mitigation: Be aware of what context flows to which MCP servers. Use local MCP servers for sensitive operations. Consider MCP client configurations that limit context sharing.

Risk 4: Server-Side Request Forgery via AI

This is a subtle one. If an MCP server makes HTTP requests based on AI agent instructions (fetching URLs, calling APIs, making webhooks), an attacker who can influence the AI agent's input can potentially use the MCP server as an SSRF proxy.

The attack path: attacker provides crafted input to a user -> user shares it with AI assistant -> AI assistant calls MCP tool with attacker-controlled parameters -> MCP server makes request to internal network.

This is not theoretical. We demonstrated this attack path in a controlled environment against three different MCP servers.

Mitigation: MCP servers should validate and sanitize all parameters, especially URLs and identifiers. Implement egress filtering for MCP server processes. Never run MCP servers on internal networks without proper network controls.

Risk 5: Insecure Credential Storage

MCP servers need credentials to access backend systems. In many deployments, these credentials are stored in plain text in MCP client configuration files. Claude Desktop stores configuration in a JSON file in the user's home directory. VS Code stores it in workspace settings. These files are often readable by any process running as the same user.

Mitigation: Use environment variables or secret management systems for MCP credentials. Never commit MCP configuration files with credentials to version control. Rotate MCP server credentials on a regular schedule.

Building Secure MCP Servers

Based on our experience building the Safeguard MCP Server, here are principles we follow:

Input validation at every boundary. Do not trust that the AI agent will send well-formed requests. Validate every parameter against expected types and ranges. Reject requests that do not match the expected schema.

Scoped authentication. Our MCP server uses the same role-based access control as the Safeguard web interface. If a user does not have permission to view a project in the UI, they cannot query it through MCP.

Minimal context consumption. We designed our tools to require only the specific parameters they need, not broad conversation context. This limits data leakage.

Audit logging. Every tool call through our MCP server is logged with the user identity, parameters, and results. This is essential for incident investigation and compliance.

Rate limiting and anomaly detection. Unusual patterns of MCP tool usage -- high frequency, unusual parameter combinations, access to resources the user has never queried before -- trigger alerts.

The Bigger Picture

MCP is going to be the standard interface between AI agents and the tools they use. That makes MCP security a foundational concern, not a nice-to-have. The risks we outlined here are not reasons to avoid MCP -- they are reasons to deploy it thoughtfully, with the same rigor you would apply to any other interface that bridges trust boundaries.

How Safeguard.sh Helps

The Safeguard MCP Server is built with security as a first-class concern. It uses scoped authentication, input validation, audit logging, and minimal context consumption. It connects your AI assistant to your supply chain security data without introducing new risks. And because Safeguard tracks your entire software inventory, you can use it to audit what MCP servers your organization is deploying and ensure they meet your security standards.

MCP AI-security AI-agents attack-surface

Back to all articles

More on #MCP

View all →

AI Security

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

Securing AI Agents: MCP Protocol Risks and Mitigations

Understanding the MCP Threat Model

Risk 1: Tool Description Injection

Risk 2: Excessive Permission Scope

Risk 3: Data Exfiltration Through Context

Risk 4: Server-Side Request Forgery via AI

Risk 5: Insecure Credential Storage

Building Secure MCP Servers

The Bigger Picture

How Safeguard.sh Helps

More on #MCP

Model Context Protocol Permissions Model Explained

MCP Server Telemetry Data Governance

MCP Server Lifecycle Management Patterns

MCP Server Sandbox Escapes: Threat Model

Related articles in AI Security

Building an Eval Suite for Your Security LLM Workflows

Zero-Day Discovery With LLM-Augmented Reachability: A Safeguard Engine Walkthrough

Frontier LLM Vendors Are Not Your Supply Chain Security Vendor

Never miss an update

Product

Solutions

Compare

Resources

Company

Legal

Developers