In the past twelve months, AI agents have gone from research demos to production infrastructure. Coding assistants write and commit code. DevOps agents provision cloud resources. Security copilots triage alerts and draft incident response plans. Customer service agents access databases, process refunds, and modify account settings.
Every one of these agents is a supply chain participant. They consume dependencies, call APIs, and produce artifacts that downstream systems trust. And almost none of them are subject to the supply chain security controls we apply to human developers.
This is a problem.
What Makes Agents Different
A traditional software supply chain has well-understood trust boundaries. A developer writes code, a build system compiles it, a pipeline tests and deploys it. At each stage, we can apply controls: code review, dependency scanning, SBOM generation, policy gates.
AI agents collapse these stages. A sufficiently capable agent can write code, select dependencies, configure build systems, and trigger deployments — all in a single autonomous session. The speed is impressive. The auditability is terrifying.
Consider a concrete example: a coding agent tasked with "add PDF export to the reporting module." The agent might:
- Search for PDF generation libraries
- Select one based on popularity and documentation quality
- Install it as a dependency
- Write integration code
- Generate tests
- Commit the changes
At no point does a human review the dependency selection. The agent does not check whether the library has known vulnerabilities, whether its maintainer is trustworthy, or whether it pulls in transitive dependencies with problematic licenses. The agent optimizes for functionality, not security.
The MCP Attack Surface
The Model Context Protocol (MCP) has become the dominant standard for connecting AI agents to external tools and data sources. MCP servers expose capabilities — file system access, database queries, API calls — that agents can invoke programmatically.
This architecture introduces supply chain risks at multiple levels:
MCP server provenance. When an organization deploys an MCP server, they are trusting that server's code to mediate between an AI agent and sensitive systems. Many MCP servers are community-built, with limited security review. A compromised or malicious MCP server can exfiltrate data, modify responses, or inject malicious instructions.
Tool poisoning. MCP servers advertise their capabilities through tool descriptions. A malicious server can craft descriptions that manipulate agent behavior — effectively prompt-injecting through the tool layer. This is a new class of supply chain attack that did not exist before the agent era.
Credential exposure. Agents need credentials to interact with external systems. MCP servers often store or proxy these credentials. The security of the credential management layer varies wildly across implementations.
Transitive trust. When Agent A calls MCP Server B, which queries API C, which returns data from System D — who is responsible for the integrity of the final result? The trust chain is long and often opaque.
Dependency Hallucination Attacks
We touched on this in our supply chain trends piece, but it deserves deeper treatment. LLMs hallucinate package names. This is not a bug — it is an inherent property of probabilistic text generation.
Researchers at Vulcan Cyber documented over 100 package names that popular LLMs consistently hallucinate. When these phantom packages are registered on npm or PyPI by an attacker, any agent that follows the LLM's suggestion will install malicious code.
The attack is elegant in its simplicity:
- Identify package names that LLMs frequently hallucinate
- Register those names on public registries
- Wait for AI agents to install them
Defenses against this attack are still immature. Package registries could implement reservation systems for hallucination-prone names. Organizations could maintain allowlists of approved packages. Agents could be required to verify package existence and security posture before installation. But today, most environments have none of these controls.
The Audit Trail Problem
When a human developer makes a decision, there is usually a trail: a PR description explaining why a dependency was chosen, a commit message describing the change, maybe a Slack thread where alternatives were discussed.
AI agents produce minimal audit trails. The agent picked Library X over Library Y, but the reasoning is embedded in the model's weights, not in a document. This makes post-incident forensics extremely difficult.
If an agent introduces a vulnerable dependency that later gets exploited, the investigation needs to answer: Why was this dependency selected? Were alternatives considered? What information did the agent have at the time of the decision? With current tooling, these questions are largely unanswerable.
Practical Defenses
None of this means organizations should stop using AI agents. The productivity gains are real and significant. But the security controls need to catch up. Here is what we recommend:
Dependency allowlists. Do not let agents install arbitrary packages. Maintain a curated list of approved dependencies and require human approval for anything outside the list.
SBOM-aware agent workflows. Generate SBOMs at every stage of agent-driven development. If an agent adds a dependency, the SBOM should be updated and evaluated against policy gates before the change is accepted.
MCP server governance. Treat MCP servers like any other third-party software. Review their source code, validate their provenance, and apply the principle of least privilege to their capabilities.
Mandatory human review gates. For any agent action that modifies the dependency tree, require a human review step. This is not about slowing down — it is about maintaining the trust boundaries that supply chain security depends on.
Agent activity logging. Implement comprehensive logging of agent decisions, including which tools were invoked, which dependencies were considered, and which data sources were consulted. This provides the audit trail that agents do not naturally produce.
The Bigger Picture
AI agents are the newest participants in the software supply chain, and they do not play by the same rules as human developers. They are faster, less cautious, and harder to audit. The security frameworks we have built over the past decade — SBOMs, SLSA, Sigstore, policy gates — are still relevant, but they need to be extended to cover agent-driven workflows.
The organizations that figure out how to get the productivity benefits of AI agents without the security blind spots will have a significant competitive advantage. The ones that deploy agents without controls will eventually learn the lesson the hard way.
How Safeguard.sh Helps
Safeguard's policy gates work at the dependency level, regardless of whether a human or an AI agent introduced the dependency. Our platform can enforce allowlists, block packages with known vulnerabilities, and require SBOM validation before any change reaches production. For organizations deploying MCP servers, Safeguard provides provenance verification and capability auditing. The goal is to bring agent-driven development under the same supply chain security umbrella as human-driven development — because the risks are the same, even if the speed is different.