AI Security

RAG Pipeline Supply Chain Attacks: Vector DBs and More

RAG pipelines have six or seven supply chain surfaces, and most teams are only watching one. Here is how the attacks actually look in production.

Shadab Khan
Security Engineer
7 min read

Retrieval-augmented generation has become the default pattern for grounding LLMs in enterprise data, which means it has also become a high-value target. The hard part of securing RAG is that the attack surface spans six or seven distinct components, and teams typically staff and monitor only one of them. This post walks the real surface and covers how attacks against each component look when they land.

Where are the actual supply chain seams in a RAG pipeline?

Document ingestion, parsers, chunkers, embedding models, vector databases, retrieval logic, rerankers, and the model itself. Each is an external dependency with its own maintainers, release cadence, and transitive tree. A typical Python RAG stack brings in unstructured, langchain or llama-index, an embedding library like sentence-transformers, a vector DB client (pgvector, Weaviate, Qdrant, Pinecone), and a reranker. Every one of those is a supply chain surface.

When Ultralytics was compromised on PyPI in late 2024, it was the kind of package many RAG teams had nearby in their dependency tree because of CV-adjacent preprocessing. A similar compromise in unstructured or sentence-transformers would hit most production RAG systems in the market. The response to "where could we get compromised" should not start with "the vector DB." It should start with the full dependency graph, and the vector DB is one node in it.

How does embedding model poisoning work?

An attacker publishes an embedding model (or a malicious update to a popular one) whose embeddings cluster attacker-selected documents near legitimate queries. When the retrieval step runs top-k similarity, attacker content surfaces alongside or instead of the intended documents. The LLM then treats that content as authoritative source material and repeats whatever instructions or misinformation it contains.

Hugging Face saw recurring variants of this in 2025. Some were research demonstrations; some were real. The detection problem is hard because the embeddings are a mostly-opaque blob of floats. You cannot grep them. What you can do is fingerprint the model (hash the weights, check the signature if present), track which version of which model is in production, and verify provenance against an approved list. If a team is using sentence-transformers/all-MiniLM-L6-v2, the hash should match what Hugging Face serves, and downgrade attacks should be alerted on.

Running your own embedding inference on pinned, verified weights removes most of this attack. Calling a hosted embedding API is convenient, but you trust the provider's supply chain end-to-end, which is a choice to make explicitly rather than by default.

What can happen at the vector database layer?

Two classes of problems. First, the vector DB is software with its own dependencies, and those dependencies can be compromised like anyone else's. The Qdrant Rust crates, the Weaviate Go modules, the pgvector Postgres extension all have their own trees. Second, the vector DB stores data that downstream consumers treat as trustworthy, which makes it a juicy persistence point for an attacker.

If an attacker can insert rows into your vector store, they can plant content that the retrieval step will surface to the model. That is not a vulnerability in the vector DB; it is an authorization problem in whatever service writes to the DB. Most RAG systems have a single "indexer" service that writes to the vector store, and the auth story on that indexer is often weaker than on user-facing services. Compromising the indexer (or tricking it through a poisoned source document) is how you plant payloads that the model will read later.

Vector stores should be treated as tier-one datastores for review purposes. Write paths need the same auditing as your primary database, not the hand-wave that "it's just embeddings."

How do chunk injection attacks work in practice?

The attacker causes a document to be indexed such that, when retrieved as context, it steers the model toward a desired action. This often rides on the same channel as indirect prompt injection, but it is worth separating because the injection surface is specifically the chunked, embedded form of the document rather than the raw document. A chunker that splits on paragraph boundaries will produce different payloads than one that splits on sentence boundaries, and the attack can be tailored to whichever chunking strategy the system uses.

A real-world pattern: an attacker contributes a documentation PR to an open source project. The PR is innocuous to a human reviewer but contains a paragraph that, when isolated by a chunker and retrieved in the right context, instructs the model to include a specific dependency in any code it generates. If the project's RAG-based assistant retrieves that chunk during a coding session, the generated code carries the attacker's dependency. This pattern has been seen in research and at least once in the wild.

The mitigation is content review that specifically looks for instruction-shaped paragraphs, plus deduping and comparison against known-good sources. The chunker itself can also help by limiting chunk autonomy: each chunk retrieved should carry enough surrounding context that isolated instructions are less potent.

What about the retrieval logic itself?

Retrieval ranking is code, and in many stacks it is custom code that touches both the query and the returned documents. That code is a compact attack surface. A subtle bug in the scoring function, or a subtly malicious commit to a reranker library, can bias retrieval toward attacker content without any embedding poisoning. Most teams do not treat reranker updates with the same scrutiny as model updates, which is an error.

Pin your reranker, diff its behavior on a stable eval set before upgrading, and include it in your SBOM. Cohere's rerankers, BGE, and various community ones all publish regularly; the update cadence alone should make you cautious about auto-pulling new versions into production.

How do we detect this in production?

Continuous monitoring of retrieval quality on known-good queries, alerts on unusual retrieval patterns (sudden spikes in which documents get retrieved, or embeddings drifting outside expected clusters), and audit logging of every write to the vector store. The operational signal that usually surfaces a RAG compromise first is not a security alert. It is users noticing the assistant is saying odd things. Shortening the distance between that signal and your incident response is high leverage.

SBOMs for RAG systems should include the embedding model version and hash, the vector DB version, the reranker model, and the full Python or Node dependency tree of the orchestration layer. Most teams stop at the orchestration layer and leave the model components out, which makes incident response harder than it needs to be.

What does a realistic hardening checklist look like?

Seven items, in rough order of return on investment. Pin every dependency in the orchestration stack to an exact version and verify hashes on install. Run your own embedding inference on locally stored, signature-verified weights. Treat the vector store as a tier-one datastore with authenticated writes and full audit logs. Put a content classifier at the ingestion boundary that flags instruction-shaped paragraphs and known injection patterns. Diff reranker behavior on a stable evaluation set before any version bump. Include model and content components in the SBOM, not just the Python packages. Monitor retrieval quality and embedding-space drift on a schedule with alerting thresholds tied to an on-call rotation. Every one of these is boring; every one of these is the kind of boring that prevents a quiet compromise from persisting for months.

How Safeguard.sh Helps

Safeguard.sh's reachability analysis cuts 60 to 80 percent of the noise from SCA scans across the full RAG stack: orchestration libraries, vector DB clients, embedding dependencies, and rerankers. Griffin AI surfaces anomalies in embedding model provenance and flags content-level injection patterns in indexed documents alongside code-level CVEs. SBOMs include model and content components alongside traditional packages, with 100-level dependency depth catching transitive compromises in libraries like unstructured or sentence-transformers. Container self-healing rebuilds RAG service images when downstream dependencies ship fixes, so pipelines do not stay exposed between release windows.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.