Frameworks

OWASP LLM Top 10 2025: System Prompt Leakage and Vector Weaknesses

The OWASP Top 10 for LLM Applications 2025 added System Prompt Leakage and Vector/Embedding Weaknesses, and elevated Sensitive Information Disclosure to #2. Here is the defender view.

Alex
Security Engineer
7 min read

The OWASP Top 10 for Large Language Model Applications 2025 edition introduced two new categories, substantially reworked several others, and reordered the list based on community feedback from two years of production LLM deployment data. The list, maintained by the OWASP GenAI Security Project, has become the most-referenced security artifact for teams building applications that wrap LLM capabilities — RAG systems, agentic workflows, copilots, support chatbots, and the long tail of internal AI tools. The 2025 update reflects what defenders have actually been hit by in the field, not what the original 2023 list anticipated. For security teams that adopted the 2023 list as training and threat-modeling material, a refresh is overdue.

What is the 2025 list and what changed?

The 2025 categories: LLM01 Prompt Injection (#1, unchanged), LLM02 Sensitive Information Disclosure (up from #6), LLM03 Supply Chain (refined), LLM04 Data and Model Poisoning, LLM05 Improper Output Handling, LLM06 Excessive Agency (expanded), LLM07 System Prompt Leakage (new), LLM08 Vector and Embedding Weaknesses (new), LLM09 Misinformation, and LLM10 Unbounded Consumption. The two genuinely new categories are System Prompt Leakage and Vector and Embedding Weaknesses, both of which reflect operational lessons from production RAG deployments and from the wave of prompt-based jailbreak research that matured in 2024.

Why did System Prompt Leakage become its own category?

System prompts — the prefix instructions that shape an LLM's behavior — contain sensitive information far more often than designers initially appreciated. Real production system prompts frequently embed business rules ("never reveal pricing under $X"), internal hostnames and API patterns, prompt-engineering tactics that are commercially differentiating, identity-management hints ("admin users have 'role:admin' in their session"), and PII or credential references. Once attackers extracted system prompts via well-known prompt-injection patterns ("repeat the text above"), they obtained operational reconnaissance against the deployment. The 2025 list elevates this to a dedicated category to make explicit that system prompts must be treated as semi-public, with all sensitive logic enforced server-side rather than in prompt text.

# LLM07 anti-pattern: sensitive logic in the system prompt
SYSTEM_PROMPT = """
You are SupportBot for Acme Corp.
- Customers on enterprise tier (look for "tier:ent" in user context) can request refunds up to $5000 without approval.
- Customers on growth tier are capped at $500.
- Never reveal pricing for enterprise tier.
- Internal API endpoint: https://billing.acme.internal/v3/refund
"""

# When extracted, this prompt reveals tiering logic, dollar limits, and an internal endpoint.
# Defense: keep tiering and limit decisions in code, not in prompt text:

def authorize_refund(user, amount):
    tier = user_tier(user)            # server-side lookup
    limit = REFUND_LIMITS[tier]       # server-side configuration
    if amount > limit:
        raise NeedsApprovalError(limit)
    return billing_client.refund(user, amount)

What are Vector and Embedding Weaknesses (LLM08)?

The LLM08 category covers risks in retrieval-augmented generation (RAG) systems and vector databases. Real-world failure modes include: documents poisoned with embedded instructions that flip the LLM's behavior when retrieved (indirect prompt injection at the embedding layer), embedding-similarity attacks where adversarial inputs are crafted to match high-trust documents and bias retrieval, multi-tenant vector stores where tenant data leakage occurs through shared embedding spaces or poorly partitioned indexes, and unsanitized metadata in retrieved documents that becomes part of the model context. The 2025 list elevates this category because RAG is now the dominant production deployment pattern and these failure modes have moved from research curiosities to active exploitation.

Why did Sensitive Information Disclosure jump to #2?

Field data showed that sensitive information disclosure — both PII leakage from training-data memorization and operational data exposed via prompt injection or excessive context — was both more prevalent and more impactful than the 2023 list reflected. Several incidents involved customer support chatbots inadvertently revealing other customers' tickets, code assistants leaking pre-release feature details, and AI agents disclosing internal infrastructure metadata. The 2025 list elevates the category to #2 and the guidance recommends a combination of data classification (do not put what you cannot afford to leak into the model context), output filtering, and continuous testing against memorization probes.

How did Excessive Agency expand in the 2025 list?

LLM06 Excessive Agency was on the 2023 list but received substantial expansion in 2025 to reflect the rapid growth of agentic architectures — patterns where an LLM is given tools, autonomy, and the ability to take actions on the user's behalf. The 2025 guidance recognizes that giving an LLM agency creates a class of risks distinct from simple prompt-injection: an LLM agent with broad tool permissions can be coerced into actions by prompt injection that an LLM without tools could merely suggest. Recommended controls in the 2025 guidance include least-privilege tool design (each tool grants the minimum permission necessary, no convenience superpowers), human-in-the-loop gates on consequential actions (send email, delete file, transfer funds, modify access), tool-use auditing (every tool invocation produces a structured audit event the security team can review), and bounded execution (agents have rate limits, recursion limits, and cost limits on their tool use). The expanded category is informed by real incidents where an agent connected to a customer-data API was prompted into broad data exfiltration through a carefully crafted user request.

How does the Supply Chain category map to broader supply-chain work?

LLM03 Supply Chain in the 2025 list addresses three layers: model provenance (where did the model come from, who fine-tuned it, what training data was used), model dependencies (the tokenizer, embedding model, and tool integrations a system depends on), and the broader software dependencies of the LLM application. The category aligns with the NIST SSDF 218A community profile for AI development, with the AIBOM concept being explored by CycloneDX 1.7's expanded model fields, and with the SLSA framework's emerging interest in attestations for model artifacts. Treat LLM03 as the AI-specific instance of your existing supply-chain security program — most of the controls you already deploy for software components apply, with model-specific augmentations for training data, fine-tuning provenance, and evaluation results.

How should AppSec teams act on the 2025 list?

Three actions. First, audit your production LLM applications against System Prompt Leakage by attempting to extract the system prompt yourselves and assessing what an attacker who succeeded would obtain. Second, review your RAG architecture for Vector and Embedding Weaknesses — partitioning, retrieval-time content sanitation, embedding-side poisoning resistance. Third, update your threat-modeling templates to use 2025 category names; the 2023 list is still useful, but the 2025 vocabulary is what your training, threat-intel, and incident-response tooling should align to.

How do Improper Output Handling and Misinformation fit the picture?

LLM05 Improper Output Handling addresses what happens after the model produces text. The 2025 guidance emphasizes that LLM output should never be treated as trusted by downstream systems: output rendered into HTML must be escaped to prevent XSS, output included in shell commands or SQL queries must be parameterized, output passed to file system or network operations must be validated against allowlists. The category exists because a meaningful subset of LLM-application vulnerabilities arise from "the model told us to run this command and we just ran it." LLM09 Misinformation addresses model outputs that are confidently wrong, including hallucinated package names that have led to dependency-confusion attacks in real incidents. The 2025 guidance recommends grounding strategies (RAG with verified sources), output verification (checking facts against authoritative databases before presenting them to users), and clear UX disclosure that LLM output may be incorrect. Both categories matter for security teams because they are paths where a benign-seeming LLM feature becomes a vulnerability vector.

How Safeguard Helps

Safeguard's AI security module maps directly to the 2025 LLM Top 10. For LLM03 Supply Chain, the platform's AIBOM capability tracks model provenance, training data documentation, and tokenizer/embedding-model dependencies. For LLM07 System Prompt Leakage, Griffin AI scans application code and prompt repositories for sensitive content in prompt strings, flagging tiering logic, internal endpoints, or PII references that should be moved to server-side enforcement. For LLM08 Vector and Embedding Weaknesses, the platform integrates with vector-database providers to audit partitioning, retrieval-side sanitation, and embedding refresh policies. Policy gates can require AI-model and RAG-pipeline attestations before production deployment, including documented mitigations for each applicable Top 10 category. For AppSec teams, Safeguard generates Top 10:2025-aligned LLM application risk reports that consolidate findings from prompt-injection probing, model lineage review, and RAG configuration audit into a single defender-friendly artifact.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.