AI Security
LLM Output Filtering as a Security Control
Output filters are the last line before the user and the tool call. We cover when they work, when they fail, and how to measure them honestly in production.
Feb 5, 20265 min read
Deep dives, practical guides, and incident analyses from engineers who build Safeguard. No fluff, no vendor FUD — just what you need to ship secure software.
Output filters are the last line before the user and the tool call. We cover when they work, when they fail, and how to measure them honestly in production.
Agents get tool lists, not tool boundaries. We walk through scoping patterns that actually hold when Claude 4 or GPT-5 picks the wrong function at runtime.
Indirect prompt injection arrives through your retrieval corpus, not your chat box. We cover the detection strategies that survive when attackers write your RAG content.
Weekly insights on software supply chain security, delivered to your inbox.