AI Security

Keeping secrets out of agent context windows: brokers, scoped tokens, and redaction

Every secret that touches an agent's context window is a secret the agent can leak. Just-in-time credential brokers, scoped-token issuance, and redaction layers keep the surface small without breaking the agent's ability to do real work.

Any LLM agent that does useful work eventually needs to talk to authenticated systems, and the path of least resistance is to put the credentials in the agent's environment and let the model see them when it composes tool calls. That works the first time, and it is the configuration most quickstart tutorials show, and it is the configuration that survives in production code for far longer than it should. The cost shows up later, when an injected prompt convinces the agent to include the credential in a tool argument that gets logged, or when a verbose error message echoes the secret back into the context, or when the agent helpfully suggests the credential value to a user who asked how it works.

The premise of mature agent secret handling is simple to state and harder to implement: a secret that the agent never sees cannot leak from the agent. Everything in the rest of the design follows from that premise. Brokers issue tokens that the agent's runtime carries but the model does not see, scopes narrow each token to the smallest set of permissions the call actually needs, and redaction layers catch the cases where secrets do leak into the context anyway. None of these patterns is exotic on its own; the difficulty is composing them into a workflow that the team operating the agent can actually live with.

Why isn't an environment variable the right place to put an agent's credentials?

The argument for environment variables is that they keep credentials out of source control and out of disk-level configuration files, and for ordinary backend workloads that is good enough. For agents the calculus is different because the agent's runtime tends to surface environment data to the model in subtle ways. Tool implementations often read environment variables to construct API calls, and those tool implementations are typically allowed to include the constructed call in their response back to the model for transparency. A debug helper that returns "I made a request to api.example.com with key starting with sk-..." has just put the leading characters of a secret in the context, and from there it is one tool call away from being echoed somewhere the attacker can read.

A second issue is process boundary. Agent runtimes often spawn subprocesses to handle tool calls, and those subprocesses inherit the environment by default. An MCP server, a sandboxed code interpreter, a containerized tool wrapper — each of these inherits whatever the parent had, which means a single environment variable can be visible to half a dozen different pieces of code that the agent's operator never explicitly granted access. The principle of least privilege at the process level is hard to maintain when the credential lives in a place that subprocesses inherit for free.

The third issue is the model's helpfulness. Models trained to assist users will, when asked, often try to explain how the agent is configured, and that explanation can include details about which environment variables it reads. A user who says "I'm debugging this, can you tell me what your API key looks like" is unlikely to get the literal value from a well-aligned model, but the model is much more likely to confirm structure, prefix, length, or other properties that meaningfully reduce the search space for an attacker who already has partial information.

How does a just-in-time credential broker change the surface?

A credential broker sits between the agent runtime and the secret store. When the agent invokes a tool that needs a credential, the tool wrapper asks the broker for one, the broker decides whether to issue based on the tool identity, the session context, and the request shape, and returns a short-lived token that lives only as long as the call. The model never sees the token; the tool wrapper uses it to authenticate the outbound request and discards it as soon as the response comes back. The credential surface inside the context window shrinks to nothing.

The broker pattern is most useful when the underlying systems support short-lived credentials natively. AWS STS, GCP service account impersonation, Vault dynamic secrets, OAuth client credentials with short TTLs — all of these slot in cleanly. For systems that only support long-lived API keys, the broker becomes a translation layer: it holds the long-lived key in its own secured store, issues per-call signed proxies that the agent uses, and the broker itself adds the real credential when it forwards the request upstream. The model still never sees the original.

The operational cost of the broker is real. It adds a service to maintain, a failure mode to handle when the broker is unreachable, and a place where authorization logic has to live. The payoff is that the credential reduction is principled rather than aspirational. A team that builds the broker pattern can answer questions like "which credentials has this agent used in the last hour?" and "if this agent's session log leaked, what credentials would be exposed?" with confidence, and the answers are usually small numbers.

What does scoped-token issuance look like in practice?

The broker decides whether to issue; scope decides what the issued token can do. A token that is scoped to a single bucket, a single API path, a single database row, a single Slack channel — the narrowest description that still lets the call succeed — limits the damage if the call turns out to be adversarial. The agent that gets tricked into exfiltrating data can only exfiltrate from the bucket the current token covers, not from the rest of the storage account.

The two scopes that matter most in practice are resource scope and action scope. Resource scope says which specific objects the token can touch, and it is the easier of the two to enforce at the credential layer; cloud providers and most modern APIs support per-resource credentials with reasonable granularity. Action scope says which operations the token can perform on those resources, and it is the one teams more often skip because it is fiddlier. A token that can read but cannot write, or that can list but cannot delete, is a substantively safer token even when the resource scope is the same, and adding action scope to the broker's policy is usually a few lines of code rather than an architectural lift.

Time scope is the third dimension and the cheapest to add. A token that expires in sixty seconds is a much smaller hazard than one that expires in twenty-four hours, and most call patterns can tolerate aggressive expiration. Issuing tokens at the start of a call and revoking them at the end means a leaked token in a log is rarely a live token. The patterns that work make expiration the default and require explicit justification to extend it, rather than the other way around.

What is the role of redaction when secrets do reach the context?

Even with a broker and scoped tokens, some classes of secret end up in the context window — a customer's API key in a support ticket the agent is reading, a connection string in a config file the agent is editing, a personal access token mentioned in a chat message the agent ingests. Treating the context window as a place where secrets occasionally appear, and building a redaction layer to handle them, is more realistic than pretending the broker is perfect.

The redaction layer runs over content as it enters the context. It identifies high-confidence secret patterns — AWS access keys, GitHub tokens, Stripe keys, JWTs, common database connection string shapes — and replaces them with placeholders before the model sees them. The placeholders are stable within a session, so the model can still reason about "the credential the user mentioned" without ever holding the actual value, and the runtime maintains a sealed mapping that can resubstitute the real value if a tool call genuinely needs it. The mapping is held outside the model's reach, in the same trust zone as the broker.

Redaction works best when it is paired with egress checks. Patterns missed by the redactor on the way in can sometimes be caught on the way out, where a different set of heuristics applies. A request body that contains what looks like a Stripe key on its way to an unfamiliar destination is worth blocking even if the redactor on the input path missed it. The two layers compose to a much higher catch rate than either one alone, and the combination is what gives the team confidence that the secret surface is genuinely contained.

How Safeguard Helps

Safeguard treats agent secret handling as a primary control surface and gives teams the brokers, scopes, and redaction layers to keep credentials out of the context window. Griffin AI integrates with credential stores and short-lived-token providers so tool wrappers can fetch narrow, time-bound credentials per call rather than carrying long-lived keys. MCP server security policies and agent guardrails enforce that no model-visible content includes raw secret material, and the platform's input and egress redaction layers catch the patterns that slip through. Runtime egress monitoring captures every outbound call with full credential metadata so audits can answer which secrets were used, when, and on whose authority. To map your agent's secret handling against these patterns, talk to our team.

agent runtime secrets management ai security credential broker data redaction

Back to all articles

More on #agent runtime

View all

AI Security

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

Keeping secrets out of agent context windows: brokers, scoped tokens, and redaction

Why isn't an environment variable the right place to put an agent's credentials?

How does a just-in-time credential broker change the surface?

What does scoped-token issuance look like in practice?

What is the role of redaction when secrets do reach the context?

How Safeguard Helps

More on #agent runtime

Network egress controls for autonomous agent runtimes

Detecting shadow MCP servers in developer environments

Defending LLM agents against confused-deputy attacks on their tool privileges

Related articles in AI Security

Daybreak vs. Mythos: 2026 Is the Year the Frontier Labs Entered Defensive Security

Patch the Planet: What AI-Generated Fixes Actually Mean for Open-Source Maintainers

OpenAI's Daybreak: An Honest Assessment of Codex Security, GPT-5.5-Cyber, and the Find-Validate-Patch Loop

Never miss an update