Restricting what each MCP server or AI tool can actually do — even if the model asks for more.
Capability scoping is the principle — and the set of mechanisms — that enforces what an AI tool is allowed to do, independent of what a model asks it to do. The distinction matters: the model is the requester, not the arbiter. A scope sits between the two and says yes or no based on policy, not on persuasion.
The shorthand: the model asks, the scope decides. If your safety story is "we told the model in the system prompt not to do that," you do not have capability scoping — you have a suggestion. Capability scoping is what happens when the suggestion fails and the tool still refuses.
Enforcement lives at the tool layer, not the model layer. Three techniques carry most of the load:
repo:read, repo:write, secret:read — and holds a service identity scoped to exactly the subset its bundle needs. A call for a capability outside the identity's grant returns a hard no from the API, not a polite refusal from the model.main branches, and never outside business hours"). The engine decides before the side effect happens.Models are non-deterministic, their context windows are untrusted, and their system prompts lose to sufficiently motivated user input. These are not failures of individual models — they are properties of the technology. A control strategy that depends on the model refusing is not a strategy.
Capability scoping is the inverse: it assumes the model will eventually get confused, tricked, or jailbroken, and it builds the controls around the tool layer instead. That is where they can be tested deterministically, audited, and upgraded without depending on which model shipped this quarter.
Whether the model hallucinates, the context gets injected, or the prompt gets jailbroken, the scope holds. Controls that depend on model compliance do not.
A scope can be unit-tested: given this agent and this request, the answer is deterministically allow or deny. System-prompt "rules" cannot be tested that way.
The worst thing a compromised agent can do is the worst action in its scope — not the worst action in your entire stack.
Moving from Claude to GPT to an open-source model does not change your security posture, because the enforcement layer lives below the model. You re-evaluate quality, not controls.
"Here is the policy, here is the evaluation log, here is the denied call" maps directly onto audit expectations. "Here is the system prompt we wrote" does not.
Capability scoping is the enforcement core of MCP server security. Every tool call on the Safeguard control plane is policy-evaluated before it runs, and the platform security layer adds tenant-level and data-envelope checks on top.
See how Safeguard enforces capability scoping at the tool layer — where policy is testable, deterministic, and model-independent.