AI Security

Tool-Call Hijacking: Griffin AI vs Mythos

A hijacked tool call is more consequential than a hijacked response. The defence requires the tool layer to police the model, not the other way around.

Tool-call hijacking is the attack variant where prompt injection or model manipulation induces the model to invoke tools the operator did not intend. Unlike prompt injection that produces a bad response, tool-call hijacking produces bad actions in the world — files deleted, emails sent, money moved. The defence requires the tool layer to enforce scope rather than trusting the model to request only authorised actions. Architectural choices matter here more than at the model layer.

What hijacking attacks look like

Three patterns:

Scope-expansion attempts. The model is induced to call a tool that exists but is not authorised for the current session.
Argument manipulation. The model calls an authorised tool but with manipulated arguments that expand the effective access.
Cascading tool invocation. The model is induced to make a chain of tool calls that together produce unauthorised behaviour even if each individual call is authorised.

Each needs different defence.

Where model-level defences fail

"Ask the model nicely to only call authorised tools" is the weakest defence. It depends on the model's judgment, which is exactly what the attack exploits. Mythos-class tools that rely primarily on model-level enforcement have measurable vulnerability.

How Griffin AI handles it

Three architectural layers:

Authorisation at the tool layer. Each tool has scope, and the scope is enforced by the tool handler, not by the model. When the model emits a call to a tool outside the session's scope, the tool layer refuses regardless of how the model justified the call.

Argument validation against capability manifest. Each tool's capability manifest declares what arguments are acceptable. Argument values outside the declared range are rejected at the tool layer.

Cascading-invocation rate limits. Anomalous patterns of cascading tool calls (e.g., read followed immediately by exfiltrative write) trigger rate limits and optional out-of-band confirmation.

A concrete example

An MCP server exposes a list_files(directory) tool scoped to a specific directory. An attacker's prompt injection induces the model to call list_files(directory="../../etc/").

With model-level defences: the model may or may not refuse depending on training.

With Griffin AI's tool-layer enforcement: the tool handler rejects the call because the directory argument escapes the declared scope. The attacker's attempt is logged; the audit trail shows the model attempted an out-of-scope call.

What to evaluate

Three concrete checks:

Configure an MCP server with narrow scope. Induce the model to attempt a scope-escape. Verify the scope holds.
Attempt argument manipulation (path traversal, SQL injection in tool arguments). Verify the arguments are validated at the tool layer.
Attempt cascading tool invocation. Verify rate limits trigger.

How Safeguard Helps

Safeguard's tool-call security is built on tool-layer enforcement of scope, argument validation, and cascading-call rate limits. The defence does not depend on the model cooperating. For organisations whose AI agents have access to real production tools, this architectural choice is the difference between a safe deployment and a latent incident.

griffin-ai mythos tool-call-hijacking ai-security

Back to all articles

More on #griffin-ai

View all →

AI Security

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

Tool-Call Hijacking: Griffin AI vs Mythos

What hijacking attacks look like

Where model-level defences fail

How Griffin AI handles it

A concrete example

What to evaluate

How Safeguard Helps

More on #griffin-ai

Total Cost of Ownership: Griffin AI vs Mythos

API Surface Reviewed: Griffin AI vs Mythos

Real-World Deployment: Griffin AI vs Mythos

Safeguard Griffin AI: Eval Benchmarks Published

Related articles in AI Security

Building an Eval Suite for Your Security LLM Workflows

Zero-Day Discovery With LLM-Augmented Reachability: A Safeguard Engine Walkthrough

Frontier LLM Vendors Are Not Your Supply Chain Security Vendor

Never miss an update

Product

Solutions

Compare

Resources

Company

Legal

Developers