AI Security

Griffin AI vs Claude Sonnet for Remediation

Claude Sonnet is the workhorse model Griffin leans on for remediation. Here's how raw Sonnet compares to Sonnet inside Griffin's remediation pipeline.

Remediation is the most satisfying part of security work and the most dangerous. A good remediation closes a real risk in a single PR. A bad one breaks production, ships a subtle regression, or introduces a new dependency that's worse than the one it replaced. The gap between those outcomes often comes down to how carefully the fix was grounded in the specifics of your codebase.

Griffin uses Claude Sonnet as its primary remediation engine. Sonnet is fast, thorough, and materially cheaper than Opus, which matters when you're producing dozens of remediation plans a day. This post compares what remediation looks like with Sonnet called directly versus Sonnet running inside Griffin's remediation pipeline — which is the same model, the same weights, doing the same kind of reasoning.

What Makes Remediation Hard

On paper, remediation looks simple. A scanner flags that you're on version 1.2.3 of a library with a CVE. The advisory says upgrade to 1.2.4. Run the bump, run the tests, ship the PR.

In practice, almost every remediation has a wrinkle. Version 1.2.4 requires a peer dependency you don't have pinned. The CVE is in a function you don't call, but the scanner can't prove reachability. The fix is in a major version bump that changes the API. The library is abandoned and you need to switch to a fork. The patched version introduces a new bug in a different feature you do use.

Any of those wrinkles turns a one-line bump into a small engineering project. A good remediation tool has to recognize the wrinkle, explain it clearly, and propose a path that respects the constraints. That's where the reasoning quality of the model you're using starts to matter.

Sonnet On Its Own

Raw Claude Sonnet is a strong remediation reasoner. Hand it a CVE, a package manifest, and a snippet of the consuming code, and it will produce a respectable upgrade plan. It'll flag peer dependency issues it can see in the manifest, warn about major version changes, and generate plausible test coverage suggestions.

Sonnet's weak spot in remediation isn't reasoning — it's grounding. The model can only reason over what you paste into the prompt, which means a remediation session with raw Sonnet requires you to gather all the relevant context yourself: the full dependency tree, the changelog of the target version, the list of call sites, the test configuration, the CI setup. That's an hour of prep per finding, which makes the approach impractical for any team dealing with volume.

The other soft spot is verification. Sonnet will cheerfully produce a remediation plan that references a version number that doesn't exist, or a migration step that isn't in the changelog, or a peer dependency bump that doesn't resolve. Without a check loop, you won't catch those errors until you try to apply the plan and the package manager complains.

Sonnet Inside Griffin's Remediation Pipeline

Griffin's remediation pipeline isn't a clever prompt. It's a small sequence of steps, each of which uses Sonnet for reasoning and deterministic tools for verification.

The first step gathers context. Griffin pulls the current pinned version, the full resolved dependency graph, the advisory record, the upstream changelog, the diff between the current and target versions, and the call sites from static analysis. That context is the prompt. The quality of the remediation that follows is almost entirely a function of whether this context assembly got it right.

The second step generates candidate fixes. Sonnet produces two or three options — a minimum bump, a full upgrade, and an alternative like a pinned workaround or a fork switch. Each option includes the concrete package coordinates, the expected behavioral change, and the required migration steps.

The third step verifies. Griffin runs each candidate fix through a resolver to confirm the package coordinates are valid and install. It runs the target version's advisories through the CVE database to confirm the proposed version isn't vulnerable to something else. It runs the migration steps against a mock lockfile to confirm resolution succeeds. When verification fails, Griffin sends the error back to Sonnet with a retry prompt, and Sonnet revises.

The fourth step packages the remediation as a pull request. Griffin drafts the commit, updates the lockfile, updates any changelog entries, and generates a PR description that explains the change in terms a reviewer will actually read.

Sonnet is present at every step. But Sonnet's mistakes are caught by the deterministic parts of the pipeline before they reach you.

The Throughput Difference

The reason Sonnet matters for remediation — rather than Opus — is cost and speed. Remediation is high-volume work. A mid-sized team might generate hundreds of candidate fixes a week. Running all of those through Opus would be both slow and expensive. Sonnet hits the sweet spot where the reasoning is good enough for most cases and the throughput is high enough to keep up with the firehose.

Raw Sonnet with a skilled operator can process maybe ten remediations an hour, limited by how long it takes to gather context. Griffin's Sonnet pipeline comfortably handles thousands per hour, limited mostly by API concurrency. That's not because the model is faster — it's the same model — but because the bottleneck has moved from human-in-the-loop context gathering to automated context injection.

When the workload is small enough that a human can sit with each finding, Sonnet alone is fine. When the workload exceeds what a human can personally guide through the context-gathering step, Griffin's pipeline stops being a luxury and starts being the only thing that keeps the queue from overflowing.

Where Raw Sonnet Still Wins

There are three remediation patterns where reaching for Sonnet directly makes more sense than running through Griffin.

Custom remediations that don't fit Griffin's templates are the obvious case. Griffin's pipeline is tuned for common remediation shapes — package version bumps, pinning, fork switches, configuration changes. If your fix is something unusual, like rewriting a usage pattern to avoid a vulnerable function entirely, Griffin's pipeline may generate a generic version bump when what you wanted was refactoring advice. Use raw Sonnet for those cases.

Exploratory migrations that aren't yet remediation are the second case. When you're evaluating "should we move off this library at all," you want a wide-open conversation with Sonnet, not a pipeline that insists on producing a pinned-version PR at the end.

Sensitive remediations where you specifically want to avoid automation are the third case. For some classes of dependency — anything crypto-related, anything auth-related, anything in the trust-boundary layer — most teams prefer a fully human-driven review, with Sonnet used as a pair programmer rather than a pipeline.

The Pattern That Works

Use Griffin for the long tail of routine remediations — the version bumps, the transitive fixes, the mechanical upgrades. That's where the pipeline's verification loop catches the hallucinated package coordinates and the impossible migration steps that raw Sonnet would happily produce.

Use Sonnet directly for the remediations where you want the model as a conversation partner — the refactors, the migration decisions, the edge cases. Pay the context-gathering cost yourself because the case is worth the extra attention.

The thing that doesn't work is trying to run everything through raw Sonnet and hoping you'll catch the errors. You won't. Sonnet is good enough to generate plausible nonsense reliably, and in remediation the cost of plausible nonsense is a broken deploy. Griffin's value isn't smarter reasoning — it's the verification layer that prevents the model from shipping its own mistakes into your pipeline.

griffin-ai anthropic claude ai-security

Back to all articles

More on #griffin-ai

View all →

AI Security

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

Griffin AI vs Claude Sonnet for Remediation

What Makes Remediation Hard

Sonnet On Its Own

Sonnet Inside Griffin's Remediation Pipeline

The Throughput Difference

Where Raw Sonnet Still Wins

The Pattern That Works

More on #griffin-ai

Total Cost of Ownership: Griffin AI vs Mythos

API Surface Reviewed: Griffin AI vs Mythos

Real-World Deployment: Griffin AI vs Mythos

Safeguard Griffin AI: Eval Benchmarks Published

Related articles in AI Security

Building an Eval Suite for Your Security LLM Workflows

Zero-Day Discovery With LLM-Augmented Reachability: A Safeguard Engine Walkthrough

Frontier LLM Vendors Are Not Your Supply Chain Security Vendor

Never miss an update

Product

Solutions

Compare

Resources

Company

Legal

Developers