A dependency upgrade is a common remediation. A CVE lands against a library, the advisory says to upgrade to a patched version, and the tool opens a PR bumping the version. The question is which version.
The answer is not the one the advisory names as the fixed version. The answer is the smallest version bump that fixes the CVE, resolves against the project's other constraints, is compatible with the language and runtime the project uses, and does not break peer dependencies. That is a real decision with real tradeoffs, and it is where the difference between Griffin AI and Mythos-class pure-LLM tools becomes most visible.
What a good upgrade pick requires
A useful upgrade recommendation has to consider four axes. Patch coverage means the picked version actually includes the fix, not just a later release that happens to postdate the advisory. Constraint compatibility means the picked version resolves against the constraints expressed in peer dependencies. Runtime compatibility means the picked version supports the language and runtime the project targets. Ecosystem compatibility means the picked version works with the rest of the project's dependency tree without triggering conflicting resolutions.
Miss any one of these and the upgrade either fails to install, installs and fails to build, or installs and builds and breaks at runtime.
Griffin AI's upgrade selection
Griffin grounds upgrade selection in the project. Before recommending a version, it reads the lockfile, the manifest, the peer dependency constraints of sibling packages, and the runtime requirements of the project.
From the advisory, it identifies the patched range. From the lockfile, it identifies the currently pinned version. From the peer constraints, it identifies the upper bound the project can accept without forcing other upgrades. From the runtime requirements, it filters out versions that require a newer language target than the project supports.
The intersection of those constraints is usually a small set, often a single version. That is the version Griffin picks. The PR includes the reasoning and the constraint chain, so a reviewer can verify the pick without re-running the resolver.
How Mythos-class tools choose versions
Pure-LLM remediation tools in the Mythos class typically pick the version the advisory names as the fixed version, or the latest stable release as of the model's training cutoff. Neither is reliably the right choice.
The advisory-named version is usually the minimum patched version, which may or may not be compatible with the project's other constraints. Picking it without checking can produce an unresolvable dependency graph.
The latest stable version is often a major version bump ahead of where the project currently sits. A major bump carries breaking changes, and those changes propagate through the project. The PR may compile because types are forgiving or generic, but runtime behavior can differ in subtle ways.
Neither approach consults the lockfile, peer constraints, or runtime requirements. The model does not have them.
The specific failure modes
Three patterns recur in pure-LLM upgrade PRs.
The first is the unresolvable resolution. The tool bumps package A to a version that requires package B at a range the project does not satisfy. The resolver fails, the install fails, and the PR cannot be merged without additional work.
The second is the major-version leap. The tool bumps the package across a major version boundary without flagging it. The diff is one line. The project compiles. A feature that depended on removed behavior breaks quietly and surfaces in production.
The third is the runtime mismatch. The tool bumps a package to a version that requires a newer language target than the project supports. The build fails with a runtime compatibility error. The reviewer now has to choose whether to upgrade the runtime as well or roll back the dependency change.
Griffin catches all three before the PR opens because the constraint chain is checked at recommendation time.
The peer dependency problem
Peer dependencies are the hardest part of upgrade selection and the part pure-LLM tools handle least well. A peer constraint says that package A requires package B within a specific range. When the tool bumps B, any sibling that depends on B has to still be compatible.
Griffin evaluates peer constraints as part of the grounded pipeline. The recommended version has to satisfy all active peer constraints. If none can satisfy them, Griffin either proposes a multi-package upgrade or reports the conflict for human decision. Either outcome is better than silently producing a PR the resolver will reject.
Pure-LLM tools without peer constraint awareness open PRs that fail at the install step. The reviewer sees the failure, closes the PR, and has to do the peer analysis themselves. At that point the tool has not saved anyone any time.
Transitive upgrade cascades
Some CVEs live in transitive dependencies, where the direct dependency in the project does not include the vulnerable code, but one of its dependencies does. Upgrading the transitive dependency requires either upgrading the direct dependency to a version that pulls in a patched transitive, or using an override mechanism specific to the ecosystem.
Griffin handles both paths. When a direct upgrade exists that pulls in the patched transitive, Griffin prefers it because it avoids the override mechanism. When no such direct upgrade exists, Griffin proposes an override with the scope limited to the affected transitive.
Pure-LLM tools rarely distinguish between these paths. They often propose overrides when direct upgrades would have worked, or propose direct upgrades when overrides were the safer scoped change.
The explanation a reviewer needs
An upgrade PR's explanation should include the current version, the recommended version, the CVE being addressed, the constraint chain that produced the recommendation, and any peers or transitives affected.
Griffin's explanation has all of those. A reviewer can verify the pick against the lockfile and the peer constraints without re-running the resolver. If they want to override the recommendation, they have the information to make an informed choice.
A Mythos-class explanation typically has the current version, the recommended version, and the CVE. The constraint chain is missing because the tool did not compute it. The reviewer has to do that work themselves.
Evaluation across ecosystems
The gap shows up differently by ecosystem. In npm, where peer dependencies are common and major version bumps carry semantic weight, pure-LLM tools produce unresolvable PRs at a high rate. In Python, where version constraints are looser but environment markers matter, pure-LLM tools often pick versions that ignore the markers. In Java, where transitive version conflicts are common and override mechanisms vary by build system, pure-LLM tools often propose changes that the build system cannot express.
Griffin's handling varies by ecosystem too, but the consistent property is that the recommendation is grounded in the ecosystem's actual rules. The reviewer gets a version the ecosystem will accept.
What to measure
When evaluating a remediation tool's upgrade picks, measure the resolver-failure rate, the runtime-failure rate, and the peer-conflict rate. Run the tool across a sample of real advisories on a real project. Count how many PRs install, how many compile, and how many run.
Griffin's resolver-failure rate stays low because the grounding includes the resolver's view. Pure-LLM tools show resolver failures frequently, especially on projects with active peer constraints.
The tool that picks versions with the full constraint chain in mind is the tool that produces dependency upgrade PRs worth merging. The one that picks versions from the advisory or the training data is the one whose PRs sit open until a human rebuilds the decision.