Vulnerability Management

SLO-Driven Vulnerability Management Program

Service-level objectives turn vulnerability management from heroics into a measurable program. Here is how to define SLOs that survive contact with reality.

Nayan Dey
Senior Security Engineer
8 min read

Why heroics do not scale

Most vulnerability management programs run on heroics. A new critical advisory drops, the security team mobilises, engineering pivots, and the issue gets fixed within hours. The team congratulates itself, the incident retrospective is positive, and the program continues operating exactly as before. Two months later, a similar advisory drops, and the same heroics happen again. The pattern is mistaken for a healthy response when it is actually the symptom of a program that has no defined commitment to performance.

A program that depends on heroics has no service-level objective. There is no number the team is committed to hitting, no number the business can plan around, and no number that distinguishes good performance from luck. When the heroics fail, as they eventually do, the program has no defensible answer for why a particular finding sat for 30 days when an earlier one was fixed in three.

SLOs change this dynamic. An SLO is a published target for how the program performs against defined categories of work. It tells the business what to expect, gives the security team a metric to optimise, and creates the structure that lets the program scale beyond what individual heroics can sustain. By 2026, the leading vulnerability programs are run as SLO-driven services, with the same discipline that SRE teams apply to availability.

What a useful SLO looks like

A useful SLO has three properties. It is specific, measurable, and tied to a business-relevant category of work. Total findings closed per month is not an SLO, because it depends on inflow, which the team does not control. Mean time to remediate is closer, but it averages across heterogeneous categories that should be treated differently.

The SLOs that hold up are tier-based. For each tier of finding, there is a specific time window for remediation, mitigation, or formal acceptance. A reasonable starting structure looks like this. Tier one findings, defined as reachable, exploitable, on internet-facing assets, must reach a documented closure state within 72 hours. Tier two findings, reachable but lower exploitability or internal exposure, must close within 14 days. Tier three findings, not currently reachable, must close within 90 days through routine refresh. Tier four findings, where remediation is impractical, must have a formal acceptance or mitigation record within 30 days of identification.

The numbers will vary by organisation, but the structure should not. Each tier has a definition that depends on real risk inputs, a time window the team commits to, and an outcome criterion that is auditable. If you cannot tell whether a finding is in tier one or tier two, the definitions are not specific enough.

Reachability as the foundation of tiering

The tiering structure described above depends on reachability data. Without it, every reachable and non-reachable finding looks the same, which forces the SLO to either over-commit on findings that do not warrant urgency, or under-commit on findings that do. Both failure modes are common in programs that try to set SLOs without proper reachability instrumentation.

Reachability has to be the foundation, not an afterthought. The SLO needs to be defined in terms of reachable findings as the primary risk category, with non-reachable findings on a separate, longer schedule. This is not a way of demoting unreachable findings, it is a way of making the active risk register reflect actual risk, so the SLO commitment is meaningful.

Modern reachability tooling makes this practical. The reachability state of every finding is computed and refreshed automatically, and the SLO clock starts when a finding is classified into its tier. If reachability changes later, for example because new code adds a path to a previously unreachable function, the finding gets re-tiered and the clock adjusts. The system, not the engineer, manages the bookkeeping.

Auto-PR throughput as the operational lever

An SLO is only as good as the team's ability to hit it. For tier one findings with a 72-hour window, the team needs a remediation path that completes within that window even when the security engineer who identified the finding is asleep. The only way to make this work consistently is automation.

Automated PR generation, gated on policy and supported by reachability and exploitability data, is the operational mechanism that lets the SLO hold. When a tier one finding is identified, the system opens a PR within minutes, runs CI, and routes the review to the appropriate engineer. The window from identification to PR-ready collapses from hours or days to under an hour, which leaves the rest of the SLO budget for human review and merge.

The throughput numbers matter. Programs that have not invested in auto-PR typically struggle to hit even tier two SLOs, because the manual handoff between security and engineering consumes more time than the SLO allows. Programs with mature auto-PR workflows hit tier one consistently, which is the difference between an SLO that is published and one that is met.

What gets measured

The metrics that support an SLO-driven program are different from the metrics that support a heroics-driven one. The headline metric is SLO attainment rate per tier, expressed as the percentage of findings in each tier that closed within the committed window. A healthy program runs in the high 90s for tiers one and two, with tier three and four trending similarly.

The supporting metrics are inflow rate per tier, mean time to PR-ready per tier, and mean time to merge per tier. These decompositions tell the team where the SLO is being challenged, so interventions can be targeted. A program that is missing tier one because PRs are not being merged fast enough has a different problem from one missing because identification-to-PR is too slow. The supporting metrics distinguish them.

The other metric worth tracking is error budget consumption. Each SLO has an implicit error budget, the share of findings that can miss the window without breaking the commitment. When error budget is running low for a given tier, the team knows to slow other initiatives and focus on remediation. When it is healthy, the team has bandwidth for improvement work. This pattern, borrowed from SRE practice, gives the program a way to balance new commitments against operational hygiene.

Where Griffin AI fits the program

Griffin AI plays two roles in an SLO-driven program. The first is at the tiering step. For each new finding, Griffin assembles the evidence needed to assign the correct tier: reachability, exploitability, exposure, asset criticality. The tiering decision is consistent across findings, which matters because inconsistent tiering will quietly destroy any SLO commitment.

The second role is at the formal acceptance and mitigation step, which determines whether tier four findings hit their 30-day window. Griffin generates the acceptance or mitigation proposal, gathers the supporting evidence, and routes for approval. The mechanical work that previously made tier four unsustainable becomes a workflow the team can run at scale.

Across both roles, Griffin produces evidence that is auditable. The tiering reasoning, the acceptance justification, the mitigation specification, all are captured automatically and persist with the finding. When an auditor or executive asks how a particular finding ended up in tier two rather than tier one, the answer is in the record, with the supporting data attached.

How to introduce SLOs without breaking morale

Introducing SLOs to a team that has been running on heroics needs to be done carefully. The first version of the SLO should reflect current performance, not aspirational performance. If the team is currently closing tier one findings in seven days, the first SLO commitment should be seven days, not 72 hours. The point is to establish the discipline of measurement and commitment, not to break the team in the first quarter.

Once the baseline SLO is being met consistently, the team can tighten the targets. Each tightening cycle should be supported by a corresponding investment in tooling or process. The SLO should never be tightened by exhortation. It is tightened by automation that makes the new target achievable.

Within two to three quarters, a team that started with loose SLOs and invested in reachability scoring, auto-PR generation, and Griffin AI triage will typically have tightened to the structure described earlier in this post, with SLO attainment in the high 90s. That is a program that has graduated from heroics to engineering, which is the transition every mature security organisation eventually needs to make.

The conversation that changes

The biggest cultural effect of moving to SLO-driven vulnerability management is the conversation it enables with the rest of the business. The security team can answer commitments rather than offering hopes. The engineering team has predictable interrupt patterns rather than ad hoc emergencies. The board sees a metric that means something rather than a backlog number that fluctuates with scanner output.

That is the program that survives 2026 and beyond, and it is the program every security leader should be working toward.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.