Vulnerability Management

Vulnerability Remediation SLAs: Best Practices for Real Teams

Setting vulnerability remediation deadlines is easy. Actually meeting them is hard. This guide covers practical SLA frameworks that balance security urgency with engineering reality.

Alex
Compliance Engineering Lead
8 min read

Every security program has vulnerability remediation SLAs on paper. Critical vulnerabilities must be fixed within 7 days. High severity within 30 days. Medium within 90 days. Low within 180 days. It looks good in the security policy document. It looks good in audit presentations. And in most organizations, it is largely fiction.

The gap between stated SLAs and actual remediation times is one of the most persistent problems in vulnerability management. A 2023 study found that the average time to remediate a critical vulnerability in enterprise environments was 60 days -- nearly ten times the typical SLA. The problem is not that teams do not care about security. The problem is that most SLA frameworks are designed by security teams in isolation, without considering the realities of software development workflows.

Why SLAs Fail

One-Size-Fits-All Severity Mapping

Most SLA frameworks map directly from CVSS severity to remediation timeline. CVSS 9.0+ gets 7 days. CVSS 7.0-8.9 gets 30 days. This ignores context entirely.

A CVSS 9.8 vulnerability in a library function that your application never calls is not actually critical to your environment. A CVSS 6.5 vulnerability in your authentication middleware that is directly exposed to the internet might be the most urgent thing you need to fix. Severity without context is noise, and developers learn to ignore noise.

No Distinction Between Availability and Effort

An SLA of 7 days assumes that a fix is available and that applying it takes less than 7 days of engineering effort. Both assumptions frequently fail.

When a zero-day is disclosed with no patch available, the 7-day SLA starts ticking while the developer has nothing to do except wait. When the fix requires a major version upgrade that breaks APIs across 15 services, the engineering effort exceeds what any team can deliver in a week.

SLAs need to account for fix availability and remediation complexity, not just severity.

Volume Overwhelms Prioritization

A typical enterprise application with 200 dependencies might have 50 to 100 known vulnerabilities at any given time. Across a portfolio of 30 applications, that is 1,500 to 3,000 open vulnerabilities. When everything is urgent, nothing is prioritized. Teams become desensitized and start ignoring vulnerability reports entirely.

Building Practical SLAs

Risk-Based Severity Rather Than CVSS Alone

Effective SLAs incorporate context beyond CVSS scores:

Exploitability. Is there a known exploit in the wild? Is it in CISA's Known Exploited Vulnerabilities catalog? A vulnerability with active exploitation is fundamentally different from a theoretical vulnerability discovered by a researcher.

Exposure. Is the vulnerable component in an internet-facing service, an internal tool, or a development dependency that never runs in production? The same CVE at the same CVSS score warrants different urgency based on exposure.

Reachability. Does your code actually invoke the vulnerable function? Static analysis tools like Snyk's reachability analysis and Endor Labs' function-level analysis can determine whether a vulnerability is reachable in your specific codebase. An unreachable vulnerability is a risk (because code paths change), but it is not an imminent threat.

Data sensitivity. A vulnerability in a service that handles payment data or personal health information warrants faster remediation than the same vulnerability in a public marketing website.

Tiered SLA Framework

Rather than mapping CVSS to timelines, map risk tiers to timelines:

Tier 1 -- Actively Exploited / Internet-Facing / Sensitive Data: Fix within 48 hours. These are vulnerabilities with known exploits targeting components that are directly exposed and handle sensitive data. This tier should contain very few vulnerabilities at any time. If it does not, your perimeter is too broad.

Tier 2 -- High Risk / Exploitable: Fix within 14 days. High CVSS, exploit available or likely, but either not internet-facing or not handling sensitive data. This is where most "critical" vulnerabilities actually land after context assessment.

Tier 3 -- Elevated Risk: Fix within 45 days. Medium-to-high CVSS, no known exploit, reachable code paths. Standard remediation through normal sprint planning.

Tier 4 -- Low Risk / Unreachable: Fix within 90 days. Low CVSS, no exploit, unreachable code paths, or development-only dependencies. Batch these into quarterly dependency update cycles.

Tier 5 -- Accepted Risk: Documented exception with business justification. Some vulnerabilities cannot be fixed because the dependency has no patched version, the upgrade would break critical functionality, or the risk is genuinely negligible. These need formal acceptance with a review date.

SLA Clock Management

Define precisely when the SLA clock starts and what events pause it:

Clock starts: When the vulnerability is detected in your environment and assigned a risk tier. Not when the CVE is published -- you cannot fix what you do not know about.

Clock pauses: When no fix is available from the upstream vendor/maintainer. The SLA should track "time with fix available" rather than "time since detection." When a fix becomes available, the clock resumes.

Clock resets: Never. Once the clock starts, it does not reset. Reclassifying a vulnerability to a lower tier extends the deadline but does not restart the timer.

Exception Process

An SLA framework without an exception process produces two outcomes: either teams ignore the SLAs, or they waste engineering time on low-value fixes to meet arbitrary deadlines. Neither is productive.

Design an exception process that is lightweight enough to actually be used:

Self-service for low-risk exceptions. Team leads should be able to approve Tier 4 exceptions with a documented justification. No committee required.

Manager approval for elevated exceptions. Tier 2 and Tier 3 exceptions require engineering manager and security team approval. The justification should include a compensating control (e.g., WAF rule, network segmentation, enhanced monitoring).

Executive approval for critical exceptions. Tier 1 exceptions require CISO or VP of Engineering approval. These should be rare -- if you are granting many Tier 1 exceptions, your tier definitions are wrong.

All exceptions expire. Every exception has a review date. When it expires, the vulnerability is reassessed. If the conditions have not changed, the exception can be renewed. But it must be a conscious decision, not indefinite deferral.

Metrics That Actually Matter

Stop measuring the number of open vulnerabilities. That metric incentivizes teams to suppress findings rather than fix them. Measure these instead:

SLA compliance rate by tier. What percentage of Tier 1 vulnerabilities are remediated within SLA? Tier 2? This tells you whether your SLAs are achievable and whether teams are meeting them.

Mean time to remediate (MTTR) by tier. The average time from detection to fix, broken down by tier. Track trends over time. MTTR decreasing means your remediation processes are improving.

Overdue vulnerability count by age. Not just "how many are overdue" but "how overdue are they." Ten vulnerabilities overdue by 2 days is very different from ten vulnerabilities overdue by 200 days.

Exception rate. What percentage of vulnerabilities receive exceptions? A high rate indicates SLAs that do not match engineering capacity. A zero rate might indicate that teams are not using the exception process and are instead ignoring SLAs silently.

Recurrence rate. How often do remediated vulnerabilities reappear? If a patched library gets downgraded in a subsequent release, your fix was not durable. This often indicates a lock file or dependency management problem.

Integrating SLAs with Development Workflows

SLAs fail when they exist in a security silo. They succeed when they are integrated into how developers actually work.

Jira/Linear integration. When a vulnerability is detected, automatically create a ticket in the owning team's backlog with the SLA deadline as the due date. Do not create a separate "security ticket" -- it will be deprioritized.

Sprint planning visibility. Surface upcoming SLA deadlines in sprint planning tools so that product managers and engineering leads can allocate capacity.

PR-level feedback. When a developer introduces a new dependency with known vulnerabilities, flag it in the PR review. It is cheaper to address vulnerabilities before merge than after deployment.

Deployment gates. For Tier 1 vulnerabilities, block deployment until remediated. For lower tiers, allow deployment but track the SLA. Blocking all deployments for all vulnerabilities is counterproductive -- it incentivizes teams to avoid scanning.

How Safeguard.sh Helps

Safeguard.sh turns vulnerability SLAs from a policy document into an operational reality. The platform assigns risk-based severity using exploit intelligence, reachability analysis, and asset context -- not just CVSS scores. SLA timelines are tracked automatically from detection through remediation, with configurable policies per team, application, or risk tier. Dashboards show SLA compliance rates, overdue vulnerabilities, and MTTR trends in real time, giving security teams the visibility they need to identify bottlenecks and giving engineering teams clear, prioritized remediation targets integrated into their existing workflows.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.