CVSS was designed to compare vulnerabilities in the abstract — across vendors, products, and threat models. It does this competently. What CVSS was never designed to do is tell a specific organisation what to fix first in a specific codebase. Programs that prioritise by CVSS alone get a queue ordered by hypothetical severity, which means high-severity-but-unreachable findings outrank medium-severity-but-actively-exploited ones. The triage outcomes that follow are predictable: engineering time spent on hypothetical risk while real risk ages in the queue. Mature programs use a composite signal that gives reachability the weight it deserves.
What each signal contributes
Four signals, each measuring something different:
- CVSS measures hypothetical severity. Useful for inter-vulnerability comparison but ignores your specific deployment.
- EPSS measures probability of exploitation in the wild. Empirical signal that adjusts for what attackers are actually doing.
- KEV is a binary list of vulnerabilities under active exploitation. Highest possible weight; CISA-maintained.
- Reachability measures whether the vulnerability is exposed in your specific code path.
A mature triage queue weights all four. None is sufficient alone.
The composite that works
Programs that have stabilised on composite scoring tend to converge on a pattern:
- KEV listing: large fixed bonus (e.g., +30 points).
- Reachability: large bonus if reachable (+20 points), large penalty if unreachable (-15 points).
- CVSS base: scaled into the score (0-10 points).
- EPSS: scaled (0-10 points).
- Service criticality: scaled (0-10 points based on the asset hosting the finding).
Total composite is then ranked. The composite produces queues that engineering teams trust because the prioritisation matches their lived experience: KEV-listed reachable vulnerabilities go to the top regardless of CVSS; unreachable critical CVSS findings drop in priority.
Where pure-CVSS prioritisation fails
Three specific failure modes:
Hypothetical critical, real noise. A CVSS-9.8 in a transitive dependency you don't actually call gets the same queue position as a CVSS-9.8 in a request-handler your service exposes. The first is hypothetical; the second is incident-likely.
Reachable medium, real risk. A CVSS-6.5 in a function actively used in your authentication flow ranks below CVSS-9.0 findings that your code never reaches. Composite scoring corrects this.
KEV-listed underweighted. A CVE on the KEV list is being exploited right now. CVSS treats it the same as any other vulnerability with the same score. Composite scoring weights KEV explicitly.
What composite scoring changes operationally
Three measurable shifts in customer programs that adopt it:
- Mean time to fix critical-and-reachable drops by 40-60% in the first quarter.
- Backlog age distribution flattens — old hypothetical findings are correctly deprioritised, recent reachable ones are fixed faster.
- Engineering trust in security findings rises measurably. Findings that are surfaced as urgent actually feel urgent when reviewed.
The third is the soft effect that compounds.
What auditors prefer
Auditors increasingly understand that CVSS alone is insufficient for prioritisation. A SOC 2 review or EU CRA self-assessment that shows composite scoring with documented methodology is more defensible than one that shows pure CVSS triage. The trend is toward formalising composite scoring as a documented control.
How Safeguard Helps
Safeguard's findings are scored using a composite of CVSS, EPSS, KEV, reachability, and service criticality by default. The composite is configurable per-tenant; the methodology is documented for auditors. Griffin AI generates remediation PRs in composite-priority order. For programs whose triage queues no longer reflect actual risk, this is the architectural change that brings the prioritisation back into alignment with reality.