The Open Source Vulnerabilities (OSV) project, initiated by Google, provides a distributed, open-source vulnerability database and a standardized schema for describing vulnerabilities in open-source software. If you have been frustrated by the inaccuracies of NVD-based vulnerability matching for your open-source dependencies, OSV is the answer the industry has been building toward.
The Problem OSV Solves
NVD was designed for a world where software was identified by vendor and product name. As we covered in our discussion of CPE naming, this maps poorly to open-source packages. The result is high false positive rates, missing coverage for smaller projects, and a lag between vulnerability disclosure and NVD publication.
OSV takes a different approach. Instead of trying to map open-source packages into a commercial product naming scheme, OSV uses the identifiers that package ecosystems already use: the package name on the registry and the version numbers as published.
This means vulnerability matching goes from "try to construct a CPE and hope it matches NVD's CPE assignment" to "look up the exact package name and version in the OSV database." The accuracy improvement is dramatic.
The OSV Schema
The OSV schema defines a JSON format for vulnerability records. A minimal record looks like this:
{
"id": "GHSA-xxxx-xxxx-xxxx",
"summary": "Remote code execution in example-package",
"details": "A detailed description of the vulnerability...",
"aliases": ["CVE-2023-12345"],
"modified": "2023-08-01T00:00:00Z",
"published": "2023-07-15T00:00:00Z",
"affected": [
{
"package": {
"ecosystem": "npm",
"name": "example-package"
},
"ranges": [
{
"type": "SEMVER",
"events": [
{"introduced": "1.0.0"},
{"fixed": "1.5.3"}
]
}
]
}
],
"severity": [
{
"type": "CVSS_V3",
"score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H"
}
]
}
Several design decisions make this schema superior to NVD for open-source vulnerability matching.
Ecosystem-Native Package Identification
The affected.package field uses the ecosystem name and the package name as it appears in the registry. No CPE construction, no vendor guessing. If you want to check whether lodash@4.17.19 on npm is affected, you look for records where the ecosystem is "npm" and the name is "lodash."
Supported ecosystems include npm, PyPI, Maven, Go, crates.io, NuGet, Packagist, RubyGems, Pub, Hex, and many more. The ecosystem identifier tells the vulnerability matching tool exactly how to interpret the version string.
Semantic Version Ranges
The ranges field describes affected versions using events -- "introduced" and "fixed" markers. This is more expressive and more accurate than NVD's version range specifications.
A range with {"introduced": "2.0.0"} and {"fixed": "2.5.1"} means all versions from 2.0.0 (inclusive) through 2.5.0 (inclusive) are affected, and 2.5.1 is the first fixed version. This maps directly to how maintainers think about and communicate vulnerability scope.
The schema supports both semantic version ranges (SEMVER type) and git commit ranges (GIT type). Git commit ranges are useful for projects that do not use standard versioning or for matching against specific commits in a repository.
Multiple Affected Packages
A single OSV record can describe a vulnerability that affects multiple packages. This is common when a vulnerability exists in a library that is distributed under different names in different ecosystems, or when a vulnerability in a core library affects multiple dependent packages.
Aliases
The aliases field links different identifiers for the same vulnerability. A single vulnerability might have a CVE ID, a GitHub Security Advisory (GHSA) ID, a Go vulnerability database ID, and a RustSec advisory ID. The aliases field connects all of these, enabling cross-database correlation.
The OSV.dev Service
OSV.dev is the public instance of the OSV database, aggregating vulnerability data from multiple sources:
- GitHub Advisory Database: The largest source, covering npm, pip, Maven, Go, RubyGems, NuGet, and more
- Go Vulnerability Database: Curated by the Go security team
- RustSec Advisory Database: For the Rust/crates.io ecosystem
- Python Advisory Database: PyPI-specific advisories
- OSS-Fuzz: Vulnerabilities discovered by Google's continuous fuzzing service
- Linux kernel: Kernel-specific vulnerability tracking
The aggregation means you get a single API endpoint that covers most major open-source ecosystems. The API is free, open, and does not require authentication for basic queries.
API Usage
The OSV API provides two primary endpoints:
Query by package: Given a package ecosystem, name, and version, return all known vulnerabilities.
POST https://api.osv.dev/v1/query
{
"package": {
"ecosystem": "npm",
"name": "lodash"
},
"version": "4.17.19"
}
Query by vulnerability ID: Given a vulnerability ID (CVE, GHSA, etc.), return the full vulnerability record.
GET https://api.osv.dev/v1/vulns/GHSA-xxxx-xxxx-xxxx
The response includes all affected packages, version ranges, severity, and references. This is enough to perform accurate vulnerability matching against an SBOM with minimal client-side logic.
OSV vs. NVD: A Practical Comparison
Accuracy
OSV's ecosystem-native identification produces dramatically fewer false positives than NVD's CPE-based matching. In practice, teams switching from NVD-only scanning to OSV-based scanning typically see a 30-50% reduction in false positive findings.
Coverage
For popular open-source ecosystems (npm, PyPI, Maven, Go), OSV coverage is generally equal to or better than NVD. For smaller ecosystems and commercial software, NVD still has broader coverage.
Timeliness
GitHub Security Advisories, which are a primary source for OSV, are often published before NVD assigns and publishes a CVE. The lag between NVD publication and advisory availability is a known issue, and OSV largely sidesteps it.
Data Quality
OSV data is curated by ecosystem-specific security teams who understand the package naming and versioning conventions. NVD data is curated by NIST analysts who must cover the entire technology landscape. The specialization of OSV curators produces higher quality data for open-source packages.
Severity Information
NVD provides CVSS scores for almost all CVEs. OSV records may or may not include severity information, depending on the upstream source. This is a gap -- for severity-based prioritization, you may still need to supplement OSV data with NVD CVSS scores.
Integrating OSV into Your Workflow
SBOM Correlation
The most valuable integration is correlating your SBOMs against the OSV database. For each component in your SBOM:
- Extract the ecosystem (from the purl type) and package name
- Query OSV with the ecosystem, name, and version
- Cross-reference with NVD for components not covered by OSV
- Merge results, deduplicating using the aliases field
CLI Tools
osv-scanner: Google's official CLI tool for scanning project dependencies against OSV. It supports lock files for most major ecosystems and produces output compatible with common CI/CD formats.
grype: Anchore's vulnerability scanner uses OSV among its data sources and integrates well with Syft for SBOM-based scanning.
trivy: Aqua Security's scanner also uses OSV data alongside other sources.
Continuous Monitoring
OSV data changes as new advisories are published and existing ones are updated. Set up continuous monitoring that re-scans your SBOMs against the latest OSV data on a regular schedule (daily at minimum).
Contributing to OSV
OSV is an open ecosystem. If you discover a vulnerability in an open-source package, you can contribute the advisory:
- Via GitHub: Create a security advisory on the package's GitHub repository. It will be automatically ingested by OSV.
- Via ecosystem databases: Submit to the appropriate ecosystem advisory database (RustSec, Go vulnerability database, etc.)
- Directly: Submit OSV-format JSON records to the appropriate source database
Contributing to OSV improves the data quality for everyone and ensures that vulnerabilities you discover are tracked in a machine-readable format.
How Safeguard.sh Helps
Safeguard.sh integrates OSV data alongside NVD, GitHub Advisory Database, and other vulnerability sources in its correlation engine. When you import an SBOM, Safeguard.sh queries all available sources and merges the results, giving you the broadest and most accurate vulnerability coverage available.
The deduplication logic uses OSV's aliases field to ensure that the same vulnerability is not reported multiple times under different identifiers. And because Safeguard.sh stores SBOMs centrally, re-correlation against new OSV data happens automatically -- when a new advisory is published, every stored SBOM is checked, and affected components are flagged without any manual action.