Vulnerability Management

OSV Schema: The Open Source Vulnerability Database Format Explained

Name: Safeguard
Brand: Safeguard
Availability: PreOrder

OSV provides a standardized format for vulnerability data that is purpose-built for open-source ecosystems. Here is how it works and why it is better than NVD for dependency scanning.

The Open Source Vulnerabilities (OSV) project, initiated by Google, provides a distributed, open-source vulnerability database and a standardized schema for describing vulnerabilities in open-source software. If you have been frustrated by the inaccuracies of NVD-based vulnerability matching for your open-source dependencies, OSV is the answer the industry has been building toward.

The Problem OSV Solves

NVD was designed for a world where software was identified by vendor and product name. As we covered in our discussion of CPE naming, this maps poorly to open-source packages. The result is high false positive rates, missing coverage for smaller projects, and a lag between vulnerability disclosure and NVD publication.

OSV takes a different approach. Instead of trying to map open-source packages into a commercial product naming scheme, OSV uses the identifiers that package ecosystems already use: the package name on the registry and the version numbers as published.

This means vulnerability matching goes from "try to construct a CPE and hope it matches NVD's CPE assignment" to "look up the exact package name and version in the OSV database." The accuracy improvement is dramatic.

The OSV Schema

The OSV schema defines a JSON format for vulnerability records. A minimal record looks like this:

{
  "id": "GHSA-xxxx-xxxx-xxxx",
  "summary": "Remote code execution in example-package",
  "details": "A detailed description of the vulnerability...",
  "aliases": ["CVE-2023-12345"],
  "modified": "2023-08-01T00:00:00Z",
  "published": "2023-07-15T00:00:00Z",
  "affected": [
    {
      "package": {
        "ecosystem": "npm",
        "name": "example-package"
      },
      "ranges": [
        {
          "type": "SEMVER",
          "events": [
            {"introduced": "1.0.0"},
            {"fixed": "1.5.3"}
          ]
        }
      ]
    }
  ],
  "severity": [
    {
      "type": "CVSS_V3",
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H"
    }
  ]
}

Several design decisions make this schema superior to NVD for open-source vulnerability matching.

Ecosystem-Native Package Identification

The affected.package field uses the ecosystem name and the package name as it appears in the registry. No CPE construction, no vendor guessing. If you want to check whether lodash@4.17.19 on npm is affected, you look for records where the ecosystem is "npm" and the name is "lodash."

Supported ecosystems include npm, PyPI, Maven, Go, crates.io, NuGet, Packagist, RubyGems, Pub, Hex, and many more. The ecosystem identifier tells the vulnerability matching tool exactly how to interpret the version string.

Semantic Version Ranges

The ranges field describes affected versions using events -- "introduced" and "fixed" markers. This is more expressive and more accurate than NVD's version range specifications.

A range with {"introduced": "2.0.0"} and {"fixed": "2.5.1"} means all versions from 2.0.0 (inclusive) through 2.5.0 (inclusive) are affected, and 2.5.1 is the first fixed version. This maps directly to how maintainers think about and communicate vulnerability scope.

The schema supports both semantic version ranges (SEMVER type) and git commit ranges (GIT type). Git commit ranges are useful for projects that do not use standard versioning or for matching against specific commits in a repository.

Multiple Affected Packages

A single OSV record can describe a vulnerability that affects multiple packages. This is common when a vulnerability exists in a library that is distributed under different names in different ecosystems, or when a vulnerability in a core library affects multiple dependent packages.

Aliases

The aliases field links different identifiers for the same vulnerability. A single vulnerability might have a CVE ID, a GitHub Security Advisory (GHSA) ID, a Go vulnerability database ID, and a RustSec advisory ID. The aliases field connects all of these, enabling cross-database correlation.

The OSV.dev Service

OSV.dev is the public instance of the OSV database, aggregating vulnerability data from multiple sources:

GitHub Advisory Database: The largest source, covering npm, pip, Maven, Go, RubyGems, NuGet, and more
Go Vulnerability Database: Curated by the Go security team
RustSec Advisory Database: For the Rust/crates.io ecosystem
Python Advisory Database: PyPI-specific advisories
OSS-Fuzz: Vulnerabilities discovered by Google's continuous fuzzing service
Linux kernel: Kernel-specific vulnerability tracking

The aggregation means you get a single API endpoint that covers most major open-source ecosystems. The API is free, open, and does not require authentication for basic queries.

API Usage

The OSV API provides two primary endpoints:

Query by package: Given a package ecosystem, name, and version, return all known vulnerabilities.

POST https://api.osv.dev/v1/query
{
  "package": {
    "ecosystem": "npm",
    "name": "lodash"
  },
  "version": "4.17.19"
}

Query by vulnerability ID: Given a vulnerability ID (CVE, GHSA, etc.), return the full vulnerability record.

GET https://api.osv.dev/v1/vulns/GHSA-xxxx-xxxx-xxxx

The response includes all affected packages, version ranges, severity, and references. This is enough to perform accurate vulnerability matching against an SBOM with minimal client-side logic.

OSV vs. NVD: A Practical Comparison

Accuracy

OSV's ecosystem-native identification produces dramatically fewer false positives than NVD's CPE-based matching. In practice, teams switching from NVD-only scanning to OSV-based scanning typically see a 30-50% reduction in false positive findings.

Coverage

For popular open-source ecosystems (npm, PyPI, Maven, Go), OSV coverage is generally equal to or better than NVD. For smaller ecosystems and commercial software, NVD still has broader coverage.

Timeliness

GitHub Security Advisories, which are a primary source for OSV, are often published before NVD assigns and publishes a CVE. The lag between NVD publication and advisory availability is a known issue, and OSV largely sidesteps it.

Data Quality

OSV data is curated by ecosystem-specific security teams who understand the package naming and versioning conventions. NVD data is curated by NIST analysts who must cover the entire technology landscape. The specialization of OSV curators produces higher quality data for open-source packages.

Severity Information

NVD provides CVSS scores for almost all CVEs. OSV records may or may not include severity information, depending on the upstream source. This is a gap -- for severity-based prioritization, you may still need to supplement OSV data with NVD CVSS scores.

Integrating OSV into Your Workflow

SBOM Correlation

The most valuable integration is correlating your SBOMs against the OSV database. For each component in your SBOM:

Extract the ecosystem (from the purl type) and package name
Query OSV with the ecosystem, name, and version
Cross-reference with NVD for components not covered by OSV
Merge results, deduplicating using the aliases field

CLI Tools

osv-scanner: Google's official CLI tool for scanning project dependencies against OSV. It supports lock files for most major ecosystems and produces output compatible with common CI/CD formats.

grype: Anchore's vulnerability scanner uses OSV among its data sources and integrates well with Syft for SBOM-based scanning.

trivy: Aqua Security's scanner also uses OSV data alongside other sources.

Continuous Monitoring

OSV data changes as new advisories are published and existing ones are updated. Set up continuous monitoring that re-scans your SBOMs against the latest OSV data on a regular schedule (daily at minimum).

Contributing to OSV

OSV is an open ecosystem. If you discover a vulnerability in an open-source package, you can contribute the advisory:

Via GitHub: Create a security advisory on the package's GitHub repository. It will be automatically ingested by OSV.
Via ecosystem databases: Submit to the appropriate ecosystem advisory database (RustSec, Go vulnerability database, etc.)
Directly: Submit OSV-format JSON records to the appropriate source database

Contributing to OSV improves the data quality for everyone and ensures that vulnerabilities you discover are tracked in a machine-readable format.

How Safeguard Helps

Safeguard integrates OSV data alongside NVD, GitHub Advisory Database, and other vulnerability sources in its correlation engine. When you import an SBOM, Safeguard queries all available sources and merges the results, giving you the broadest and most accurate vulnerability coverage available.

The deduplication logic uses OSV's aliases field to ensure that the same vulnerability is not reported multiple times under different identifiers. And because Safeguard stores SBOMs centrally, re-correlation against new OSV data happens automatically -- when a new advisory is published, every stored SBOM is checked, and affected components are flagged without any manual action.

OSV vulnerability database open source NVD advisory

Back to all articles

More on #OSV

View all

Vulnerability Management

Open Source Vulnerability Databases Compared: NVD, OSV, GitHub Advisory, and More

6 min read

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

Self-healing security runs on Safeguard.

Your first fix PR is minutes away.

Book a demo Get started

No sales call required, even your agent can complete the purchase over MCP.