The OpenSSF Scorecard project published a v6 roadmap proposal in mid-2025 that moves the tool from a pure numeric score to a conformance engine aligned with the Open Source Project Security Baseline (OSPS Baseline). The proposal, tracked in pull request #4952 and discussed across multiple TAC meetings, preserves every existing check while adding an additive layer that emits PASS, FAIL, UNKNOWN, NOT_APPLICABLE, and ATTESTED verdicts against a control catalog. For consumers running Scorecard across thousands of repositories, the practical effect is a shift from "your repo scored 7.2" to "your repo conforms to OSPS Baseline level 1, partially to level 2." That is a meaningful difference when an enterprise procurement team is trying to decide whether an open-source dependency is acceptable.
What is the OSPS Baseline and why is Scorecard adopting it?
The OSPS Baseline is a control catalog developed within OpenSSF that codifies minimum security practices an open-source project should meet — branch protection, signed releases, dependency review, vulnerability disclosure, and similar. It is structured as Levels (L1, L2, L3) that scale with project criticality. Scorecard, prior to v6, expressed similar concepts as individual checks that combined into a 0-10 score. The v6 proposal preserves the checks (now reframed as "probes") and adds a control-to-probe mapping that lets users express "did this repo satisfy OSPS Baseline control BR-01 about branch protection?" instead of just "what is the Branch-Protection check score?" The mapping is versioned, so a consumer pinning Baseline v1.0 will get reproducible answers even after Scorecard adds new probes.
What does the v6 verdict model look like?
Five verdicts per control. PASS means automated checks succeeded. FAIL means they failed. UNKNOWN means Scorecard cannot evaluate the control with current data. NOT_APPLICABLE means the control does not apply to the repository's type or language. ATTESTED is the new and most interesting verdict: a maintainer may sign an attestation asserting compliance for a control that Scorecard cannot fully automate (for example, "we run a documented incident response drill annually"), and the engine surfaces that attestation alongside automated probes. The combination lets consumers separate "we measured this and it passes" from "the maintainer claims this and we have a signed statement."
{
"control": "OSPS-BR-01",
"title": "Branch protection enforced on default branch",
"verdict": "PASS",
"evidence": [
{
"probe": "branchProtection",
"result": "satisfied",
"score": 10
}
],
"baseline_level": "L1",
"baseline_version": "1.0.0",
"scorecard_version": "6.0.0-alpha.3",
"evaluated_at": "2025-09-19T14:22:01Z"
}
What new infrastructure does v6 introduce?
Four engines stitched together. A probe engine that runs the existing Scorecard checks against a repository. An applicability engine that detects preconditions — for example, suppressing fuzzing probes for a repo with no native code. A conformance engine that joins probe output to the OSPS Baseline control catalog via the versioned mapping file. And a Security Insights ingestion path that pulls maintainer-declared metadata from the security-insights.yml file that the ORBIT ecosystem project promotes. Together they aim to compose with Gemara Layer 4, the OpenSSF data model for conformance assessments, so that Scorecard's output can be consumed by Allstar, GUAC, and downstream policy engines without reformatting.
How will CI gating actually work?
The roadmap proposes that maintainers commit a policy file specifying minimum verdict requirements per control. A pull request that regresses any FAIL into a PASS gate fails CI. Critically, gating is configurable: a project may require PASS on OSPS-BR-01 (branch protection) but accept ATTESTED on OSPS-IR-02 (incident response procedure) where automation is impractical. The Scorecard maintainers explicitly designed v6 so that existing users on v5 do not have to migrate; the legacy numeric score remains available and v6 conformance is additive.
How does v6 handle proprietary repos and air-gapped environments?
Scorecard historically targeted public GitHub repositories where rate-limited unauthenticated checks were viable. v6 work explicitly addresses two pain points enterprise consumers raised against earlier versions: running Scorecard against private internal repos, and running it offline against a vendored set of dependencies in air-gapped networks. The roadmap describes a deployment pattern where a self-hosted Scorecard instance authenticates via GitHub App, GitLab access token, or Azure DevOps PAT, evaluates internal repos against the same probes used for public projects, and stores conformance verdicts in a tenant-owned database. For air-gapped consumers, the roadmap describes a "package mode" where probes that depend on external services (OSV lookups, dependency-update API queries) degrade gracefully to UNKNOWN with a documented reason, rather than failing the entire evaluation. This is a substantial usability improvement: enterprises that wanted to require Scorecard-based gates on internal projects had been hand-rolling subsets of the probes, and v6 makes the official tooling viable inside the firewall.
What does this mean for enterprise consumers?
If you currently fetch Scorecard scores via the public API to gate dependency adoption, v6 will let you upgrade from a heuristic threshold ("require Scorecard >= 7") to a control-level policy ("require PASS on OSPS-BR-01, OSPS-BR-02, OSPS-VM-01"). That is enforceable language that maps cleanly to enterprise security standards. It also exposes the previously hidden subjective component: when a project's Scorecard score dropped from 7.5 to 7.1, was it because branch protection regressed or because a new probe was added? v6's per-control verdicts answer that. Expect procurement and TPRM teams to write policy against Baseline control IDs by the time v6 stabilizes, with the v5 numeric score continuing as a coarse risk signal.
What is the maintainer impact?
Maintainers of open-source projects whose repos are scored by consumers across the ecosystem may worry that v6 raises the bar. The roadmap explicitly addresses this concern in two ways. First, the v6 model is additive: existing checks and scores remain, so a project that scores well today continues to score well. Second, the ATTESTED verdict lets maintainers acknowledge controls they implement through processes Scorecard cannot automatically detect, without those controls being treated as failures. The net effect for thoughtful maintainers is better representation of actual practice rather than artificially lower scores driven by automation limits. Maintainers who do nothing will see the same scores as before; maintainers who actively populate Security Insights metadata and produce attestations will see improved conformance verdicts that better reflect reality. The community feedback during the roadmap development period emphasized this point and the maintainers of Scorecard committed to keeping the additive principle through v6 and beyond.
How does v6 align with the broader OpenSSF stack?
Scorecard v6's conformance output is designed to plug into the rest of the OpenSSF tool family. The Security Insights ingestion path uses the security-insights.yml schema that the ORBIT ecosystem project maintains, letting maintainers declare facts about their project (vulnerability disclosure contacts, governance details, dependency policies) in a single canonical location consumed by multiple tools. The conformance output is consumable by GUAC, the graph-based supply-chain understanding project that aggregates SBOMs and attestations, where Scorecard verdicts can be used to filter or annotate dependency-graph traversals. Allstar, OpenSSF's GitHub repository-policy tool, can consume v6 verdicts to enforce specific OSPS Baseline controls on repositories under an organization's control. The Gemara Layer 4 alignment lets conformance assessments produced by Scorecard interoperate with conformance assessments from other producers, building toward a portable assessment ecosystem. For organizations adopting the full OpenSSF stack, v6 turns Scorecard from an isolated repo-evaluation tool into the conformance backbone the rest of the stack queries. For organizations using Scorecard standalone, the upgrade is still valuable but the cross-tool synergy is the larger longer-term payoff.
How Safeguard Helps
Safeguard already ingests OpenSSF Scorecard scores via the public BigQuery dataset and through on-demand evaluation for repositories not pre-scored. With v6 conformance output, Safeguard will normalize PASS/FAIL/ATTESTED verdicts into the platform's component risk model, weight ATTESTED claims based on signer identity, and flag UNKNOWN verdicts as missing-evidence gaps in TPRM evaluations. Griffin AI will explain why a specific OSPS Baseline control regressed by joining the verdict change to the underlying commit, issue, or workflow modification. Policy gates accept Baseline control IDs as first-class inputs, so a developer pushing a change to a dependency on a repo that just lost OSPS-BR-01 PASS status sees the block immediately, with context about which control failed and what evidence Scorecard relied on. For maintainers, Safeguard generates the in-toto attestations that v6 expects for ATTESTED controls, signed against the project's Sigstore identity.