In September 2023, security researchers at Checkmarx published findings about an intrusion pattern in which attackers used stolen GitHub personal access tokens to push malicious commits to both public and private repositories, with commit messages and author metadata crafted to impersonate the GitHub Dependabot service. The campaign targeted repositories across multiple ecosystems, pushing commits that appeared, at a glance, to be routine dependency updates but which actually modified workflow files or added obfuscated payloads. The incident was not a vulnerability in Dependabot itself; it was a misuse of GitHub's trust model, in which commit authorship is a cosmetic attribute that does not by itself prove origin.
The incident was short, the exposure relatively narrow, and GitHub's response quick. But the pattern it exposed, that developers and automated systems alike will accept commits at face value when the author metadata looks familiar, has implications that extend well beyond the specific campaign.
What happened
According to Checkmarx's research and the subsequent coverage, the campaign proceeded as follows. Attackers, through a variety of mechanisms including stolen credentials harvested from compromised developer machines, phishing, and possibly malicious packages that exfiltrated tokens, obtained valid GitHub personal access tokens belonging to legitimate users. With those tokens, the attackers authenticated to GitHub as the compromised users and pushed commits to repositories the users had access to.
The commits were crafted to resemble Dependabot activity. Commit messages followed the Bump <package> from <version> to <version> pattern that Dependabot uses. The author name in the commit metadata was set to dependabot[bot], and the author email to the address that Dependabot uses. Developers scanning their repositories saw what looked like routine dependency maintenance; if they merged without detailed inspection, they accepted malicious payloads.
The payloads themselves varied. In some cases, the commits modified GitHub Actions workflow files to exfiltrate secrets during subsequent CI runs. In some cases, they added a malicious dependency to package.json or an equivalent manifest. In some cases, they introduced an obfuscated script that attempted to capture additional credentials. The specific payload was less important than the delivery mechanism: by exploiting trust in "Dependabot commits," the attackers got their code into repositories without triggering the usual skepticism that developers apply to commits from unknown authors.
The trust model the campaign exploited
GitHub commit metadata, author name, author email, committer name, committer email, is not cryptographically verified unless the commit is signed with a GPG, S/MIME, or SSH key that GitHub has validated. Most commits in most repositories are unsigned. A pushed commit can claim to be from any author, and GitHub will display that author's name next to the commit in the web UI. The only evidence that the commit actually originated from the claimed author is that a valid token or SSH key with push access was used, and that evidence is not surfaced in the commit display.
For commits from a human contributor, this weakness is usually noticed because the human contributor's identity does not by itself confer special trust. For commits that appear to be from an automated service like Dependabot, the weakness is more consequential. Developers have internalized the heuristic "if it looks like Dependabot, it's Dependabot." Automated merge bots, review policies that auto-approve Dependabot updates, and security tooling that treats Dependabot PRs as low-risk all encode the same heuristic. The impersonation campaign exploited this encoded trust directly.
GitHub's response included clarifying documentation, encouraging signed commits, and making verified commit badges more prominent in some views. The structural issue, that the author metadata is not by itself a proof of origin, remains. Signed commits mitigate it, but adoption of signed commits outside a minority of repositories is slow.
Why personal access tokens were the weak link
The campaign depended on token theft. GitHub personal access tokens, in their classic form, are long-lived, broadly scoped, and frequently stored in environments that are less protected than the authentication systems that issued them. A typical engineer might have a PAT stored in their shell profile, in a credential helper, in an IDE configuration, and in a CI environment variable. Any of those storage locations could leak the token to a malicious package, an infostealer, or a misconfigured log.
GitHub has introduced fine-grained personal access tokens, which are scoped to specific repositories and organizations and have expiration dates. GitHub Apps and workload identity federation offer stronger alternatives for automation scenarios. Adoption of these alternatives has been uneven; many repositories, especially in smaller organizations, still authorize a mix of classic PATs alongside newer mechanisms.
The operational lesson is not that PATs are uniquely dangerous, but that any long-lived credential with broad scope represents a disproportionate fraction of the attack surface. An organization that can answer "how many classic PATs exist in our organization, which users hold them, and what scopes do they have" is in a position to reason about exposure. An organization that cannot answer those questions is relying on the absence of evidence of compromise.
Lessons for automated dependency update workflows
Treat "looks like Dependabot" as a heuristic, not a control. Auto-merge policies, review exemptions, and security tool triage that trust Dependabot-attributed commits should instead rely on verifiable signals, commit signatures from a known key, workflow runs authenticated with Dependabot's actual identity, or pull requests that originated from Dependabot-managed branches with GitHub-verifiable provenance.
Protect the workflow files themselves. A common payload in the 2023 campaign modified .github/workflows/ files to exfiltrate secrets or add malicious steps. Branch protection rules can require code owner review for changes to workflow files, disallowing auto-merge for that path, regardless of the commit author. This is a zero-cost defense that most repositories do not yet enforce.
Reduce long-lived PAT usage. Migrating automation from classic PATs to fine-grained tokens or GitHub Apps does not prevent token theft, but it narrows the blast radius and adds expiration. An organization that has eliminated long-lived PATs for production automation loses much less to any given token compromise.
Monitor for unusual commit patterns from automation identities. Dependabot's real activity has recognizable shape: PRs rather than direct pushes, specific branch naming conventions, and consistent committer behavior over time. Commits that claim Dependabot authorship but deviate from those patterns, direct pushes to protected branches, changes to non-manifest files, unusual scheduling, are candidates for anomaly detection. Several third-party tools and GitHub's own secret scanning and push protection features support this kind of rule set.
What the incident did not prove
It is worth being precise about scope. The 2023 campaign did not compromise GitHub's infrastructure, did not breach Dependabot itself, and did not exploit a vulnerability in the GitHub product. It exploited a combination of weak token hygiene on the part of users and an underdeveloped trust model around commit authorship. Framing it as a "GitHub breach" misreads the evidence. Framing it as a reminder that commit metadata is not an identity claim is correct.
The lasting value of the incident lies in how it trained a large number of engineering teams to notice that their trust heuristics were brittle. The equivalent heuristic in other ecosystems, "if it looks like a release from the upstream maintainer, it's authentic," has been exploited repeatedly in package ecosystems. The structural fix, verifiable provenance linked to build-time attestations, is the same in every case.
How Safeguard Helps
Safeguard helps organizations treat commit provenance and token inventory as first-class security surfaces. The platform inventories GitHub personal access tokens and integration credentials across connected organizations, flags long-lived or broadly scoped tokens, and highlights repositories that lack branch protection on workflow files. Automated dependency updates are reviewed against their actual provenance rather than their displayed author, with verified signatures and attestations promoted to the prioritization decision. When impersonation campaigns like the 2023 Dependabot incident become public, teams can query which projects would be susceptible based on their current configuration. The effect is that auto-merge policies, review exemptions, and security tool triage rest on verifiable signals rather than commit metadata heuristics.