Incident Analysis

Hugging Face Token Exposure 2024 Analysis

Researchers found thousands of valid Hugging Face API tokens in public code and models. Analysis of the 2024 exposures and what they mean for ML supply chain.

Shadab Khan
Security Engineer
8 min read

The 2024 Hugging Face token exposure disclosures were a predictable outcome of two trends colliding: ML engineers treating notebooks as throwaway artifacts and the growth of the Hugging Face Hub as the default distribution mechanism for models, datasets, and Spaces. Lasso Security's February 2024 disclosure, which found more than 1,600 valid tokens across GitHub and Hugging Face itself, was followed through the year by JFrog, HiddenLayer, and Protect AI write-ups that expanded the scope. By end of 2024 the community had accepted that the ML supply chain had its own distinct credential-leakage problem on top of the generic one every developer ecosystem faces.

This post analyzes the 2024 incidents, the privilege design of Hugging Face tokens, and what ML teams should do differently now.

What did Lasso Security find in February 2024?

Lasso Security found more than 1,600 exposed Hugging Face API tokens across public repositories, with a meaningful percentage carrying write access to organization-owned models and datasets. Their scanning covered GitHub, GitLab, and Hugging Face itself, including Spaces (interactive apps hosted on the platform). The write-access finding was the headline because it meant an attacker could modify models belonging to major organizations - Meta, Google, Microsoft, VMware, and others were named in the disclosure - and the modifications would flow to every downstream user who pulled the model.

Lasso's researchers confirmed token validity by calling the /api/whoami-v2 endpoint, which returns the authenticated user's scopes without side effects. They did not actually modify any models. The disclosure was handled responsibly with Hugging Face, which rotated the affected tokens and expanded its own scanning.

A subtle but important finding: the leaks were not concentrated in abandoned or low-quality repositories. Tokens appeared in active projects, teaching material, notebook demos, and CI configurations across respected labs and companies. The common factor was that the token was treated as a convenience value rather than a privileged secret, often set once on a developer laptop and then inadvertently committed alongside a .env file or a Jupyter notebook cell output.

How do Hugging Face tokens work and what can they actually do?

Hugging Face tokens work as personal access tokens scoped to the user or organization context that issued them, and the default "write" scope grants full control over any model or dataset the identity owns or has collaborator rights on. As of 2024 the token model was simple: read or write, with organization-level tokens inheriting the organization's full permissions.

What a write token can actually do in practice:

  • Push commits to any model repository owned by the identity, including replacing model weight files.
  • Create new model or dataset repositories under the identity.
  • Modify model cards and README files to point users at alternative download instructions.
  • Trigger builds for Spaces owned by the identity, which on the platform means executing attacker code in the hosted environment.
  • Read private models and datasets that the identity has access to.

The model-weight replacement path is the most dangerous abuse. A model pulled from a known repository is, for most ML teams, trusted by default. An attacker with write access could replace pytorch_model.bin with a file containing a malicious serialized payload and every downstream user loading that model via torch.load would execute the payload. Hugging Face has since promoted safetensors specifically to close this class of issue, but adoption is incomplete and many legacy projects still use pickle-based formats.

Hugging Face introduced fine-grained tokens in mid-2024 after the Lasso disclosure, allowing token scopes to be narrowed to specific repositories and operations. Adoption has grown, but legacy tokens with full scope remain common.

What attacker capabilities were demonstrated post-disclosure?

Attacker capabilities demonstrated publicly after the disclosures included model-replacement proofs of concept, Spaces-hosted reverse shells, and data exfiltration from private datasets. Researchers at JFrog showed that a malicious pickle payload inside a model file executes immediately on AutoModel.from_pretrained, and HiddenLayer extended the work to show that Hugging Face's transformers loader did not meaningfully sandbox the load process.

The Spaces attack path is distinct and worth highlighting. A Space is a hosted application, typically a Gradio or Streamlit demo, that runs attacker-controlled code in the platform's containers. With a token that grants Space write access, an attacker can push code that beacons to command-and-control, scrapes visitor data, or uses the Space's own API credentials to pivot further. Hugging Face restricts egress and applies container isolation, but the Space owner's trust relationship with visitors is what makes it useful to an attacker.

For private dataset exposure, the attacker simply pulls the dataset via the leaked token. This was the most common abuse pattern in the honeypot data, because it is low-friction and does not require any code modification.

Why are ML repositories such a soft target for token leakage?

ML repositories are a soft target for token leakage because the workflow norms of ML development push credentials into places that traditional software development has spent a decade learning to treat as hostile territory. Specifically:

  • Jupyter notebooks persist cell outputs to disk, and a cell that prints a token for debugging embeds the token in the .ipynb file. Committing the notebook commits the token.
  • Training runs are often started from notebook cells or ad hoc shell invocations where tokens are pasted directly, rather than from scripts with environment-sourced credentials.
  • CI pipelines for ML projects frequently pull models from private repositories during build, requiring a token in the environment, and the CI configuration files that reference those tokens sometimes leak the literal value rather than a secret reference.
  • The research-oriented culture rewards sharing reproducibility artifacts, and a researcher copying a working training environment to a public repository can include tokens without a second look.

None of this is unique to ML, but the density of these patterns is higher in ML repositories than in typical software projects. The cleanup requires changing workflow tooling, not just changing secret management.

What did Hugging Face do in response?

Hugging Face responded with token rotation, expanded secret scanning, fine-grained token scopes, and a push toward safer model formats. The rotation was the immediate response to the disclosures and covered confirmed-exposed tokens. The scanning work integrated with GitHub's secret-scanning partner program and added Hugging Face's own pre-commit scanning for files pushed to the hub.

The fine-grained token rollout, released mid-2024, allows users to issue tokens with specific scopes - for example, read-only access to a single dataset - so that a leaked token has a bounded blast radius. Adoption requires explicit user action, which means the long tail of legacy tokens with broad scopes persists.

The safetensors push is the most durable structural change. Models published in safetensors format cannot execute arbitrary code on load because the format is a restricted tensor-only container. Hugging Face has made safetensors the default recommendation, updated documentation, and converted many hub-hosted models. The work is incremental, and pickle-based models remain in active use.

What should ML teams do differently in 2026?

ML teams in 2026 should treat Hugging Face tokens as production secrets, enforce fine-grained token scopes, load models only in safetensors where possible, and scan notebooks and training scripts the same way they scan application code. Practical steps:

  1. Rotate every token issued before mid-2024 and reissue as fine-grained tokens with the minimum scopes required.
  2. Use per-environment tokens. Developer-laptop tokens should not have the same scope as CI or production tokens.
  3. Enforce nbstripout or an equivalent notebook sanitizer as a pre-commit hook to strip cell outputs before they reach version control.
  4. Require safetensors for any model that crosses a trust boundary. If a model ships only as pickle, either convert it to safetensors before ingesting or load it in a sandboxed environment.
  5. Monitor Hugging Face organization audit logs for unexpected pushes or token use from unfamiliar IP ranges. The platform exposes these events.
  6. Apply secret-scanning to ML artifacts including notebooks, training configs, and dataset card files. Generic code scanners often miss ML-specific file types unless configured.

The operational reality is that ML supply chain security is where traditional software supply chain security was around 2019. The bugs are known, the fixes are known, adoption is lagging, and most of the work is about enforcing existing controls consistently rather than inventing new ones.

How Safeguard.sh Helps

Safeguard.sh reachability analysis correlates leaked Hugging Face tokens with the models and datasets that are actually loaded in your production pipelines, filtering out unused inventory so CVE-plus-secret noise drops by 60 to 80 percent and remediation focuses on live exposure. Griffin AI autonomous remediation rotates exposed tokens, regenerates notebook cell outputs, and rebuilds model-ingest pipelines with safetensors-preferred configurations without a manual cleanup marathon. Eagle malware classification scans model artifacts for malicious serialized payloads, SBOM generation with 100-level dependency depth extends across Python, transformers, and model-weight layers so inherited risk is visible, container self-healing restores ML-inference pods to known-clean state, and TPRM extends the same visibility to model and dataset vendors in your supply chain.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.