In February 2025, ReversingLabs researchers disclosed two malicious machine learning models on Hugging Face that abused a previously-unappreciated quirk in PyTorch serialization to evade the platform's Picklescan detection. The technique was dubbed nullifAI. It was the most operationally interesting Hugging Face pickle finding since the broader pickle-deserialization concerns first surfaced in 2022, because rather than relying on novel gadget chains, it relied on the parser disagreement between Picklescan (a defender's tool) and torch.load() (the consumer's tool). Where the two disagree, an attacker has a free move. Hugging Face removed the malicious models within 24 hours and patched Picklescan, but the underlying lesson — that parser disagreement is a class of vulnerability for ML supply chains — generalizes.
What the attackers did
PyTorch's default model serialization format is a ZIP archive containing pickled tensors and metadata. Picklescan walks the ZIP, finds the embedded pickle stream, and disassembles it looking for dangerous opcodes (REDUCE with a posix.system callable, GLOBAL with subprocess.Popen, and similar). If Picklescan finds nothing dangerous, the model is allowed onto the Hub. The nullifAI authors changed the outer container: instead of ZIP, they compressed the PyTorch archive with 7-Zip. Picklescan's archive walker did not recognize the 7z magic bytes and skipped the file silently. Crucially, torch.load() is more permissive — depending on the version it falls back to attempting alternative decompressors or simply reading the inner pickle stream directly when handed certain unusual archive headers. Where the defender saw nothing, the consumer saw a working model with a malicious pickle inside.
What the payload did
The pickle stream embedded at the start of the inner archive contained a typical platform-aware reverse shell. On import, the model would resolve the host operating system, build a connect-back socket to a hard-coded IP, and execute commands received over the channel. The implementation was straightforward — there was no novel evasion of host EDR; the novelty was entirely in getting the pickle onto the Hub past Picklescan. ReversingLabs characterized the finding as a proof-of-concept rather than evidence of an active campaign, but the proof-of-concept nature does not diminish its value as a worked attack on the parser-disagreement seam.
Why Picklescan is structurally a partial solution
Picklescan, like any opcode-level pickle disassembler, has to predict which callables are dangerous and which are not. The default deny-list includes the obvious suspects (os.system, subprocess.Popen, eval, exec), but pickle has thousands of legitimate callable targets and the boundary is fuzzy. The historical pickle finding from 2022, Charlie Miller's pickle-as-malware research, the JFrog "silent backdoor" disclosure in 2024 — these have repeatedly demonstrated that allow-list approaches to pickle are essentially impossible because legitimate ML workloads import callables from torch, numpy, transformers, and dozens of other libraries. Picklescan does the best it can within deny-list constraints. The nullifAI bypass is a reminder that the deny-list itself can be sidestepped if the scanner never sees the bytes.
The structural fix: safetensors and signed models
Two converging mitigations exist. First, the safetensors format authored by Hugging Face does not use pickle; it is a memory-mappable binary container for tensors with a JSON header. There is no execution path on load. Hugging Face has been actively migrating its hosted model catalog to safetensors, and as of 2025 the majority of high-traffic models are available in safetensors form. The right policy for enterprise consumers is to require safetensors and refuse pickle-format models at ingress. Second, the OpenSSF Model Signing project shipped version 1.0 in April 2025 in collaboration with NVIDIA and HiddenLayer. The model-signing library lets publishers sign model artifacts using Sigstore, and consumers verify signatures before load. A signed safetensors model carries provenance from publisher to verifier; a malicious pickle still loads via torch.load() but no longer carries a valid signature, and your enforcement gate refuses it.
# Minimum enterprise ingress check before torch.load
from model_signing import verify
from pathlib import Path
import sys
def safe_load(model_path: str, expected_signer: str):
p = Path(model_path)
if p.suffix not in (".safetensors", ".gguf"):
sys.exit(f"refused: {p.suffix} format requires policy waiver")
sig = p.with_suffix(p.suffix + ".sig")
if not sig.exists():
sys.exit("refused: no Sigstore signature present")
verify.verify(
model_path=str(p),
signature_path=str(sig),
identity=expected_signer,
identity_provider="https://accounts.google.com",
)
What Hugging Face changed after the disclosure
ReversingLabs reported nullifAI to Hugging Face's security team in late January 2025 and the malicious models were removed within 24 hours. Hugging Face shipped a Picklescan update that recognizes 7z-packed PyTorch archives and either scans the inner pickle or flags the model as unscannable. Hugging Face also continued to push the safetensors migration and expanded the visibility of the "Use this model" badge that warns consumers when only pickle weights are available. The platform's broader response — making malicious models on a public hub a fix-forward problem rather than a moderation-up-front problem — remains a fundamental architectural constraint. Hugging Face is a hub, not a registry with curated trust. Enterprises consuming models from the hub must add their own gate.
Treating Hugging Face as an untrusted source
The correct mental model for consuming Hugging Face artifacts is the same one mature engineering teams use for npm or PyPI: untrusted external source, mirrored internally with verification at ingress. Build a private model registry. Allow-list specific organizations on the Hub (openai/, meta-llama/, mistralai/, google/, etc.). For artifacts from less-trusted publishers, require either a signed model artifact or a hash that matches one verified by your security review. Pin versions: refuse to consume latest tags. Run downloaded models through a malware sandbox before promoting to the registry; safetensors models still warrant a pip install analysis if they ship with custom code via trust_remote_code=True.
How Safeguard Helps
Safeguard ingests Hugging Face model artifacts into an AIBOM keyed by repo, revision, and artifact hash, and flags models that present only pickle weights, lack a Sigstore signature, or require trust_remote_code=True. Policy gates block product builds that pull from publishers outside your allow-list. Griffin AI cross-references model metadata against ReversingLabs, JFrog, and Protect AI advisories — when a new malicious-model disclosure lands, every downstream product that consumed an affected artifact is surfaced within minutes. TPRM workflows monitor publishers continuously against historical breach signals (the 1,600+ leaked Hugging Face tokens disclosed in 2023 still surface today as a vendor risk factor), so model-supply-chain decisions remain auditable through your SOC 2 and AI Act evidence packages.