By November 2023, Hugging Face had become the GitHub of AI—hosting over 400,000 models, 80,000 datasets, and 100,000 demo applications. Organizations building AI-powered features routinely download pre-trained models from the Hugging Face Hub, fine-tune them, and deploy them to production. But unlike npm or PyPI, where the security community has spent years building vulnerability scanning and malware detection, the AI model supply chain remains largely unguarded.
The AI Model as an Attack Vector
AI models aren't just data—they're executable code. This is the fundamental security insight that the AI community has been slow to internalize.
Pickle files are arbitrary code execution. The most common serialization format for Python ML models is pickle (.pkl, .pt, .pth files). Python's pickle module can execute arbitrary code during deserialization. Loading a malicious pickle file is functionally equivalent to running exec() on attacker-controlled code.
This isn't a theoretical risk. In 2023, researchers demonstrated multiple attack scenarios:
- Models that execute reverse shells when loaded
- Models that exfiltrate environment variables (including cloud credentials) during deserialization
- Models that install persistent backdoors while appearing to function normally
- Models that inject malicious code into the training pipeline
SafeTensors adoption. Hugging Face has been promoting SafeTensors, a safe serialization format that only stores tensor data without executable code. Adoption has grown significantly in 2023, but pickle-based model files remain common, especially for older models and certain frameworks.
Real-World Incidents
Several incidents in 2023 highlighted AI supply chain risks:
Malicious models on Hugging Face. JFrog's security research team discovered over 100 models on Hugging Face containing malicious code. The models used pickle deserialization to execute arbitrary payloads, including reverse shells and credential stealers. The models were uploaded to repositories with names mimicking popular legitimate models.
Compromised datasets. Researchers demonstrated that poisoned training data could be uploaded to Hugging Face datasets, potentially corrupting models fine-tuned on that data. Unlike code, where malicious intent can sometimes be detected through static analysis, poisoned training data is extremely difficult to identify.
Model backdoors. Academic researchers published multiple papers in 2023 demonstrating techniques for embedding backdoors in neural networks that are activated by specific trigger inputs. A backdoored model performs normally on standard inputs but produces attacker-controlled outputs when triggered.
The Supply Chain Gap
Traditional software supply chain security tools don't cover AI models:
SBOM tools don't inventory models. Standard SBOM generators track code dependencies but not model files. An organization might have a complete SBOM for their Python application while having no inventory of the pre-trained models it loads.
Vulnerability scanners don't scan models. Tools like Trivy, Grype, and Snyk scan for CVEs in software packages. They don't analyze model files for malicious payloads or backdoors.
Package registries have controls; model hubs don't. npm, PyPI, and Maven Central have implemented malware scanning, 2FA requirements, and package signing. Hugging Face has taken steps toward security (malware scanning, SafeTensors adoption), but the controls are less mature.
Provenance is weak. For software packages, tools like SLSA and Sigstore provide provenance attestations—proof of where and how an artifact was built. For AI models, provenance is typically a README claiming the model was trained on a certain dataset with certain hyperparameters. Verifying these claims is extremely difficult.
Threat Model
Organizations using pre-trained models face several threat categories:
Model Poisoning
An attacker contributes poisoned training data or publishes a pre-trained model with embedded backdoors. The model performs well on benchmark tests but produces incorrect or malicious outputs for specific trigger inputs.
Impact: Incorrect decisions based on manipulated model outputs. In a security context, a poisoned malware classifier could be trained to always classify a specific malware family as benign.
Model Theft via Trojan
An attacker publishes a model that exfiltrates its own weights or the data it processes to an external server. Organizations that download and deploy the model unknowingly create an exfiltration channel.
Impact: Loss of proprietary data processed by the model. In a RAG (Retrieval-Augmented Generation) application, this could include the entire knowledge base.
Credential Theft
As demonstrated by the JFrog research, malicious pickle files can steal credentials from the environment where the model is loaded. Developer machines and CI/CD environments are particularly rich targets.
Impact: Stolen cloud credentials, API keys, and SSH keys leading to broader infrastructure compromise.
Dependency Chain Attacks
AI models often depend on specific versions of frameworks (PyTorch, TensorFlow, transformers) and libraries. A model's requirements.txt or config.json might specify dependencies that include vulnerable or malicious packages.
Impact: Software supply chain compromise through the AI model's dependency requirements.
Mitigations
Use SafeTensors. Prefer models in SafeTensors format over pickle. If a model is only available in pickle format, evaluate whether you can convert it or find an alternative.
Sandbox model loading. Load untrusted models in sandboxed environments (containers, VMs) with no network access and limited filesystem access. Monitor for unexpected system calls during model loading.
Scan model files. Tools like Hugging Face's built-in malware scanner, JFrog's ML model scanner, and academic tools like ModelScan can detect known malicious patterns in model files.
Verify model provenance. Prefer models from known organizations with established reputations. Check model cards, training documentation, and community reviews.
Inventory your models. Maintain a registry of all pre-trained models used in your organization, including their sources, versions, and deployment locations.
Pin model versions. Don't use "latest" for model downloads. Pin to specific revisions and verify checksums.
How Safeguard.sh Helps
Safeguard.sh extends supply chain security to AI/ML pipelines by inventorying pre-trained models alongside traditional software dependencies. Our platform scans model files for malicious payloads, tracks model provenance, monitors for known vulnerabilities in ML framework dependencies, and provides a unified view of your software and AI supply chain risk. As AI models become as critical as code dependencies, Safeguard.sh ensures they receive the same security scrutiny.