Emerging Threats

Open Source AI Model Security: The Emerging Threat Landscape

As open source AI models proliferate, their security implications extend far beyond traditional software vulnerabilities. Model poisoning, supply chain tampering, and unsafe deserialization create new attack surfaces.

James
Threat Intelligence Lead
6 min read

The explosion of open source AI models in 2024, led by the Llama, Mistral, and Stable Diffusion families, has created a software supply chain problem that the industry is barely beginning to address. AI models are not just data files. They are executable artifacts that can contain arbitrary code, and the ecosystems distributing them lack many of the security controls that traditional package registries have developed over decades.

Models as Attack Surface

Traditional software security focuses on code. AI model security requires thinking about code, data, and behavior simultaneously.

Serialization exploits. The most immediate risk is that AI model files can contain executable code. Python's pickle format, widely used for serializing machine learning models, can execute arbitrary code during deserialization. Loading a pickled model is functionally equivalent to running an untrusted script.

This is not theoretical. Researchers have demonstrated:

  • Reverse shells embedded in pickle-serialized models
  • Credential-harvesting code triggered on model load
  • Cryptominers activated when importing a pre-trained model

The PyTorch ecosystem has been particularly affected because .pt and .pth files use pickle serialization by default. SafeTensors, an alternative format that does not support arbitrary code execution, has emerged as a safer alternative, but adoption is not universal.

Model poisoning. An attacker who can modify a model's weights can alter its behavior without changing any visible code. Poisoned models may perform normally on standard benchmarks while containing backdoors that activate on specific trigger inputs.

For example, a poisoned image classification model might correctly classify 99.9% of images but misclassify specific inputs chosen by the attacker. A poisoned code generation model might produce subtly vulnerable code when given certain prompts.

Training data poisoning. Open source models trained on web-scraped data are vulnerable to data poisoning attacks where an attacker injects malicious examples into training datasets. This can influence model behavior at scale without direct access to the model weights.

The Hugging Face Ecosystem

Hugging Face Hub has become the primary distribution platform for open source AI models, hosting over 500,000 models. It occupies a role analogous to npm for JavaScript or PyPI for Python, but with significantly less mature security infrastructure.

Current security measures:

  • Malware scanning of uploaded models
  • Secret scanning to detect leaked credentials in model repositories
  • SafeTensors format support and promotion
  • Model cards for documentation (though completion is optional)

Remaining gaps:

  • No mandatory code signing or provenance attestation
  • Limited behavioral analysis of model outputs
  • No standardized vulnerability reporting for model-specific issues
  • Dependency tracking for model-to-model relationships (fine-tuned models depending on base models) is informal

The analogy to early npm is instructive. npm faced similar growing pains as its ecosystem scaled, and the security measures it developed over years are now being applied to AI model distribution platforms in compressed timescales.

Supply Chain Attack Vectors

Several attack vectors are specific to the AI model supply chain:

Model Swapping

An attacker compromises a model repository and replaces a legitimate model with a modified version. Users downloading the model receive the tampered version. Without integrity verification (signatures, checksums verified against a trusted source), this swap is undetectable.

Fine-Tuning Pipeline Compromise

Organizations fine-tune open source base models on their own data. If the fine-tuning pipeline is compromised, the attacker can inject behaviors into the resulting model. This is analogous to build system compromise in traditional software supply chains.

Adapter and Plugin Attacks

The modular nature of modern AI frameworks (LoRA adapters, plugins, tool-use configurations) creates additional attack surfaces. A malicious LoRA adapter loaded on top of a legitimate base model can alter behavior without modifying the base model itself.

Dependency Chain Attacks

AI models often depend on specific framework versions, tokenizer configurations, and preprocessing code. Compromising any of these dependencies can affect model behavior even if the model weights are unchanged.

SBOM for AI: An Emerging Concept

The software supply chain security community is beginning to apply SBOM concepts to AI models. An "AI Bill of Materials" or "Model Card+" would document:

  • Model provenance: Who trained the model, on what infrastructure, from what base model
  • Training data sources: Datasets used, their licenses, and known biases
  • Dependencies: Framework versions, tokenizer versions, preprocessing code
  • Evaluation results: Performance on standard benchmarks and safety evaluations
  • Known limitations: Documented failure modes and unsafe behaviors

CycloneDX has extended its SBOM format to support machine learning model components. SPDX is developing similar capabilities. These are early-stage efforts, but they represent the beginning of structured transparency for AI model supply chains.

Practical Defenses

Organizations using open source AI models should implement:

Use SafeTensors format. Reject pickle-serialized models. SafeTensors provides the same functionality without arbitrary code execution risk. Most popular models are available in SafeTensors format.

Verify model integrity. When downloading models, verify checksums against the source repository. Use tools that detect tampering in model files.

Isolate model loading. Load untrusted models in sandboxed environments. Never deserialize models in environments with access to production credentials or sensitive data.

Scan model files. Use security tools that specifically analyze ML model files for embedded malicious code. Traditional antivirus is insufficient for detecting pickle exploits.

Pin model versions. Reference specific model commits or checksums rather than mutable tags. This prevents model swapping attacks from affecting your deployments.

Evaluate model behavior. Run standardized safety evaluations on models before deploying them. Compare outputs against expected baselines to detect behavioral anomalies that might indicate poisoning.

Track model provenance. Document which models you use, where you obtained them, and what modifications you have made. This provenance chain enables rapid response when a model is found to be compromised.

The Regulatory Dimension

The EU AI Act introduces requirements for AI system providers that intersect with model security:

  • High-risk AI systems must maintain technical documentation including training data descriptions
  • Providers must implement risk management systems that address security threats
  • Transparency requirements may necessitate model composition disclosure

These requirements will drive demand for AI-specific supply chain transparency tools and practices.

How Safeguard.sh Helps

Safeguard.sh is extending its supply chain visibility capabilities to address the AI model security landscape. By tracking AI model dependencies alongside traditional software dependencies, Safeguard.sh provides a unified view of your complete supply chain, including the models your applications depend on.

SBOM generation that captures AI model components, framework versions, and associated dependencies provides the inventory foundation for AI supply chain security. Continuous monitoring ensures that when vulnerabilities are discovered in ML frameworks or model-related dependencies, affected applications are identified immediately.

As AI model security standards mature, Safeguard.sh will incorporate model-specific risk indicators into its analysis, providing the same rigorous supply chain oversight for AI artifacts that organizations already rely on for traditional software components.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.