AI Security

Vulnerability Scanning for AI Models: A New Frontier

AI models ship with dependencies, use vulnerable libraries, and introduce novel attack surfaces. Traditional scanning is not enough.

Shadab Khan
Security Engineer
6 min read

When we talk about vulnerability scanning, we typically mean analyzing software for known CVEs in its dependency tree. That model works well for applications built from packages with version numbers and published advisories.

AI models break this model. A deployed machine learning model is a software artifact, but it does not look like a traditional application. It has dependencies — frameworks, libraries, custom operators — but also weights, configurations, and training data that introduce entirely different classes of vulnerability.

The security industry is starting to grapple with this reality. Here is where things stand.

The AI Model Attack Surface

An AI model in production has several distinct attack surfaces:

Framework and Library Vulnerabilities

AI models depend on frameworks like PyTorch, TensorFlow, ONNX Runtime, and their associated libraries (NumPy, SciPy, Hugging Face Transformers). These frameworks have CVEs like any other software. A vulnerability in PyTorch's deserialization code (like CVE-2025-32434) can allow remote code execution when loading a malicious model file.

This layer is the most similar to traditional vulnerability scanning. The dependencies are declared in requirements.txt or pyproject.toml, they have version numbers, and they have published advisories. Standard SBOM generation and scanning tools handle this well.

Model Serialization Risks

AI models are typically serialized (saved to disk) using formats like pickle, PyTorch's .pt, TensorFlow's SavedModel, or ONNX. Several of these formats — pickle in particular — are inherently unsafe. Loading a pickle file executes arbitrary Python code. A malicious model file is functionally equivalent to a malicious executable.

The industry has been warning about pickle deserialization for years, and safer alternatives exist (safetensors, ONNX with strict loading). But legacy models serialized with pickle are everywhere, and many inference pipelines load them without sandboxing.

This is not a traditional CVE — it is a design-level weakness in the model distribution chain. Scanning for it requires understanding the model format, not just the dependency list.

Supply Chain for Pre-Trained Models

The explosion of pre-trained models available through Hugging Face, Model Zoo, and similar repositories mirrors the open-source package ecosystem — including its supply chain risks.

A pre-trained model from an untrusted source can contain:

  • Backdoored weights that produce targeted misclassifications
  • Embedded malicious code (in pickle-serialized models)
  • Data poisoning from compromised training data
  • Trojan triggers that activate on specific inputs

Model provenance — knowing where a model came from, who trained it, and what data was used — is in its infancy. There is no equivalent of Sigstore for model signing, no standardized model bill of materials, and limited tooling for verifying model integrity.

Inference Pipeline Vulnerabilities

The code that wraps a model for inference — API servers, preprocessing pipelines, postprocessing logic — is traditional software with traditional vulnerabilities. But it also introduces AI-specific risks:

  • Prompt injection in LLM applications, where user input manipulates the model's behavior
  • Adversarial inputs that cause misclassification or denial of service
  • Side-channel leakage where inference timing or output probabilities reveal information about the training data

These vulnerabilities do not have CVE numbers. They are not in any vulnerability database. Scanning for them requires AI-specific testing methodologies.

Current Approaches

Dependency Scanning (Mature)

Standard dependency scanning works for the framework and library layer. Generate an SBOM for the model's Python environment, scan it against NVD and OSV, and you have visibility into known CVEs.

This is table stakes and should be part of every AI deployment pipeline. The tooling is the same as for any Python application.

Model File Analysis (Emerging)

Tools are emerging to analyze model files for embedded risks:

  • Picklescan detects dangerous operations in pickle-serialized models
  • ModelScan (from Protect AI) scans multiple model formats for code injection, unsafe deserialization, and embedded threats
  • Safetensors is not a scanner but an alternative serialization format that is safe by design — no code execution on load

These tools address the serialization risk layer, which is critical for organizations that consume pre-trained models.

Model Provenance (Early)

Model provenance tracking is in the early stages:

  • Model cards provide metadata about training data, intended use, and evaluation results, but they are not cryptographically signed or machine-verifiable
  • Hugging Face model signing is in beta, using Sigstore to sign model files
  • SLSA for ML is a community proposal to extend SLSA provenance attestations to model training pipelines

This is the equivalent of where code signing was in 2019 — the concepts are defined, early tooling exists, but mainstream adoption is years away.

AI-Specific Security Testing (Nascent)

Testing for adversarial robustness, prompt injection, and data leakage requires specialized tools:

  • Garak tests LLMs for prompt injection, jailbreaks, and other attacks
  • Counterfit (from Microsoft) tests ML models for adversarial robustness
  • AI red teaming frameworks are emerging but not yet standardized

This category is the least mature. There are no widely accepted benchmarks, no standard vulnerability taxonomy, and limited automation.

Building an AI Model Security Program

For organizations deploying AI models in production, here is a practical framework:

  1. Treat models as software artifacts. Generate SBOMs for model environments, scan dependencies, and enforce policy gates just like any other application.

  2. Validate model files before loading. Never load an untrusted model without scanning it for embedded threats. Prefer safe serialization formats (safetensors, ONNX with strict loading).

  3. Track model provenance. Know where your models come from. Prefer models from trusted sources with signed artifacts. Document the training data, training process, and evaluation results.

  4. Test for AI-specific vulnerabilities. Incorporate adversarial testing, prompt injection testing, and data leakage testing into your model evaluation pipeline.

  5. Monitor inference pipelines. Apply runtime security monitoring to model serving infrastructure. Watch for anomalous inputs, unexpected outputs, and performance degradation that could indicate an attack.

How Safeguard.sh Helps

Safeguard extends supply chain security to AI model deployments. Our platform generates SBOMs for model environments, scans model files for embedded threats using integrated ModelScan capabilities, and enforces policy gates that can block deployment of models with unsafe serialization formats or unverified provenance. For organizations that treat AI models as first-class software artifacts — which every organization should — Safeguard provides the same supply chain security controls for models that it provides for traditional applications.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.