AI Security

ChatGPT and AI Security Implications for Software Supply Chains

The explosion of AI tools like ChatGPT is reshaping how developers write code — and introducing new supply chain risks that most teams aren't thinking about.

Nayan Dey
Engineering Lead
6 min read

By January 2023, ChatGPT had crossed 100 million users, and a huge chunk of them were software developers. The tool was being used to generate code, debug issues, write configurations, and even draft security policies. This rapid adoption introduced a new category of supply chain risk that the security industry was slow to recognize.

The New Attack Surface

When a developer asks ChatGPT to generate a Python script for handling API authentication, the LLM draws on patterns from its training data. That training data includes millions of code samples from public repositories — including code with known vulnerabilities, deprecated libraries, and insecure patterns.

This isn't a hypothetical concern. Research from Stanford published in late 2022 showed that developers using AI code assistants produced code with more security vulnerabilities than those who didn't, and — critically — they were more confident that their code was secure.

Package Hallucination

One of the most supply-chain-relevant risks is what researchers call "package hallucination." When asked to write code that requires external dependencies, ChatGPT sometimes recommends packages that don't exist. An attacker who registers those nonexistent package names on npm or PyPI creates a direct supply chain attack vector.

For example, if ChatGPT consistently recommends a package called python-jwt-utils (which doesn't exist), an attacker can create a malicious package with that name. Every developer who copies ChatGPT's suggestion and runs pip install python-jwt-utils will install the attacker's code.

This attack vector was demonstrated by security researcher Bar Lanyado in early 2023, who found that ChatGPT repeatedly recommended specific nonexistent packages. The consistency of these hallucinations makes them particularly dangerous — it's not random noise, it's a predictable pattern an attacker can exploit.

Outdated and Vulnerable Code Patterns

ChatGPT's training data has a knowledge cutoff, meaning it can recommend libraries with known CVEs or deprecated APIs. A developer asking for help with XML parsing might get code using a library vulnerable to XXE injection. Someone asking for cryptographic code might receive implementations using weak algorithms or insecure modes.

The model doesn't check CVE databases. It generates what looks syntactically correct based on training data patterns.

The Dependency Explosion Problem

AI code generation tends to pull in more dependencies than handwritten code. When a developer writes code from scratch, they make conscious decisions about each dependency. When an AI generates code, it includes whatever packages fit the pattern, often adding unnecessary dependencies that increase the attack surface.

This matters because every additional dependency is:

  • Another package that needs vulnerability monitoring
  • Another maintainer you're trusting
  • Another potential point of compromise in your supply chain

Real-World Implications in Early 2023

By January 2023, several concerning trends were already emerging:

Corporate code leakage: Developers were pasting proprietary code into ChatGPT for debugging assistance. Samsung famously banned ChatGPT usage after engineers inadvertently leaked semiconductor source code. This creates a supply chain risk because confidential code patterns, API endpoints, and architectural details become part of training data for future model updates.

Automated vulnerability introduction: Security teams at multiple companies reported finding identical insecure code patterns across different projects — traced back to ChatGPT-generated code that multiple developers had independently adopted.

Configuration generation risks: Developers were using ChatGPT to generate Dockerfiles, Kubernetes manifests, CI/CD configurations, and cloud IAM policies. These generated configurations frequently contained overly permissive settings, hardcoded credentials in examples, and insecure defaults.

What Makes This Different from Traditional Supply Chain Risks

Traditional supply chain attacks involve compromising a specific package, build system, or distribution channel. AI-generated code risks are more diffuse. There's no single point of compromise — instead, there's a systemic weakening of code quality and security practices across the entire industry.

The scale is also unprecedented. When a developer copies vulnerable code from Stack Overflow, that affects one project. When an AI model consistently generates the same vulnerable pattern for thousands of developers, the impact is multiplied across the entire ecosystem.

The Trust Problem

Developers inherently trust AI-generated code more than they should. When ChatGPT produces a clean, well-commented code block, there's a psychological tendency to treat it as authoritative. Code review processes often give AI-generated code less scrutiny because it "looks professional."

This creates a situation where vulnerable code bypasses the human review safeguards that traditionally catch security issues.

Practical Defenses

1. Treat AI-Generated Code Like Untrusted Input

Every code block generated by an AI should go through the same security review as code from an unknown external contributor. This means static analysis, dependency checking, and human review with security focus.

2. Verify Every Dependency

Before adding any package recommended by an AI tool, verify that:

  • The package actually exists on the intended registry
  • The package is actively maintained
  • The package doesn't have known vulnerabilities
  • The package name isn't a typosquat of a popular legitimate package

3. Establish AI Usage Policies

Organizations need clear policies on:

  • What code can be shared with AI tools
  • Whether AI-generated code requires additional review
  • How AI-generated dependencies are vetted
  • Documentation requirements for AI-assisted code

4. Pin Dependencies Aggressively

AI-generated code often uses loose version specifiers. Every dependency should be pinned to exact versions, and updates should go through a controlled process.

5. Continuous Vulnerability Scanning

With AI potentially introducing vulnerabilities faster than manual review can catch them, automated continuous scanning becomes essential rather than optional.

How Safeguard.sh Helps

Safeguard.sh is built for exactly this kind of emerging supply chain risk:

  • Dependency Verification: Safeguard.sh validates every dependency in your project against known vulnerability databases, catching insecure packages that AI tools might recommend — including packages with known CVEs that AI models don't know about.
  • SBOM Generation and Analysis: By maintaining comprehensive SBOMs, Safeguard.sh gives you visibility into every component in your software, whether it was added manually or by AI suggestion.
  • Package Existence Validation: Safeguard.sh can identify dependencies that reference nonexistent or suspicious packages, catching the "package hallucination" attack vector before malicious packages are installed.
  • Continuous Monitoring: As new CVEs are discovered in packages that AI tools previously recommended, Safeguard.sh alerts you immediately, ensuring your supply chain stays clean over time.

AI tools aren't going away — they're going to become more deeply integrated into development workflows. The organizations that treat AI-generated code as a supply chain input to be verified, rather than a trusted source, will be the ones that avoid the next generation of supply chain attacks.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.