Software Supply Chain Security

Compression Library Vulnerabilities: From zlib to the xz Backdoor

Name: Safeguard
Brand: Safeguard
Availability: PreOrder

Compression libraries are everywhere and trusted implicitly. The xz backdoor proved that trust can be weaponized. Here is the full picture.

Compression is one of the most fundamental operations in computing. Every network protocol, file format, package manager, and archive tool depends on compression libraries. zlib alone is linked by virtually every application on a typical Linux system. When these libraries have vulnerabilities, the blast radius is enormous.

The xz backdoor (CVE-2024-3094) was a watershed moment. It demonstrated that compression libraries are not just vulnerable to accidental bugs -- they are attractive targets for deliberate supply chain attacks. The combination of ubiquitous deployment, implicit trust, and deep system integration makes compression libraries uniquely dangerous.

The zlib Story

zlib is the most widely deployed compression library in existence. Written by Jean-loup Gailly and Mark Adler, it implements the DEFLATE compression algorithm used in gzip, PNG, HTTP content encoding, and countless other contexts.

Despite its maturity, zlib has had significant vulnerabilities over the years:

CVE-2018-25032 was a memory corruption vulnerability in the deflate function that could be triggered by crafting specific input to the compressor. The bug existed for 17 years before discovery. Because zlib is used for HTTP content encoding, this vulnerability affected web servers, proxies, and any application that compressed data before transmission.

CVE-2022-37434 was a heap-based buffer overflow in inflate that could be triggered by crafted compressed data. This was particularly dangerous because decompression of untrusted data is a common operation -- every time your browser receives a gzip-compressed HTTP response, it passes the data through zlib.

The zlib codebase is around 30,000 lines of C. It has been audited, fuzzed, and reviewed more than almost any other open-source library. And it still has bugs. That tells you something about the inherent difficulty of writing correct C code.

zlib-ng: The Modern Fork

zlib-ng is a modernized fork of zlib that incorporates performance optimizations and security improvements. It uses safer coding patterns, enables compiler hardening flags by default, and takes advantage of modern CPU instructions for faster compression.

For new projects, zlib-ng is generally preferable to original zlib. However, switching existing projects requires testing because the API, while compatible, may produce different compressed output.

The xz Backdoor

In March 2024, Andres Freund discovered a backdoor in xz Utils versions 5.6.0 and 5.6.1. The backdoor was inserted by a contributor named Jia Tan who had spent two years building trust with the project maintainer before introducing the malicious code.

The backdoor targeted the SSH authentication process. On systems where sshd was linked against liblzma (through systemd), the backdoor intercepted the RSA public key verification function, allowing the attacker to authenticate with a specific key without valid credentials.

What Made the xz Attack Unique

Social engineering over years. Jia Tan began contributing to xz in 2021, making legitimate improvements to build trust. Other sockpuppet accounts pressured the original maintainer (who was dealing with mental health issues) to add Jia Tan as a co-maintainer. This level of patience and social manipulation is characteristic of nation-state operations.

Hidden in plain sight. The malicious code was not in the xz source code itself but in test fixture files (binary blobs that are not human-readable) and in the build system scripts that decoded and compiled them. Code review of the C source would not have revealed the backdoor.

Targeting through indirect dependency. The attack did not target xz directly but rather exploited the fact that sshd linked against liblzma through systemd. Most people would never think of xz as part of the SSH authentication chain.

Caught by accident. Freund noticed the backdoor because SSH authentication was taking 500ms longer than expected on his Debian testing machine. This was a performance anomaly, not a security audit finding. If the backdoor had been slightly more optimized, it might have gone undetected for months or years.

Other Compression Library Vulnerabilities

Brotli

Google Brotli compression library has had vulnerabilities including CVE-2020-8927, an integer overflow in the decompression code that could lead to buffer overflow. Brotli is used in HTTP content encoding and is supported by all major browsers.

LZ4

LZ4 is a fast compression algorithm used in databases, filesystems, and real-time applications. CVE-2021-3520 was a heap overflow in the LZ4 decompression code that could be triggered by crafted compressed data.

Snappy

Google Snappy library (used in LevelDB, MongoDB, and other systems) has had fewer vulnerabilities than some alternatives, but its C++ implementation is not immune to memory safety bugs.

Implications for Your Supply Chain

Compression library vulnerabilities are different from typical application-level bugs:

You cannot avoid them. Every application uses compression, either directly or through transitive dependencies. You cannot simply choose not to depend on compression libraries.

The blast radius is maximal. A vulnerability in zlib affects essentially everything. The patch deployment effort for a zlib CVE is enormous because it touches every system.

Detection is difficult. Your SCA tools may not surface compression library dependencies because they are often system libraries rather than explicitly declared dependencies. A Python application that uses gzip from the standard library depends on the system zlib, but this will not appear in requirements.txt.

Defense Strategies

Know your compression dependencies. Generate SBOMs that include system libraries, not just package-level dependencies. You need to know which version of zlib, xz, brotli, and other compression libraries each of your applications uses.

Monitor for backdoors, not just bugs. The xz incident proved that compression libraries are targets for deliberate compromise. Monitor not just for CVEs but for suspicious changes in maintainership, build processes, and binary artifacts.

Update quickly. When compression library vulnerabilities are disclosed, patch immediately. The ubiquity of these libraries means exploits will be developed quickly.

Build from source when possible. Using system-provided compression libraries means trusting your OS vendor to patch promptly. Building from source gives you direct control over versions and allows you to apply patches without waiting for distribution updates.

How Safeguard Helps

Safeguard provides deep visibility into compression library dependencies across your entire infrastructure. Our platform tracks zlib, xz, brotli, lz4, and other compression libraries at the system level, not just at the package manifest level. When a vulnerability like CVE-2024-3094 is discovered, Safeguard tells you exactly which systems and applications are affected, enabling rapid response across your organization.

Compression zlib xz Vulnerabilities Supply Chain

Back to all articles

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

Self-healing security runs on Safeguard.

Your first fix PR is minutes away.

Book a demo Get started

No sales call required, even your agent can complete the purchase over MCP.