The first time I watched AFL find a real crash, I was running it against a JSON parser that had been in production for five years. Within about forty minutes, the fuzzer produced an input that triggered a heap overflow in the number-parsing path. The input was twelve bytes long. Nobody had ever written a unit test with those twelve bytes. Nobody needed to. The fuzzer had explored the space until it found them.
That story has been repeated thousands of times since Michal Zalewski released AFL in 2013. Coverage-guided fuzzing transformed vulnerability research by turning bug discovery into a background process that runs while you sleep. For software supply chain work, it is one of the highest-leverage activities a security team can invest in, because every bug found in a popular open-source library is a bug fixed for everyone who depends on it.
This post covers how fuzzing works today, which tools are worth your time, and how to fit fuzzing into a supply-chain security program without burning a quarter's engineering budget on it.
Coverage-Guided Fuzzing in One Paragraph
A coverage-guided fuzzer generates inputs, feeds them to a target program, and monitors which code paths the program executes. Inputs that reach new code paths are kept and mutated. Inputs that do not are discarded. Over time, the fuzzer builds up a corpus of inputs that covers the program's behavior broadly, and along the way it finds inputs that crash the program or trigger sanitizer errors.
This feedback loop is the central insight. Random fuzzing is mostly useless against structured inputs because the probability of producing a valid JSON document by chance is vanishingly small. Coverage-guided fuzzing bootstraps from simple inputs and learns structure by watching which mutations reach new code.
The Tool Landscape
libFuzzer is the in-process fuzzer distributed with LLVM. It is the simplest to adopt because you write a single function, link it against libFuzzer, and run the resulting binary. libFuzzer is the default choice for C and C++ libraries with a small, well-defined input surface.
AFL++ is the community-maintained successor to Michal Zalewski's original AFL. It runs as a separate process, which makes it more robust against target crashes, and it ships with an extensive set of mutators, instrumentation modes, and input schedulers. Marc Heuse and the AFL++ team published a 2020 WOOT paper, "AFL++: Combining Incremental Steps of Fuzzing Research," that documents the design decisions.
Honggfuzz, from Robert Swiecki at Google, is the third major C and C++ fuzzer. It is less popular but has some clever features around hardware performance counters and deep linking.
Jazzer brings libFuzzer-style coverage-guided fuzzing to Java and Kotlin. It uses bytecode instrumentation and has found dozens of vulnerabilities in widely used libraries like Apache Commons Compress and the Bouncy Castle crypto provider.
Atheris does the same thing for Python, and cargo-fuzz is the standard for Rust. go-fuzz, from Dmitry Vyukov, pioneered coverage-guided fuzzing in Go, and the Go standard library has included native fuzzing since version 1.18.
OSS-Fuzz and Its Track Record
OSS-Fuzz, run by Google's security team, is the single most important fuzzing infrastructure in the open-source world. Projects integrate their fuzz targets, Google provides the compute, and vulnerabilities are automatically reported to maintainers with a ninety-day disclosure window.
As of the most recent public numbers, OSS-Fuzz has found over ten thousand vulnerabilities and bugs across hundreds of open-source projects. The list of beneficiaries reads like a who's-who of critical infrastructure: OpenSSL, SQLite, FFmpeg, ImageMagick, systemd, curl, Python's standard library, and on and on.
For a supply-chain security program, OSS-Fuzz is a gift. If your application depends on a library that OSS-Fuzz covers, that library is under constant adversarial pressure and you get the benefit for free.
Structure-Aware Fuzzing
Plain byte-level mutation hits a wall with deeply structured formats. If your target parses protobuf, XML, or ASN.1, random mutations usually produce invalid inputs that the parser rejects immediately. The fuzzer never gets deep into the parsing logic.
Structure-aware fuzzing solves this by mutating the abstract syntax tree instead of the serialized bytes. libprotobuf-mutator, from Kostya Serebryany's group at Google, mutates protobuf messages directly. Nautilus, from Cornelius Aschermann and collaborators at Ruhr University Bochum, uses a context-free grammar to generate inputs and has found bugs in PHP, Lua, and ChakraCore that byte-level fuzzers missed.
For anyone fuzzing a parser, structure-aware techniques are table stakes. The upfront cost of writing a grammar or protobuf schema pays for itself within a week of runtime.
Sanitizers Are Half the Story
A fuzzer only finds bugs that manifest as observable behavior. Memory-safety bugs usually cause crashes eventually, but the crash may be far from the actual bug. Sanitizers close this gap.
AddressSanitizer, from Konstantin Serebryany and collaborators at Google, detects heap overflows, use-after-free, and a handful of other memory-safety issues with low overhead. UndefinedBehaviorSanitizer catches signed overflow, null dereferences, and alignment violations. MemorySanitizer detects reads of uninitialized memory. ThreadSanitizer finds data races.
Any serious fuzzing setup runs with AddressSanitizer at a minimum, often combined with UBSan. The 2023 xz-utils vulnerability (CVE-2022-1271, not the supply-chain incident) was found by running the project's existing fuzz targets under ASan after the project joined OSS-Fuzz.
Where Fuzzing Fits in a Supply Chain Program
The mistake most teams make is treating fuzzing as a quarterly project instead of a continuous activity. A weekend of fuzzing a library produces a handful of findings and then stops. A month of continuous fuzzing, with a steadily growing corpus, produces orders of magnitude more coverage and finds bugs that only surface after hours of execution.
The practical answer for most teams is to contribute fuzz targets to OSS-Fuzz for the libraries you depend on. If your company's product ships with a particular XML parser, protobuf implementation, or image codec, writing a fuzz target for that library is high-leverage work. OSS-Fuzz then runs it forever, for free, and you get CVE credit and a more secure dependency.
For libraries that cannot be integrated into OSS-Fuzz, you can run ClusterFuzzLite, a lightweight CI-integrated fuzzer from the same team. It is intended to run for fifteen minutes per pull request and catch regressions before they merge. It is not a replacement for continuous fuzzing, but it is useful as a gate.
Reading Fuzz Reports
A fuzz report typically gives you a crashing input, a stack trace, and a sanitizer diagnosis. The skill is in triaging which crashes are security-relevant.
A null-pointer dereference in a parser is usually a denial-of-service bug. A heap overflow in a parser is usually a remote code execution primitive. A use-after-free in an event loop might be either, depending on exploitability.
Project Zero's 2022 post, "The More You Know, The More You Know You Don't Know," argued that exploitability is usually decidable with a few hours of analysis, and most reported bugs fall into clear categories. For a supply-chain program, even denial-of-service bugs matter, because they can be used against production systems.
A Realistic Expectation
Fuzzing is not a silver bullet. It finds memory-safety bugs, parser bugs, and edge cases in well-defined input-processing code. It rarely finds logic bugs, authorization issues, or cryptographic misuse. Those require different techniques.
But for the bugs that fuzzing does find, it is the most cost-effective technique in vulnerability research. A single well-written fuzz target, running continuously for a year, can find bugs that would take a team of reviewers months of manual work. For open-source dependencies, the economics are overwhelming.
How Safeguard Helps
Safeguard tracks which of your open-source dependencies are covered by OSS-Fuzz and surfaces new fuzzing-discovered CVEs against them within hours of disclosure. For libraries outside OSS-Fuzz coverage, our research team runs targeted fuzzing campaigns against packages that appear in a significant portion of our customers' dependency graphs, and we publish the findings responsibly through coordinated disclosure. If a zero-day is found in a library you use, Safeguard tells you whether the vulnerable code is reachable from your application before the official advisory lands.