Pure-Ruby gems are straightforward to reason about. You can read them, you can grep them, you can diff them across versions. Native extensions are a different story. A gem like nokogiri, pg, sqlite3, or eventmachine ships C code that compiles against system libraries at install time, or in the case of precompiled gems, ships binary artifacts that were produced on a maintainer's machine or in a CI runner. The supply-chain surface for native extensions is meaningfully larger than for pure-Ruby gems, and the auditing story is meaningfully weaker. This post walks through what that looks like in practice and what you can do about it.
The core issue is that a native extension pulls in two supply chains that a pure-Ruby gem does not: the compiler toolchain that builds it, and the system libraries it links against. A CVE in libxml2 that ships with the operating system becomes your CVE via nokogiri, even though the Ruby Advisory Database may never catch it. A compromised maintainer who publishes a precompiled native gem can hide malicious code in the binary that would be visible in source form.
How do native extensions actually get built?
A gem with a native extension includes an extconf.rb file that uses the mkmf standard library to probe for system libraries and generate a Makefile. When you bundle install, Bundler invokes extconf.rb, then runs make against the generated Makefile, which produces a shared library (.so on Linux, .bundle on macOS, .dll on Windows) that Ruby loads at runtime. This is the source-gem path, and it runs entirely on your machine during install.
The precompiled-gem path is different. The maintainer builds the shared libraries for a set of platforms in advance, typically in a GitHub Actions workflow using rake-compiler-dock, and publishes platform-specific gems to RubyGems.org. A user installing on linux-x86_64 downloads the precompiled .so directly without running a local compile. This is dramatically faster and avoids the compiler-dependency problem, but it shifts trust from your machine to the maintainer's build environment.
Most popular gems that could ship native extensions do publish precompiled variants. Nokogiri 1.16.0, released in December 2023, ships precompiled for 11 platforms including the major Linux architectures, macOS on both x86_64 and arm64, and Windows. The source-gem fallback is still there but increasingly rare as the precompiled coverage has improved. This is good for installation speed and bad for auditability.
What can go wrong in a source-gem build?
The source-gem path runs mkmf probes that execute arbitrary compiler commands on your machine. A maliciously crafted extconf.rb could in principle do almost anything an attacker wants: fetch additional code from a remote server, probe the filesystem, write files outside the gem directory. In practice, gems with this kind of extconf.rb would be caught quickly because the code is in the published gem and visible to anyone looking. But "visible to anyone looking" is a weak defense when most consumers do not look.
A more realistic risk is that the build process pulls in system headers and libraries that have their own vulnerabilities. A gem that links against an old OpenSSL may inherit CVEs in that OpenSSL even if the gem code is pristine. This is where SBOM-based scanning becomes essential, because you need to know what your installed binary is actually composed of, not just what the gem's Ruby code looks like.
The build-time network-access risk is often under-appreciated. Some extconf.rb implementations fetch source archives from the upstream project, libffi, libiconv, libyaml, and compile them inline. This introduces a dependency on a third-party download that is outside RubyGems.org's control. The psych gem historically pulled libyaml this way, though the current 5.1 series switched to either the system library or a vendored copy depending on platform. A gem that still fetches at build time is pulling from a URL the attacker does not control today but might tomorrow.
What can go wrong in a precompiled-gem build?
The precompiled-gem path moves the compile off your machine and onto the maintainer's. Whatever security properties their build environment has, you inherit. If the maintainer's GitHub Actions runner is compromised, the binaries you install are compromised. If the maintainer's local laptop is compromised, same story. You have no visibility into the build environment beyond what the maintainer chooses to publish.
The SLSA provenance movement is trying to fix this by requiring build platforms to produce signed attestations that describe the build environment, inputs, and outputs. GitHub Actions added SLSA Level 3 provenance support for Ruby gem builds in late 2023, and some security-conscious gems, ruby-lsp, sorbet-runtime, and a handful of others, publish attestations alongside their precompiled gems. RubyGems.org does not yet surface these attestations prominently, so consuming them requires custom tooling, but the data is available for gems that opt in.
The more common risk with precompiled gems is version-skew between the source code in the gem and the binary that actually ships. A reader who audits the source of gem-foo 2.3.1 has no guarantee that the .so inside the gem-foo-2.3.1-x86_64-linux.gem artifact was built from exactly that source. Reproducible builds would close this gap, but very few Ruby native extensions have reproducible-build pipelines today.
What system library CVEs should you worry about?
The system libraries that ship inside or link against popular Ruby native gems have their own CVE streams that Ruby Advisory Database does not comprehensively track. Some of the important ones to watch:
- libxml2 and libxslt, linked by nokogiri. CVE-2024-25062, a XInclude use-after-free disclosed in February 2024, affected versions before 2.12.5 and was present in nokogiri binaries until 1.16.4 shipped with the patched library.
- OpenSSL, linked by openssl (standard library) and indirectly by many gems. The 3.0, 3.1, and 3.2 series have had a steady stream of CVEs; CVE-2024-0727, disclosed January 2024, affected PKCS12 parsing in versions before 3.0.13, 3.1.5, and 3.2.1.
- libyaml, linked by psych until version 5.1. CVE-2024-35195, disclosed May 2024, was a heap-buffer-overflow in YAML::SYCK scanning that affected versions before 0.2.5.
- libsqlite3, linked by sqlite3-ruby. The 3.x series has ongoing fuzzing coverage and a moderate CVE cadence; make sure your gem's bundled version is current.
The gems that vendor these libraries have been improving their practice of bumping vendored versions promptly after upstream CVEs, but the practice is inconsistent across the ecosystem. Nokogiri is generally very responsive, with patches typically shipping within 7 to 14 days of an upstream libxml2 CVE. Smaller gems with less-active maintainers can lag by months or not patch at all.
How do you monitor this?
The baseline is SBOM-based scanning that includes the native-library layer. A plain gem-version scan tells you that your nokogiri is 1.16.2; an SBOM-based scan tells you that nokogiri 1.16.2 embeds libxml2 2.12.4 and libxslt 1.1.39, and cross-references those against CVE feeds for the upstream projects. Tools that generate CycloneDX SBOMs for Ruby applications typically capture this metadata; the cyclonedx-ruby and bundler-cyclonedx gems both emit it.
For precompiled gems, you want to verify that the binary you have matches the binary the maintainer published. This is less about security in the immediate sense and more about detecting tampering between RubyGems.org and your install step. The Bundler 2.5 checksum format covers this for gem artifacts; for a deeper check, you can hash the .so files inside the installed gem and compare against known-good hashes from a previous clean install.
For source-gem builds, the most useful hardening is running the build in a sandbox that limits network access and filesystem write paths. Some organizations do this by running bundle install inside a container with restricted egress. This catches extconf.rb implementations that phone home or fetch unexpected resources, and it makes the build more reproducible as a side effect.
What should maintainers of native-extension gems do?
The most impactful step is to adopt trusted publishing through GitHub Actions, which eliminates the long-lived API token and ties publishes to a specific workflow. The second step is to publish SLSA provenance attestations alongside your precompiled artifacts, which gives consumers a paper trail. The third step, harder but valuable, is to aim for reproducible builds so that a third party can verify your binary matches your source.
Vendoring system libraries comes with ongoing work. Pick libraries that have active upstream security response, keep your vendored version current, and document your upgrade cadence publicly so consumers know what to expect. A gem README that says "we bump vendored libxml2 within 14 days of a CVSS 7.0+ upstream CVE" is infinitely more useful than silence on the topic.
How Safeguard Helps
Safeguard scans your Ruby applications at the native-library layer, not just the gem layer, surfacing CVEs in vendored system libraries like libxml2, libyaml, and OpenSSL that pure-gem scanners miss. We track which of your native-extension gems ship precompiled binaries with SLSA provenance attestations, and we alert when a gem's precompiled variant diverges from what its source would produce. For gems that build from source in your CI, we catch unusual network access or filesystem writes during the extconf probe, giving you early warning of tampered build logic. Native extensions are the hardest part of Ruby supply chain to audit, and we make them legible.