Open Source Security

Python Cython Extensions and the Supply Chain

Cython-built Python extensions ship as platform-specific binaries with a build toolchain behind them. That introduces supply chain surface most teams have not mapped.

Nayan Dey
Senior Security Engineer
7 min read

Cython is one of those tools most Python developers use transitively without thinking about. NumPy, Pandas, SciPy, scikit-learn, lxml, PyYAML (when built with C speedups), Pydantic (historically), and dozens of other high-performance packages ship Cython-generated C code that gets compiled into platform-specific binary extensions. If you pip install numpy, you are installing a wheel that was built from Cython sources against a specific C compiler, against specific platform libraries, on a specific CI runner.

That is a supply chain story, and it is not the same story as pure-Python packages. This is a look at what Cython-based dependencies mean for your security posture.

Two Artifacts, Two Trust Boundaries

When a Cython-using project publishes to PyPI, there are typically two things uploaded:

A source distribution (.tar.gz) containing the .pyx Cython source files, sometimes the generated .c files, a setup.py or pyproject.toml, and everything needed to build from scratch. If a user installs from source, Cython runs, then a C compiler runs, and the result is a .so or .pyd binary that becomes part of the installed package.

One or more wheel distributions (.whl) per supported Python version and platform. Linux x86_64 with CPython 3.11 is one wheel; macOS ARM64 with 3.11 is another; Windows x86_64 with 3.12 is another. Each wheel is a pre-compiled binary.

The trust story is different for each. For a source install, you are running the project's build pipeline on your machine, including Cython and a C compiler. For a wheel, you are trusting that the wheel was built honestly from the declared source and not tampered with.

Most users install wheels, most of the time. pip install numpy on a supported platform gets a pre-built wheel and never invokes Cython or GCC on your machine.

Cibuildwheel Is the Industry Standard

Most Cython-heavy projects build their wheels using cibuildwheel, a tool that wraps up the pattern of "build all the wheels for all the platforms in CI." It integrates with GitHub Actions, runs platform-specific containers (manylinux for Linux, cross-compilation or matrix runners for macOS and Windows), and produces the matrix of wheels a project needs.

cibuildwheel is reasonably security-conscious. It pins the manylinux image to specific SHAs, it uses the official Python builds inside the container, and it supports attestation workflows. A project that uses cibuildwheel with the default settings inherits those protections.

Projects that hand-roll their wheel builds — uploading wheels manually from a maintainer's laptop, using bespoke CI scripts — have a weaker provenance story. When evaluating a Cython-based dependency for security-critical use, check how the wheels are built. The CI configuration in the repository usually tells you.

Does Your Wheel Match the Source?

This is the hard question. When you pip install a wheel, you are trusting it matches the published source distribution. There is no automatic verification. The wheel might have been built from a different git commit, might include additional code that is not in the source tarball, might link against a differently-configured C library.

PEP 740 attestations help — they tell you a wheel was built by a specific CI workflow from a specific git commit. But the attestation does not prove that the Cython-to-C translation was honest, that the C compiler was honest, or that the linking pulled in only what was declared.

The practical mitigation for high-security deployments is reproducible builds. For a small number of critical dependencies, rebuild the wheel yourself in a controlled environment and compare output against the published wheel (or better, bit-for-bit against a well-known reproducible build output). Reproducible Python wheels are hard (timestamps embedded in metadata, compiler optimization non-determinism, etc.), but it is doable and some projects are actively working on it.

For most of your dependency tree, you accept the attestation and move on. The goal is to have a tier of critical deps where you have rebuilt and compared.

What About Bundled Libraries?

Many Cython-using projects also bundle C libraries inside the wheel. NumPy's Linux wheels bundle a specific version of OpenBLAS. Pillow bundles libjpeg, libpng, libtiff, and more. lxml bundles libxml2 and libxslt on some platforms.

These bundled libraries are C dependencies that get shipped with the Python package. They have their own CVEs. When libxml2 publishes a security fix, a Pillow wheel built before that fix is shipped ships the vulnerable version — and your pip install pillow==X.Y is not going to automatically pick up the libxml2 fix until Pillow publishes a new wheel.

This is why a CVE in a C library can cascade across the Python ecosystem with long tails. The libjpeg vulnerabilities of the mid-2010s lingered in bundled-libjpeg Python wheels for months after the C library was patched, because each downstream Python package had to cut a new release.

For SBOM purposes: the Python wheel is one component, but the C libraries inside it are separate components with their own versions and advisories. A thorough SBOM should unpack wheels and enumerate their bundled binaries. Most SBOM tools do not do this by default. Check yours.

Historical Incidents

NumPy wheel issues, various: NumPy's wheel build matrix is complex and has occasionally shipped wheels with subtle mismatches between architectures. Not malicious incidents, but illustrations of how hard the build pipeline is.

The ctx incident of 2022: not Cython-specific, but a reminder that PyPI account takeover plus a package with a C extension means malicious code gets compiled into your binary at install time and then runs with full process privileges. The ctx attacker could have included Cython-compiled payloads if they had wanted to; the attack they shipped was simpler.

Ultralytics PyPI incident, late 2024: the ultralytics package (YOLO implementation) had compromised releases pushing crypto miners. The package has C extensions and uses cibuildwheel. This incident showed that even well-maintained Cython-using projects can be compromised via CI pipeline attacks — the compromise was via a poisoned GitHub Action.

Should You Prefer Pure-Python Alternatives?

Sometimes. A pure-Python implementation has simpler supply chain properties: no compilation, no platform-specific wheels, no bundled C libraries, attestation is a cleaner story. For small-volume code paths, the performance gap is often acceptable.

But for workloads where Cython is in the path for legitimate performance reasons (numerical computing, parsing, cryptography), the tradeoff is worth the supply chain complexity. The mitigation is to treat Cython-using packages as higher-attention deps: pin tighter, audit releases more carefully, require attestations.

Building Your Own Cython Extensions

If your team maintains Cython extensions internally, the security checklist is specific.

Use cibuildwheel, not hand-rolled builds. Inherit the patterns the ecosystem has worked out.

Pin your Cython version in [build-system] requires. Cython's code generation can change between versions and you want reproducibility.

Bundle libraries with care. If you link against OpenSSL, libxml2, or other sensitive C libraries, those versions become part of your package's SBOM. Update them on a schedule.

Sign your wheels with Sigstore, via pypi-attestations or the equivalent. Provenance for binary wheels is more valuable than provenance for pure-Python packages, because there is more a malicious builder could do.

How Safeguard Helps

Safeguard treats Cython-using wheels as composite SBOM entries, enumerating bundled C libraries like OpenBLAS, libxml2, and libjpeg so that CVEs in those libraries surface against the Python packages that ship them. Reachability analysis evaluates whether your application code actually calls into the Cython extension paths that are affected by an advisory. Griffin AI can draft upgrade PRs that account for bundled-library CVEs, pointing you at the next Python package release that ships the fixed C library rather than suggesting upgrades that leave you on the vulnerable binary. Policy gates can block production deploys when a critical Cython-using dependency is missing an attestation or ships a bundled library below a CVE-driven minimum version.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.