Semgrep and GitHub CodeQL are the two most discussed static analysis engines in 2024, and teams frequently ask which one to standardize on. The honest answer is that they are optimized for different things: Semgrep for fast, readable, developer-owned rules that run in pre-commit and PR checks; CodeQL for deep, whole-program data-flow analysis that runs nightly and catches what pattern matchers miss. This post is for AppSec leads deciding which of the two to fund, or whether to run both. We use Semgrep 1.78 (the July 2024 release) and CodeQL CLI 2.18.x with the standard security-and-quality suite. The dimensions we compare are rule authoring ergonomics, language coverage, taint analysis depth, scan time, IDE integration, and pricing, with specific limitations noted for each.
How do you actually write rules in each?
Semgrep rules are YAML with pattern DSL; CodeQL queries are a purpose-built logic language. A Semgrep rule for detecting a hardcoded secret is 10-15 lines of YAML with a pattern: block that looks like the source itself - any developer who knows the target language can write one in under ten minutes. CodeQL queries use QL, a Datalog-derived language, and a reusable SSA-based class library; a comparable query is 30-60 lines and assumes familiarity with DataFlow::Configuration and predicates. The payoff is that CodeQL can express conditions Semgrep cannot - for example, "a tainted value that flows through three method calls, one of which is defined in a sibling project." Semgrep's taint-mode closes some of this gap but is still per-function reasoning, not whole-program.
Which covers more languages well?
CodeQL goes deeper on fewer languages; Semgrep covers more with lighter analysis. CodeQL (2.18) supports C/C++, C#, Go, Java/Kotlin, JavaScript/TypeScript, Python, Ruby, and Swift, with full semantic understanding. Semgrep supports 30+ languages with varying maturity - JavaScript/TypeScript, Python, Java, Go, Ruby, PHP, C, and Scala are production-grade, while Kotlin, Swift, and Elixir are marked "beta" or "experimental" in the 1.78 release notes. For a polyglot monorepo with Terraform, Dockerfile, and shell scripts alongside application code, Semgrep's IaC and config coverage is meaningfully wider. For a Java backend with deep framework usage, CodeQL's built-in models for Spring, Struts, and JAX-RS will find issues Semgrep misses.
How good is their taint analysis?
CodeQL's is stronger by design; Semgrep's is good enough for most OWASP Top 10. CodeQL's taint tracking is flow-, field-, and context-sensitive and ships with pre-built source/sink catalogs for the major web frameworks. It can track a tainted HTTP parameter through ORM calls into a raw SQL string across files and even across compiled libraries. Semgrep's taint-mode (GA in 2023) supports interprocedural analysis within a single repo and has pre-built "Pro rules" for common frameworks, but it does not model reflection, dynamic dispatch, or cross-package flow as thoroughly. For detecting the long-tail of XSS and SSRF in a mature Spring codebase, CodeQL finds roughly 20-30% more true positives in our sandbox tests.
How do scan times compare on a real repo?
Semgrep is faster; CodeQL is slower and builds a database. On a 400k-LOC TypeScript monorepo, Semgrep 1.78 with the default p/typescript and p/security-audit packs completes in about 90 seconds on a 4-vCPU GitHub Actions runner. CodeQL on the same repo requires first a database build (roughly 6-8 minutes for TS, longer for compiled languages) and then a query run (another 4-5 minutes with the security-and-quality suite). CodeQL's performance penalty is by design - it builds a relational database of the code - but it means CodeQL is typically run nightly or on PR merge, while Semgrep is run on every commit. If your team gates on SAST at PR, Semgrep is the more practical default.
What do IDE and CI integrations look like?
Both have VS Code extensions; CI stories diverge on where results land. Semgrep has a VS Code and JetBrains extension that runs rules locally and annotates findings inline; it also has a hosted platform (Semgrep AppSec Platform) that collects PR findings and manages triage. CodeQL has a VS Code extension primarily aimed at query authors, not end-users - findings surface in GitHub's code scanning UI (requires GitHub Advanced Security on private repos). In CI, Semgrep emits SARIF and ships a GitHub Action, GitLab template, and generic Docker image; CodeQL has first-party GitHub Actions support but is awkward outside GitHub - running codeql CLI on Jenkins or CircleCI is supported but not first-class.
What does each cost in 2024?
Semgrep has a generous OSS core; CodeQL is free only for public repos. Semgrep CE (the engine) is LGPL and free; Semgrep's hosted platform with team triage starts at $40 per developer per month. CodeQL CLI is free for use on public and academic repos and for developing queries, but using CodeQL on private repos requires GitHub Advanced Security, which is licensed per active committer (typically $49/committer/month as of 2024). For a 300-engineer private repo org, both land in a similar price band; for open source projects, CodeQL is effectively free via GitHub-hosted runners.
Who wins for what workload?
- Developer-owned rules in PR checks - Semgrep.
- Deep taint analysis on Java, C#, Go - CodeQL.
- IaC, Docker, shell alongside app code - Semgrep.
- Open source projects on GitHub - CodeQL, free with code scanning.
- Fast feedback in pre-commit - Semgrep (CodeQL is too slow).
- Large-scale variant analysis across many repos - CodeQL, via multi-repo queries.
How Safeguard Helps
Safeguard unifies SAST output with SCA, SBOM, and runtime signals. Teams running Semgrep for fast PR checks and CodeQL for nightly deep scans commonly import both into Safeguard, which deduplicates findings, applies Reachability Analysis to show which SAST hits actually sit on live code paths, and enforces policy gates before a release is tagged. Griffin AI produces a single remediation queue across both tools, suppressing duplicates and flagging high-confidence, high-exploitability issues first. The result is that Semgrep and CodeQL each do what they are best at, and Safeguard provides the enterprise reporting and governance layer on top.