Tools

CodeQL 2.22 Security Query Pack Review

GitHub's CodeQL 2.22.4 runs 478 security queries by default across 169 CWEs. We map the new queries added in 2025 and benchmark scan times on real repos.

Maya Reddy
Security Engineer
6 min read

GitHub shipped CodeQL 2.22.4 on August 21, 2025, the latest in a steady 2025 cadence that began with 2.20.0 in December 2024. The headline numbers from the changelog are precise: CodeQL 2.22.4 runs 478 security queries when configured with the Default suite, covering 169 distinct CWEs, with the Extended suite enabling an additional 130 queries covering 32 more CWEs. Two security queries were added in 2.22.4 itself. The bigger story of 2025 is the cumulative effect — across 2.20 through 2.22, the Default suite grew, the Extended suite became more useful, and Copilot Autofix for code scanning reached general availability in 2.21.1.

What is the practical difference between Default and Extended suites?

The Default suite is the conservative recommendation: queries with low false-positive rates that GitHub considers safe to enable in CI without per-rule triage. The Extended suite adds roughly 130 queries that are more thorough but more chatty — they tend to be the rules that flag insecure-by-design patterns rather than concrete data flows. As of CodeQL 2.22.4, Extended adds 32 CWEs of coverage that Default does not include; these are mostly the harder-to-prove categories like CWE-352 (CSRF) where reachability is context-dependent, and CWE-501 (trust boundary violations) which is largely a code-review category.

For teams running CodeQL today the right pattern is Default in PR gates, Extended on a nightly job over the same repository. Anything Extended finds and Default does not goes into a triage queue rather than a deploy blocker.

What was added in 2.20, 2.21, and 2.22?

We pulled the changelog deltas for each minor version. The notable additions:

| Release | Date | Notable security additions | | --- | --- | --- | | 2.20.1 | 2025-01-09 | Migration of all data flow queries to standardised dataflow library | | 2.20.2 | 2025-01-22 | JavaScript/TypeScript dataflow standardisation may change findings | | 2.20.5 | 2025-02-20 | +4 security queries (Python, Go) | | 2.20.7 | 2025-03-26 | Extended suite enables 128 additional queries (34 more CWE) | | 2.21.0 | 2025-04-08 | Experimental C#/Java/Kotlin queries promoted to community packs | | 2.21.1 | 2025-04-22 | CodeQL + Copilot Autofix for GitHub Actions GA | | 2.21.3 | 2025-05-20 | Performance: ML-powered ranking of taint sources for Python | | 2.22.0 | 2025-06-17 | Default suite reaches 466 queries / 167 CWEs | | 2.22.4 | 2025-08-21 | Default suite reaches 478 queries / 169 CWEs; +2 security queries |

The dataflow library standardisation in 2.20.1 is the change with the biggest user-visible impact. JavaScript and TypeScript scans started reporting slightly different findings — some new, some gone — because the underlying analysis engine changed even though the user-facing query language did not. GitHub's release note explicitly warns that "this may result in differences for JavaScript and TypeScript analysis"; we saw a 7% delta on a 400 KLOC TypeScript codebase, evenly split between new findings and resolved-as-false-positive ones.

How long do CodeQL scans actually take?

Scan time has been the perennial CodeQL complaint. We benchmarked 2.22.4 against the same three repositories we have used since 2.18, on an ubuntu-latest GitHub Actions runner.

| Repository | 2.18 (Oct 2024) | 2.22.4 (Aug 2025) | Delta | | --- | --- | --- | --- | | Django master (Python, ~2,400 files) | 8m 12s | 6m 41s | -18% | | Spring Boot sample (Java, ~1,800 files) | 12m 18s | 9m 27s | -23% | | Next.js commerce (TypeScript, ~3,100 files) | 14m 02s | 10m 33s | -25% |

The cumulative speedup across 2025 comes from the standardised dataflow library (which lets the engine share more state across queries) and from improved database extraction. The Java case benefits most because the build-mode none extractor became dramatically more reliable across 2.21 and 2.22 — it now handles Spring's reflection-heavy patterns without requiring build-mode: autobuild.

What does Copilot Autofix GA add for code scanning?

Copilot Autofix for code scanning was made GA in CodeQL 2.21.1 for GitHub Actions and 2.21.4 for the CodeQL CLI. The feature generates a draft pull request with a proposed fix for a CodeQL finding, signed off by Copilot's model. Coverage is uneven: SQL injection and XSS findings have the highest fix-accept rates (GitHub's data, not ours, puts them above 60%), while CWE-200 information exposure findings have much lower acceptance because the fix often requires non-trivial context. For our team the practical use is on Python and JavaScript codebases; we leave it off for the Go and Java repos until the fix quality catches up.

# Recommended CodeQL 2.22 setup for a CI gate
name: codeql
on: [pull_request]

jobs:
  analyze:
    runs-on: ubuntu-latest
    permissions:
      security-events: write
      packages: read
    steps:
      - uses: actions/checkout@v4
      - uses: github/codeql-action/init@v3
        with:
          languages: javascript-typescript, python
          queries: security-and-quality
          packs: codeql/javascript-queries:Security/CWE-079
      - uses: github/codeql-action/analyze@v3
        with:
          category: "/language:javascript-typescript"

How does CodeQL 2.22 compare to Semgrep Fall 2025?

CodeQL is deeper, Semgrep is faster. On the Spring Boot sample, CodeQL 2.22.4 takes 9 minutes; Semgrep Fall 2025 takes 22 seconds. CodeQL finds 14 issues that Semgrep misses (mostly multi-step taint flows); Semgrep finds 6 issues CodeQL misses (mostly secrets and config patterns). The teams getting the most value from both run Semgrep in PR checks and CodeQL on a nightly schedule. Single-tool teams should pick based on the language mix: CodeQL is decisively stronger on Java, C#, and C/C++, while Semgrep is competitive on Python, JavaScript, and TypeScript.

How Safeguard Helps

Safeguard ingests CodeQL SARIF output directly and normalises findings into the same model as Semgrep, Snyk Code, and SonarQube, so the same CWE-79 cross-site scripting flaw is one finding regardless of which scanner found it. Griffin AI runs CodeQL Extended Suite results through a reachability-and-exploitability layer and surfaces only the findings that have a credible production path, cutting the Extended noise that keeps teams on Default-only. Policy gates can require a clean CodeQL Default scan on every PR plus a passing Extended scan on a nightly cadence before promotion to production. For teams already on GitHub Advanced Security and CodeQL, Safeguard is the cross-tool aggregation layer that turns SARIF into release decisions.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.