The roadmap is refreshed quarterly. Every item is categorised by stage — Shipping, Building, Designing, or Researching — so the further out an item sits, the more honest we are about how committed we actually are. Dates are intentional, not contractual; the stage tells you which.
Code merged, in active rollout — feature-flagged for the current quarter.
Implementation underway with a committed quarter. Scope locked.
Design doc, API surface, and acceptance criteria in review. Quarter is intentional, not contractual.
Open research. Feasibility, eval methodology, and dataset shape still being established.
Items closer to today are mostly Shipping or Building. The further out the quarter, the more items sit in Designing and Researching — and the looser the date.
Each quarter card is the same shape — headline, primary stage, blurb, and four to six items. The composition of stages drifts from Shipping at the front to Researching at the back, which is what an honest roadmap should look like.
Closer to where developers already work.
The hot path gets faster, the policy DSL gets a major version, and the MCP profiles stop being community-maintained. Most of this quarter is in the rollout window already.
Three distilled heads at the same 1B budget, each specialised by tokeniser bias and sink coverage. JVM (Kotlin / Java / Scala), Python (incl. typed-stub awareness), and Go (incl. cgo boundaries). Inline F1 lifts measurably without exceeding the 100 ms p95 inline ceiling.
An FP8-quantised Griffin S checkpoint, validated against the bf16 baseline on the standard internal eval suite. Drops cost-per-1k-tokens on shared cloud meaningfully — recommended default for PR-time reasoning where the L head's depth isn't required.
Rego-flavoured v2 syntax with first-class CWE, EPSS, KEV, reachability, and SLA primitives. One policy compiles to enforcement in CI, IDE, the desktop app, and the admission controller. Backwards-compatible v1 shim retained through end-2027.
First-party MCP profiles for the three common agent surfaces, with tested capability scoping defaults, Lino-screened egress, and reference policies. Replaces the community-maintained adapters.
When the auto-router escalates a finding from Griffin S to L, the console shows the two traces side-by-side — what S concluded, what L added or refuted. Reviewers can audit the escalation decision, not just the final verdict.
Auto-fix grows up; the marketplace opens.
Auto-fix learns to coordinate across services in the same campaign, the marketplace gets an SDK for partner-built scanners, and we add a second compute path for the L head.
A TPU-served Griffin L variant in addition to the GPU baseline, for sovereign and dedicated deployments where TPU availability is the binding constraint. Latency parity with GPU L; same weight provenance and attestation guarantees.
When one CVE touches many services, auto-fix coordinates the fan-out: prioritises by reachability, opens linked PRs, tracks the campaign as a single object with one regulator-export row. Replaces 47 disconnected tickets with one tracked rollout.
A typed SDK and certification path for partner-built scanners. Cross-scanner dedup, evidence binding, and policy integration come for free; partners just supply detection logic. The 11 first-party scanners stay first-party.
Watch a running container, compute an SBOM delta on every layer change, alert on drift from the signed build-time SBOM. Closes the gap between what was signed and what is actually running.
Current four-stage trace format is verbose. v3 collapses to a compressed-but-replayable trace that still passes attestation review — saves storage on multi-tenant audit-log volumes.
Disclosure becomes a first-class workflow.
Q1 is where the disclosure side of the platform catches up with the detection side: coordinated disclosure as a built-in workflow, an Eagle v3 head with cross-language signals, and a browser sandbox for prospects who don't want to install anything.
Built-in disclosure object: draft from a Griffin Zero candidate, maintainer-managed mailbox, embargo timer, fix-availability gating, and CVE filing once the upstream patch ships. Replaces a manual coordination dance with a state machine.
Eagle v3 incorporates cross-language taint signals — a Python sink fed from a Go service via gRPC stops being two unrelated findings and starts being one ranked path. Same 13B parameter budget, retrained ranking head.
A hosted vulnerable-app sandbox running in the browser, pre-wired to a sandbox tenant. Prospects can walk an end-to-end PR-to-disclosure flow without installing anything on a laptop. Replaces today's video-only demo path for one of the three paths.
An iOS / Android SDK that instruments the JS / native runtime to feed runtime taint signals back to the platform. Closes the gap on what the build-time SBOM can't see — runtime-resolved dynamic imports and native bridge calls.
Treat risk like a budget: each team gets a quarterly allowance, exceptions consume it, surplus carries over. Surfaces as a console widget; ties policy decisions to organisational headroom rather than per-finding negotiation.
The next-generation lineup begins.
Q2 is mostly the research horizon. On-device Griffin Lite, federated learning across opt-in cohorts, and a Zero MoE compression are all in feasibility — committed to a quarter, but the scope will harden over Q1.
An 8B Griffin Lite checkpoint, quantised hard enough to run on an 8GB-RAM developer laptop with no cloud burst. Brings the deep-pass reasoning into the offline path. Hard problem: keep the chain-of-thought trace coherent at that compression ratio.
An opt-in federated training mode where customer-side fine-tuning signals improve the shared Eagle and Griffin heads without customer code ever leaving the tenant. Feasibility-gated on differential-privacy guarantees we can sign on.
Opt-in, per-tenant-controlled aggregation of taint-path archetypes — not code, just abstract sink + sanitiser shapes. Improves Eagle's recall on novel patterns without compromising tenant isolation. Hard problem: consent UX and per-finding revocation.
Zero today routes top-2 of 8 experts (~37B active). The research target is a 4-experts-active variant for sovereign deployments that need the depth on smaller GPU counts. Loss budget against today's Zero is the open question.
Instead of point-in-time audit exports, expose a regulator-credentialed signed feed of the audit log. Regulator pulls when they need to; tenant signs each delta. Designed against DORA Article 28 and NIS2 oversight expectations.
Items can move between quarters as new information lands — that is the point of refreshing every quarter. If your team depends on a specific dated commitment, talk to us; specific commitments live in contracts, not on this page.
Roadmaps that only describe what is being shipped are half-honest. The other half is what is being declined. The list below is what we have looked at and chosen not to build.
An item moves through four filters before it sits on a quarter. By the time you see a date, the work has already survived feasibility, scoping, and a design pass.
Support tickets, QBRs, community channels, and design partner sessions all flow into the same backlog. Every signal carries who it came from, what surface it lives on, and how often we've seen it.
Signals get grouped into research items with a hypothesis, a counter-hypothesis, and a feasibility note. Items can sit in the backlog for quarters — that is by design. We do not commit to a quarter we cannot reason about.
Items that survive feasibility move into design: API surface, eval methodology, acceptance criteria, rollback path. This is where most of the honesty is — a lot of items quietly die here.
Designed items get a quarter and a stage. The stage is what you read on this page. We do not commit a date earlier than the design pass; if you see a date, the work has already survived three filters.
Enterprise tiers get a private quarterly roadmap with named owners and committed dates. Book a session and we will walk through it against your supply-chain posture.