AI Security

API Surface Reviewed: Griffin AI vs Mythos

Most platform comparisons stop at features. The API surface is where automation and integration actually happen — and where vendors quietly diverge.

Nayan Dey
Senior Security Engineer
5 min read

Feature comparisons are clean. API comparisons are honest. A platform's API surface tells you what is actually possible to automate, what the team has prioritised over the last two years of engineering work, and how much of the user experience is real product versus dashboard glue. The Griffin AI vs Mythos comparison at the feature level is interesting; at the API level, the architectural differences come into focus and explain why some customers move faster than others through procurement.

What does API surface depth actually mean?

Three measurable dimensions:

  • Coverage. What percentage of the platform's behaviour is reachable from the API without a UI fallback?
  • Stability. How long do endpoints last before deprecation? What is the deprecation policy?
  • Composability. Can the outputs of one endpoint be passed cleanly to another, or does every workflow require glue logic?

These three together determine whether your team can build automation on top of the platform or whether you are stuck with the vendor's UI as the only practical interface. Griffin AI's API surface scores high on all three; Mythos-class platforms typically score on coverage and stability but lower on composability because the LLM-mediated output shapes are less stable.

Schema stability under model upgrades

One of the most underappreciated dimensions of an AI-for-security API is what happens to the response schema when the underlying model changes. If a finding's description field is generated by an LLM, a model upgrade can subtly shift the verbosity, structure, and vocabulary of the description. Downstream parsers built against the old format break silently.

Griffin AI ships explicit schema-stable fields (CWE classification, taint path, confidence score, fix-PR status) and an opt-in narrative field that is acknowledged to vary across model versions. Customers building automation point at the stable fields; the narrative field is for human consumption.

Mythos-class platforms more commonly emit a single summary or description blob that is the entire output of the LLM call. Any automation built on top of that field is on a deprecation clock the customer cannot see.

Query expressiveness

A mature API supports composition: filter findings by repo + severity + reachability + age + assignee + status. Each filter is an independent dimension. The query language doesn't know about your specific use case but supports any combination you can express.

Griffin AI's findings API is filterable on every persistent field with consistent operator semantics (eq, in, gt, between, contains). The query language is typed; invalid filters fail at validation time, not runtime. Pagination is cursor-based with stable cursors that survive concurrent writes.

Mythos-class platforms more often expose a few prebuilt views (e.g., "critical findings," "recent findings") and require API consumers to filter client-side once they have results. This works for small data volumes and breaks at scale.

Webhook semantics

Webhooks are where API design decisions become operationally visible. The questions:

  • Does the platform deliver webhooks at-least-once or at-most-once?
  • Are webhook payloads versioned?
  • Is there a retry policy? With what backoff?
  • Can webhooks be paused during incident response without losing events?
  • Is there a replay mechanism for events delivered during downstream downtime?

Griffin AI delivers at-least-once with versioned payloads, exponential backoff, pause-and-replay support, and a sequence number that allows downstream systems to detect gaps. Mythos-class platforms vary widely; many deliver at-most-once without replay, which means downstream incident-response systems can miss events during their own outage windows.

Authentication and credential rotation

API tokens for AI-for-security platforms grant broad access — they can read findings (sometimes including sensitive code), trigger scans, modify policies. The credential rotation story matters.

Griffin AI supports short-lived tokens via OIDC federation with major identity providers, scoped per-purpose tokens for narrow integration scenarios, and a documented rotation cadence. Audit logs include token-level attribution.

Mythos-class platforms typically rely on long-lived API keys. Rotation is manual. Audit attribution is at the user level, not the token level. None of this is fatal, but it is a different operational burden.

Code samples in the docs

A real signal of API maturity: are there working code samples in the docs that exercise actual workflows, not just toy curl examples? Are the SDKs maintained? Do the samples include error handling, pagination, and rate-limit handling — or do they assume happy-path?

Griffin AI ships SDKs for Python, TypeScript, and Go, with maintained CI that runs sample code against a staging environment on every release. Samples cover end-to-end workflows: ingest a finding, enrich with reachability, file a Jira ticket, comment on the PR, mark resolved.

Mythos-class platforms range from "we have a Python SDK" to "documented endpoints, build your own client." The lower end of the range is fine for evaluations and frustrating for production.

What an API audit should ask

Five concrete checks during evaluation:

  1. Show me the full OpenAPI spec.
  2. Walk me through how a parser built against today's response would handle a model upgrade.
  3. Demonstrate webhook replay during a 30-minute simulated downstream outage.
  4. Rotate an API token without downtime and show me the audit trail.
  5. Build a small automation in front of me using the SDK.

The answers separate platforms that have invested in API as a product surface from platforms that treat API as a documentation deliverable.

How Safeguard Helps

Safeguard's API is treated as a first-class product surface, not a documentation export. Schema stability is contractual: stable fields survive model upgrades, narrative fields are explicitly marked as varying. Webhook delivery is at-least-once with replay. SDKs are maintained for Python, TypeScript, and Go. For teams whose security automation depends on API-layer reliability, the engine-plus-LLM architecture's structured outputs make the API a foundation you can build on rather than a moving target you have to defend against.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.