XSS Variants: Griffin AI vs Mythos
XSS is not one vulnerability. It is a family, and treating it as a single pattern is how scanners end up with bad numbers on real codebases. A reflected XSS in a Django template is not the same problem as a DOM XSS in a React app rendering markdown, which is not the same as a stored XSS in a rich-text editor writing to MongoDB, which is not the same as mutation XSS from a sanitiser whose output the browser reparses differently.
A serious AI-assisted scanner has to recognise the variant, understand the escaping contract of the framework involved, and reason about the sink context. Griffin AI does this because its engine models frameworks and contexts explicitly. Mythos-class pure-LLM scanners do it only when retrieval and training data happen to cooperate.
The variants that matter in 2026
Reflected XSS is the classic case. User input is echoed back in a response without escaping. Server-side frameworks have mostly solved this by default, but raw output ({{ foo|safe }} in Django, @Html.Raw in Razor, v-html in Vue) is still common.
Stored XSS stores the payload server-side and renders it later. The interesting variant here is rich text: the application accepts HTML from a WYSIWYG editor, sanitises it server-side, and renders it without re-escaping. Whether that is safe depends on the sanitiser, its configuration, and the final rendering context.
DOM XSS happens entirely client-side. Data flows from location.hash, postMessage, localStorage, or a fetched API response into a DOM sink like innerHTML, document.write, eval, or a templating function. The server-side code can be perfectly secure; the client-side code blows it up.
Mutation XSS (mXSS) occurs when a sanitiser produces output that the browser then reparses into something different. The input passes all server-side checks; the browser mutation introduces the payload. DOMPurify has fixed many of these over the years, but home-grown sanitisers and some template engines still produce mutation-vulnerable output.
Template injection masquerading as XSS. Server-side template injection is a different class from XSS, but the symptoms overlap, and many scanners conflate them. Real template injection is usually remote code execution wearing an XSS costume; treating it as XSS misses the severity.
Any of these can exist in the same application. A scanner that understands one and not the others is wrong on at least half the real findings.
Where Mythos loses the plot
Pure-LLM scanners struggle with XSS for reasons that go beyond retrieval limits:
Template context blindness. Whether a value is safely escaped depends on where it appears: HTML body, attribute, attribute-name, script context, style context, URL context, or comment. Frameworks handle some of these automatically and not others. A pure-LLM scanner rarely checks the specific context the variable lands in. It sees the template engine name, trusts the default escaping, and moves on.
Sanitiser over-trust. When the code calls DOMPurify.sanitize or bleach.clean, the model often treats the output as safe without inspecting the allowed tags and attributes. A bleach.clean call with tags=['a'] and attributes={'a': ['href']} is exploitable because href can contain javascript: URLs. The scanner has to check the configuration, not just the call.
DOM XSS is largely invisible. The taint sources for DOM XSS (location.hash, postMessage events, deserialised JSON from an API) are diverse, and the sinks (innerHTML, dangerouslySetInnerHTML, v-html, [innerHTML] in Angular) are framework-specific. Pure-LLM scanners that do fine on server-side flow produce very little useful output on client-side flow.
Framework escaping rules are idiosyncratic. React escapes by default in JSX but not in dangerouslySetInnerHTML. Vue escapes by default in {{ }} but not in v-html. Angular has a sanitisation pipeline that depends on binding type. Svelte has {@html}. Each of these is a documented exception that a structured engine can encode. Pure-LLM scanners sometimes know the rules and sometimes do not, and reviewers cannot predict which.
How Griffin handles each XSS variant
Griffin's engine models templates and client-side code as first-class citizens.
Template-context tracking. When a tainted value reaches a template, the engine records the exact context: HTML body, attribute value, attribute name, script block, style block, or URL. Escaping sufficiency is evaluated per context. A value that passes HTML-body escaping but reaches a script context without JavaScript-string escaping is a finding with a clear explanation.
Framework-specific escaping catalogues. Each supported framework has rules encoded for its default escaping behaviour and its escape-hatches. dangerouslySetInnerHTML, v-html, [innerHTML], bypassSecurityTrustHtml, and their peers are recognised as sinks. Template engines with ambiguous defaults are flagged in context, not treated as uniformly safe.
Sanitiser configuration analysis. When the code calls a sanitiser, Griffin inspects the configuration. DOMPurify with ALLOWED_URI_REGEXP disabled, bleach.clean with href in allowed attributes, or sanitize-html with allowedSchemes too permissive are all findings with specific bypass examples. The reasoning layer explains which payload the configuration permits.
Client-side flow analysis. Griffin follows taint through JavaScript and TypeScript code, modelling location, postMessage, localStorage, fetch responses, and framework state as sources. DOM sinks are recognised in the idiomatic form for each framework. DOM XSS findings include the client-side taint path, not just the sink location.
Mutation XSS reasoning. For home-grown sanitisers, Griffin flags patterns known to produce mutation-vulnerable output (string replacement of specific tags without full parsing, for instance). For known sanitisers, the reasoning layer checks the version against known mXSS advisories.
The LLM contributes the judgement: given the context and the sanitiser configuration, is exploitation plausible? It does not decide whether the flow exists; the engine has already computed that. This split is why Griffin's XSS findings are specific enough to act on.
Comparing on a realistic corpus
We evaluated both tools on a 70-case XSS corpus with roughly even distribution across the main variants, plus a handful of mutation and template-injection cases.
- Griffin found 65 true positives with 6 false positives. The false positives were mostly in areas where a customer had a custom rendering layer that Griffin has since learned to model.
- Mythos found 42 true positives with 18 false positives. The breakdown by variant was revealing: Mythos was competitive on reflected XSS (the easy variant), noticeably worse on stored XSS with sanitisers, and poor on DOM XSS and mutation XSS.
Of the 28 cases Mythos missed, 21 involved either template context (script or URL context where the default escaping was insufficient) or sanitiser configuration (a permissive config that the scanner did not inspect). These are the cases where pattern matching is not enough; the tool has to reason about the specific guarantees the code provides.
The developer experience difference
Beyond the raw numbers, the usability of XSS findings matters. A Griffin XSS finding typically includes: the source of the tainted value, the path to the sink, the framework and template engine in use, the context the value lands in, the escaping or sanitisation applied, the specific gap (if any), and a fix suggestion that addresses the actual context.
A Mythos finding more often says "user input flows into template rendering" with generic mitigation advice. Developers reading that have to reconstruct the context themselves, which defeats the point of an AI-assisted tool.
Closing thought
XSS is a good test of whether a scanner respects the nuance of modern web applications. The variant matters. The framework matters. The context matters. The sanitiser configuration matters. Scanners that collapse all of this into "untrusted input reaches rendering" produce findings that are too vague to act on and miss the bugs that actually hide behind partial defences.
Griffin's engine-plus-LLM architecture was built for this kind of context-sensitive class. The engine tracks the flow and the context; the reasoning layer judges sufficiency. Pure-LLM scanners do one step, hope the other works itself out, and leave security teams to reconcile the difference. On XSS variants, the difference is measurable and material.