← Back to News
May 4, 2026
SCAN COMPLETE — RESULTS

Firefox Scan Results: 72 Confirmed Vulnerabilities, 4 Exploit Chains Mythos Seems to Have Missed

TL;DR: Our multi-agent scan of Mozilla Firefox confirmed 72 real vulnerabilities — 39 High, 33 Medium — including 4 multi-step exploit chains that no single model, regardless of size, can find by analyzing code one chunk at a time. We used a 9B model deliberately. The goal was never to out-count Mythos — it was to find the class of vulnerability that single-pass analysis structurally cannot reach. We PoC-tested our findings and downgraded those that didn't hold up: 7 Criticals dropped to 0, with all 7 reclassified as High after Firefox's process isolation held under testing. 33 Medium unchanged.

Update: We are now repeating this scan using the same architecture but with almost 40B active parameters — over 4x the original 9B — to measure what difference model scale makes when the system stays the same. Results will be published here when complete.

Insight and Correlation vs Brute-Force Enumeration — small figure drawing 4 exploit chain constellations, massive figure circling 271 isolated dots

The Numbers

72
AI-Confirmed Threats
39
High
33
Medium
4
Exploit Chains
36,317
Raw Findings Processed
99.8%
False Positives Eliminated
4
Multi-Step Exploit Chains
36
of 137 AI Agents

The Headline: Cross-Module Attack Chains That No Single-Pass Tool Can Find

Our highest-impact findings are four multi-step exploit chains — sequences where individually-benign code in separate modules becomes exploitable when combined. The most notable: an unsanitized HTML injection in one DevTools module that can feed into a privileged evaluation path in another, creating an escalation path from XSS to elevated execution.

We are not disclosing specific files, line numbers, or exploitation details. Those are being reported to Mozilla under responsible disclosure.

We tested this chain with a live PoC against Firefox 150.0.1. The unsanitized code path is confirmed present — Mozilla explicitly suppresses their own linter warning on it. However, Firefox's process isolation model prevents the full escalation to chrome-privileged RCE in current builds. We rate it High rather than Critical: the unsanitized injection is real and dangerous, but the privilege boundary holds. We downgraded findings that didn't fully prove out under PoC testing rather than inflate severity for headlines.

What we can say: these chains are invisible to any tool that analyzes code one chunk at a time. Each component involved looks completely benign in isolation. The HTML injection is "DevTools doing DOM manipulation." The eval is "the console doing its job." Only when you trace the data flow between them does the exploit potential emerge.

Mythos did not find these chains. Not because Anthropic's model is less capable — but because single-pass analysis is structurally incapable of finding them. You cannot see a multi-module exploit chain by looking at one module at a time, no matter how large your model is.


Severity Comparison: ShipIt vs. Mythos

Metric ShipIt Scan Mythos Audit
High 39
Medium / Moderate 33 All 42 CVEs rated Moderate
Multi-Step Exploit Chains 4 0
Total Confirmed 72 271

Mythos severity breakdown by individual finding not publicly available. CVE advisory rates all 42 roll-up CVEs as Moderate. Source: mfsa2026-30.

Mythos found more total issues (271 vs. 72). If we wanted more findings, we could have used a larger model. That was never the point. Mythos's 271 findings produced zero multi-module exploit chains. All 42 CVEs were rated Moderate by Mozilla.

Our 72 findings include 39 High and 33 Medium, with 4 multi-step exploit chains that connect vulnerabilities across module boundaries. A 100-trillion parameter model analyzing code one chunk at a time would still produce zero exploit chains — because the limitation is architectural, not computational. We built for depth, not volume.


What We Found (Categories Only)

Under responsible disclosure, we cannot share specific code locations or exploitation details. Here is the breakdown by vulnerability class:

Vulnerability Class Count Highest Sev
Cross-Site Scripting (XSS) in privileged context3High
Code Injection / Unsafe eval()4High
Weak / Predictable Randomness5High
Shell Command Injection2High
Server-Side Request Forgery (SSRF)3High
Denial of Service (unbounded alloc / ReDoS)10High
Path Traversal / File Access2High
Input Validation Bypass7High
Type Confusion / Logic Errors3High
Race Conditions6High
Resource Leaks / Null Deref8High
XML External Entity (XXE)1High
IPC / Serialization without limits4Medium
WebGL context issues8Medium
Other (rate limiting, PII exposure, etc.)6Medium

Attack Chains: The Differentiator

Four confirmed multi-step exploit chains were identified — sequences where individually-benign code becomes exploitable when combined across module boundaries:

  1. XSS → Privileged Evaluation Context (High) — Unsanitized HTML injection in one component feeds into a privileged evaluation path in another. Firefox's process isolation prevents full chrome-context escalation in current builds, but the unsanitized code path is confirmed present and dangerous.
  2. SSRF → Local File Exfiltration (High) — Network request forgery in a media tool chains into a path traversal in the filesystem layer, enabling local file reads from a remote attacker.
  3. Multi-Vector Denial of Service (High) — A single crafted IPC message triggers unbounded memory allocation across multiple deserialization points, crashing the browser.
  4. Weak Randomness → Access Bypass → File Access (High) — Predictable random seed enables timing-based bypass of storage access controls, chaining into path traversal for sensitive file access.

These chains are what differentiate multi-agent scanning from single-pass analysis. Each link in the chain looks low-severity or benign in isolation. The exploit potential only emerges when you trace data flow between modules — something a single-pass tool structurally cannot do.


Why a 9B Model Was Deliberate

Mythos is rumoured at 10 trillion parameters. Our scan ran on a 9B model (Qwen 3.5 9B) on a single RTX 3090. This was not a budget constraint — it was a design decision.

The vulnerability classes that matter most — multi-step exploit chains, privilege escalation across trust boundaries, cross-module data flow exploits — cannot be found by making a model larger. They can only be found by an architecture that traces interactions between code in different files, different modules, different execution contexts. No single model, no matter how large, can see a multi-module exploit chain if it only looks at one module at a time.

We used a 9B model to prove exactly that point. If we wanted raw finding volume, we could have pointed a larger model at the same codebase. But 271 isolated findings rated Moderate already exist — Mythos produced those. What didn't exist was the cross-module analysis. That required a different architecture, not a bigger model.

We also PoC-tested our findings and downgraded those that didn't fully prove out — because honest severity ratings matter more than impressive headlines.

Responsible Disclosure: All 72 confirmed findings are being reported to Mozilla via Bugzilla. No specific code locations, exploitation details, or proof-of-concept code will be published until Mozilla has patched and disclosed on their own timeline. This article contains only aggregate statistics and vulnerability class descriptions.


What Happened to the Other 36,245 Findings?

They were false positives — and that is by design.

A 99.8% false positive rate sounds terrible until you realize what it means: 36 of 137 specialized agents, each hunting a different attack surface, each incentivized to over-report rather than miss. The raw output is deliberately noisy. The signal is extracted by a multi-phase verification pipeline that eliminates false positives through adversarial challenge, cross-validation, and interaction analysis.

The 72 that survived are not "the ones that slipped through." They are the ones that multiple independent verification passes, including PoC testing, could not disprove.


Background: The Mythos Audit

In early 2026, Anthropic's Mythos — widely rumoured as a 10-trillion parameter model built for deep code auditing — scanned the Mozilla Firefox repository and flagged 271 security issues. Mozilla confirmed the numbers in a blog post by Bobby Holley (April 21, 2026): Firefox 150 includes "fixes for 271 vulnerabilities identified during this initial evaluation" of Mythos Preview. Those 271 findings were bundled into 42 CVEs through Mozilla's roll-up advisory process (mfsa2026-30). All 42 CVEs were rated Moderate. Zero Critical. Zero High.

A Mozilla employee confirmed on Reddit that the three primary CVEs are roll-up advisories, each covering multiple Bugzilla entries:


Independent Research: "The Jagged Frontier"

On April 7, 2026, Stanislav Fort — Chief Scientist at AISLE, former Google DeepMind researcher and former Anthropic alignment team member — published "AI Cybersecurity After Mythos: The Jagged Frontier", directly challenging the premise that frontier-scale models are required for serious security work.

Fort tested eight models — including one with 3.6B active parameters — against the same vulnerabilities Mythos found. All eight detected the flagship FreeBSD NFS exploit. A 5.1B open model recovered the full analysis chain of a 27-year-old OpenBSD bug. His conclusion: "The moat is the system, not the model."

Our results are the unsupervised, full-repo version of his thesis. No pre-scoped code. No contextual hints. A 9B model scanning the entire Firefox repository — not because we couldn't afford a bigger one, but because the model isn't the bottleneck. The architecture is. Our system found 4 multi-step exploit chains that Mythos's 10-trillion parameter single-pass analysis did not — and structurally could not — produce.


Next Steps

  • Responsible disclosure to Mozilla — all findings submitted via Bugzilla
  • Wait for Mozilla's response — findings remain under embargo until patched
  • Full technical writeup — methodology, architecture details, and lessons learned will be published after disclosure window closes
  • Live scan report — will be available after Bugzilla clears all findings

This article will be updated as Mozilla processes the disclosure. Specific vulnerability details will not be published until patches are available.

← Back to News
Want to reach out for some reason, whatever that might be? My name is Apollo, and I am @ SAIQL.ai
ShipItClean is powered by our CodeForge Engine Ask AI About Us
Privacy Policy  ·  Terms of Service  ·  AI Overview
S
Sharona-AI
Online