June 2026 OpenRouter Rankings: Company Standings and Model Top 10
OpenRouter is one of the most credible sources for real-world AI model adoption. It aggregates token volume from millions of developers worldwide—no vendor marketing, only production votes. The leaderboard shows which models teams actually trust in live code, not which lab wins a static benchmark.
By company (weekly token volume, June 2026):
| Rank | Company | Origin | Weekly Tokens | Share |
|---|---|---|---|---|
| 1 | DeepSeek | China | 5.13T | 17.6% |
| 2 | Anthropic | US | 4.34T | 14.8% |
| 3 | US | 3.66T | 12.5% | |
| 4 | OpenAI | US | 2.46T | 8.4% |
| 5 | Xiaomi | China | 2.42T | 8.3% |
| 6 | MiniMax | China | 2.37T | 8.1% |
| 7 | Tencent | China | 2.36T | 8.1% |
| 8 | Alibaba Qwen | China | 1.26T | 4.3% |
Chinese vendors in the top tier alone account for roughly 46% of tracked volume. Include Moonshot and adjacent Chinese routes, and developer traffic from Chinese models crosses 60%.
By model (daily token volume, Top 10):
| Rank | Model | Vendor | Daily Tokens |
|---|---|---|---|
| 1 | DeepSeek V4 Flash | DeepSeek | 619B |
| 2 | Hy3 Preview | Tencent | 451B |
| 3 | MiniMax M3 | MiniMax | 447B |
| 4 | MiMo-V2.5 | Xiaomi | 327B |
| 5 | DeepSeek V4 Pro | DeepSeek | 300B |
| 6 | Claude Opus 4.7 | Anthropic | 263B |
| 7 | Claude Opus 4.8 | Anthropic | ~200B |
| 8 | Claude Sonnet 4.6 | Anthropic | 178B |
| 9 | Gemini 3 Flash Preview | 156B | |
| 10 | Kimi K2.6 | Moonshot AI | ~150B |
Still shopping by MMLU: Lab scores and production wallets often diverge—month-end bills tell a different story than leaderboard headlines.
Ignoring June structural shifts: Fable 5 delisting, dual IPO rumors, and Chinese share crossing 60% all change routing logic at once.
Conflating volume with quality: DeepSeek leading traffic does not mean it surpasses Opus 4.8 on the hardest tasks.
Single-model religion: Hard-coding one provider becomes technical debt when Q3 releases land in a six-week window.
API online, host offline: A closed laptop kills agent pipelines regardless of how accurate the rankings are.
US Share Fell from 70% to 30% in One Year: An Economics Story
A Bloomberg-cited chart makes the shift unmistakable:
| Period | US model share (Google + OpenAI + Anthropic) |
|---|---|
| June 2025 | ~70% |
| June 2026 | ~30% |
Where did the missing 40 points go? Chinese models absorbed them. This is not a domestic-preference effect—OpenRouter users span the US, Europe, India, and beyond. They route to DeepSeek, Xiaomi, and MiniMax because those models are cheap, fast, and good enough for daily work.
"Coding with Claude runs about ten dollars an hour. With DeepSeek, under fifty cents." — San Diego developer, verbatim
This is not primarily a quality narrative. It is an economics narrative. A Dallas developer described their stack: roughly $500/month on Claude and ChatGPT for hard problems, and about $200/month on MiniMax, Kimi, and MiMo for the other 90% of coding and speech recognition. Route by complexity, optimize by cost—that is the 2026 default.
Volume Leader Is Not Quality Leader: Opus 4.8, Fable 5, and Three Chinese Drivers
Quality ceiling: Claude Opus 4.8 still ranks first overall on the Artificial Analysis Intelligence Index (through late May 2026):
| Model | Intelligence Index | SWE-bench Pro | Notes |
|---|---|---|---|
| Claude Opus 4.8 | 61.4 (#1) | 69.2% | Long context and agents |
| GPT-5.5 | 59–60 | 63.1% | Ecosystem and tool calls |
| Gemini 3.1 Pro | 57 | — | Hardest reasoning tasks |
| Qwen 3.7 Max | 57 | — | China closed flagship |
| Claude Sonnet 4.6 | — | 80.8% (SWE-bench Verified) | Writing and instruction following |
One engineer ran 20 tasks head-to-head: Claude Opus 4.8 won 16, GPT-5.5 won 5, Gemini 3.1 Pro won 4. On long-context work, Opus was effectively dominant.
Claude Fable 5 once scored a perfect 100/100 on quality ratings and roughly 95% on SWE-bench Verified. US export controls forced a global delisting in mid-June 2026; its status is still unclear. Fable 5 proves US frontier models can still lead on raw capability—accessibility is now the constraint.
Volume champions: Chinese models win daily tasks on value. Three structural reasons explain the shift:
Price: MiniMax M3 API input pricing is $0.60/M tokens—about one-eighth of Claude Opus 4.8 at $5.00/M.
Good enough: For daily coding assist, completion, translation, and summarization, Chinese models deliver 80–90% of frontier quality.
Open weights: DeepSeek V4 and MiniMax M3 ship open weights so teams can self-host and remove data-privacy friction.
| Use case | Recommended model | Rationale |
|---|---|---|
| Complex code / agents | Claude Opus 4.8 | Top intelligence index; best long context |
| Daily coding assist | DeepSeek V4 Flash / MiMo-V2.5 | Extreme value; fast |
| Ultra-low-cost API | MiniMax M3 | $0.60/M; open weights; self-hostable |
| Long context | Kimi K2.6 (1M context) | Very long window; fair price |
| Google ecosystem | Gemini 3.5 Flash | Native Google Workspace integration |
| Real-time web search | Grok 4.3 | Live X/Twitter content access |
| Self-hosted deployment | GLM 5.2 / Kimi K2.6 | Top-tier open weights |
| Image generation | ChatGPT Images 2.0 | Strongest text rendering |
| Daily chat experience | GPT-5.5 | 52.5% fewer hallucinations vs GPT-5.3; mature ecosystem |
Six Steps to a Model-Agnostic AI Coding Workflow
Route by task complexity: Send the hardest 5% to Claude Opus 4.8 or GPT-5.5; run the other 95% on DeepSeek V4 Flash, MiMo-V2.5, or MiniMax M3.
Centralize on OpenRouter: Track openrouter.ai/rankings weekly instead of hard-coding a single model ID.
Set billing circuit breakers: Define daily caps from price per million tokens times call volume; default agent batches to low-cost routes and escalate to Opus only for hard refactors.
Watch the Q3 release window: GPT-6, Claude Opus 5, Gemini 4, and DeepSeek V5 may land within a six-week span in Aug–Sep—leave switch slots in your routing matrix.
Evaluate enterprise compliance separately: Chinese model share will keep rising among individual developers, but Fortune 500 procurement faces data-security and US congressional scrutiny—compliance is the ceiling.
Provision 7×24 agent hosts: Move Cursor, Claude Code, and OpenClaw off laptops onto dedicated cloud Macs: launchd persistence, Keychain for multi-provider API keys. Compare the pricing page and help center to pick a term.
2026 is widely labeled the year agents move from experiment to production. Anthropic's 2026 AI Agent Status Report shows nearly 44% of Claude API calls come from math and computer-science tasks. In H2, the winner is whoever runs multi-step agents most reliably—not whoever wins a single benchmark headline.
H2 Forecast: Q3 Model Wave and Five Macro Trends
Confirmed or high-probability Q3 2026 releases:
| Model | Vendor | Expected timing | Key angle |
|---|---|---|---|
| GPT-6 | OpenAI | Aug–Sep 2026 | Longer context (rumored 1.5M tokens); stronger agents |
| Claude Opus 5 | Anthropic | Around Sep 2026 | Successor to Opus 4.8; long-horizon agent upgrade |
| Gemini 4 | Q3 2026 | Multimodal push: video understanding, audio input | |
| DeepSeek V5 | DeepSeek | Q3 2026 | Open weights; ~1T parameters; closed-frontier parity target |
| Grok 4.3+ | xAI | Q3 2026 | 1M context; enhanced real-time web |
| GLM 5.2 | Z.ai | Released | Top open-weight option today; strong coding |
Five macro predictions:
Competition shifts to "best for this scenario": Five labs may ship inside a 90-day window—no single "best model." Closed frontier handles the hardest 5%; Chinese open weights absorb the other 95% of daily volume.
Chinese share keeps climbing; enterprise compliance caps it: Individual developers may push past 70% of OpenRouter traffic; Fortune 500 procurement likely stays under 30%.
Agents are the real battlefield: The axis moves from benchmark scores to "can this reliably run a 50-step agent workflow?"
IPO pressure reshapes pricing: OpenAI and Anthropic both signaled IPO interest in June 2026—public-market pressure may accelerate a price war with Chinese models.
Local models break through: By 2027, consumer GPUs with 32GB RAM may run local models above 80% on SWE-bench coding tasks.
Note: Data from OpenRouter live traffic, Artificial Analysis, and SWE-bench Pro; article date 2026-07-01. For the latest leaderboard visit openrouter.ai/rankings.
The underlying story is margin compression across the model layer. DeepSeek proved in early 2025 that frontier quality does not require frontier compute spend. Xiaomi, Tencent, MiniMax, and Moonshot drove base pricing toward the floor. US labs are splitting strategies: OpenAI bets on ecosystem (plugins, enterprise integration, DALL-E, Codex Mobile); Anthropic defends the quality high ground; Google pushes speed and multimodal (Gemini Flash remains a strong closed-source value pick). The middle tier—"almost as good but still expensive"—is disappearing fast.
API routing alone cannot replace an agent host: laptops sleep when lids close, export controls can delist flagship models overnight, and self-hosted open weights need 96GB+ unified memory—each path carries hidden cost. For production teams that need 7×24 multi-model agent pipelines with flexible OpenRouter switching, a KVMNODE dedicated cloud Mac Mini is usually the steadier option: native Apple Silicon toolchains, daily/weekly/monthly terms. See the pricing page; order via the order page.