WHAT CHANGED IN JULY 2026
● Claude Fable 5 is back (July 1) — export controls lifted June 30, restored globally. Retakes coding crown at 80.3% SWE-bench Pro. $10/$50/M. Max, Team, Enterprise plans.
● Claude Sonnet 5 launched (June 30) — new default for Free and Pro. Beats GPT-5.5 on 6/6 comparable benchmarks. $2/$10 intro through Aug 31.
● GPT-5.6 Sol/Terra/Luna — restricted government preview only. General availability expected mid-July. Not in rankings until publicly accessible.
● Sora 2 consumer app retired April 26 — Veo 3.1 now leads video generation. Sora 2 API available to developers until September 24, 2026.
● Gemini 3.5 Pro — targeted July general availability. Not yet widely available at time of writing.
● Open-weight models — DeepSeek V4 (80.6% SWE-bench Verified), MiniMax M3 (80.5%), Kimi K2.7 (80.2%) now match Gemini 3.1 Pro at a fraction of the price.
Best AI for Coding — July 2026
| Model | SWE-bench Pro | SWE-bench Verified | Price (in/out per 1M) | Verdict |
| Claude Fable 5 ✓ #1 | 80.3% | 95.0% | $10/$50 | Highest ceiling, back July 1 |
| Claude Opus 4.8 — Best value flagship | 69.2% | 88.6% | $5/$25 | Best everyday pick for most teams |
| Claude Sonnet 5 — Best value per dollar | 63.2% | ~80%+ | $2/$10 intro | Beats GPT-5.5 at lower cost |
| GPT-5.5 — Best SWE-bench Verified | 58.6% | 88.7% | $5/$30 | Best ecosystem, Codex included |
| Grok 4.3 — Best API price | ~63-75%* | ~72-75%* | $1.25/$2.50 | Cheapest major-lab API |
| DeepSeek V4 — Best open-weight | ~65% | 80.6% | $0.14/$0.28 | Best open-source alternative |
* Grok 4 scores vary significantly by testing conditions — vendor-reported vs independent Scale SEAL results differ substantially. Use directionally.
July verdict for coding: Claude Fable 5 reclaims #1 at 80.3% SWE-bench Pro. For most teams, Claude Opus 4.8 at $5/$25 is the practical default. Claude Sonnet 5 is the best value — beats GPT-5.5 on every comparable benchmark at half the price. Fable 5 is worth the premium only if you need the highest coding ceiling for long-horizon agentic runs.
Best AI for Research and Reasoning — July 2026
#1 Gemini 3.1 Pro — Deep Research and Multimodal Reasoning
GPQA Diamond leader (94.3%). 1M token context. Native multimodal (text, image, video, audio). Google AI Studio and Vertex AI. $2/$12 per million tokens. Best for: academic research, scientific literature review, long-document synthesis requiring multimodal input. Gemini 3.5 Pro targets July GA and could reset this ranking when it ships.
#2 Claude Opus 4.8 — Hardest General Reasoning
Tops Humanity's Last Exam with tools (57.9%) and without (49.8%). Leads GDPval-AA (1,890 Elo — economically valuable knowledge work). $5/$25. Best for: complex multi-step reasoning chains, professional research, long-context analysis where the answer cannot be looked up.
#3 GPT-5.5 Pro — Hard Math and Abstract Reasoning
Leads FrontierMath Tier 4 at 39.6% (nearly double Claude Opus 4.8's 22.9%). ARC-AGI-2: 85.0%. Best for: graduate-level math, physics problem-solving, competitive mathematics research where step-by-step working matters most.
Best AI for Writing and Long Documents — July 2026
#1 Claude Sonnet 5 / Opus 4.8 — Best prose quality and longest output
1M token context (Sonnet 5) and 200K (Opus 4.8). Produces the most natural long-form prose of any frontier model. 128K max output tokens means entire documents, reports, and codebases in a single pass. Sonnet 5 at $2/$10 intro is the best-value option. Opus 4.8 for the highest quality ceiling.
#2 GPT-5.5 with Canvas — Best editing environment
Canvas is the best AI writing editor available in any subscription tier. Real-time collaborative editing, inline suggestions, direct Google Docs export. ChatGPT Plus at $20/month. Best for teams that write, review, and edit collaboratively inside ChatGPT.
#3 Grok 4 — Best for trend-driven and real-time content
Live X firehose access makes Grok 4 uniquely valuable for social content, trend-aware writing, and anything where "what people are saying right now" is more important than polished prose. Arena Elo ~1,493 reflects users preferring Grok's more direct, opinionated voice.
Best AI for Video Generation — July 2026
#1 Veo 3.1 — Best overall video AI
Available in Gemini app, Google AI Studio, and Vertex AI. Native audio generation, 1080p output, strongest physics consistency of any model currently supported. Sora 2's consumer app was officially retired April 26, 2026 — Veo 3.1 now leads uncontested. Best for: cinematic video, nature and physics-heavy content, long clips with accurate motion.
#2 Kling 3.5 — Best cost-conscious pick
Speed and cost advantage over Veo 3.1. Best for: high-volume social video, content production at scale, quick iterations. Slightly lower quality ceiling but meaningfully faster and cheaper.
#3 Runway Gen-4 — Best for filmmakers
Precise camera control tools that Veo 3.1 and Kling lack. Motion Brush, Reference Pack, and Act-One make it the choice for directors and ad teams who need creative control over camera movement and scene composition rather than just output quality.
Best AI for Images — July 2026
#1 ChatGPT Images 2.0 (GPT-5.5) — Best for text-in-image and infographics
Best model for rendering precise multilingual text inside images and infographic-style layouts. Included in ChatGPT Plus and Pro. Best for: marketing materials, social cards, infographics, any image that needs readable text.
#2 Grok Aurora — Best for stylised and abstract content
SuperGrok: 50 video and unlimited image renders per day. Heavy: 500 video renders per day. Strongest for abstract, stylised, and illustration-style images. At 500 renders per day on Heavy ($300/month), the cheapest per-render option for high-volume social content at under $0.02 per render.
Quick Decision Guide — July 2026
Best for coding (highest ceiling) → Claude Fable 5 ($10/$50/M) — restored July 1, 80.3% SWE-bench Pro
Best for coding (value) → Claude Sonnet 5 ($2/$10 intro) — beats GPT-5.5 on 6/6 benchmarks
Best for research → Gemini 3.1 Pro ($2/$12) — GPQA Diamond leader, 1M context
Best for real-time social data → Grok 4 (SuperGrok $30/mo) — only model with live X firehose
Best for video → Veo 3.1 (Gemini app) — uncontested since Sora 2 consumer app retired April 26
Best for high-volume API → Grok 4 ($1.25/$2.50/M) — cheapest major-lab model on Amazon Bedrock
Best open-source → DeepSeek V4 ($0.14/$0.28/M) — 80.6% SWE-bench Verified, MIT license
Best overall subscription → Claude Pro $20/month — Sonnet 5 default, Fable 5 on Max $100/month
Sources: BenchLM.ai July 4, 2026 · TheAIRankings.com · BuildFastWithAI July 2026 rankings · Anthropic Sonnet 5 System Card · Related: Grok 4 vs Claude Sonnet 5 vs GPT-5.5 — full benchmark comparison → · SuperGrok vs ChatGPT Plus — which to pay for → · Claude news hub →