TODAY'S TOP STORIES — JULY 5, 2026
- Anthropic Enters Drug Discovery — Claude Science Launches — At the June 30 AI for Science event, Anthropic announced an internal drug discovery program for neglected diseases alongside Claude Science: a research workbench with 60+ preconfigured tools, scientific database connectors, and HPC access. Available in beta for Pro, Max, Team, Enterprise on macOS and Linux. Eric Kauderer-Abrams, Anthropic's life sciences head: "To build the right models and tools to accelerate the industry, we need to live it alongside all of you"
- GPT-5.6 Sol/Terra/Luna — Full Preview Details Confirmed — Sol: new SOTA on Terminal-Bench 2.1 (96.7%), "max reasoning effort" and "ultra mode" for complex agentic work, stronger scientific reasoning and long-horizon planning. Terra: GPT-5.5-competitive performance at 2x lower cost — the most important pricing development. Luna: fastest and cheapest of the three. Government-restricted preview; general availability still expected mid-July
- Grok 5 Update: Still Training, Not Coming Q3 — Colossus 2 expanded to 1.5 GW for Grok 5 training. Polymarket closed June 30 Q3 release contracts at 3% probability. No release timeline from xAI. Grok 4.3 on Amazon Bedrock remains the current xAI production model at $1.25/$2.50/M
- OpenAI GeneBench-Pro — New research-level benchmark for AI agents in computational biology. Tests how agents navigate ambiguity and make consequential judgments under scientific uncertainty. Open-sources representative questions. A direct response to Anthropic's VirBench — the biology benchmark arms race is now official
1. Anthropic Enters Drug Discovery — What Claude Science Actually Is
Anthropic will start a drug discovery program focused on "neglected" diseases — those that traditional biopharmaceutical companies wouldn't consider attractive targets. The announcement came at the June 30 AI for Science event in San Francisco alongside the launch of Claude Science, an AI "workbench" designed to help scientists discover new drugs that integrates more than 60 preconfigured tools and connectors into "a single research environment" and provides access to local, remote, and high-performance computing resources.
Claude Science integrates scientific databases, computing resources, and domain-specific tools for genomics, proteomics, and drug discovery. The platform is available in beta for Pro, Max, Team, and Enterprise users on macOS and Linux. The 60+ preconfigured tools are the key specification: this is not a general Claude interface wrapped in a science-themed UI. It is a purpose-built research environment where the tools themselves — database connectors, sequence analysis utilities, genomics pipelines — are first-class citizens alongside the AI model.
Novartis CEO Vas Narasimhan says new AI models could cut drug development time from twelve years to seven or eight years. Better safety predictions could also double the success rate from 8% to 16%. Anthropic's internal drug discovery program is not a commitment to become a pharmaceutical company — it is a credibility move. Anthropic leaders framed the drug discovery program as a way to build credibility with the biopharma companies it hopes to win as Claude Science customers. The strategy: use Claude Science internally to develop drug candidates for neglected diseases, generate real scientific results, and demonstrate to pharma customers that the tools work on real problems rather than benchmarks.
Claude Science — What It Includes
60+ preconfigured tools: Scientific database connectors, genomics and proteomics pipelines, sequence analysis utilities
Computing access: Local compute, remote servers, and high-performance computing resources from within a single interface
Research environment: Not a chatbot — a workbench where tools, data, and AI operate together in one environment
Availability: Beta — Pro, Max, Team, Enterprise plans. macOS and Linux. Not yet available on Windows.
Internal use: Anthropic's own drug discovery team using it for neglected disease research — rare genetic disorders and tropical diseases
The VirBench research previewed last week is the technical foundation: Google DeepMind's AlphaFold protein structure prediction tool remains one of the most prominent examples of AI in biology, and its co-developer John Jumper recently left for Anthropic. The implication is clear — Jumper's AlphaFold expertise, Claude Science's 60+ tool integrations, and the internal drug discovery program are all part of the same strategic arc: Anthropic positioning itself as the definitive AI partner for life sciences over the next five years.
2. GPT-5.6 Sol/Terra/Luna — The Preview Details That Matter Most
OpenAI is previewing GPT-5.6 Sol, its newest flagship model for developers and enterprises, alongside Terra and Luna. Sol is built for frontier reasoning and long-horizon agentic work; Terra is a balanced everyday model with GPT-5.5-competitive performance at 2x lower cost; and Luna is the fastest, most affordable member of the family.
Sol advances coding, scientific reasoning, long-horizon planning, and agentic workflows, while improving reliability and efficiency across demanding real-world tasks. Sol establishes new high-water marks across some of OpenAI's most challenging evaluations — on Terminal-Bench 2.1, which tests command-line workflows, Sol sets a new state of the art. The 96.7% TerminalBench 2.1 score — confirmed from the government review that triggered the export control discussion — is what places Sol above the informal government cybersecurity threshold and explains why it remains in restricted preview.
| Model |
Positioning |
Key benchmark |
What changes |
| GPT-5.6 Sol |
Frontier reasoning + long-horizon agents |
96.7% Terminal-Bench 2.1 (new SOTA) |
"Max reasoning effort" and "ultra mode" — new capability tiers for hardest tasks |
| GPT-5.6 Terra |
Balanced everyday model |
GPT-5.5-competitive |
2x lower cost than GPT-5.5 — most important pricing development of the preview |
| GPT-5.6 Luna |
Speed and affordability |
Fastest of the three |
Most affordable GPT-5.6 tier — likely the default for most ChatGPT users at GA |
The Terra pricing development is the most significant for competitive dynamics. If Terra delivers GPT-5.5-competitive performance at 2x lower cost — implying roughly $2.50/$15 per million tokens — it would directly challenge Claude Sonnet 5's introductory pricing of $2/$10. Two highly capable models at similar price points is a genuine price war. For enterprise customers evaluating API costs, Terra vs Sonnet 5 will be the comparison that matters most when GPT-5.6 reaches general availability. Altman's "couple of weeks" from June 26 puts that window closing this coming week.
GPT-5.6 remains in government-restricted preview — available only to approved government and enterprise partners. General availability expected mid-July. Rankings and comparisons will update when publicly accessible.
3. Grok 5 — Still Training, Not Coming Q3
Grok 5 is still in training on Colossus 2, xAI's expanded supercomputer cluster now running at 1.5 GW — a significant scale-up from the original Colossus 1 build. Polymarket closed June 30 contracts for Grok 5 at approximately 3% probability of release, reflecting accurate community calibration of xAI's timeline. No release date has been provided by xAI. Given the scale of the Colossus 2 build and standard training timelines for models at this compute level, Q4 2026 is the earliest realistic window.
For developers and enterprises, the practical implication is clear: Grok 4.3 on Amazon Bedrock at $1.25/$2.50/M remains the current xAI production model. The Colossus 2 investment at 1.5 GW signals that Grok 5 will be a significant capability jump — xAI would not build at this scale for an incremental release. When it arrives, it is expected to compete directly with Claude Fable 5 and GPT-5.6 Sol at the frontier performance level. Until then, the Grok 4.3/SuperGrok combination is the product.
4. OpenAI GeneBench-Pro — The Biology Benchmark Arms Race
OpenAI introduced GeneBench-Pro, a research-level benchmark for judging AI agents in computational biology. It expands GeneBench with harder, more realistic synthetic tasks, open-sources representative questions, and reports strong model results on scientific reasoning under uncertainty. The benchmark tests how AI agents navigate ambiguity and make consequential judgments in computational biology — specifically the kind of tasks where scientific data rarely arrive with instructions, where researchers must decide whether a pattern reflects biology or noise.
The timing is not coincidental. Anthropic published VirBench in June 2026 — showing deterministic tools pushing AI biology accuracy from 16.9% to 92.8% — and launched Claude Science on June 30. OpenAI publishing GeneBench-Pro within days is the clearest signal yet that biology has become the next AI benchmark frontier after coding. The pattern mirrors what happened with SWE-bench in 2024-2025: one company establishes a benchmark, the other responds with a harder version, benchmarks proliferate, and eventually third-party standardised testing resolves the dispute. Expect the biology benchmark space to follow the same arc over the next 12 months.
The Week Ahead — What to Watch
GPT-5.6 general availability: Altman said "a couple of weeks" from June 26. That deadline arrives this week. Government review of GPT-5.6 has now run longer than Fable 5's review (20 days). A general release announcement this week would reset the competitive pricing picture with Terra's 2x-cheaper-than-GPT-5.5 positioning.
White House voluntary AI standards announcement: FT reported on July 2 that the announcement was possible "as soon as next week." That window is this week. The framework will define cybersecurity benchmark thresholds, review timelines, and access rules for frontier models — the most significant AI regulatory development since the June 12 Fable 5 suspension.
Gemini 3.5 Pro general availability: Google targeted July GA. No announcement yet. If it ships this week, it becomes the first new unrestricted frontier model since GPT-5.6 went into restricted preview.
Claude Sonnet 5 introductory pricing window: Intro pricing of $2/$10/M runs through August 31, 2026. Developers on high-volume workloads have 8 weeks to benchmark their actual costs before the September 1 step-up to $3/$15. The tokenizer change (1.0-1.35x more tokens) means real costs range from $2.60-$3.90/M — still cheaper than GPT-5.5 at $5, but worth measuring before September.