Today's Biggest AI Updates
- Claude ends third-party subscriptions — major impact on AI agents & automation tools (full breakdown)
- Grok adds speed & quality modes — better control over output performance (see update)
- Microsoft launches 3 new AI models — strong push beyond OpenAI partnership (details)
- ElevenLabs enters image & video — full creative platform expansion
- Google TurboQuant cuts memory costs 6x — major infrastructure shift
What this means: AI tools are rapidly shifting from simple apps to full platforms. Costs, performance, and competition are all changing faster than expected.
April 2026 opened with a flood of AI announcements that would have individually dominated a slow news week. Instead they arrived within days of each other, making this one of the most consequential early-month stretches in AI product history. Here is every update that matters, organized by tool.
Google DeepMind — Gemma 4
Released April 2, 2026 under Apache 2.0 — the most permissive open license in the Gemma family's history. Four models: Effective 2B and 4B for phones and edge devices, a 26B Mixture of Experts, and a 31B Dense that currently ranks third globally among all open models on Arena AI with an Elo score of 1452. All four support text, images, and video. The edge models also handle native audio input. Context windows reach 256,000 tokens on the larger models. The 31B scores 89.2% on AIME 2026 and 80.0% on LiveCodeBench v6. Day-one support confirmed across Hugging Face, Ollama, vLLM, llama.cpp, MLX, LM Studio, NVIDIA NIM, and Android Studio. Available now via ollama run gemma4:27b.
Microsoft — MAI-Transcribe-1, MAI-Voice-1, MAI-Image-2
Three in-house AI models launched April 2, 2026 through Microsoft Foundry and the new MAI Playground. MAI-Transcribe-1 claims the lowest average FLEURS Word Error Rate across 25 languages at 3.8% WER — undercutting OpenAI Whisper-large-v3 on all 25 languages and Google Gemini 3.1 Flash on 22 of 25. MAI-Voice-1 generates audio at 60x real-time and supports custom voice creation from a few seconds of sample audio, priced at $22 per million characters — directly competing with ElevenLabs. MAI-Image-2 debuted in the Arena.ai top three with 2x faster generation than its predecessor at $5 per million tokens input and $33 for image output. WPP is among the first enterprise partners. Built by a team of just 10 people, the audio model reflects CEO Mustafa Suleiman's philosophy of small, empowered engineering teams. This is Microsoft's first independently built frontier AI production models since it began its OpenAI partnership.
Anthropic — Claude computer use on Windows
Computer use in Claude Cowork and Claude Code Desktop expanded to Windows this week. Previously macOS-only since its March 23 launch, the feature now lets Windows users on Pro and Max plans enable Claude to open apps, navigate Chrome, fill spreadsheets, run dev tools, and complete multi-step desktop tasks autonomously. The update arrives on all supported Windows hardware — no Apple Silicon equivalent restriction. To enable: Settings → General → Desktop app → turn on Browser use → turn on Computer use. Works with Dispatch for phone-to-desktop task handoff. Available to Pro ($20/month) and Max ($100–$200/month) subscribers only.
Anthropic — Claude subscriptions end for third-party tools
Effective April 4, 2026 at 12 p.m. PT, Claude Pro and Max subscriptions no longer cover usage through third-party agents like OpenClaw, OpenCode, and similar harnesses. Announced by Claude Code executive Boris Cherny, the change formally ends the practice of routing subscription OAuth tokens through external tools to access Claude's models at flat-rate pricing. Users can continue using third-party tools with their Claude account via pay-as-you-go usage bundles now available at a one-time discounted credit, or by switching to a standard API key. Refunds are being offered. The move follows months of technical enforcement beginning in January 2026 and formalizes Anthropic's updated Terms of Service that explicitly prohibit OAuth token use outside Claude.ai and Claude Code.
Cohere — Transcribe (open-source ASR)
Launched March 26, 2026. Cohere's first voice model is a 2-billion-parameter open-source automatic speech recognition model that immediately took the top spot on the Hugging Face Open ASR Leaderboard with an average word error rate of 5.42% — beating OpenAI Whisper Large v3 (7.44%), ElevenLabs Scribe v2 (5.83%), and Qwen3-ASR-1.7B (5.76%). Supports 14 languages including English, French, Chinese, Arabic, and Japanese. Licensed under Apache 2.0. In human evaluations, Transcribe was preferred over Whisper Large v3 in 64% of English pairwise comparisons. Available free on Hugging Face and via Cohere's API, with production deployment through Model Vault. Integration with Cohere's North enterprise agent platform is planned for later in 2026.
Google Research — TurboQuant
Google Research introduced TurboQuant, a KV cache compression algorithm that reduces AI inference memory requirements by at least sixfold while maintaining accuracy on benchmarks — without any accuracy loss. The method targets the KV cache, one of the primary bottlenecks in running large language models with long context windows. The impact on the AI infrastructure market was immediate: SK Hynix dropped over 6%, Samsung fell 5%, and Micron slid more than 2% on the announcement, as investors repriced the long-term demand outlook for AI memory chips. For AI users and developers, the implication is lower infrastructure costs and potentially faster adoption of long-context models across a wider range of hardware.
Salesforce — Slackbot upgraded to autonomous work assistant
Salesforce announced a major update to Slackbot this week, transforming it from a simple notification tool into an autonomous work assistant with 30 new AI features. The upgraded Slackbot can take multi-step actions, coordinate tasks across channels, and integrate with Salesforce CRM data to surface context directly inside Slack conversations. The update positions Slack as a full-featured AI work layer rather than a messaging platform with bolt-on AI — a direct response to Microsoft's Copilot Cowork expansion and the broader push toward agentic workplace tools.
Microsoft Copilot — Critique and Model Council
Microsoft rolled out two new multi-model orchestration features to Copilot this week. Critique has one model generate a response while a second reviews it for accuracy before delivery — a quality layer designed to reduce hallucinations in high-stakes business outputs. Model Council enables side-by-side comparisons of outputs from multiple models simultaneously, giving users and teams the ability to pick the best response rather than accepting a single answer. Microsoft is also expanding access to Copilot Cowork, its agentic tool for task automation, as part of the same update cycle.
xAI Grok — Speed and quality modes, Imagine 2.0 incoming
Grok app version 1.3.54 shipped this week with sharper image detail, smoother video motion, and cinematic quality improvements to Grok Imagine. On April 3, Elon Musk confirmed the first Imagine 2.0 preview: users can now choose between Speed mode (fast output for iteration) and Quality mode (higher fidelity per generation). A Professional mode is confirmed for later in April ahead of the full Imagine 2.0 rollout, which will bring major improvements to face consistency and audio synchronization. Musk also confirmed models update approximately twice weekly, and noted Grok's real-world accuracy advantage on topics like California zoning laws where other models have produced incorrect results. Update the Grok app to version 1.3.54 or later to access the new modes.
ElevenLabs — Image and Video (Beta)
ElevenLabs launched Image and Video in beta, expanding from its audio-first roots into a unified creative platform. Users can now generate images and videos using models including Veo, Sora, Kling, WAN, and Seedance — then immediately bring them to life with ElevenLabs voices, music, and sound effects in a single workflow. The platform includes lipsync for generated videos using ElevenLabs voices, a composition timeline for multi-clip storytelling, and direct export to ElevenLabs Studio for final production polish. The addition makes ElevenLabs a direct competitor to standalone video generation tools, positioning it as an end-to-end creative production platform rather than a voice-only tool.
Anthropic MCP — 97 million installs
Anthropic's Model Context Protocol crossed 97 million installs in March 2026, cementing its transition from an experimental standard to foundational infrastructure for AI agents. Every major AI provider now ships MCP-compatible tooling, and the protocol has become the default mechanism by which agents connect to external tools, APIs, and data sources across the industry. The milestone was highlighted at NVIDIA GTC 2026, where Jensen Huang's keynote emphasized that AI has moved from experimental infrastructure to a core operating layer for global industry.
The pattern across all of it
Every update this week points in the same direction: AI tools are moving from single-feature products toward platforms, from consumer experiments toward enterprise infrastructure, and from human-in-the-loop toward autonomous execution. Gemma 4 brings frontier-level reasoning to a phone. Claude computer use brings autonomous desktop control to Windows. Cohere Transcribe brings enterprise-grade speech recognition to anyone with a GPU. TurboQuant makes long-context AI cheaper to run at scale. The velocity is not slowing down. If anything, April 2026 is confirming that the pace of meaningful AI product releases has permanently reset to a level that would have seemed impossible two years ago.