QUICK ANSWER
Cloudflare and xAI partnered on June 4, 2026 to bring all Grok models into Cloudflare AI Gateway. No extra API keys needed - route through your existing AI Gateway endpoint, billed directly through Cloudflare. Available models: Grok 4.3 (text/image, 1M context, $1.25/$2.50 per million tokens), Grok Build 0.1 (coding agent, 256K context, $1.00/$2.00), Grok Imagine (image generation/editing), Grok Aurora (video), and Grok Voice (speech-to-speech). Elon Musk confirmed on X the same day.
What Cloudflare AI Gateway Is and Why This Partnership Matters
Cloudflare AI Gateway is a unified control plane for AI API traffic. Rather than calling each AI provider directly from your application, you route all requests through a single AI Gateway endpoint. The Gateway handles logging, caching, rate limiting, spend controls, fallback routing, and billing in one place - regardless of which models you use. Adding xAI Grok to that roster means developers can add Grok to their existing multi-model setup without a separate integration layer, separate authentication, or separate billing relationship with xAI.
The core developer benefit Cloudflare emphasized in the announcement: no additional auth, no additional environment variables, no additional API keys. If you already have Cloudflare AI Gateway configured for other providers - OpenAI, Anthropic, Google Vertex - you add Grok with a model name change, not a new credentials setup. Usage is billed directly through Cloudflare, consolidating AI spend onto one invoice alongside your existing Cloudflare services.
Elon Musk confirmed the partnership with a post on X on June 4, 2026 - an unusual level of direct founder engagement for an infrastructure integration announcement, signaling that the partnership is viewed as strategically significant rather than a routine model listing. According to partner coverage, the xAI-Cloudflare partnership was first formalized in August 2025; the June 2026 announcement reflects the expanded xAI API catalog available through mid-2026 including the full generation model lineup.
Full Model Catalog and Pricing
| Model |
Type |
Context |
Input price |
Output price |
Cached input |
| Grok 4.3 |
Text + image input |
1M tokens |
$1.25/M |
$2.50/M |
$0.20/M |
| Grok Build 0.1 |
Coding agent, text + image |
256K tokens |
$1.00/M |
$2.00/M |
$0.20/M |
| Grok Imagine |
Image generation + editing |
- |
Per image |
- |
- |
| Grok Aurora |
Video generation |
- |
Per second |
- |
- |
| Grok Voice |
Speech-to-speech audio |
- |
Per minute |
- |
- |
Grok 4.3 is xAI's primary text model - the most versatile entry point for most developer use cases. It supports text and image inputs, function calling, structured outputs, and configurable reasoning effort (none, low, medium, or high). The 1M-token context window is competitive with Claude Sonnet 4.6 and Gemini 3.5 Flash. At $1.25 input / $2.50 output per million tokens, Grok 4.3 undercuts Claude Sonnet 4.6 ($3.00/$15.00) significantly and is broadly in line with Gemini 3.5 Flash ($1.50/$9.00) on input, cheaper on output.
Grok Build 0.1 is xAI's dedicated software engineering model - the same one underlying the Grok Build coding agent beta. It features always-on reasoning, tool calling, and structured outputs at $1.00/$2.00 per million tokens with 100+ tokens per second throughput. For developers building code generation pipelines who want a cheaper alternative to GPT-5.5 or Claude Opus 4.8, Grok Build 0.1 via AI Gateway is worth evaluating.
How to Set Up Grok in Cloudflare AI Gateway
Quick setup - Grok 4.3 via AI Gateway
Base URL format
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_name}/grok
Required: Your AI Gateway Account ID, gateway name, and active xAI API token
curl example
curl https://gateway.ai.cloudflare.com/v1/ACCOUNT_ID/GATEWAY_NAME/grok -H "Authorization: Bearer $XAI_API_KEY" -H "Content-Type: application/json" -d '{"model":"grok-4.3","messages":[{"role":"user","content":"Hello"}]}'
OpenAI-compatible schema also supported
base_url = "https://gateway.ai.cloudflare.com/v1/ACCOUNT_ID/GATEWAY_NAME/grok"
# Use your existing OpenAI client - just swap the base URL and model name
Despite Cloudflare's announcement saying "no additional API keys needed," you still need an active xAI API token in the Authorization header - the Cloudflare docs confirm this. What the announcement means in practice is that you do not need a separate Cloudflare-xAI credential or a new Cloudflare configuration beyond adding the xAI provider to your gateway. Your existing xAI API key passes through the gateway just as it would in a direct call. Billing still runs through Cloudflare's consolidated invoice for usage tracked through the gateway.
What Cloudflare AI Gateway Adds on Top of a Direct xAI Call
Unified logging
Every Grok request logged alongside your OpenAI, Anthropic, and other provider calls - single dashboard for all AI traffic, latency metrics, error rates, and token consumption across models.
Caching
Semantic caching returns stored responses for semantically similar queries rather than making a new API call. The $0.20/M cached input token rate reflects this - requests that hit the cache cost 84% less than uncached input tokens.
Rate limiting and spend controls
Set hard spending limits on Grok usage without modifying your application code. AI Gateway enforces the limit at the gateway level - useful for preventing runaway costs on agentic pipelines.
Dynamic routing and fallbacks
Configure fallback chains: if Grok 4.3 returns an error or exceeds latency thresholds, automatically route to another model (GPT-5.4, Claude Sonnet 4.6) without changing application code. This is particularly valuable for production reliability where a single provider outage would otherwise bring down the feature.
Guardrails
AI Gateway's built-in guardrails flag and block harmful or inappropriate content, protect personal data, and enforce compliance policies in real time - applied uniformly across all providers including Grok, without adding guardrail logic to each provider integration separately.
How Grok Compares to Other Models in AI Gateway
| Model |
Provider |
Input (per 1M) |
Output (per 1M) |
Context |
| GPT-5.4 |
OpenAI |
$2.00 |
$8.00 |
128K |
| Grok 4.3 |
xAI |
$1.25 |
$2.50 |
1M |
| Claude Sonnet 4.6 |
Anthropic |
$3.00 |
$15.00 |
1M |
| Gemini 3.5 Flash |
Google |
$1.50 |
$9.00 |
1M |
| Grok Build 0.1 |
xAI |
$1.00 |
$2.00 |
256K |
Grok 4.3 is the most competitively priced 1M-context model currently available in AI Gateway. At $1.25/$2.50, it undercuts Gemini 3.5 Flash on output by 72% and Claude Sonnet 4.6 on output by 83%. The pricing advantage is significant for output-heavy workloads - document generation, code generation, long-form content - where output token costs dominate total bill.
Frequently Asked Questions
Do I need an xAI account to use Grok via AI Gateway?
Yes. Despite Cloudflare's framing of "no additional API keys needed," the Cloudflare docs confirm you still need an active xAI API token in the Authorization header. What you do not need is a new Cloudflare-specific credential or a separate billing relationship. Get an xAI API key at console.x.ai, then use it in your AI Gateway requests the same way you would for a direct xAI call.
Does Grok in AI Gateway support function calling and structured outputs?
Yes. Grok 4.3 and Grok Build 0.1 both support function calling and structured outputs natively. Grok 4.3 also supports configurable reasoning effort (none, low, medium, high) which maps to the thinking level system on the Grok consumer app.
Is the AI Gateway free to use?
Cloudflare AI Gateway has a free tier covering up to 100,000 requests per day with basic logging. The paid Workers Paid plan ($5/month) removes request limits and adds advanced features including caching analytics and extended log retention. You still pay for model usage at the provider's token rates regardless of which Gateway tier you are on.
Can I use Grok Aurora (video) and Grok Voice (audio) via AI Gateway?
Yes - both are included in the partnership announcement. Grok Aurora for video generation and Grok Voice for speech-to-speech audio are accessible via AI Gateway. Pricing for these generation models is per-second (video) and per-minute (audio) respectively - check the Cloudflare AI Gateway docs at developers.cloudflare.com/ai-gateway for current rates as these may differ from the text model pricing.
How does this compare to calling the xAI API directly?
Direct xAI API calls are marginally simpler (no gateway URL substitution) and have no additional latency from the gateway hop. AI Gateway adds logging, caching, rate limiting, spend controls, fallback routing, and guardrails. For production applications where observability and reliability matter, the gateway overhead is worth it. For quick prototyping or single-model applications, direct xAI API calls are fine.