You will notice something strange when optimizing headlines for AI-related content in 2026: even AI models do not agree on what gets clicks.
I ran a real-world CTR experiment using ChatGPT, Claude, Gemini, and Grok — feeding all four the same query, the same impression data, and the same two headline options. The result was a perfect split: two models voted for one headline, two voted for the other. And the reasoning behind each vote reveals something genuinely useful about how intent-matching actually works in SEO.
The experiment setup
To test CTR behavior, I used a consistent set of inputs across all four models:
Same topic: Grok Voice and Video features. Same impression data from the last 24 hours. Same two headline options rated on a 0 to 10 CTR scale.
The two options were:
Option 1 — Feature-Driven Headline
Heading: "Grok Video Generation & Aurora Voice (April 2026): Who Has Access?"
Subheading: "Here is exactly how to enable Video and Aurora Voice mode on X Premium+ and SuperGrok tiers."
Option 2 — Action + Availability Headline
Heading: "Grok Voice Mode Is Available Now — Here's How to Turn It On"
Subheading: "Aurora voice mode works on iOS, Android, and grok.com desktop for SuperGrok ($30/mo) and X Premium+ subscribers."
Real search impression data (24 hours)
Before sharing the AI responses, here is the actual impression data I fed each model — real queries driving real traffic in a 24-hour window:
grok xai voice mode availability 2026 — 36 impressions
does grok ai have video generation feature april 2026 — 31 impressions
grok voice mode availability 2026 — 19 impressions
does grok by xai have video generation feature april 2026 — 15 impressions
grok voice mode availability april 2026 — 13 impressions
grok voice feature availability 2026 — 11 impressions
grok voice mode update april 2026 — 11 impressions
The intent picture here is mixed. Availability intent dominates — users asking "does it have this" and "is it available." But feature curiosity is also significant, particularly around video generation, which appears in 46 of the total impressions across multiple query variants.
What each AI model said
Grok: Option 1 → 9/10. Option 2 → 5/10. Grok favored the feature-specific headline, rating its specificity and technical detail — "15s clips," "native audio," the April 2026 date stamp — as strong signals for users who want capability confirmation before clicking.
ChatGPT: Option 2 → 9.6/10. Option 1 → 6.2/10. ChatGPT voted hard for the action headline, citing "available now" as an urgency signal and "here's how to turn it on" as a direct match for users with action intent — people who already know they want the feature and are searching for the how-to.
Gemini: Option 1 → 9/10. Option 2 → 7/10. Gemini sided with Option 1, noting that the video generation hook directly matches the second-highest impression query ("does grok ai have video generation feature april 2026 — 31 impressions"), and that Option 2's lack of a video hook misses a significant portion of the search intent pool.
Claude: Option 2 → 8.5/10. Option 1 → 5.5/10. Claude went with Option 2, arguing it matches 72% of impressions (the availability-intent queries), carries both an "available now" urgency signal and an action-intent "how-to" promise, and is easier to process at a glance. It flagged Option 1 for using "Aurora Voice" — a term that appears in zero queries — as a wasted keyword slot.
Final result: 2 vs 2
Option 1 wins: Grok and Gemini.
Option 2 wins: ChatGPT and Claude.
A perfect split.
This is not a failure of the experiment. It is the finding.
What the split actually reveals — the SEO insight
The two models that voted for Option 1 (Grok, Gemini) weighted feature specificity and keyword coverage. The two that voted for Option 2 (ChatGPT, Claude) weighted intent alignment and conversion probability. Both are legitimate CTR optimization strategies — they are just optimizing for different user types at different stages of the decision funnel.
Option 1 — Feature Hook works for curiosity-driven clicks. It matches "does it have X feature?" queries. It is strong for discovery traffic — users who are not yet sure they want the product and need to be persuaded by capability signals. The specifics (video generation, Aurora Voice, April 2026) function as credibility markers that tell the reader this article has current, detailed information.
Option 2 — Action + Availability matches high-intent search queries directly. "Available now" creates urgency. "How to turn it on" is a direct action trigger that converts the impression into a click from users who have already decided they want the feature. This headline type performs better for conversion-type clicks — users who are one step away from using the product and just need the setup instructions.
So which one actually wins?
The answer depends on your traffic source.
If your traffic is search-driven and your top queries follow the availability + how-to pattern (as they do in this dataset), Option 2 likely wins on CTR. The majority of impressions here are availability-intent queries — users asking "does grok have this" and "how do I enable it." Option 2 answers both directly in the headline.
If your traffic is social or discovery-driven — Reddit, X, newsletters, where users are browsing rather than searching — Option 1 can outperform. Feature headlines attract attention from curious readers who are not actively searching but will click when something looks technically interesting and specific.
My real conclusion
CTR is not about the "better headline." It is about matching user intent. Feature headlines attract attention. Action headlines convert clicks. The split result from this experiment is not a bug — it is a clear signal that the two headlines are optimized for two different audiences, and the right choice depends entirely on which audience you are trying to reach.
The deeper lesson is that asking AI models to rate CTR without context about your distribution channel will always produce conflicting results — because the models are correctly identifying that different intent signals favor different headline structures. The models are not wrong. The question is incomplete.
What I'm testing next
A hybrid headline that combines the feature hook with the action trigger — something like "Grok Voice Mode and Video Generation Are Live — Here's How to Turn Them On" — to test whether covering both intent types in one headline outperforms either option on its own. I will also be tracking CTR versus bounce rate correlation, because a headline that gets the click but does not match the content delivers worse long-term signals than a lower-CTR headline that converts to engaged readers.
If you are doing your own headline testing, the one variable this experiment makes clear: separate your search traffic from your social traffic before you compare results. The same headline can be a 9/10 for one audience and a 5/10 for another — and both scores are correct.