Grok 4 First Tests: Beats o3, Can't Count Fingers

Grok 4 first tests reveal mixed results: outperforms OpenAI o3 in coding and reasoning but fails at basic visual tasks like counting fingers.

Jul 11, 2025

∙ Paid

"AI Disruption" Publication 7100 Subscriptions 20% Discount Offer Link.

Netizens Spend Big Money to Experience Grok 4

Yesterday, Musk appeared at the Grok 4 launch event with a proud expression, stating: "Grok now reaches postdoctoral level in all disciplines, without exception, and can even achieve scientific breakthroughs within this year."

Musk Releases Grok 4! Tops All Benchmarks

Meng Li

July 10, 2025

Read full story

This immediately sparked global netizens' interest. Despite Grok 4's hefty price tag, many netizens still voluntarily paid to experience it.

Grok 4 vs o3 Battle

Blogger @Alex Prompter conducted a series of tests comparing Grok 4 and OpenAI o3.

First was physics simulation, having a ball bounce inside a hexagon to test whether AI truly understands causal laws like gravity and collision, as well as spatiotemporal relationships, while also testing the model's coding capabilities.

He used identical prompts to compare the generation effects of Grok 4 and o3.

Prompt: Create a HTML, CSS, and javascript where a ball is inside a rotating hexagon. The ball is affected by Earth's gravity and friction from the hexagon walls. The bouncing must appear realistic.

Continue reading this post for free, courtesy of Meng Li.

Or purchase a paid subscription.