Grok 4 First Tests: Beats o3, Can't Count Fingers
Grok 4 first tests reveal mixed results: outperforms OpenAI o3 in coding and reasoning but fails at basic visual tasks like counting fingers.
"AI Disruption" Publication 7100 Subscriptions 20% Discount Offer Link.
Netizens Spend Big Money to Experience Grok 4
Yesterday, Musk appeared at the Grok 4 launch event with a proud expression, stating: "Grok now reaches postdoctoral level in all disciplines, without exception, and can even achieve scientific breakthroughs within this year."
This immediately sparked global netizens' interest. Despite Grok 4's hefty price tag, many netizens still voluntarily paid to experience it.
Grok 4 vs o3 Battle
Blogger @Alex Prompter conducted a series of tests comparing Grok 4 and OpenAI o3.
First was physics simulation, having a ball bounce inside a hexagon to test whether AI truly understands causal laws like gravity and collision, as well as spatiotemporal relationships, while also testing the model's coding capabilities.
He used identical prompts to compare the generation effects of Grok 4 and o3.
Prompt: Create a HTML, CSS, and javascript where a ball is inside a rotating hexagon. The ball is affected by Earth's gravity and friction from the hexagon walls. The bouncing must appear realistic.