Meta Llama 4 Accused of "Cheating": Aces Arena Tests but Fumbles in Real-World Use

Meta Llama 4 AI controversy: Top benchmark scores but real-world coding struggles. Is it benchmark-optimized?

Apr 07, 2025

∙ Paid

"AI Disruption" Publication 5700 Subscriptions 20% Discount Offer Link.

Meta’s stumble came out of nowhere.

Last Saturday, Meta released its latest AI model series—Llama 4—and dropped three versions at once: Llama 4 Scout, Llama 4 Maverick, and Llama 4 Behemoth.

Meta Opens Llama 4! First MoE Model, 10M Context, Beats DeepSeek

Meng Li

April 6, 2025

Read full story

According to the official introduction, their rankings in the large model arena are pretty impressive.

Take Llama 4 Maverick, for example—it’s ranked second overall, becoming the fourth model to break the 1400-point barrier.

Continue reading this post for free, courtesy of Meng Li.

Or purchase a paid subscription.