OpenAI o3 Medium: The New "Cost-Effective King"? ARC-AGI Results Show Double Score at 1/20 Cost
OpenAI's o3 model doubles ARC-AGI scores at 1/20 cost! New benchmark results reveal its cost-performance dominance. Is this the AI efficiency king?
"AI Disruption" Publication 6000 Subscriptions 30% Discount Offer Link.
OpenAI Releases o3 in April: Score Doubles Second Place, Cost Only 1/20?!
The new performance of o3 (Medium) on the ultra-challenging ARC-AGI reasoning task has truly delivered a shocking surprise to everyone.
According to the official ARC Prize announcement, the key conclusions from this round of testing are as follows:
o3 (Medium) scored 57% on ARC-AGI-1, with a cost of $1.5 per task, outperforming all known Chain-of-Thought (COT) reasoning models.
o4-mini (Medium) scored 42% on ARC-AGI-1, with a cost of $0.23 per task, showing lower accuracy but a significant cost advantage.
On the more difficult ARC-AGI-2, both models scored below 3%.