Qwen Breaks Record: 2000+ Tokens/Second Open-Source Model

World's fastest open-source AI model K2 Think hits 2000+ tokens/sec. Built on Qwen 2.5-32B, it excels in math reasoning with 32B parameters. Try it now!

Meng Li

Sep 10, 2025

∙ Paid

"AI Disruption" Publication 7600 Subscriptions 20% Discount Offer Link.

MBZUAI and G42 Launch K2 Think: A Leading Open-Source System for Advanced AI Reasoning - PR Newswire APAC

The world's fastest open-source large model has arrived—reaching speeds of 2,000 tokens per second!

Although it has only 32 billion parameters (32B), its throughput is more than 10 times that of typical GPU deployments.

This is K2 Think, launched through a collaboration between Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) in the UAE and startup G42 AI.

Does the name sound familiar?

That's right, it does have a slight naming collision with Moonshot AI's recently released Kimi K2, though the UAE version adds "Think" to the name.

But what's very interesting is that behind K2 Think, there's indeed a "made in China" flavor.

Continue reading this post for free, courtesy of Meng Li.

Or purchase a paid subscription.