Qwen Breaks Record: 2000+ Tokens/Second Open-Source Model
World's fastest open-source AI model K2 Think hits 2000+ tokens/sec. Built on Qwen 2.5-32B, it excels in math reasoning with 32B parameters. Try it now!
"AI Disruption" Publication 7600 Subscriptions 20% Discount Offer Link.
The world's fastest open-source large model has arrived—reaching speeds of 2,000 tokens per second!
Although it has only 32 billion parameters (32B), its throughput is more than 10 times that of typical GPU deployments.
This is K2 Think, launched through a collaboration between Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) in the UAE and startup G42 AI.
Does the name sound familiar?
That's right, it does have a slight naming collision with Moonshot AI's recently released Kimi K2, though the UAE version adds "Think" to the name.
But what's very interesting is that behind K2 Think, there's indeed a "made in China" flavor.