Alibaba’s QwQ-32B: 1/20 the Parameters, DeepSeek R1-Level Performance

Discover Alibaba's QwQ-32B: a 32B parameter inference model delivering DeepSeek R1-level performance through advanced reinforcement learning.

Mar 06, 2025

∙ Paid

QwQ-32B-Preview, the Reasoning Model from the Qwen Team, Demo is Available for Free on HuggingChat - DigiAlps LTD

Today, Alibaba Open Source released its new inference model, QwQ-32B, which has 32 billion parameters but its performance can rival the full-blood version of DeepSeek-R1 with 671 billion parameters.

On X, Qianwen stated:
“This time, we have explored methods to extend RL, and based on our Qwen2.5-32B, we have achieved some impressive results. We found that RL training can continuously improve performance, especially on math and coding tasks, and we observed that continuous extension of RL can help medium-sized models achieve performance comparable to that of giant MoE models. Feel free to chat with our new model and provide us with feedback!”

AI Disruption

Alibaba’s QwQ-32B: 1/20 the Parameters, DeepSeek R1-Level Performance

Discover Alibaba's QwQ-32B: a 32B parameter inference model delivering DeepSeek R1-level performance through advanced reinforcement learning.

This post is for paid subscribers