AI Disruption

AI Disruption

Share this post

AI Disruption
AI Disruption
Alibaba’s QwQ-32B: 1/20 the Parameters, DeepSeek R1-Level Performance
Copy link
Facebook
Email
Notes
More

Alibaba’s QwQ-32B: 1/20 the Parameters, DeepSeek R1-Level Performance

Discover Alibaba's QwQ-32B: a 32B parameter inference model delivering DeepSeek R1-level performance through advanced reinforcement learning.

Meng Li's avatar
Meng Li
Mar 06, 2025
∙ Paid
2

Share this post

AI Disruption
AI Disruption
Alibaba’s QwQ-32B: 1/20 the Parameters, DeepSeek R1-Level Performance
Copy link
Facebook
Email
Notes
More
1
Share
QwQ-32B-Preview, the Reasoning Model from the Qwen Team, Demo is Available  for Free on HuggingChat - DigiAlps LTD

Today, Alibaba Open Source released its new inference model, QwQ-32B, which has 32 billion parameters but its performance can rival the full-blood version of DeepSeek-R1 with 671 billion parameters.

On X, Qianwen stated:
“This time, we have explored methods to extend RL, and based on our Qwen2.5-32B, we have achieved some impressive results. We found that RL training can continuously improve performance, especially on math and coding tasks, and we observed that continuous extension of RL can help medium-sized models achieve performance comparable to that of giant MoE models. Feel free to chat with our new model and provide us with feedback!”

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Meng Li
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More