AI Disruption

AI Disruption

Share this post

AI Disruption
AI Disruption
Llama Version o1 Released, Based on AlphaGo Zero Paradigm

Llama Version o1 Released, Based on AlphaGo Zero Paradigm

Explore the open-source O1 project, recreating OpenAI’s model with AlphaGo Zero, Monte Carlo Tree Search, and LLaMA for advanced AI reasoning.

Meng Li's avatar
Meng Li
Nov 05, 2024
∙ Paid
2

Share this post

AI Disruption
AI Disruption
Llama Version o1 Released, Based on AlphaGo Zero Paradigm
1
Share

Recreating OpenAI O1 Inference Large Model: Latest Progress in the Open-Source Community

The LLaMA-based O1 project has just been released by the Shanghai AI Lab team.

The description clearly mentions the use of Monte Carlo Tree Search, Self-Play Reinforcement Learning, PPO, and AlphaGo Zero’s dual-strategy paradigm (prior strategy + value assessment).

alt text

Point to be noted - they mentioned Alpha Go Zero, not Alpha Go! Which means it is not dependent on human-generated data, but based on pure RL with a known set of "rules"!

This is even bigger than o1! Big, if true!

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Meng Li
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share