AI Disruption

AI Disruption

Llama Version o1 Released, Based on AlphaGo Zero Paradigm

Explore the open-source O1 project, recreating OpenAI’s model with AlphaGo Zero, Monte Carlo Tree Search, and LLaMA for advanced AI reasoning.

Meng Li's avatar
Meng Li
Nov 05, 2024
∙ Paid
2
1
Share

Recreating OpenAI O1 Inference Large Model: Latest Progress in the Open-Source Community

The LLaMA-based O1 project has just been released by the Shanghai AI Lab team.

The description clearly mentions the use of Monte Carlo Tree Search, Self-Play Reinforcement Learning, PPO, and AlphaGo Zero’s dual-strategy paradigm (prior strategy + value assessment).

alt text

Point to be noted - they mentioned Alpha Go Zero, not Alpha Go! Which means it is not dependent on human-generated data, but based on pure RL with a known set of "rules"!

This is even bigger than o1! Big, if true!

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Meng Li
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture