AI Disruption
Subscribe
Sign in
Home
Podcast
Chat
Chip
Meta
Paper
Qwen
Agent
Robot
OpenAI
YouTube
AI Code
AI Video
AI Weekly
Elon Musk
AI Writing
AI Painting
AI Business
🎈 Guest Posts
AI Open Source
Machine Learning
Chinese Outbound
Foundation Model
Archive
About
Paper
Latest
Top
Discussions
Dawn of Practical Quantum Chemistry: ByteQC Toolkit by ByteDance Research
Accelerate quantum chemistry with ByteQC – ByteDance Research’s GPU-powered toolkit delivering gold standard accuracy for large-scale molecular…
Mar 6
Â
•
Â
Meng Li
2
Share this post
AI Disruption
Dawn of Practical Quantum Chemistry: ByteQC Toolkit by ByteDance Research
Copy link
Facebook
Email
Notes
More
2024 Turing Award Honors Reinforcement Learning Father Richard Sutton and Mentor Andrew Barto
Reinforcement Learning pioneers Andrew Barto and Richard Sutton win the 2024 ACM Turing Award for groundbreaking contributions in AI.
Mar 5
Â
•
Â
Meng Li
4
Share this post
AI Disruption
2024 Turing Award Honors Reinforcement Learning Father Richard Sutton and Mentor Andrew Barto
Copy link
Facebook
Email
Notes
More
DeepSeek R1 Technology Successfully Migrates to the Multimodal Domain, Fully Open Sourced
Discover Visual-RFT—an open-source breakthrough that extends DeepSeek-R1’s rule-based reinforcement learning to vision-language models for efficient…
Mar 4
Â
•
Â
Meng Li
4
Share this post
AI Disruption
DeepSeek R1 Technology Successfully Migrates to the Multimodal Domain, Fully Open Sourced
Copy link
Facebook
Email
Notes
More
DeepSeek's GRPO: Complete From-Scratch Implementation
Discover how to implement GRPO from scratch using Qwen2.5-1.5B-Instruct in this comprehensive distributed RL tutorial to boost model performance and…
Mar 2
Â
•
Â
Meng Li
6
Share this post
AI Disruption
DeepSeek's GRPO: Complete From-Scratch Implementation
Copy link
Facebook
Email
Notes
More
Muon Optimizer: 48% Less Compute Than AdamW, Compatible with DeepSeek
Discover how Dark Side of the Moon's improved Muon optimizer cuts computational requirements by 48% over AdamW, scales to larger models, and is…
Feb 23
Â
•
Â
Meng Li
2
Share this post
AI Disruption
Muon Optimizer: 48% Less Compute Than AdamW, Compatible with DeepSeek
Copy link
Facebook
Email
Notes
More
MoBA Attention by Kimi Yang: DeepSeek NSA Collision & Code Release
Discover the MoBA attention mechanism, an advanced approach combining MoE and FlashAttention for efficient long-sequence processing in large language…
Feb 20
Â
•
Â
Meng Li
1
Share this post
AI Disruption
MoBA Attention by Kimi Yang: DeepSeek NSA Collision & Code Release
Copy link
Facebook
Email
Notes
More
OpenAI Launches Million-Dollar AI Programming Test: Claude 3.5 Tops the Benchmark!
OpenAI introduces SWE-Lancer, a million-dollar AI benchmark testing real-world software engineering tasks. Claude 3.5 leads the way, setting new…
Feb 19
Â
•
Â
Meng Li
2
Share this post
AI Disruption
OpenAI Launches Million-Dollar AI Programming Test: Claude 3.5 Tops the Benchmark!
Copy link
Facebook
Email
Notes
More
DeepSeek's Liang Wenfeng Unveils NSA: A Game-Changing Attention Architecture
DeepSeek's NSA introduces a fast, hardware-aligned sparse attention mechanism for efficient long-context training and inference in large models.
Feb 18
Â
•
Â
Meng Li
6
Share this post
AI Disruption
DeepSeek's Liang Wenfeng Unveils NSA: A Game-Changing Attention Architecture
Copy link
Facebook
Email
Notes
More
DeepSeek Launches CODEI/O: Enhancing Large Model Inference with Thought Chains
DeepSeek's CODEI/O dataset enhances model reasoning by transforming code into natural language thought chains, improving performance across various…
Feb 17
Â
•
Â
Meng Li
1
Share this post
AI Disruption
DeepSeek Launches CODEI/O: Enhancing Large Model Inference with Thought Chains
Copy link
Facebook
Email
Notes
More
MakeAnything Unlocks Multi-task Process Generation with Diffusion Transformer
MakeAnything combines Diffusion Transformer and asymmetric LoRA to unlock cross-domain, high-quality multi-task process generation, achieving…
Feb 16
Â
•
Â
Meng Li
2
Share this post
AI Disruption
MakeAnything Unlocks Multi-task Process Generation with Diffusion Transformer
Copy link
Facebook
Email
Notes
More
Apple Discovers Model Distillation Scaling Law! Stronger Teacher Models Aren't Always Better
Apple researchers uncover a new distillation scaling law, optimizing computational resources for better model performance. Learn how to improve AI…
Feb 14
Â
•
Â
Meng Li
3
Share this post
AI Disruption
Apple Discovers Model Distillation Scaling Law! Stronger Teacher Models Aren't Always Better
Copy link
Facebook
Email
Notes
More
ByteDance Proposes Dense Video Multimodal Large Model Sa2VA
Sa2VA, developed by ByteDance and Peking University, integrates SAM-2 and LLaVA to achieve advanced spatiotemporal video and image understanding…
Feb 12
Â
•
Â
Meng Li
2
Share this post
AI Disruption
ByteDance Proposes Dense Video Multimodal Large Model Sa2VA
Copy link
Facebook
Email
Notes
More
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts