30 Latest AI Open Source Projects of the Week(2025.1.6-2025.1.12)

Discover 30 cutting-edge AI open-source projects, including Dolphin 3.0, CAG, and RadioDiff, with advanced models for tasks like reinforcement learning, multimodal interaction, and edge deployment.

Jan 12, 2025

∙ Paid

I’m sharing some interesting AI open-source models and frameworks for this week (2025.1.6-2025.1.12).
There are a total of 30 AI open-source projects.

Project: Dolphin 3.0

Dolphin 3.0 is the next-generation instruction-tuning model in the Dolphin series, designed to be the ultimate general-purpose local model, supporting encoding, mathematics, agents, function calls, and general use.

Similar to models like ChatGPT, Claude, and Gemini, Dolphin allows users full control over system prompts, model versions, and data privacy, without imposing any ethical guidelines or restrictions.

https://huggingface.co/cognitivecomputations/Dolphin3.0-Llama3.2-1B

https://huggingface.co/cognitivecomputations/Dolphin3.0-Llama3.2-3B

https://huggingface.co/cognitivecomputations/Dolphin3.0-Qwen2.5-0.5B

https://huggingface.co/cognitivecomputations/Dolphin3.0-Qwen2.5-1.5B

https://huggingface.co/cognitivecomputations/Dolphin3.0-Llama3.1-8B

Project: SmallThinker-3B-Preview

SmallThinker-3B-Preview is a new model fine-tuned from the Qwen2.5-3B-Instruct model.

It is designed for edge deployment on resource-constrained devices and serves as a quick and efficient draft model for the larger QwQ-32B-Preview model. The model performs excellently in various benchmark tests, particularly in mathematics and science.

https://huggingface.co/PowerInfer/SmallThinker-3B-Preview

Project: CAG

Cache-Augmented Generation (CAG) is a new paradigm for enhancing language models by pre-loading all relevant resources into the model's context and caching its runtime parameters, thereby bypassing the need for real-time retrieval.

CAG leverages the extended context window of modern large language models to directly generate responses during inference, eliminating the need for retrieval. This method aims to reduce latency, improve reliability, and simplify system design.

https://github.com/hhhuang/CAG

Project: RadioDiff

RadioDiff is an efficient generative diffusion model aimed at achieving sample-free dynamic radio graph construction.

This project models radio graph construction as a conditional generative problem and introduces a denoising diffusion method to improve construction quality.

The project employs an attention-based U-Net with an adaptive fast Fourier transform module as the backbone network to enhance the ability to extract features from dynamic environments. Additionally, the use of decoupled diffusion models further enhances the performance of radio graph construction.

https://github.com/UNIC-Lab/RadioDiff

Project: SparseViT

SparseViT is a project based on sparse encoding Transformers, designed to adaptively extract non-semantic features that are more critical for image manipulation localization by distinguishing semantic and non-semantic features.

This project provides a novel method for precisely identifying manipulated image regions.

https://github.com/scu-zjz/SparseViT

Project: MiniPerplx

MiniPerplx is a minimalist AI-driven search engine designed to help users find information on the internet.

The project leverages the Vercel AI SDK and multiple APIs to provide a diverse set of search functionalities, including web search, specific URL search, weather queries, code execution, map localization, text translation, video search, academic paper search, product search, and more.

https://scira.how/

Project: Cosmos

NVIDIA Launches Cosmos, A Family of Open-Source Video Generation Models Trained on 20 Million Hours Video Data - DigiAlps LTD

Robot ChatGPT: NVIDIA Unveils Cosmos World Model Platform

Meng Li

Jan 8

Read full story

Cosmos is a world model development platform designed for physical AI. It consists of world foundation models, tokenizers, and video processing pipelines, aimed at accelerating the development of physical AI for robotics and autonomous driving laboratories.

The Cosmos library allows end-users to run Cosmos models, execute inference scripts, and generate videos.

https://github.com/NVIDIA/Cosmos

AI Disruption