AI Disruption

AI Disruption

Share this post

AI Disruption
AI Disruption
The First Pure Attention-Free Large Model Surpasses Open-Source Giant Llama 3.1

The First Pure Attention-Free Large Model Surpasses Open-Source Giant Llama 3.1

Falcon Mamba 7B: A new open-source model challenging Transformer, handling infinite sequences on a single GPU. Now outperforming Llama 3.1.

Meng Li's avatar
Meng Li
Aug 13, 2024
∙ Paid

Share this post

AI Disruption
AI Disruption
The First Pure Attention-Free Large Model Surpasses Open-Source Giant Llama 3.1
1
Share

The Mamba architecture model challenges Transformer once again.

Is the Mamba model finally ready to "stand tall"? Since its first launch in December 2023, Mamba has been a strong competitor to Transformer.

Since then, models using the Mamba architecture have continued to emerge. For example, Mistral released Codestral 7B, the first open-source large model based on Mamba.

Today, the Abu Dhabi Technology Innovation Institute (TII) released a new open-source Mamba model—Falcon Mamba 7B.

Here are the highlights of Falcon Mamba 7B: It can handle sequences of any length without additional memory and can run on a single 24GB A10 GPU.

Falcon Mamba 7B is now available on Hugging Face. This model, using a causal decoder only, leverages the novel Mamba State Space Language Model (SSLM) architecture to tackle various text generation tasks.

In terms of performance, Falcon Mamba 7B outperforms other leading models of similar size on several benchmarks, including Meta's Llama 3 8B, Llama 3.1 8B, and Mistral 7B.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Meng Li
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share