AI Disruption

AI Disruption

OpenAI Puts GPT-5-Level Reasoning Into Voice Model

OpenAI’s new voice models cut live translation cost to $0.034 per minute.

Meng Li's avatar
Meng Li
May 08, 2026
∙ Paid

“AI Disruption” Publication 9800 Subscriptions 20% Discount Offer Link.


OpenAI has launched three new real-time voice models that not only integrate GPT-5-level reasoning capabilities but also deliver a powerful blow to the simultaneous interpretation industry:

Simultaneous interpretation that closely follows the speaker’s rhythm is now just $0.034 per minute.

The three models are GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper. They combine end-to-end reasoning speech, simultaneous translation, and streaming transcription into a single API.

The results are mind-blowing.

OpenAI employee Jason Liu spoke English into the microphone, and GPT-Realtime-Translate instantly converted it into Japanese in real time.

The entire process is fully streaming — translation begins immediately without waiting for the speaker to finish a complete sentence.

Netizen Claire Vo used ChatPRD combined with GPT-Realtime-2. She spoke into the microphone: “Help me write a product requirements document.”

User's avatar

Continue reading this post for free, courtesy of Meng Li.

Or purchase a paid subscription.
© 2026 Meng Li · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture