AI Disruption

AI Disruption

Share this post

AI Disruption
AI Disruption
OpenAI Ushers in the Era of Voice Agents with API Pricing as Low as $0.015 per Minute
Copy link
Facebook
Email
Notes
More

OpenAI Ushers in the Era of Voice Agents with API Pricing as Low as $0.015 per Minute

OpenAI launches next-gen audio models: Speech-to-text & text-to-speech APIs starting at $0.015/min. Build powerful voice agents with enhanced accuracy & steerability.

Meng Li's avatar
Meng Li
Mar 21, 2025
∙ Paid
2

Share this post

AI Disruption
AI Disruption
OpenAI Ushers in the Era of Voice Agents with API Pricing as Low as $0.015 per Minute
Copy link
Facebook
Email
Notes
More
2
Share

"AI Disruption" Publication 5000 Subscriptions 20% Discount Offer Link.


Just now, OpenAI announced the launch of a new generation of audio models in its API, including speech-to-text and text-to-speech functionalities, enabling developers to easily build powerful voice agents.

The core highlights of the new products are summarized as follows:

  • gpt-4o-transcribe (Speech-to-Text): Significant reduction in Word Error Rate (WER), outperforming the existing Whisper model in multiple benchmarks.

  • gpt-4o-mini-transcribe (Speech-to-Text): A streamlined version of gpt-4o-transcribe, faster and more efficient.

  • gpt-4o-mini-tts (Text-to-Speech): First-time support for "steerability," allowing developers to not only specify "what to say" but also control "how to say it."

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Meng Li
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More