AI Disruption

AI Disruption

Share this post

AI Disruption
AI Disruption
OpenAI Releases Full-Powered o1 Model API with 60% Cost Reduction
Copy link
Facebook
Email
Notes
More

OpenAI Releases Full-Powered o1 Model API with 60% Cost Reduction

OpenAI launches o1 model API with 60% cost reduction, advanced visual features, WebRTC support for real-time APIs, and preference fine-tuning for custom models.

Meng Li's avatar
Meng Li
Dec 18, 2024
∙ Paid
1

Share this post

AI Disruption
AI Disruption
OpenAI Releases Full-Powered o1 Model API with 60% Cost Reduction
Copy link
Facebook
Email
Notes
More
1
Share
OpenAI o1 API Access, WebRTC, Go, Java and more - Geeky Gadgets

On Day 9 of OpenAI's 12-day release event, the o1 model API was officially launched, alongside a major upgrade to the real-time API that now supports WebRTC.

Compared to the previous preview version, the o1 model API reduces cognitive costs by 60% and includes advanced visual capabilities. The audio costs for GPT-4o have been cut by 60%, and the price of the mini version has plummeted by 10 times.

Additionally, OpenAI introduced a new preference fine-tuning method. By directly optimizing preferences with algorithms, large models can better grasp users’ preferred styles.

This update includes the following key features:

1. OpenAI o1 in the API:

The OpenAI o1 model is now officially available in the API for Level 5 users. As the successor to OpenAI o1-preview, the o1 model is designed to handle complex, multi-step tasks and provide higher accuracy. The model offers the following key features:

  • Function Calls: Seamlessly connects the o1 model to external data and APIs.

  • Structured Output: Generates responses that reliably adhere to custom JSON schemas.

  • Developer Messages: Allows developers to specify instructions or context for the model, such as defining tone, style, and other behavior guidance.

  • Visual Capabilities: Can understand images, unlocking more applications in fields that require visual input, such as science, manufacturing, or coding.

  • Lower Latency: For a given request, the o1 model uses 60% fewer inference tokens than o1-preview.

  • Reasoning Effort Parameter: A new reasoning_effort API parameter that allows developers to control the model's thinking time before answering questions.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Meng Li
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More