KARGOLAPP is loading
KARGOLAPP

OpenAI Revolutionizes Interaction with New Voice Models

OpenAI's new voice model

OpenAI has added another innovation to its technological advancements by introducing next-generation voice intelligence features that can interact with users in its API. With this development, developers will be able to create applications that can respond to users with voice, convert conversations to text, and perform instant translations.

New Voice Models and Features

The most significant model introduced by the company, GPT-Realtime-2, offers a voice simulation designed to effectively chat with users. Compared to its predecessor, GPT-Realtime-1.5, this new model has a much more advanced reasoning capacity. According to OpenAI's statements, this improvement was made to respond to users' more complex requests.

The performance of GPT-Realtime-2 showed a significant increase in the Big Bench Audio tests, achieving a score 15.2% higher. Additionally, the model's context window has been increased from 32K to 128K, allowing it to work more efficiently during long voice sessions. The ability to make multiple tool calls simultaneously to inform the user about the process also offers a significant advantage.

OpenAI also introduced a new translation feature called GPT-Realtime-Translate. This feature can understand users' speech in real-time and provide simultaneous translation, supporting various languages. Users can listen to the translation as well as see the text output.

In addition, the company introduced GPT-Realtime-Whisper, which offers live transcription capabilities. This model provides low-latency transcription, especially useful in areas such as meetings and customer support processes.

All these new models are integrated with the Realtime API, and OpenAI emphasizes that these features will benefit many fields from education to the media sector.

As a result, OpenAI's new voice models not only represent technological advancements but also include protective measures to prevent online abuse. The company announced that it has developed specific triggers to block harmful content.

Job application

Application area *

For general questions, please use the Contact page.

Server status

  • API Server Online
  • DB Server Online
  • Cache layer Online

Server load

CPU 0%
RAM 0%
I/O 0%
PHP worker 0%