OpenAI launches GPT-Realtime-2, Translate, and Whisper for live voice apps
OpenAI says three Realtime API audio models—GPT‑Realtime‑2, GPT‑Realtime‑Translate, and GPT‑Realtime‑Whisper—support voice agents that reason, translate, and transcribe in real time.
Passed source freshness, duplicate, QA, and review checks before publishing. Main source freshness limit: 14 days.
- Source count
- 1
- Primary sources
- 1
- QA status
- pass
Plain English
What this means in simple words
Instead of recording audio and sending it later, an app can talk to the API continuously and get immediate speech-to-text, translation, and spoken replies.
What happened
On May 7, 2026, OpenAI introduced three Realtime API audio models: GPT‑Realtime‑2 for voice interactions with stronger reasoning, GPT‑Realtime‑Translate for live speech translation, and GPT‑Realtime‑Whisper for streaming transcription.
Why it matters
Realtime voice apps often fail on long context, tool calls, or multilingual use. These models target lower-latency voice agents that can keep a conversation going while translating or transcribing, which can expand where voice interfaces are practical.
Key points
- GPT‑Realtime‑2 targets live conversations with stronger reasoning and longer context for agent workflows.
- GPT‑Realtime‑Translate supports live speech translation across 70+ input languages into 13 output languages.
- GPT‑Realtime‑Whisper provides low-latency streaming transcription priced per minute.
What to watch
Watch developer adoption in the Realtime API, how translation quality holds up in noisy settings, and whether voice agents reliably handle interruptions and tool calls in production.
Key terms
- Realtime API
- An API pattern where audio is streamed continuously so models can respond while a conversation is happening.
- Streaming transcription
- Speech-to-text that outputs partial text as someone speaks, reducing perceived latency.
Sources
Source dates are original publication dates. The posted date above is when The AI Tea published this explanation.
- Advancing voice intelligence with new models in the API OpenAI · Product release · Original source May 7, 2026 · Source age 1 day Primary