Waboom AI
AI Training
AI Automation
AI Voice Agents
Case Studies
Resources
Contact
09 888 0402
Back to BlogTechnology

GPT Realtime 2 just shipped. Our voice agents are already running it.

Leonardo Garcia-Curtis09/05/2026
TL;DR

GPT Realtime 2 shipped on 8 May 2026 with GPT-5 class reasoning, a 128,000 token context window (up from 32,000), five reasoning levels and async function calling that stops the line freezing while the agent looks something up. Big Bench Audio jumps from 81.4% to 96.6%. Audio MultiChallenge jumps from 34.7% to 48.5%. Function calling on ComplexFuncBench moves from 49.7% to 66.5%. Waboom AI is rolling outbound sales campaigns onto GPT Realtime 2 this week. Your Waboom AI per-minute rate does not change.

GPT Realtime 2 just shipped. Our voice agents are already running it.

6 min read  ·  Operator notes from the call floor  ·  Last updated 9 May 2026

OpenAI shipped GPT Realtime 2 yesterday.

It is the first voice model OpenAI have released with GPT-5 class reasoning. Four times the context window of the model we ran on Tuesday. Smarter under interruption. Calls a tool mid-conversation without freezing the line on you.

If your voice agent has ever stalled on a CRM lookup mid-pitch, that bit is over.

We have been running GPT Realtime 1.5 across the Waboom AI call floor for months. As of today, your outbound campaigns are running on 2.

Here is what actually changes on your cold call.

In this article

  • 1. What did OpenAI ship on 8 May 2026?
  • 2. How much smarter is 2 than 1.5?
  • 3. Why a 128k context window matters on a sales call
  • 4. Async function calling. The line stops freezing.
  • 5. Five reasoning levels. We tune per call.
  • 6. What this does to a real Waboom AI campaign
  • 7. The bottom line for an operator in May 2026
  • 8. Frequently Asked Questions
1

The Release

What did OpenAI ship on 8 May 2026?

Three new voice models in one drop.

GPT Realtime 2 is the flagship for voice agents. GPT Realtime Translate handles live translation across 70 input languages and 13 output languages. GPT Realtime Whisper is the live transcription model.

The flagship is the one that matters for your outbound sales. It is not a 1.5 patch. The benchmarks tell you that.

Bar chart showing GPT Realtime 2 vs GPT Realtime 1.5 on Big Bench Audio (96.6% vs 81.4%) and Audio MultiChallenge (48.5% vs 34.7%)
2

The Numbers

How much smarter is 2 than 1.5?

96.6% on Big Bench Audio. Up from 81.4%.

A 15.2 point jump on the headline audio reasoning test in one release. Your agent now passes Audio MultiChallenge at 48.5% (up from 34.7%) and ComplexFuncBench at 66.5% (up from 49.7%).

What that maps to on your call: the agent stays on script when prospects throw curveballs. It answers the question that was actually asked, not the closest one in the prompt. It picks the right tool the first time.

Honest caveat for you. Those benchmarks were run at the top two reasoning settings (high and xhigh). Production calls run at low, the default, for latency reasons.

Past 800ms a caller wonders if the line dropped. We covered that cliff in our LLM by job type breakdown. Even at low, GPT Realtime 2 carries a meaningful lift on 1.5.

This is also why the question is no longer "which model" but "which reasoning level for which intent". Same model, different brains for different jobs. We covered that meta shift in why voice agents get smarter every night.

3

The Context Window

Why a 128k context window matters on a sales call

GPT Realtime 1.5 had a 32,000 token context. GPT Realtime 2 has 128,000.

Four times more room to think.

If your provider was compressing the prospect record before each call, you have been losing context the model could have used.

Concretely on your Sydney vendor lead campaign: full prospect record, last six emails, last three calls, listing history, neighbouring sales, current motivation tags. All loaded in one context. All available to reason against in real time during a 30 second conversation.

Before, we had to compress. Pick the five most load-bearing fields. Hope the agent did not need the rest. Now we hand the model the whole HubSpot card and let it decide what is relevant.

That is the difference between an SDR who skim-read your brief and one who actually knows your lead.

4

Async Function Calling

Async function calling. The line stops freezing.

On 1.5, your agent said "let me check that for you" and the line went quiet. 700ms. 900ms. 1.4 seconds. Pickup rate killer. Hangup rate inflator.

On 2, your agent narrates while the lookup runs in the background.

"Let me pull that up. Yep, looking at your record now. I see you enquired about the Westmere listing on Tuesday."

Real conversational rhythm. No silence cliff.

Real estate agent walking down an Auckland street on a phone call while an AI voice agent dashboard shows a CRM lookup running in parallel and the bubble that reads let me check that for you

This matters most on three jobs we run every day.

A mortgage broker call quoting a live rate from a panel mid-conversation. A real estate call checking a listing CRM while quoting price brackets. A customer support ticket looking up an account while the customer keeps talking.

Same job we have been doing on 1.5. Half the awkwardness on the line.

5

Reasoning Levels

Five reasoning levels. We tune per call.

GPT Realtime 2 ships with five reasoning levels. Minimal, low, medium, high, xhigh. Low is the default. The benchmarks above were run at high and xhigh, which is the bit most launch coverage glossed over.

Timeline diagram showing five GPT Realtime 2 reasoning levels (minimal, low, medium, high, xhigh) mapped to Waboom voice agent jobs from after-hours receptionist through to multi-step service tickets

For a Waboom AI outbound campaign, the right setting is the lowest one that still holds the script. Higher reasoning costs latency. On an 800ms cliff that compounds.

How we map it across the agent fleet today.

After-hours receptionist taking messages and bookings runs at low. Inbound mortgage quotes with live rate lookups run at medium. Vendor objection handling on a cold seller call runs at high. Multi-step service tickets with conditional logic run at xhigh.

You do not pay for reasoning you do not need. The agent runs at the speed of the job.

6

The Sydney Playbook on 2

What this does to a real Waboom AI campaign

Our Sydney 90-day vendor lead campaign on the old model: 10,713 dials, 3,609 pickups (33.7%), 1,997 real conversations (18.6%), 141 warm transfers (7.1% of conversations). AU$32.74 per warm-transferred seller.

7.1% of conversations turned into warm transfers on 1.5.

The squeeze in that funnel was always the conversation to transfer step. That is where the agent has to hold three or four objections in a row. That is a reasoning job. And reasoning is exactly what 2 just unlocked.

We are not promising you specific post-upgrade conversion numbers in week one. We have been on the new model for a day. But the bottleneck 1.5 hit on long objection chains is the one 2 breaks.

Same logic for our Christchurch developer campaign. 49 viewings booked. $7.12 per booked viewing. 14 days of Meta lead handling.

The squeeze was always at multi-step recovery when prospects got vague about timing. New context window. Sharper reasoning. Expect your funnel to widen.

7

The Bottom Line

The bottom line for an operator in May 2026

If your voice agent provider is not on GPT Realtime 2 by end of May, you are calling at a handicap.

The reasoning gap is too big to ignore. The latency profile is the same. Per-minute economics hold. Async function calling fixes the only place the conversational rhythm broke.

Waboom AI has been the LLM-promiscuous voice agency from day one. Right model for the right job, all the way down. GPT Realtime 2 just became the right model for most outbound sales jobs we run.

For the AUD breakdown of what 10,000 calls a month looks like on this stack, our Australian pricing post has the maths.

Frequently Asked Questions

Is GPT Realtime 2 already running on my Waboom AI voice agent?

Most outbound sales campaigns moved across this week. We migrate campaign by campaign as the script gets re-validated at low reasoning. If you are a current customer and want to confirm where your campaign sits, message your Waboom AI account contact.

Does GPT Realtime 2 cost more than 1.5?

Not for you. Your Waboom AI per-minute rate does not change for the move to 2. We absorb the underlying model shift on our side. You get the smarter agent at the same talk-time price you signed for.

What does this mean for accents on AU and NZ campaigns?

Persona work sits above the model. Australian and Kiwi accents are the default for AU and NZ campaigns through voice ID and persona prompts. The reasoning lift in 2 sharpens script handling, not the accent. Full mechanic in the localised persona post.

What about the new translation model?

GPT Realtime Translate covers 70 input languages and 13 output languages live. We are testing it for AU campaigns where the lead pool includes Mandarin or Cantonese first language sellers. Expect a separate post once we have real data on word error rates and pricing for multilingual campaigns.

How fast can a new campaign go live on GPT Realtime 2?

Live in days, not weeks. A focused single-campaign rollout on the new model lands inside a week. Multi-path orchestration with concurrent campaigns and complex CRM integration sits at two to three weeks.

I run my own voice agent stack. What is the cheapest path to 2?

Swap the model ID from gpt-realtime-1.5 to gpt-realtime-2 in your Realtime API call. Re-tune the reasoning level per intent (start at low).

Audit your function-calling prompts so the agent narrates while async tools run. The rest of your stack carries over. Full background on stack choice in our LLM by job type breakdown.

Want a Waboom AI voice agent on GPT Realtime 2 by next week?

Send us a list and a campaign objective. We will spec it on the new model and quote per-outcome before you commit. Same stack behind the Sydney 141-listing campaign, now on GPT-5 class reasoning.

Waboom AI voice agents  ·  Book a 15-min scoping call

Sources: OpenAI: Advancing voice intelligence with new models in the API (8 May 2026) and OpenAI gpt-realtime-2 model documentation.

LG

Leonardo Garcia-Curtis

Founder & CEO at Waboom AI. Building voice AI agents that convert.

Ready to Build Your AI Voice Agent?

Let's discuss how Waboom AI can help automate your customer conversations.

Book a Free Demo

Related Pages

AI Sales Agent Australia

Outbound dialling, qualification, meeting booking. Live in hours.

AI Receptionist for Medical Offices

GP practices, allied health, specialist clinics.

Related Articles

ElevenLabs v3 vs Flash v2.5: when each one wins

ElevenLabs v3 vs Flash v2.5: when each one wins

The best LLM for voice agents, for sales cold calls and more.

The best LLM for voice agents, for sales cold calls and more.

AI Voice Agent Pricing in 2026: An Honest NZ + AU Breakdown

AI Voice Agent Pricing in 2026: An Honest NZ + AU Breakdown

Waboom AI

Empowering New Zealand and Australian businesses with AI voice agents and automation that deliver real, measurable value.

hello@waboom.ai+64 9 888 0402
Level 8, 139 Quay Street
Auckland CBD, New Zealand

Voice Agents

  • AI Voice Agents
  • AI Virtual Receptionist
  • AI Sales Agent
  • Voice Agent Pricing
  • Listen to Voices
  • Voice Agent Demos
  • Real Estate Voice Agents
  • Real Estate Guide

Workshops

  • AI Team Training
  • AI Strategy Workshop
  • AI Champion Workshop
  • Claude Team Training
  • Claude Code Workshop
  • Lovable Workshop
  • Free AI Workshop

Automation

  • AI Automation
  • Microsoft Copilot Agents
  • Integrations

Company

  • About Us
  • Contact
  • Partners
  • Resources
  • Blog
  • AI Agency NZ
  • AI Agency Australia

Powered by leading AI technologies

VAPIRetell AIOpenAIZapierMakeStripe

© 2026 Waboom.ai. All rights reserved.

PrivacyTermsSecurity