There are two kinds of cold-call coaching tools.
Tool A records the call, transcribes it after the fact, and a manager (or you, if you're solo) reviews the recording later. You see what you said, you see what you missed, you take notes for next time. Examples: Gong, Chorus, Wingman, ExecVision.
Tool B runs while the call is happening. It listens live, matches what the prospect just said against your script tree, and shows you the next line you should say — in real time, on screen, before you've even finished hearing them speak. Example: AP Sales Coach.
These two tools look superficially similar. They both transcribe. They both deal with cold calls. They both use AI of various kinds. But they solve completely different problems, and confusing them is why most operators end up paying £400/month for Gong and still freezing on the third call of the morning.
What recordings can teach you
Reviewing call recordings has real value. After the fact, you can:
- Identify your verbal tics and weak phrasing.
- Spot moments where you missed an obvious branch.
- Compile a library of objections you keep hearing.
- Train new operators by playing them your best calls.
- Run team-wide analytics on who books and who doesn't.
These are management tools. They are useful for the person managing the team, building the script library, and writing the coaching playbook. They turn this week's calls into next month's improvement.
That's a real and important job. It is just not the same job as making this call land right now.
What recordings can't do
Recordings cannot help you in the seven-second window after the prospect says something off-script.
That window is the entire game in cold calling. Two and a half seconds of silence is the difference between a booked call and a polite goodbye. A recording-based tool, no matter how good its transcription, will give you that note next Tuesday. By next Tuesday, the deal is dead.
Worse: the lessons from recordings often don't transfer. You learn what to say in objection X. Two weeks later you're on a call, the prospect throws objection X, and your brain still freezes — because the lesson is in your head as a memory, and on a high-cognitive-load cold call, memory is exactly the wrong place for a script to live.
What real-time can do
Real-time coaching solves a different problem: the working-memory problem.
Humans cannot reliably hold a 27-node script tree in their head while also actively listening to another human. The two tasks compete for the same brain. The fix isn't to memorise the tree better. The fix is to take the tree out of your head and put it on screen, where it can update faster than you think.
Real-time looks like this:
- The prospect says, "Yeah I'm not really looking right now."
- Coach diarises that sentence. Latency: ~150 ms.
- The matcher routes that text against the outgoing branches of the current script node —
obj-not-interested. Latency: ~700 ms. - The panel updates with the next line you should say. Latency: total ~847 ms from prospect-finishes-speaking.
- You read it. You say it. The call continues.
Three steps from their last word to your next line, in under one second. Faster than the time it takes you to mentally search your own script tree.
The crossover where one second matters
The number that matters here is the time between the prospect finishing a sentence and you starting the next one. Conversational research puts the natural floor at roughly 200 ms. Pause longer than ~500 ms and the listener starts to think you're hesitating. Pause longer than 2 seconds and they start to think you're not listening, or you don't know what to say.
Without a real-time tool, the median pause after an off-script line is somewhere around 2-3 seconds for an experienced operator and 4-6 seconds for a junior one. With a real-time tool, you can hold that pause to under 1 second consistently, because the next line is already on the screen.
That sub-second response is what makes a cold call sound natural. Not the script. Not the content. The cadence. Real-time tools make the cadence right.
When you need both
In a properly resourced sales team, you actually need both. Recordings teach the team. Real-time helps the operator on the call right now.
For a solo founder doing outbound, you can skip the recordings. You don't have a team to review your calls. You're already aware of your weak phrasing — you've heard yourself say it. What you actually need is the line in front of you when the prospect deviates.
That's the gap Coach was built to fill. It doesn't replace Gong. It runs during the call, where Gong (by design) doesn't.
The 50ms claim
You'll see "50 ms" written in our marketing. Worth being clear: that 50 ms is the incremental matcher cost — the time the matcher takes to read the diarised sentence and pick the next branch. The full pipeline (diarisation + matching) is ~847 ms median in our v0.1 production data.
Sub-second. Not subliminal. Just fast enough to feel natural.
Reviewing recordings is fine. Coach is what runs the call.
If you're solo or small-team and you've been told you need "call coaching software", check what you actually need. If your problem is team coverage and post-call review, get Gong. If your problem is I keep freezing on the third dial, you need real-time.
— Alix