Bridges a single call() turn's audio and timing into the executor
state. Kept private (one-call-per-instance) so subclasses can opt out
by overriding call() and the default flow stays short.
Timing model (review M1): segment start/end are laid on a byte-accurate
AUDIO cursor — each segment occupies [cursor, cursor + chunk.durationSeconds]
and advances the cursor by its PCM byte-duration. A segment's
endTime - startTime therefore equals its true audio length, and
recording.duration (= max endTime) equals the full.wav byte-duration,
regardless of how fast an in-process transport flushed the bytes. The OLD
model timestamped each segment at the wall-clock instant sendAudio /
receiveAudio resolved, which on fast transports collapsed multi-second
turns to ~1 ms and made manifest.duration unreliable as audio-length proof.
Latency is NOT derived from these audio offsets (consecutive turns are
gapless on the cursor, so an offset gap would always be ~0). It is measured
from the wall-clock marks recordUser / markAgentStart keep —
the genuine response time between the user finishing on the wire and the
agent's first chunk arriving.
Bridges a single
call()turn's audio and timing into the executor state. Kept private (one-call-per-instance) so subclasses can opt out by overridingcall()and the default flow stays short.Timing model (review M1): segment start/end are laid on a byte-accurate AUDIO cursor — each segment occupies
[cursor, cursor + chunk.durationSeconds]and advances the cursor by its PCM byte-duration. A segment'sendTime - startTimetherefore equals its true audio length, andrecording.duration(= max endTime) equals thefull.wavbyte-duration, regardless of how fast an in-process transport flushed the bytes. The OLD model timestamped each segment at the wall-clock instantsendAudio/receiveAudioresolved, which on fast transports collapsed multi-second turns to ~1 ms and mademanifest.durationunreliable as audio-length proof.Latency is NOT derived from these audio offsets (consecutive turns are gapless on the cursor, so an offset gap would always be ~0). It is measured from the wall-clock marks recordUser / markAgentStart keep — the genuine response time between the user finishing on the wire and the agent's first chunk arriving.