OptionalagentSet when the adapter has emitted its first agent audio chunk for the
current turn — gates timing-based barge-in. Concrete adapters expose
this so scenario.interrupt can wait for real speech before
firing the interruption. Optional: adapters without server-VAD-style
interrupt sequencing can leave it undefined.
ReadonlycapabilitiesDeclaration of what this adapter can and cannot do. Concrete subclasses MUST publish a non-default value; the base instance defaults to "nothing supported" so capability-gated steps fail safely when an adapter forgets to declare.
Protected ReadonlyhistoryReadonlyllmOptionalnameHard cap on a single agent turn's audio. Prevents runaway loops if a transport never signals end-of-stream. 30s = a long sentence.
Tail silence: once the first agent chunk arrives, keep draining receiveAudio until no chunk shows up within this many seconds — that's how we detect the agent finished talking.
Seconds to wait for agent audio after sending user audio.
OptionalstreamingIncremental transcript text emitted while the agent speaks. Populated
by adapters that advertise capabilities.streamingTranscripts. Read
by scenario.interrupt when afterWords: N is set.
ReadonlysttReadonlyttsProtected ReadonlyttsProtectedturnTurn-output guard. The default call() drains receiveAudio until
tail-silence; on this adapter that would kick a second LLM call. Reset by
sendAudio (new user turn → new LLM call allowed), set by the end of
receiveAudio.
ReadonlyvoiceStatic ReadonlyDEFAULT_Default call() body, ported from Python VoiceAgentAdapter.call.
Threads the latest user-message audio through sendAudio, drains the agent response on tail silence, records one user and one agent segment into the executor state, and returns the merged assistant audio message. Subclasses may override for specialised flows but will usually inherit it.
Open the transport and prepare to exchange audio.
Close the transport and release resources.
Send a first-class interrupt signal to the agent under test.
Adapters that advertise capabilities.interruption === true override
this to send the transport-native interrupt (e.g. Twilio clear,
OpenAI Realtime response.cancel). The default raises
UnsupportedCapabilityError; callers (scenario.interrupt())
check capabilities.interruption and fall back to timing-based
barge-in when this returns false.
Whether the transport is currently open and ready to exchange audio
(Gap #11). The default call flow (defaultVoiceCall)
consults this BEFORE sending audio and raises PendingTransportError
uniformly when it returns false — so a call() issued before the
executor's connect() fails with one clear error across every transport
instead of a transport-specific null-dereference or silent hang.
Base default is true: adapters with no meaningful "not connected" state
(in-process composable, test doubles) never trip the gate. Network
transport leaves override this to report their real socket/session state.
Transmit DTMF tones to the telephony peer. Adapters that advertise
capabilities.dtmf MUST implement this; the default raises
UnsupportedCapabilityError so an adapter that forgot to ship
sendDtmf while claiming the capability fails loudly instead of
silently routing through a PCM fallback.
Returns a string representation of an object.
Composable voice agent with ElevenLabs-opinionated defaults.
Not to be confused with ElevenLabsAgentAdapter (above) which talks to ElevenLabs' hosted ConvAI endpoint. This class is local: you compose
ElevenLabsSTTProvider+ any LLM + ElevenLabs TTS yourself.Default stack:
openai("gpt-5.4-mini")— text-only chat completion.elevenlabs/EXAVITQu4vr4xnSDxMaL(Sarah — free-tier premade). Override via theELEVENLABS_VOICE_IDenv var or thevoicearg.Example