@langwatch/scenario
    Preparing search index...

    Class ComposableVoiceAgent

    Locally-executed STT → LLM → TTS voice agent.

    sendAudio transcribes incoming user audio; receiveAudio runs the LLM on the running conversation history and synthesizes the response via TTS.

    Hierarchy (View Summary)

    Index

    Constructors

    Properties

    agentSpeakingEvent?: AgentSpeakingEvent

    Set when the adapter has emitted its first agent audio chunk for the current turn — gates timing-based barge-in. Concrete adapters expose this so scenario.interrupt can wait for real speech before firing the interruption. Optional: adapters without server-VAD-style interrupt sequencing can leave it undefined.

    capabilities: AdapterCapabilities = ...

    Declaration of what this adapter can and cannot do. Concrete subclasses MUST publish a non-default value; the base instance defaults to "nothing supported" so capability-gated steps fail safely when an adapter forgets to declare.

    history: ModelMessage[]
    lastLlmResponse: string | null = null
    lastUserTranscript: string | null = null
    llm: LanguageModel
    name?: string
    responseMaxDuration: number = 30.0

    Hard cap on a single agent turn's audio. Prevents runaway loops if a transport never signals end-of-stream. 30s = a long sentence.

    responseTailSilence: number = 0.6

    Tail silence: once the first agent chunk arrives, keep draining receiveAudio until no chunk shows up within this many seconds — that's how we detect the agent finished talking.

    responseTimeout: number = 30.0

    Seconds to wait for agent audio after sending user audio.

    role: AgentRole = AgentRole.AGENT
    streamingTranscript?: string

    Incremental transcript text emitted while the agent speaks. Populated by adapters that advertise capabilities.streamingTranscripts. Read by scenario.interrupt when afterWords: N is set.

    tts: string
    ttsOptions: SynthesizeOptions
    turnOutputEmitted: boolean = false

    Turn-output guard. The default call() drains receiveAudio until tail-silence; on this adapter that would kick a second LLM call. Reset by sendAudio (new user turn → new LLM call allowed), set by the end of receiveAudio.

    DEFAULT_SYSTEM_PROMPT: string = ...

    Methods

    • Send a first-class interrupt signal to the agent under test.

      Adapters that advertise capabilities.interruption === true override this to send the transport-native interrupt (e.g. Twilio clear, OpenAI Realtime response.cancel). The default raises UnsupportedCapabilityError; callers (scenario.interrupt()) check capabilities.interruption and fall back to timing-based barge-in when this returns false.

      Returns Promise<void>

    • Whether the transport is currently open and ready to exchange audio (Gap #11). The default call flow (defaultVoiceCall) consults this BEFORE sending audio and raises PendingTransportError uniformly when it returns false — so a call() issued before the executor's connect() fails with one clear error across every transport instead of a transport-specific null-dereference or silent hang.

      Base default is true: adapters with no meaningful "not connected" state (in-process composable, test doubles) never trip the gate. Network transport leaves override this to report their real socket/session state.

      Returns boolean