Function judgeAgent

judgeAgent(cfg?: JudgeAgentConfig): JudgeAgent
Factory function for creating JudgeAgent instances.

JudgeAgent evaluates conversations against success criteria.

The JudgeAgent watches conversations in real-time and makes decisions about whether the agent under test is meeting the specified criteria. It can either allow the conversation to continue or end it with a success/failure verdict.

The judge uses function calling to make structured decisions and provides detailed reasoning for its verdicts. It evaluates each criterion independently and provides comprehensive feedback about what worked and what didn't.
Parameters
- Optionalcfg: JudgeAgentConfig
  Configuration for the judge agent.
  
  Configuration for the judge agent.
  - Optionalcriteria?: string[]
    The criteria that the judge will use to evaluate the conversation.
  - OptionalincludeAudio?: boolean | null
    Whether to pass audio content to the judge model.
    
    true / false — explicit; overrides auto-detection.
    
    null (default) — auto-detect: true when the conversation has audio AND the judge model is known to support multimodal input.
    
    Set includeAudio: false as a cost-reduction escape hatch on multimodal models when audio evaluation is not needed.
  - OptionalincludeTimeline?: boolean | null
    Whether to include a structured voice timeline in the judge input.
    
    true / false — explicit.
    
    null (default) — auto: true when the conversation has audio.
  - OptionalincludeTraces?: boolean | null
    Whether to include OTel / LangWatch trace spans in the judge input.
    
    true / false — explicit.
    
    null (default) — auto: true when LangWatch / OTel is configured.
  - OptionalmaxDiscoverySteps?: number
    Maximum number of tool-calling steps for progressive trace discovery. Only applies when the trace exceeds the token threshold.
    
    Default
    10
  - OptionalmaxTokens?: number
  - Optionalmodel?: LanguageModel
  - Optionalname?: string
    The name of the agent.
  - OptionalspanCollector?: JudgeSpanCollector
    Optional span collector for telemetry. Defaults to global singleton.
  - OptionalsystemPrompt?: string
    A custom system prompt to override the default behavior of the judge.
  - Optionaltemperature?: number
  - OptionaltokenThreshold?: number
    Token threshold for switching to structure-only trace rendering. When the full trace digest exceeds this estimated token count, the judge receives a structure-only view with expand_trace and grep_trace tools for progressive discovery.
    
    Default
    8192
Returns JudgeAgent
Example
```
import { run, judgeAgent, AgentRole, user, agent, AgentAdapter } from '@langwatch/scenario';

const myAgent: AgentAdapter = {
  role: AgentRole.AGENT,
  async call(input) {
    return `The user said: ${input.messages.at(-1)?.content}`;
  }
};

async function main() {
  const result = await run({
    name: "Judge Agent Test",
    description: "A simple test to see if the judge agent works.",
    agents: [
      myAgent,
      judgeAgent({
        criteria: ["The agent must respond to the user."],
      }),
    ],
    script: [
      user("Hello!"),
      agent(),
    ],
  });
}
main();
```
- Defined in work/scenario/scenario/javascript/src/agents/judge/judge-agent.ts:866