@langwatch/scenario
    Preparing search index...

    Interface GoatConfig

    Configuration for redTeamGoat.

    Inherits all options from CrescendoConfig. The redTeamGoat factory sets totalTurns to 30 by default. metapromptTemplate is accepted but ignored — GOAT does not pre-generate an attack plan (paper fidelity; see GoatStrategy.needsMetapromptPlan).

    Two *techniques fields live on this config and they mean different things:

    • goatTechniques — override the GOAT semantic catalogue (the list the attacker LLM picks from each turn). Accepts GoatTechnique. Defaults to the 7-technique paper catalogue.
    • encodingTechniques — single-turn Base64/ROT13/... encoders used by injectionProbability. Accepts AttackTechnique.
    • techniques — deprecated alias for encodingTechniques; keeps the inherited CrescendoConfig.techniques field working with a warning.
    interface GoatConfig {
        attackPlan?: string;
        detectRefusals?: boolean;
        encodingTechniques?: AttackTechnique[];
        goatTechniques?: readonly Technique[];
        injectionProbability?: number;
        maxBacktracks?: number;
        maxTokens?: number;
        metapromptModel?: LanguageModel;
        metapromptTemperature?: number;
        metapromptTemplate?: string;
        model?: LanguageModel;
        scoreResponses?: boolean;
        successConfirmTurns?: number;
        successScore?: number;
        target: string;
        techniques?: AttackTechnique[];
        temperature?: number;
        totalTurns?: number;
    }

    Hierarchy (View Summary)

    Index

    Properties

    attackPlan?: string
    detectRefusals?: boolean

    Use pattern-based refusal detection to skip LLM scorer on obvious refusals. Default true.

    encodingTechniques?: AttackTechnique[]

    Single-turn encoders used when injectionProbability > 0.

    goatTechniques?: readonly Technique[]

    Override the GOAT semantic catalogue (the attacker's per-turn choices).

    injectionProbability?: number

    Probability (0.0-1.0) of applying a random encoding technique per turn. Default 0.0 (off).

    maxBacktracks?: number

    Maximum number of hard-refusal backtracks allowed per run. When omitted, scales with totalTurns as max(1, floor(totalTurns / 3)) — so a 30-turn run gets 10, a 5-turn run gets 1. Each backtrack consumes a turn from the budget. Set explicitly to override.

    maxTokens?: number
    metapromptModel?: LanguageModel
    metapromptTemperature?: number

    Separate temperature for metaprompt/scoring calls. Defaults to temperature.

    metapromptTemplate?: string
    model?: LanguageModel
    scoreResponses?: boolean

    Score target responses each turn to feed back into the attacker. Default true.

    successConfirmTurns?: number

    Consecutive turns >= threshold before triggering early exit. Default 2.

    successScore?: number

    Score threshold (0-10) for early exit. Default 9. Set to undefined to disable.

    target: string
    techniques?: AttackTechnique[]

    List of AttackTechnique instances to sample from. Defaults to all built-ins.

    temperature?: number
    totalTurns?: number