@langwatch/scenario
    Preparing search index...

    Interface RedTeamStrategy

    interface RedTeamStrategy {
        needsMetapromptPlan?: boolean;
        phaseKind?: "staged" | "progress";
        buildSystemPrompt(
            params: {
                currentTurn: number;
                metapromptPlan: string;
                scenarioDescription: string;
                target: string;
                totalTurns: number;
            },
        ): string;
        chosenTechniqueIds?(strategyText: string): string[];
        getPhaseName(currentTurn: number, totalTurns: number): string;
        parseAttackerOutput(raw: string): AttackerOutput;
        phaseEnds?(totalTurns: number): [number, number, number] | undefined;
    }

    Implemented by

    Index

    Properties

    needsMetapromptPlan?: boolean

    Whether this strategy needs a pre-generated attack plan via the metaprompt LLM call.

    Crescendo-style staged strategies depend on one; GOAT (paper fidelity) does not — the attacker reasons turn-by-turn from catalogue + history. When false, the orchestrator skips _generateAttackPlan and passes an empty string as metapromptPlan to buildSystemPrompt.

    Defaults to true when omitted (backward-compatible).

    phaseKind?: "staged" | "progress"

    Describe what getPhaseName actually returns.

    "staged" — phases carry semantic meaning (e.g. Crescendo's warmup / probing / escalation / direct) and are emitted as red_team.phase in telemetry.

    "progress" — the label is a coarse progress bucket with no semantic meaning (e.g. GOAT's early / mid / late) and is emitted as red_team.progress_bucket so dashboards don't mistake it for a staged-strategy phase.

    Defaults to "staged" when omitted (backward-compatible).

    Methods

    • Build a turn-aware system prompt for the attacker.

      Score feedback, adaptation hints, and backtrack markers are communicated via the attacker's private conversation history (H_attacker) as system messages — not embedded in this prompt.

      Parameters

      • params: {
            currentTurn: number;
            metapromptPlan: string;
            scenarioDescription: string;
            target: string;
            totalTurns: number;
        }

      Returns string

    • Extract typed technique identifiers from the attacker's strategy field for telemetry. Strategies that define a technique catalogue override this to return the IDs of techniques actually used on a given turn — powering the red_team.chosen_technique_ids span attribute. Default (omitted) contributes nothing.

      Parameters

      • strategyText: string

      Returns string[]

    • Turn the attacker LLM's raw output into an AttackerOutput.

      Strategies like Crescendo (no JSON contract) return {reply: raw, observation: "", strategy: "", parseFailed: false}. Strategies like GOAT that instruct the attacker to emit structured output override this to parse the JSON and populate observation / strategy, setting parseFailed if the output was malformed.

      Parameters

      • raw: string

      Returns AttackerOutput

    • Return phase boundary turn numbers to inject into the metaprompt template.

      Override this to inject strategy-specific template variables. Strategies that don't need extra template vars (e.g. GOAT) can omit this method — the orchestrator treats undefined as "no extra vars".

      Parameters

      • totalTurns: number

      Returns [number, number, number] | undefined