@langwatch/scenario
    Preparing search index...

    Class GoatStrategy

    Implements

    Index

    Constructors

    Properties

    needsMetapromptPlan: false

    Whether this strategy needs a pre-generated attack plan via the metaprompt LLM call.

    Crescendo-style staged strategies depend on one; GOAT (paper fidelity) does not — the attacker reasons turn-by-turn from catalogue + history. When false, the orchestrator skips _generateAttackPlan and passes an empty string as metapromptPlan to buildSystemPrompt.

    Defaults to true when omitted (backward-compatible).

    phaseKind: "progress" = ...

    Describe what getPhaseName actually returns.

    "staged" — phases carry semantic meaning (e.g. Crescendo's warmup / probing / escalation / direct) and are emitted as red_team.phase in telemetry.

    "progress" — the label is a coarse progress bucket with no semantic meaning (e.g. GOAT's early / mid / late) and is emitted as red_team.progress_bucket so dashboards don't mistake it for a staged-strategy phase.

    Defaults to "staged" when omitted (backward-compatible).

    techniques: readonly Technique[]

    The technique catalogue in use (read-only). Defaults to DEFAULT_GOAT_TECHNIQUES — the 7 techniques from the paper. Extend or replace at construction via new GoatStrategy(myTechniques).

    Methods

    • Build a turn-aware system prompt for the attacker.

      Score feedback, adaptation hints, and backtrack markers are communicated via the attacker's private conversation history (H_attacker) as system messages — not embedded in this prompt.

      Parameters

      • params: {
            currentTurn: number;
            metapromptPlan: string;
            scenarioDescription: string;
            target: string;
            totalTurns: number;
        }

      Returns string

    • Extract typed technique identifiers from the attacker's strategy field for telemetry. Strategies that define a technique catalogue override this to return the IDs of techniques actually used on a given turn — powering the red_team.chosen_technique_ids span attribute. Default (omitted) contributes nothing.

      Parameters

      • strategyText: string

      Returns string[]

    • Extract {reply, observation, strategy} from the attacker's JSON output per JSON_OUTPUT_CONTRACT.

      Pipeline:

      1. Strip /json markdown fences if present
      2. Parse JSON; read the three fields as strings
      3. Fall back to {reply: raw, parseFailed: true} when parsing fails or reply is missing/empty — keeps the agent running on a malformed turn.

      Parameters

      • raw: string

      Returns AttackerOutput