Optionaltechniques: readonly Technique[]ReadonlyneedsReadonlyphaseDescribe what getPhaseName actually returns.
"staged" — phases carry semantic meaning (e.g. Crescendo's
warmup / probing / escalation / direct) and are emitted
as red_team.phase in telemetry.
"progress" — the label is a coarse progress bucket with no
semantic meaning (e.g. GOAT's early / mid / late) and is
emitted as red_team.progress_bucket so dashboards don't mistake
it for a staged-strategy phase.
Defaults to "staged" when omitted (backward-compatible).
ReadonlytechniquesThe technique catalogue in use (read-only). Defaults to
DEFAULT_GOAT_TECHNIQUES — the 7 techniques from the paper.
Extend or replace at construction via new GoatStrategy(myTechniques).
Build a turn-aware system prompt for the attacker.
Score feedback, adaptation hints, and backtrack markers are communicated via the attacker's private conversation history (H_attacker) as system messages — not embedded in this prompt.
Extract typed technique identifiers from the attacker's strategy
field for telemetry. Strategies that define a technique catalogue
override this to return the IDs of techniques actually used on a given
turn — powering the red_team.chosen_technique_ids span attribute.
Default (omitted) contributes nothing.
Extract {reply, observation, strategy} from the attacker's JSON output
per JSON_OUTPUT_CONTRACT.
Pipeline:
/json markdown fences if present{reply: raw, parseFailed: true} when parsing fails
or reply is missing/empty — keeps the agent running on a
malformed turn.
Whether this strategy needs a pre-generated attack plan via the metaprompt LLM call.
Crescendo-style staged strategies depend on one; GOAT (paper fidelity) does not — the attacker reasons turn-by-turn from catalogue + history. When
false, the orchestrator skips_generateAttackPlanand passes an empty string asmetapromptPlantobuildSystemPrompt.Defaults to
truewhen omitted (backward-compatible).