Build a turn-aware system prompt for the attacker.
Score feedback, adaptation hints, and backtrack markers are communicated via the attacker's private conversation history (H_attacker) as system messages — not embedded in this prompt.
Build a turn-aware system prompt for the attacker.
Score feedback, adaptation hints, and backtrack markers are communicated via the attacker's private conversation history (H_attacker) as system messages — not embedded in this prompt.