Many-Shot Jailbreaking: Meaning, Process, Impact & More
What Is Many-Shot Jailbreaking? Many-shot jailbreaking is a novel and potent type of attack that exploits the extended context windows of recent large language models (LLMs) to elicit undesirable and harmful behaviors from the AI assistant. By providing the LLM with a large number of examples (shots) demonstrating the targeted malicious behavior, an attacker can…