Jailbreak Script vs 2025

Large reasoning models are autonomous jailbreak agents

Jailbreaking – bypassing built-in safety mechanisms in AI models – has traditionally required complex technical procedures or specialized human expertise. In this study, we show that the persuasive ...

Ars Technica

Anthropic dares you to jailbreak its new AI model

Even the most permissive corporate AI models have sensitive topics that their creators would prefer they not discuss (e.g., weapons of mass destruction, illegal activities, or, uh, Chinese political ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Large reasoning models are autonomous jailbreak agents

Anthropic dares you to jailbreak its new AI model

Trending now