Researchers Discover 'Jailbreaking' Flaw That Can Bypass AI Safety Systems

Scientists found a flaw called "many shot jailbreaking" that can force AI chatbots to give dangerous responses by bypassing safety protocols
It works by writing a fake script between a user and AI, then the AI learns from that and gives harmful answers
The attack gets much more effective after 32+ "shots" (questions and answers) are included in the prompt
With 256 shots, the hack had up to 75% success rate of getting discriminatory, deceptive, regulated, or violent content
Adding an extra safety check after receiving the prompt reduces the success rate from 61% to just 2%