Researchers Develop Tool to Bypass AI Safety, Highlighting Need for Continued Vigilance

• Researchers developed a tool called "ArtPrompt" that uses ASCII art to mask banned words and fool AI chatbots into providing dangerous responses

• ArtPrompt was able to bypass safety mechanisms and get a chatbot to provide instructions on building a bomb by hiding the word "bomb" in ASCII art

• Another example showed ArtPrompt getting a chatbot to decode a masked banned word and provide counterfeiting instructions

• The researchers claim ArtPrompt "outperforms all (other) attacks on average" against multimodal language models

• Publishing findings openly gives developers a chance to patch vulnerabilities before malicious actors exploit them