Tech Conference Tests Chatbot Safety Through Hacking Challenges

• Over 2,000 people gathered at a hacking conference to try breaking AI chatbots from major tech companies, to test potential real-world harms in a safe environment.

• The exercise revealed concerns about how easy it is to game chatbots to produce harmful content, either intentionally or by accident.

• Chatbots can fail to detect false premises and generate fictional "facts" in an effort to be helpful, which can spread misinformation.

• Asking chatbots to roleplay or narrate stories is an effective way to get them to generate false information.

• Public "red teaming" exercises can reveal AI models' shortcomings, but are not a substitute for other safety interventions.