AI Models Test Their Ability to Detect Their Own Content

Researchers tested 3 AI models - Bard, ChatGPT, and Claude - on their ability to detect their own generated content. Claude had the most difficulty detecting its own content.
Claude's inability to detect its own content may be because its output contains fewer detectable "artifacts" that signal AI-generated text.
Bard had the highest success rate in detecting its own content. This may be because Bard outputs more artifacts that make its content easier to identify as AI-generated.
The models were better overall at detecting their own content vs. content generated by other models. This shows promise for "self-detection" by AI systems.
More research is needed with larger datasets and more models to further test self-detection and understand differences in detectors' abilities.