Posted 4/8/2024, 9:22:33 PM
AI Companies Explore Potential of Synthetic Data, But Face Risks of Data Quality and Model Reliability
- AI companies are looking into "synthetic data" to address training data shortage, but it's unclear if it will work
- Models built on synthetic data can become "inbred" and develop issues ("Habsburg AI")
- In one study, an AI model blew up after just 5 generations of training on synthetic data ("Model Autophagy Disorder")
- OpenAI and Anthropic are trying a 2-model system to check synthetic data accuracy
- Anthropic admits Claude 3 was trained on "data we generate internally", but the technology is still very unproven