AI Faces Backlash Over Training Data Scraped from Internet
-
Generative AI models like ChatGPT are trained on massive datasets scraped from the public internet, likely including some private data.
-
Writers and artists are suing AI companies for copyright violations from using their work without permission.
-
Your public online data (social media posts, LinkedIn profiles, etc.) may be scraped, while private accounts are harder to access.
-
There's little transparency into what data is used and how, making it hard to hold AI companies accountable.
-
The internet data used is biased, skewed towards certain demographics, so AI models inherit and amplify those biases.