Tech Giants' Data Hunger Drives Questionable Tactics

Tech companies like OpenAI, Google, and Meta are desperately seeking more data to train their AI models, even if it means bending rules or laws.
OpenAI created a tool to transcribe YouTube videos without permission to get more text data, while Google may have done the same.
Meta executives discussed buying a publishing house and using copyrighted online content for their models, arguing it qualifies as fair use.
The tech giants have nearly exhausted reputable English language text online but need more to advance their AI.
Facing a data shortage, companies like OpenAI are exploring using AI-generated "synthetic" text to train more powerful AI systems.