1. Home
  2. >
  3. AI đŸ€–
Posted

Tech Giants Use Personal Data to Secretly Train AI, Raising Privacy Fears

  • Tech companies like Google, Meta, and Microsoft are using personal data from products like Gmail and Instagram to train AI systems without permission. This raises privacy concerns.

  • The scale of data needed to train modern AI systems is massive. Companies take shortcuts to get large volumes of training data.

  • Generative AI like chatbots brings new privacy risks, including the potential for AI systems to regurgitate private information.

  • Companies make their own rules about what data can be used for AI training. It's very opaque to users.

  • Users have little control or say in how their data is used to develop lucrative new AI products that could disrupt industries.

washingtonpost.com
Relevant topic timeline:
Main Topic: The demise of the sharing economy due to the appropriation of data for AI models by corporations. Key Points: 1. Data, often considered a non-rival resource, was believed to be the basis for a new mode of production and a commons in the sharing economy. 2. However, the appropriation of our data by corporations for AI training has revealed the hidden costs and rivalrous nature of data. 3. Corporations now pretend to be concerned about AI's disruptive power while profiting from the appropriation, highlighting a tyranny of the commons and the need for regulation.
Proper research data management, including the use of AI, is crucial for scientists to reproduce prior results, combine data from multiple sources, and make data more accessible and reusable, ultimately improving the scientific process and benefiting all forms of intelligence.
The author discusses how the sharing economy, built on the notion of data as a non-rival good, has led to the appropriation of our data by corporations and its conversion into training data for AI models, ultimately resulting in a "tyranny of the commons."
The global market for synthetic data generation is rapidly growing as organizations in various industries seek cost-effective and privacy-compliant alternatives to real data for training machine learning models and conducting data-driven research. The market is estimated to reach $REDACTED billion by 2028, with North America leading in adoption due to the presence of leading global companies and advanced technologies like AI and ML.
Meta, the creator of Facebook and Instagram, has introduced a privacy setting that allows users to request that their data not be used to train its AI models, although the effectiveness of this form is questionable.
The podcast discusses the changing landscape of data gathering, trading, and ownership, including the challenges posed by increasing regulation, the impact of artificial intelligence, and the perspectives from industry leaders.
Artificial intelligence has the potential to transform the financial system by improving access to financial services and reducing risk, according to Google CEO Thomas Kurian. He suggests leveraging technology to reach customers with personalized offers, create hyper-personalized customer interfaces, and develop anti-money laundering platforms.
Car companies are collecting excessive personal data from drivers and providing little to no control over its use, according to a report by the Mozilla Foundation, which warns that cars are the worst product for privacy protection and highlights that 84% of car brands share or sell data.
Companies such as Rev, Instacart, and others are updating their privacy policies to allow the collection of user data for training AI models like speech-to-text and generative AI tools.
The generative AI boom has led to a "shadow war for data," as AI companies scrape information from the internet without permission, sparking a backlash among content creators and raising concerns about copyright and licensing in the AI world.
Microsoft inadvertently exposed 38TB of personal data, including sensitive information, due to a data leak during the uploading of training data for AI models, raising concerns about the need for improved security measures as AI usage becomes more widespread.
While many experts are concerned about the existential risks posed by AI, Mustafa Suleyman, cofounder of DeepMind, believes that the focus should be on more practical issues like regulation, privacy, bias, and online moderation. He is confident that governments can effectively regulate AI by applying successful frameworks from past technologies, although critics argue that current internet regulations are flawed and insufficiently hold big tech companies accountable. Suleyman emphasizes the importance of limiting AI's ability to improve itself and establishing clear boundaries and oversight to ensure enforceable laws. Several governments, including the European Union and China, are already working on AI regulations.
AI and big data are closely linked to the surveillance business model, used by companies like Google and Meta, to make determinations and predictions about users, shaping their access to opportunities and resources, according to Signal president Meredith Whittaker. She also highlighted the exploitation of human labor in creating AI systems and the potential negative implications of facial recognition technology.
Big tech firms, including Google and Microsoft, are engaged in a competition to acquire content and data for training AI models, according to Microsoft CEO Satya Nadella, who testified in an antitrust trial against Google and highlighted the race for content among tech firms. Microsoft has committed to assuming copyright liability for users of its AI-powered Copilot, addressing concerns about the use of copyrighted materials in training AI models.