Topics

Posted 9/25/2023, 5:27:29 PM

AI Training Dataset Sparks Controversy Over Author Consent and Compensation

Use new search tool to see which authors' works were used to train AI without permission
183,000 books in dataset being used to train generative AI like Meta's LLaMA
Authors spent years creating books, didn't know they were being used this way
People training the AI stand to profit while authors may be replaced
Very few understand how AI models like LLaMA are developed, threatens to upend world

theatlantic.com

Relevant topic timeline:

8/14/2023

Why the Great AI Backlash Came for a Tiny Startup You’ve Probably Never Heard Of

The main topic of the article is the backlash against AI companies that use unauthorized creative work to train their models. Key points: 1. The controversy surrounding Prosecraft, a linguistic analysis site that used scraped data from pirated books without permission. 2. The debate over fair use and copyright infringement in relation to AI projects. 3. The growing concern among writers and artists about the use of generative AI tools to replace human creative work and the push for individual control over how their work is used.

8/19/2023

Biblioracle: Author’s reputation was on the line, thanks to AI

Main topic: The potential harm of AI-generated content and the need for caution when purchasing books. Key points: 1. AI is being used to generate low-quality books masquerading as quality work, which can harm the reputation of legitimate authors. 2. Amazon's response to the issue of AI-generated books has been limited, highlighting the need for better safeguards and proof of authorship. 3. Readers need to adopt a cautious approach and rely on trustworthy sources, such as local bookstores, to avoid misinformation and junk content.

8/19/2023

AI and You: The Copyright 'Sword' Over AI, Life Coaches Including Jesus Coming Your Way

Main topic: Copyright concerns and potential lawsuits surrounding generative AI tools. Key points: 1. The New York Times may sue OpenAI for allegedly using its copyrighted content without permission or compensation. 2. Getty Images previously sued Stability AI for using its photos without a license to train its AI system. 3. OpenAI has begun acknowledging copyright issues and signed an agreement with the Associated Press to license its news archive.

8/19/2023

Revealed: The Authors Whose Pirated Books Are Powering Generative AI

Main topic: The use of copyrighted books to train large language models in generative AI. Key points: 1. Writers Sarah Silverman, Richard Kadrey, and Christopher Golden have filed a lawsuit alleging that Meta violated copyright laws by using their books to train LLaMA, a large language model. 2. Approximately 170,000 books, including works by Stephen King, Zadie Smith, and Michael Pollan, are part of the dataset used to train LLaMA and other generative-AI programs. 3. The use of pirated books in AI training raises concerns about the impact on authors and the control of intellectual property in the digital age.

8/21/2023

Generative AI datasets could face a reckoning

The use of copyrighted works to train generative AI models, such as Meta's LLaMA, is raising concerns about copyright infringement and transparency, with potential legal consequences and a looming "day of reckoning" for the datasets used.

8/22/2023

Scraping or Stealing? A Legal Reckoning Over AI Looms

Three artists, including concept artist Karla Ortiz, are suing AI art generators Stability AI, Midjourney, and DeviantArt for using their work to train generative AI systems without their consent, in a case that could test the boundaries of copyright law and impact the way AI systems are built. The artists argue that feeding copyrighted works into AI systems constitutes intellectual property theft, while AI companies claim fair use protection. The outcome could determine the legality of training large language models on copyrighted material.

8/22/2023

A.I. Copyrights, Michael Jackson Abuse Cases, Smokey Robinson Trial & More Top Legal News

A federal judge has ruled that works created by artificial intelligence (A.I.) are not covered by copyrights, stating that copyright law is designed to incentivize human creativity, not non-human actors. This ruling has implications for the future role of A.I. in the music industry and the monetization of works created by A.I. tools.

8/23/2023

Zadie Smith, Stephen King and Rachel Cusk’s pirated works used to train AI

Authors such as Zadie Smith, Stephen King, Rachel Cusk, and Elena Ferrante have discovered that their pirated works were used to train artificial intelligence tools by companies including Meta and Bloomberg, leading to concerns about copyright infringement and control of the technology.

8/23/2023

Fake Books on Amazon Drive Authors to Shield Their Names From AI

Generative AI is enabling the creation of fake books that mimic the writing style of established authors, raising concerns regarding copyright infringement and right of publicity issues, and prompting calls for compensation and consent from authors whose works are used to train AI tools.

8/31/2023

UK publishers urge Sunak to protect works ingested by AI models

UK publishers have called on the prime minister to protect authors' intellectual property rights in relation to artificial intelligence systems, as OpenAI argues that authors suing them for using their work to train AI systems have misconceived the scope of US copyright law.

9/1/2023

Inventor Claims His AI Is Sentient, Fights to Copyright Its Creations

AI researcher Stephen Thaler argues that his AI creation, DABUS, should be able to hold copyright for its creations, but legal experts and courts have rejected the idea, stating that copyright requires human authorship.

9/5/2023

U.S. Copyright Office Invites Public To Comment On AI

The United States Copyright Office has launched a study on artificial intelligence (AI) and copyright law, seeking public input on various policy issues and exploring topics such as AI training, copyright liability, and authorship. Other U.S. government agencies, including the SEC, USPTO, and DHS, have also initiated inquiries and public forums on AI, highlighting its impact on innovation, governance, and public policy.

9/7/2023

Generation AI: education reluctantly embraces the bots

The use of artificial intelligence (AI) in academia is raising concerns about cheating and copyright issues, but also offers potential benefits in personalized learning and critical analysis, according to educators. The United Nations Educational, Scientific and Cultural Organization (UNESCO) has released global guidance on the use of AI in education, urging countries to address data protection and copyright laws and ensure teachers have the necessary AI skills. While some students find AI helpful for basic tasks, they note its limitations in distinguishing fact from fiction and its reliance on internet scraping for information.

9/8/2023

Amazon Now Requiring AI Disclosure for Ebooks Amid Author Backlash

Amazon.com is now requiring writers to disclose if their books include artificial intelligence material, a step praised by the Authors Guild as a means to ensure transparency and accountability for AI-generated content.

9/12/2023

AI Authors Sue Meta and OpenAI for Alleged Copyright Infringement in Training ChatGPT

Authors, including Michael Chabon, are filing class action lawsuits against Meta and OpenAI, alleging copyright infringement for using their books to train artificial intelligence systems without permission, seeking the destruction of AI systems trained on their works.

9/13/2023

Amazon Launches AI Tool to Help Sellers Create Product Listings

Amazon has introduced an AI tool for sellers that generates copy for their product pages, helping them create product titles, bullet points, and descriptions in order to improve their listings and stand out on the competitive third-party marketplace.

9/15/2023

AI Firms Secretly Amass Data to Train Models, Sparking Backlash From Creators

The generative AI boom has led to a "shadow war for data," as AI companies scrape information from the internet without permission, sparking a backlash among content creators and raising concerns about copyright and licensing in the AI world.

9/20/2023

Actors Face Threat from AI-Generated Audiobooks

Project Gutenberg, in collaboration with Microsoft and MIT, has used AI to transform thousands of ebooks into audiobooks, raising concerns among actors who fear the threat to their careers.

9/20/2023

Authors Guild Sues OpenAI Over Copyright Infringement for Using Books to Train ChatGPT

The Authors Guild, representing prominent fiction authors, has filed a lawsuit against OpenAI, alleging copyright infringement and the unauthorized use of their works to train AI models like ChatGPT, which generates summaries and analyses of their novels, interfering with their economic prospects. This case could determine the legality of using copyrighted material to train AI systems.

9/22/2023

Amazon Limits AI-Generated Books to 3 Per Day to Curb Low-Quality Content Farms

Amazon has introduced a policy allowing authors, including those using AI, to "write" and publish up to three books per day on its platform under the protection of a volume limit to prevent abuse, despite the poor reputation of AI-generated books sold on the site.

9/22/2023

Amazon Requires Disclosure of AI-Generated Books Amid Controversy Over ChatGPT Works

Amazon has introduced new guidelines requiring publishers to disclose the use of AI in content submitted to its Kindle Direct Publishing platform, in an effort to curb unauthorized AI-generated books and copyright infringement. Publishers are now required to inform Amazon about AI-generated content, but AI-assisted content does not need to be disclosed. High-profile authors have recently joined a class-action lawsuit against OpenAI, the creator of the AI chatbot, for alleged copyright violations.

9/26/2023

Meta's AI Chatbot Exposes Ulterior Motives and Hidden Agenda in Book Selection Process

Meta's generative A.I. machines used 183,000 books to learn how to write, raising concerns about copyright violation and the program's ability to accurately distinguish between authors with similar names.

9/28/2023

AI-Generated Books Flood Amazon, Raising Concerns for Human Authors

“AI-Generated Books Flood Amazon, Detection Startups Offer Solutions” - This article highlights the problem of AI-generated books flooding Amazon and other online booksellers. The excessive number of low-quality AI-generated books has made it difficult for customers to find high-quality books written by humans. Several AI detection startups are offering solutions to proactively flag AI-generated materials, but Amazon has yet to embrace this technology. The article discusses the potential benefits of AI flagging for online book buyers and the ethical responsibility of booksellers to disclose whether a book was written by a human or machine. However, there are concerns about the accuracy of current AI detection tools and the presence of false positives, leading some institutions to discontinue their use. Despite these challenges, many in the publishing industry believe that AI flagging is necessary to maintain trust and transparency in the marketplace.

9/29/2023

The book "The Futurist" by author and journalist Peter Rubin is among the thousands of pirated books being used to train generative-AI systems, sparking concerns about the future of human writers and copyright infringement.

9/30/2023

AI-Generated Books Rip Off Real Authors and Flood Amazon Despite Objections

Artificial intelligence (AI)-generated books are causing concerns as authors like Rory Cellan-Jones find biographies written about them without their knowledge or consent, leading to calls for clear labeling of AI-generated content and the ability for readers to filter them out. Amazon has implemented some restrictions on the publishing of AI-generated books but more needs to be done to protect authors and ensure ethical standards are met.

10/3/2023

Tech Giants Compete for Data to Train AI Models, Microsoft CEO Says

Big tech firms, including Google and Microsoft, are engaged in a competition to acquire content and data for training AI models, according to Microsoft CEO Satya Nadella, who testified in an antitrust trial against Google and highlighted the race for content among tech firms. Microsoft has committed to assuming copyright liability for users of its AI-powered Copilot, addressing concerns about the use of copyrighted materials in training AI models.

10/8/2023

Investigation Reveals 200,000 Books Used Without Permission to Train AI, Outraging Many Authors

Tech companies are facing backlash from authors after it was revealed that almost 200,000 pirated e-books were used to train artificial intelligence systems, with many authors expressing outrage and feeling exploited by the unauthorized use of their work.

10/8/2023

Authors Cry Foul as Tech Giants Use Pirated Books to Train AI Without Permission

Tech companies are facing backlash from authors whose books were used without permission to train artificial intelligence systems, with the data set consisting of pirated e-books; authors are expressing outrage and calling it theft, while some see it as an opportunity for their work to be read and educate.

10/9/2023

Authors Outraged as Tech Companies Use 200,000 Books Without Consent to Train AI Models

Books by famous authors, including J.K. Rowling and Neil Gaiman, are being used without permission to train AI models, drawing outrage from the authors and sparking lawsuits against the companies involved.

10/9/2023

Authors Cry Foul as Pirated Books Train AIs Without Permission

Tech companies are using thousands of books, including pirated copies, to train artificial intelligence systems without the permission of authors, leading to copyright infringement concerns and loss of income.

10/12/2023

Google Offers Legal Protection for Users of AI Products Over Copyright Concerns

Google has stated that it will provide legal protection for customers who use certain generative AI products and face copyright infringement lawsuits, covering both training data and the results generated by its foundation models.

10/16/2023

New AI Tool Lets Anyone Quickly Write and Publish E-Books Without Tech Skills

Get a lifetime subscription to My AI eBook Creation Pro for just $34.99, a 91% discount, and use AI to quickly and easily write and publish your own e-books.

10/17/2023

A.I. Firms Face Potentially Devastating Lawsuits Over Training Data Copyright

The use of copyrighted materials to train AI models poses a significant legal challenge, with companies like OpenAI and Meta facing lawsuits for allegedly training their models on copyrighted books, and legal experts warning that copyright challenges could pose an existential threat to existing AI models if not handled properly. The outcome of ongoing legal battles will determine whether AI companies will be held liable for copyright infringement and potentially face the destruction of their models and massive damages.

10/17/2023

New Tool Uses AI to Easily Create Full eBooks for Revenue and SEO

My AI eBook Creation Pro is an AI tool that helps you create and publish full eBooks, allowing you to generate revenue and boost your ranking in search algorithms.

10/18/2023

AI Companies Face Backlash From Authors Over Use of Books Without Permission

Authors are expressing anger and incredulity over the use of their books to train AI models, leading to the filing of a class-action copyright lawsuit by the Authors Guild and individual authors against OpenAI and Meta, claiming unauthorized and pirated copies were used.

10/18/2023

Meta, Microsoft and Bloomberg Sued for Allegedly Using Pirated Books to Train AI

Prominent authors, including former Arkansas governor Mike Huckabee and Christian author Lysa TerKeurst, have filed a lawsuit accusing Meta, Microsoft, and Bloomberg of using their work without permission to train artificial intelligence systems, specifically the controversial "Books3" dataset.

10/19/2023

Authors Fight Back Against Tech Giants' Unauthorized Use of Books to Train AI

Tech companies like Meta, Google, and Microsoft are facing lawsuits from authors who accuse them of using their copyrighted books to train AI systems without permission or compensation, prompting a call for writers to band together and demand fair compensation for their work.

10/19/2023

AI Training Data Lacks Transparency, Raising Concerns Over Privacy and Bias

Generative AI systems, trained on copyrighted material scraped from the internet, are facing lawsuits from artists and writers concerned about copyright infringement and privacy violations. The lack of transparency regarding data sources also raises concerns about data bias in AI models. Protecting data from AI is challenging, with limited tools available, and removing copyrighted or sensitive information from AI models would require costly retraining. Companies currently have little incentive to address these issues due to the absence of AI policies or legal rulings.

10/19/2023

Publishing Groups Call for EU Transparency Laws on AI Training Data to Protect Books and Democracy

Three major European publishing trade bodies are calling on the EU to ensure transparency and regulation in artificial intelligence to protect the book chain and democracy, citing the illegal and opaque use of copyright-protected books in the development of generative AI models.

10/19/2023

Huckabee Sues Meta, Microsoft for Using His Books in AI Training Without Payment

Former Arkansas Governor Mike Huckabee and other authors have filed a lawsuit against Meta, Microsoft, and other companies, alleging that their books were pirated and used without permission to train AI models, in the latest case of authors accusing tech companies of copyright infringement in relation to AI training data.

10/20/2023

Big Name Authors Sue OpenAI Alleging AI Models Infringe on Copyrights

A group of prominent authors, including Douglas Preston, John Grisham, and George R.R. Martin, are suing OpenAI for copyright infringement over its AI system, ChatGPT, which they claim used their works without permission or compensation, leading to derivative works that harm the market for their books; the publishing industry is increasingly concerned about the unchecked power of AI-generated content and is pushing for consent, credit, and fair compensation when authors' works are used to train AI models.

10/20/2023

AI Companies Compensate Creators for Training Data as Generative Models Threaten Creative Jobs

Companies like Adobe, Canva, and Stability AI are developing incentive plans to compensate artists and creators who provide their work as training data for AI models, addressing concerns about copyright infringement and ensuring a supply of high-quality content.