Copyright Concerns Mount Over AI Models' Text Memorization
• Generative AI models like ChatGPT may depend on "memorizing" copyrighted books and articles without permission, posing legal threats
• Lawsuits argue training is not "fair use" since models can reproduce long sections of text verbatim
• Memorization seems inherent to large language models - removing it could "cripple" usefulness
• Possible solutions hiding memorization via "alignment training" or using retrieval to cite sources
• No blanket ruling on fair use expected - models will be judged case-by-case on outputs