Copyright Concerns Mount Over AI Models' Text Memorization

• Generative AI models like ChatGPT may depend on "memorizing" copyrighted books and articles without permission, posing legal threats

• Lawsuits argue training is not "fair use" since models can reproduce long sections of text verbatim

• Memorization seems inherent to large language models - removing it could "cripple" usefulness

• Possible solutions hiding memorization via "alignment training" or using retrieval to cite sources

• No blanket ruling on fair use expected - models will be judged case-by-case on outputs