Study Shows Removing Sensitive Data from AI Models Like ChatGPT Remains a Challenge
-
Researchers find it's difficult to fully remove sensitive data from large language models like ChatGPT and Bard.
-
Deleting info from LLMs is hard because models are trained on massive datasets and data is embedded in undefinable weights.
-
Guardrails like reinforcement learning can limit unwanted outputs but don't delete data.
-
New study shows factual info can still be extracted from LLMs even after editing methods try to delete it.
-
Defending against attacks to extract sensitive info is an ongoing challenge as new attack methods emerge.