Microsoft Builds Prompt Shields to Protect AI Chatbots From Manipulation

Microsoft is designing "prompt shields" to detect and block attempts to trick AI chatbots into unintended behaviors.
The new tools can spot suspicious inputs and block them in real time.
Microsoft is addressing "prompt injection attacks" where hackers insert malicious instructions into an AI's training data.
The company is investigating incidents with its Copilot chatbot where users deliberately tried to generate weird or harmful responses.
Microsoft and OpenAI aim to deploy AI safely but "jailbreaks" that trick models are an inherent weakness of the technology.