Hacker creates false memories in ChatGPT to steal victim data — but it might not be as bad as it sounds

ChatGPT app on an iPhone
(Image credit: Shutterstock / Primakov)

Security researchers have exposed a vulnerability which could allow threat actors to store malicious instructions in a user’s memory settings in the ChatGPT MacOS app.

A report from Johann Rehberger at Embrace The Red noted how an attacker could trigger a prompt injection to take control of ChatGPT, and can then insert a memory into its long-term storage and persistence mechanism. This leads to the exfiltration of the conversation on both sides straight to the attacker’s server.

From then on, the prompt is stored as ‘memory persistent’, so any future conversations with the chatbot will have the same vulnerability. Because ChatGPT remembers things about its users, like names, ages, locations, likes and dislikes, and previous searches, this exploit presents serious risk for users.

Staying safe

In response, OpenAI had introduced an API which means the exploit is no longer possible through ChatGPT’s web interface, and has also launched a fix to prevent memories from being used as an exfiltration vector. However, researchers say that untrusted third-party content can still inject prompts that could exploit the memory tool.

The good news is, whilst the memory tool is automatically turned on by default in ChatGPT, but can be turned off by the user. The feature is great for those who want a more personalized experience using the chatbot, as it can listen to your wants and needs and make suggestions based on the info - but clearly there are dangers.

To mitigate the risks from this, users should be alert when using the chatbot, and particularly look at the ‘new memory added’ messages. By reviewing the stored memories regularly, users can examine for any potentially planted memories.

This isn't the first security flaw that researchers have discovered in ChatGPT, with concerns over the plugins allowing threat actors to take over users' other accounts and potentially access sensitive data.

More from TechRadar Pro

Ellen Jennings-Trace
Staff Writer

Ellen has been writing for almost four years, with a focus on post-COVID policy whilst studying for BA Politics and International Relations at the University of Cardiff, followed by an MA in Political Communication. Before joining TechRadar Pro as a Junior Writer, she worked for Future Publishing’s MVC content team, working with merchants and retailers to upload content.