Ahead of GPT-5 launch, another test shows that people cannot distinguish ChatGPT from a human in a conversation test — is it a watershed moment for AI?

Turing Test chat
(Image credit: UC San Diego)

Popular AI tools such as GPT-4 generate fluent, human-like text and perform so well on various language tasks it’s becoming increasingly difficult to tell if the person you’re conversing with is human or a machine. 

This scenario mirrors Alan Turing's famous thought experiment, where he proposed a test to evaluate if a machine could exhibit human-like behavior to the extent that a human judge could no longer reliably distinguish between man and machine based solely on their responses.

The Department of Cognitive Science at UC San Diego decided to see how modern AI systems fared and evaluated ELIZA (a simple rules-based chatbot from the 1960’s included as a baseline in the experiment), GPT-3.5, and GPT-4 in a controlled Turing Test. Participants had a five-minute conversation with either a human or an AI and then had to decide whether their conversation partner was human.

No better than chance

GPT-4 and  GPT-3.5 were told how to respond to messages. Researchers instructed the model to “adopt the persona of a young person who was very concise, didn’t take the game very seriously, used slang, and made occasional spelling errors”. In addition, the prompt contained “general information about the setup of the game, and information about recent news stories that had occurred after the model’s training cutoff”. The two models were accessed through the OpenAI API with temperature set to 1 and to prevent the AI from responding too quickly, their messages were sent with a delay based on the number of characters in the message.

In the study, 500 participants were assigned to one of five groups. They engaged in a conversation with either a human or one of the three AI systems. The game interface resembled a typical messaging app. After five minutes, participants judged whether they believed their conversation partner was human or AI and provided reasons for their decisions.

The results were interesting. GPT-4 was identified as human 54% of the time, ahead of GPT-3.5 (50%), with both significantly outperforming ELIZA (22%) but lagging behind actual humans (67%). Participants were no better than chance at identifying GPT-4 as AI, indicating that current AI systems can deceive people into believing they are human.

Analysis of the results showed that interrogators often relied on linguistic style, socio-emotional factors, and knowledge-based questions to decide if they were talking to a human or a machine.

Details of the test and the full results can be seen in the paper published on the arXiv preprint server.

More from TechRadar Pro

Wayne Williams
Editor

Wayne Williams is a freelancer writing news for TechRadar Pro. He has been writing about computers, technology, and the web for 30 years. In that time he wrote for most of the UK’s PC magazines, and launched, edited and published a number of them too.

Read more
An iPhone showing the ChatGPT logo on its screen
ChatGPT-4.5 is here for Pro users now and Plus users next week, and I can't wait to try it
GPT 4.5
ChatGPT 4.5 understands subtext, but it doesn't feel like an enormous leap from ChatGPT-4o
ChatGPT/DeepSeek
I discovered a surprising difference between DeepSeek and ChatGPT search capabilities
EDMONTON, CANADA - FEBRUARY 10: A woman uses a cell phone displaying the Open AI logo, with the same logo visible on a computer screen in the background, on February 10, 2025, in Edmonton, Canada
I asked ChatGPT to work through some of the biggest philosophical debates of all time – here’s what happened
ChatGPT vs Gemini comparison
I compared GPT-4.5 to Gemini 2.0 Flash and the results surprised me
Man with headache
I asked ChatGPT to invent 6 philosophical thought experiments – and now my brain hurts
Latest in Pro
Epson EcoTank ET-4850 next to a TechRadar badge that reads Big Savings
I searched for the best printer deal you won't find in the Amazon Spring Sale
Microsoft Copiot Studio deep reasoning and agent flows
Microsoft reveals OpenAI-powered Copilot AI agents to bosot your work research and data analysis
Group of people meeting
Inflexible work policies are pushing tech workers to quit
Data leak
Top home hardware firm data leak could see millions of customers affected
Representational image depecting cybersecurity protection
Third-party security issues could be the biggest threat facing your business
An image of network security icons for a network encircling a digital blue earth.
Why multi-CDNs are going to shake up 2025
Latest in News
An image of Pro-Ject's Flatten it closed and opened
Pro-Ject’s new vinyl flattener will fix any warped LPs you inadvertently buy on Record Store Day
EA Sports F1 25 promotional image featuring drivers Oscar Piastri, Carlos Sainz and Oliver Bearman.
F1 25 has been officially announced, with this year's entry marking a return for Braking Point and a 'significant overhaul' for My Team mode
Garmin clippd integration
Garmin's golf watches just got a big software integration upgrade to help you improve your game
Robert Downey Jr reveals himself as Doctor Doom to a delighted crowd at San Diego Comic-Con 2024
Marvel is currently revealing the full cast for Avengers: Doomsday, and I think it's going to be a long-winded announcement
Samsung QN90F on yellow background
Samsung announces US prices for its 2025 mini-LED TV lineup, and it’s good and bad news
Nintendo Switch Lite
Forget the Nintendo Switch 2, the original Switch is getting one last hurrah in a surprise Nintendo Direct tomorrow