DeepSeek is the new AI chatbot that has the world talking – I pitted it against ChatGPT to see which is best
Is there a new sheriff in town?
DeepSeek is the new AI chatbot on everybody’s lips and is currently sitting at the top of Apple’s App Store in the US and the UK. A completely free AI model built by a Chinese start-up, DeepSeek wants to make AI even more accessible to the masses by offering a competitor to OpenAI’s ChatGPT o1 reasoning model without a fee.
New AI apps appear on the App Store almost daily, and there’s often hype around a new model launch as people look for the next alternative to ChatGPT. Whether you’re an avid user of OpenAI’s software or you prefer to use Google Gemini, there’s an AI tool for everyone, and DeepSeek wants to be the next icon on your home screen.
After seeing DeepSeek all over my newsfeed, I knew I had to give the brand-new AI a go and see if it was as good as people who made it out to be online. I pitted DeepSeek V3 and DeepThink R1 against ChatGPT 4o and o1 to see just how good the new king of the App Store really is.
AI life hacks
In this test, I wanted to get a full feel for everything DeepThink offers compared to ChatGPT, so I only thought it was fair to use the AI chatbot the same way I would use AI in my daily life.
Recently, I’ve been wanting to get help from AI to create a daily schedule that fits my needs as a person who works from home and needs to look after a dog. Up until recently, my partner also worked from home, so it was much easier to split tasks, but she is now working from an office, and I need to find a way to juggle life, work, and my French Bulldog Kermit.
I asked ChatGPT o4 and DeepSeek V3 to create a daily schedule with some information on when I wake up, my dog’s potty routine, and a brief breakdown of my workflow. Both created excellent schedules that I could genuinely see myself using daily. However, ChatGPT’s memory feature made OpenAI’s schedule even more coherent.
I had previously told ChatGPT that I like to review AI news and trends at 9 am, and 4o implemented that information from a previous chat into my morning routine. DeepSeek, on the other hand, can only remember information from the same chat and couldn’t bring back information from previous chats to help with its answer.
Get daily insight, inspiration and deals in your inbox
Sign up for breaking news, reviews, opinion, top tech deals, and more.
(ELI5) Explain like I’m 5
Next, I wanted to ask both AI chatbots about the NFL Playoffs, considering we now know the two teams that will face each other at Super Bowl LIX. I asked DeepSeek and ChatGPT to give me a 200-word rundown of the NFL playoffs and how it works. Both provided excellent information that gave me a full understanding of how the seeding system works and the journey a team needs to take to make it to the Super Bowl.
ChatGPT opted for a 200-word paragraph, while DeepSeek broke information down into bullet points. I did notice that ChatGPT gave me more context on how teams become a Wild Card, but the difference between the results is fairly minimal and you’ll like one better than the other purely based on personal preference.
Problem solver
Now that we’ve covered some simple AI prompts, it’s time to get down to the nitty gritty and try out DeepThink R1, the AI model that has everyone talking. People online are saying DeepSeek’s free reasoning model is as good as ChatGPT’s o1, which is free in small doses but requires a paid subscription to access regularly.
To test the AI chatbots’ reasoning capabilities, I looked for some of the hardest problems I could find, and I’m shocked by some of the results:
Question 1: Find the missing word: Apple, Red, Coal
This isn’t a particularly hard question, especially considering the source material was multiple-choice with different color options. I opted to avoid giving R1 and o1 multiple-choice answers and instead just wrote the question and hit enter.
ChatGPT o1 took 1 minute and 29 seconds to determine the answer, and it found links between the words and the fairytale Snow White. The model decided to answer based on this quote, “her lips were red as blood, her hair was black as coal, and her skin was white as snow.” Based on this quote o1 chose Snow as the missing word answer. While its thought process was clever, it wasn’t the answer I was looking for.
DeepThink R1, on the other hand, took 1 minute and 14 seconds to answer, and it managed to guess the right word: Black. Apple is red; coal is black. Impressive, to say the least.
Question 2: 1. Finish the sequence: 1, 2, 4, 8, ? 2. Finish the sequence: house, Saturn, dog, burger, ?
These two sequences are completely unrelated, but I thought it would be interesting to ask back-to-back questions to see what happens. While the first sequence is very easy, the second is impossible (they are just three random words). Would ChatGPT o1 or DeepThink R1 be able to notice the trap?
Well, no. Both reasoning models attempted to find an answer and gave me a completely different one. DeepThink R1 answered “yellow” because it thought the words were related to their color (white house, yellow Saturn, brown dog, yellow burger). ChatGPT o1, on the other hand, answered “car” because it found the sequence almost impossible but decided to offer answers based on “a common puzzle approach.” The approach it chose to offer up was linking each item into the bigger category it belongs to (house = building, Saturn = planet, dog = animal, burger = food, and car = vehicle).
Ultimately, both reasoning models were wrong, and neither responded by saying there were too many variables to give an accurate answer.
Question 3: Hummingbirds within Apodiformes uniquely have a bilaterally paired oval bone, a sesamoid embedded in the caudolateral portion of the expanded, cruciate aponeurosis of insertion of m. depressor caudae. How many paired tendons are supported by this sesamoid bone? Answer with a number.
For the final question, I decided to ask ChatGPT o1 and DeepThink R1 a question from Humanity’s Last Exam, the hardest AI benchmark out there. To a mere mortal like myself with no knowledge of hummingbird anatomy, this question is genuinely impossible; these reasoning models, however, seem to be up for the challenge.
O1 answered four, while DeepThink R1 answered two. Unfortunately, the correct answer isn’t available online to prevent AI chatbots from scraping the internet to find the correct response. That said, from some research, I believe DeepThink might be right here, while o1 is just off the mark.
DeepSeek vs ChatGPT?
So, I’ve run multiple prompts and used both chatbots for an extensive amount of time, but what is the better option? According to the answers I received from prompts, DeepThink R1 is an excellent free reasoning model that makes you question whether it’s worth paying to access o1 regularly. DeepSeek is only available on the web, iOS App Store, and Play Store, so if you want to use a standalone Mac app or iPad app, you’ll need to wait for the company to release one.
According to Humanity’s Last Exam, DeepThink R1 outperforms ChatGPT o1 with a 9.4% accuracy rate compared to OpenAI’s 9.1%; it’s a marginal difference, but considering one is completely free, it may sway you towards using the new kid on the block.
Personally, I’ll be sticking with ChatGPT because I don’t have enormous requirements for reasoning models, and I rely heavily on the memories feature, which allows the AI chatbot to reference previous conversations. I also like the fact that ChatGPT has a standalone Mac and iPad app, as well as the ability to generate images with one of the best AI image generators, DALL-E.
DeepSeek is purely text-based and lacks multi-modal capabilities, but considering how new it is, this is an incredibly promising start to a genuine challenger for OpenAI’s AI crown.
You may also like
John-Anthony Disotto is TechRadar's Senior Writer, AI, bringing you the latest news on, and comprehensive coverage of, tech's biggest buzzword. An expert on all things Apple, he was previously iMore's How To Editor, and has a monthly column in MacFormat. He's based in Edinburgh, Scotland, where he worked for Apple as a technician focused on iOS and iPhone repairs at the Genius Bar. John-Anthony has used the Apple ecosystem for over a decade, and is an award-winning journalist with years of experience in editorial.
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.