Nvidia rival claims DeepSeek world record as it delivers industry-first performance with 95% fewer chips

SambaNova runs DeepSeek
(Image credit: SambaNova)

  • SambaNova runs DeepSeek-R1 at 198 tokens/sec using 16 custom chips
  • The SN40L RDU chip is reportedly 3X faster, 5X more efficient than GPUs
  • 5X speed boost is promised soon, with 100X capacity by year-end on cloud

Chinese AI upstart DeepSeek has very quickly made a name for itself in 2025, with its R1 large-scale open source language model, built for advanced reasoning tasks, showing performance on par with the industry’s top models, while being more cost-efficient.

SambaNova Systems, an AI startup founded in 2017 by experts from Sun/Oracle and Stanford University, has now announced what it claims is the world’s fastest deployment of the DeepSeek-R1 671B LLM to date.

The company says it has achieved 198 tokens per second, per user, using just 16 custom-built chips, replacing the 40 racks of 320 Nvidia GPUs that would typically be required.

Independently verified

“Powered by the SN40L RDU chip, SambaNova is the fastest platform running DeepSeek,” said Rodrigo Liang, CEO and co-founder of SambaNova. “This will increase to 5X faster than the latest GPU speed on a single rack - and by year-end, we will offer 100X capacity for DeepSeek-R1.”

While Nvidia’s GPUs have traditionally powered large AI workloads, SambaNova argues that its reconfigurable dataflow architecture offers a more efficient solution. The company claims its hardware delivers three times the speed and five times the efficiency of leading GPUs while maintaining the full reasoning power of DeepSeek-R1.

“DeepSeek-R1 is one of the most advanced frontier AI models available, but its full potential has been limited by the inefficiency of GPUs,” said Liang. “That changes today. We’re bringing the next major breakthrough - collapsing inference costs and reducing hardware requirements from 40 racks to just one - to offer DeepSeek-R1 at the fastest speeds, efficiently.”

George Cameron, co-founder of AI evaluating firm Artificial Analysis, said his company had “independently benchmarked SambaNova’s cloud deployment of the full 671 billion parameter DeepSeek-R1 Mixture of Experts model at over 195 output tokens/s, the fastest output speed we have ever measured for DeepSeek-R1. High output speeds are particularly important for reasoning models, as these models use reasoning output tokens to improve the quality of their responses. SambaNova’s high output speeds will support the use of reasoning models in latency-sensitive use cases.”

DeepSeek-R1 671B is now available on SambaNova Cloud, with API access offered to select users. The company is scaling capacity rapidly, and says it hopes to reach 20,000 tokens per second of total rack throughput "in the near future".

DeepSeek R1 on SambaNova

(Image credit: Artificial Analysis)

You might also like

TOPICS
Wayne Williams
Editor

Wayne Williams is a freelancer writing news for TechRadar Pro. He has been writing about computers, technology, and the web for 30 years. In that time he wrote for most of the UK’s PC magazines, and launched, edited and published a number of them too.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

Read more
Cerebras WSE-3
DeepSeek on steroids: Cerebras embraces controversial Chinese ChatGPT rival and promises 57x faster inference speeds
Nvidia H800 GPU
A look at the unbelievable Nvidia GPU that powers DeepSeek's AI global ambition
DeepSeek
Nvidia out? DeepSeek pairs with banned Chinese tech giant to deliver unbelievably low pricing on AI inference which could cause Nvidia's house of cards to come crashing
A person's hand using DeepSeek on their mobile phone
'A virtual DPU within a GPU': Could clever hardware hack be behind DeepSeek's groundbreaking AI efficiency?
A person using DeepSeek on their smartphone
DeepSeek R1 is now available on Nvidia, AWS, and Github as available models on Hugging Face shoot past 3,000
Half man, half AI.
Yet another tech startup wants to topple Nvidia with 'orders of magnitude' better energy efficiency; Sagence AI bets on analog in-memory compute to deliver 666K tokens/s on Llama2-70B
Latest in Pro
cybersecurity
What's the right type of web hosting for me?
Security padlock and circuit board to protect data
Trust in digital services around the world sees a massive drop as security worries continue
Hacker silhouette working on a laptop with North Korean flag on the background
North Korea unveils new military unit targeting AI attacks
An image of network security icons for a network encircling a digital blue earth.
US government warns agencies to make sure their backups are safe from NAKIVO security issue
Laptop computer displaying logo of WordPress, a free and open-source content management system (CMS)
This top WordPress plugin could be hiding a worrying security flaw, so be on your guard
construction
Building in the digital age: why construction’s future depends on scaling jobsite intelligence
Latest in News
Ray-Ban Meta Smart Glasses
Samsung's rumored smart specs may be launching before the end of 2025
Apple iPhone 16 Review
The latest iPhone 18 leak hints at a major chipset upgrade for all four models
Quordle on a smartphone held in a hand
Quordle hints and answers for Monday, March 24 (game #1155)
NYT Strands homescreen on a mobile phone screen, on a light blue background
NYT Strands hints and answers for Monday, March 24 (game #386)
NYT Connections homescreen on a phone, on a purple background
NYT Connections hints and answers for Monday, March 24 (game #652)
Quordle on a smartphone held in a hand
Quordle hints and answers for Sunday, March 23 (game #1154)