'iPhone of AI': startup first to deliver trillion-plus parameter AI model that works in symbiosis with its very own chip — SambaNova promises 90% savings on inference costs, but take that with a pinch of salt

SambaNova Samba-1
(Image credit: SambaNova)

Although everyone wants in, the deployment of generative AI at scale has proved a significant challenge for large enterprises and government bodies. 

Despite recognizing the potential of the technology to streamline processes, reduce costs, and improve supply chains, concerns about cost, complexity, security, data privacy, model ownership, and regulatory compliance have acted as barriers to adoption. 

In a potential breakthrough, Softbank-funded SambaNova Systems has announced the launch of Samba-1, the first trillion-parameter generative AI model. Powered by the SambaNova Suite, Samba-1 is designed to meet the performance, accuracy, scalability, and total cost of ownership (TCO) requirements. The model also promises a 90% reduction in inference costs, although this claim should be approached with caution. 

Building the 'iPhone of AI'

Unlike other trillion-parameter models, which are built as single, monolithic entities, Samba-1 utilizes a Composition of Experts (CoE) architecture. This system aggregates multiple small "expert" models into a single large solution, functioning as a single large model. This approach offers broader knowledge across various topics, high accuracy, and multimodality.

The CoE model can also reportedly provide greater knowledge and accuracy for specialized domains than other large models. Individual smaller models can be trained for specific domains, such as finance, law, physics, or biology, and added to the CoE, bringing high accuracy for that specific domain without the need for training on the entire trillion-parameter model. 

The release of Samba-1 follows SambaNova's announcement of the SN40L, a smart AI chip designed to rival those from AI behemoth Nvidia. The integration of this chip with the Samba-1 model represents a significant step forward, with SambaNova being the first to deliver an integrated hardware and software system for the enterprise. 

“The entire AI industry is talking about building the iPhone of AI - an integrated hardware and software system - and SambaNova is the first to deliver a version of that to the enterprise,” said Rodrigo Liang, Co-founder and CEO of SambaNova Systems. “This past fall, we announced the SN40L, the smartest AI chip, and now we’ve integrated that chip with the first 1T parameter model for the enterprise. Samba-1 rivals GPT-4, however, it’s better suited for the enterprise as it can be delivered on-premises or in private clouds so that customers can fine-tune the model with their private data without ever disclosing it into the public domain.” 

Despite the impressive capabilities of Samba-1, the model's claim to reduce inference costs by 90% should be taken with a pinch of salt. While the CoE architecture does offer low inference costs, the true value of this saving will only become apparent once the model is deployed in real-world scenarios.

Liang told us “AI is not a fad, we’re at the start of this journey. Our full-stack solution is focused on large-scale enterprise and government organizations, which no one else can provide on-prem and privately. There’s no escaping how dominant Nvidia is right now, but we’re able to deploy these models at scale for a fraction of the cost.”

More from TechRadar Pro

Wayne Williams
Editor

Wayne Williams is a freelancer writing news for TechRadar Pro. He has been writing about computers, technology, and the web for 30 years. In that time he wrote for most of the UK’s PC magazines, and launched, edited and published a number of them too.

Read more
Cerebras WSE-3
DeepSeek on steroids: Cerebras embraces controversial Chinese ChatGPT rival and promises 57x faster inference speeds
SambaNova runs DeepSeek
Nvidia rival claims DeepSeek world record as it delivers industry-first performance with 95% fewer chips
Project DIGITS - front view
I am thrilled by Nvidia’s cute petaflop mini PC wonder, and it’s time for Jensen’s law: it takes 100 months to get equal AI performance for 1/25th of the cost
Half man, half AI.
Yet another tech startup wants to topple Nvidia with 'orders of magnitude' better energy efficiency; Sagence AI bets on analog in-memory compute to deliver 666K tokens/s on Llama2-70B
Representation of AI
These are the 10 hottest AI hardware companies to follow in 2025
Ai tech, businessman show virtual graphic Global Internet connect Chatgpt Chat with AI, Artificial Intelligence.
Apple is the biggest winner of DeepSeek’s new AI breakthrough
Latest in Pro
Half man, half AI.
How finance teams can avoid falling behind in the AI race
eSIM
Global eSIM shipment volume surpasses half a billion units as demand keeps on growing
China
Microsoft says Chinese Silk Typhoon hackers are targeting cloud and IT apps to steal business data
Salesforce Agentforce 2dx
Salesforce gives AI agents the power to be proactive and autonomous like never before
Insecure network with several red platforms connected through glowing data lines and a black hat hacker symbol
BadBox malware hit after infecting over 500,000 Android devices
An abstract image of a lock against a digital background, denoting cybersecurity.
Cyber resilience under DORA – are you prepared for the challenge?
Latest in News
Fujfilm GFX 50R
First Fujifilm GFX100RF images leaked in build-up to expected reveal – here’s what they tell us about the unique premium compact camera
Samsung Galaxy Z Flip 6 in blue
The Samsung Galaxy Z Flip 7 could have a Motorola Razr-style full-sized cover screen – and I think it’s about time
An AMD Radeon RX 9070 XT made by Sapphire on a table with its retail packaging
Last-minute AMD RX 9070 XT stock rumors are making me hopeful for a much better launch than Nvidia’s RTX 5000 GPUs – with just one snag
eSIM
Global eSIM shipment volume surpasses half a billion units as demand keeps on growing
Samsung Galaxy Buds in white
Samsung may be working on new cheap wireless earbuds – will the Galaxy Buds FE 2 beat Sony's next value earbuds to the punch?
PS5 Pro feature
PlayStation Direct now lets you rent, yes rent, a PS5 from £11.99 a month