Microsoft-backed AI startup beats Nvidia H100 on key tests with GPU-like card equipped with 256GB RAM

D-Matrix's Corsair C8 card
(Image credit: D-matrix)

D-Matrix’s unique compute platform, known as the Corsair C8, can stake a huge claim to have displaced Nvidia’s industry-leading H100 GPU - at least according to some staggering test results the startup has published. 

Designed specifically for generative AI workloads, the Corsair C8 differs from GPUs in that it uses d-Matrix’s unique digital-in-memory computer (DIMC) architecture. 

The result? A nine-times increase in throughput versus the industry-leading Nvidia H100, and a 27-times increase versus the A100.

Corsair C8 power

The startup is one of the most hotly followed in Silicon Valley, raising $110 million from investors in its latest funding round, including funding from Microsoft. This came alongside a $44 million investment round from backers including Microsoft, SK Hynix, and others, in April 2022.

Its flagship Corsair C8 card includes 2,048 DIMC cores with 130 billion transistors and 256GB LPDDR5 RAM. It can boast 2,400 to 9,600 TFLOPS of computing performance, and has a chip-to-chip bandwidth of 1TB/s 

These unique cards can produce up to 20 times high throughput for generative inference on large language models (LLMS), up to 20 times lower inference latency for LLMs, and up to 30 times cost savings when compared with traditional GPUs.

With generative AI rapidly expanding, the industry is locked in a race to build increasingly powerful hardware to power future generations of the technology. 

The leading components are GPUs and, more specifically, Nvidia’s A100 and newer H100 units. But GPUs aren’t optimized for LLM inference, according to d-Matrix, and too many GPUs are needed to handle AI workloads, leading to excessive energy consumption. 

This is because the bandwidth demands of running AI inference lead to GPUs spending a lot of time idle, waiting for data to come in from DRAM. Moving data out of DRAM also means higher energy consumption alongside reduced throughput and added latency. This means cooling demands are then heightened. 

The solution, this firm claims, is its specialized DIMC architecture that mitigates many of the issues in GPUs. D-Matrix claims its solution can reduce costs by 10 to 20 times – and in some cases as much as 60 times. 

Beyond d-Matrix’s technology, other players are beginning to emerge in the race to outpace Nvidia’s H100. IBM presented a new analog AI chip in August that mimics the human brain and can perform up to 14 times more efficiently.

More from TechRadar Pro

TOPICS
Keumars Afifi-Sabet
Channel Editor (Technology), Live Science

Keumars Afifi-Sabet is the Technology Editor for Live Science. He has written for a variety of publications including ITPro, The Week Digital and ComputerActive. He has worked as a technology journalist for more than five years, having previously held the role of features editor with ITPro. In his previous role, he oversaw the commissioning and publishing of long form in areas including AI, cyber security, cloud computing and digital transformation.