AMD releases details of 288GB MI355X accelerator: 80% faster than MI325X, 8TB/s memory bandwidth

AMD Instinct MI355X Accelerator
(Image credit: AMD)

We already knew a lot about AMD’s next generation accelerator, the Instinct MI325X, from an earlier event in June 2024 - but the company has now revealed more at its AMD Advancing AI event.

First, we knew the Instinct MI325X was a minor upgrade from the MI300X, with the same CDNA 3 architecture, but just enough oomph to make it a viable alternative to the H200, Nvidia’s AI powerhouse.

Eagle-eyed readers will also notice that AMD has cut the onboard HBM3e memory capacity from 288GB to 256GB with the memory capacity now only 80% more than Nvidia’s flagship rather than the more enviable 2x improvement.

Preparing the grounds for the MI355X

To make things a bit more murkier, AMD also mentioned another SKU, the MI325X OAM which will have, wait for it, 288GB memory - we have asked for clarification and will update this article in due course.

AMD provided some carefully selected performance comparisons against Nvidia’s H200:

  • 1.3X the inference performance on Mistral 7B at FP16
  • 1.2X the inference performance on Llama 3.1 70B at FP8
  • 1.4X the inference performance on Mixtral 8x7B at FP16

The company also revealed the accelerator has 153 billion transistors, which is the same as the MI300X. The H200 has only 80 billion transistors while Blackwell GPUs will top the scale at more than 200 billion transistors.

The star of the show though had to be the MI355X accelerator, which was also announced at the event with an H2 2025 launch date. Manufactured on TSMC’s 3nm node and featuring AMD’s new CDNA 4 architecture, it introduces FP6 and FP4 formats and is expected to deliver improvements on 80% on FP16 and FP8, compared to the current MI325X accelerator.

Elsewhere, the Instinct MI355X will offer 288GB HBM3E and 8TB/s memory bandwidth, a 12.5% and 33.3% improvement on its immediate predecessor. An 8-unit OXM platform, which will also be launched in H2 2025, will offer a staggering 18.5 petaflops in FP16, 37PF in FP8, 74PF in FP6 and FP4 (or 9.3PF per OXM).

The MI355x will compete against Nvidia’s Blackwell B100 and B200 when it launches in 2025, and will be instrumental in Lisa Su’s attempt to supercharge AMD’s aspirations to catch up with its rival.

Nvidia remains firmly in the driving seat, with more than 90% of the world’s AI accelerator market, making it the world’s most valuable company at the time of writing, with its share price at its all time high and a market capitalization of $3.3 trillion.

AMD also unveiled its new family of EPYC 9005 series CPU with a 192-core model that costs nearly $15,000.

More from TechRadar Pro

Desire Athow
Managing Editor, TechRadar Pro

Désiré has been musing and writing about technology during a career spanning four decades. He dabbled in website builders and web hosting when DHTML and frames were in vogue and started narrating about the impact of technology on society just before the start of the Y2K hysteria at the turn of the last millennium.

TOPICS