AMD lands yet another major cloud deal as Oracle adopts thousands of Instinct MI300X GPUs to power new AI supercluster
Oracle Cloud Infrastructure will add the AI accelerator to its selection of bare metal instances
AMD’s Instinct MI300X is an incredibly powerful AI accelerator, and major cloud companies are beginning to integrate it into their infrastructure to support intensive AI workloads.
Vultr recently announced that it had ordered “thousands” of MI300X units, and now Oracle Cloud Infrastructure (OCI) says it has adopted AMD’s hardware for its new OCI Compute Supercluster instance, BM.GPU.MI300X.8.
The new supercluster is designed for massive AI models containing billions of parameters and supports up to 16,384 GPUs in a single cluster. This setup leverages the same high-speed technology used by other OCI accelerators, enabling large-scale AI training and inference with the memory capacity and throughput required for the most demanding tasks. The configuration makes it particularly suited for LLMs and complex deep learning operations.
Preproduction testing
“AMD Instinct MI300X and ROCm open software continue to gain momentum as trusted solutions for powering the most critical OCI AI workloads,” said Andrew Dieckmann, corporate vice president and general manager, Data Center GPU Business, AMD. “As these solutions expand further into growing AI-intensive markets, the combination will benefit OCI customers with high performance, efficiency, and greater system design flexibility.”
Oracle says its testing of the MI300X as part of its preproduction efforts validated the GPU’s performance in real-world scenarios. For the Llama 2 70B model, the MI300X achieved a 65 millisecond "time to first token" latency and scaled efficiently to generate 3,643 tokens across 256 concurrent user requests. In another test with 2,048 input and 128 output tokens, it delivered an end-to-end latency of 1.6 seconds, matching closely with AMD’s own benchmarks.
The OCI BM.GPU.MI300X.8 instance features 8 AMD Instinct MI300X accelerators, delivering 1.5TB of HBM3 GPU memory with a bandwidth of 5.3TB/s, paired with 2TB of system memory and 8 x 3.84TB NVMe storage. Oracle will be offering the bare-metal solution for $6 per GPU/hour.
“The inference capabilities of AMD Instinct MI300X accelerators add to OCI’s extensive selection of high-performance bare metal instances to remove the overhead of virtualized compute commonly used for AI infrastructure,” said Donald Lu, senior vice president of software development at Oracle Cloud Infrastructure. “We are excited to offer more choice for customers seeking to accelerate AI workloads at a competitive price point.”
Are you a pro? Subscribe to our newsletter
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
More from TechRadar Pro
Wayne Williams is a freelancer writing news for TechRadar Pro. He has been writing about computers, technology, and the web for 30 years. In that time he wrote for most of the UK’s PC magazines, and launched, edited and published a number of them too.