Meta showcases the hardware that will power recommendations for Facebook and Instagram — low-cost RISC-V cores and mainstream LPDDR5 memory are at the heart of its MTIA recommendation inference CPU

MTIA chip
(Image credit: Meta)

Meta unveiled its first-generation in-house AI inference accelerator designed to power the ranking and recommendation models that are key components of Facebook and Instagram back in 2023.

The Meta Training and Inference Accelerator (MTIA) chip, which can handle inference but not training, was updated in April, and doubled the compute and memory bandwidth of the first solution.

At the recent Hot Chips symposium last month, Meta gave a presentation on its next-generation MTIA and admitted using GPUs for a recommendation engines is not without challenges. The social media giant noted that peak performance doesn't always translate to effective performance, large deployments can be resource-intensive, and capacity constraints are exacerbated by the growing demand for Generative AI.

Mysterious memory expansion

Taking this into account, Meta's development goals for the next generation of MTIA include improving performance per TCO and per watt compared to the previous generation, efficiently handling models across multiple Meta services, and enhancing developer efficiency to quickly achieve high-volume deployments.

Meta's latest MTIA gains a significant boost in performance with GEN-O-GEN, which increases GEMM TOPs by 3.5x to 177 TFLOPS at BF16, hardware-based tensor quantization for accuracy comparable to FP32, and optimized support for PyTorch Eager Mode, enabling job launch times under 1 microsecond and job replacement in less than 0.5 microseconds. Additionally, TBE optimization enhances embedding indices' download and prefetch times, achieving 2-3x faster run times compared to the previous generation.

The MTIA chip, built on TSMC's 5nm process, operates at 1.35 GHz with a gate count of 2.35 billion and offers 354 TOPS (INT8) and 177 TOPS (FP16) GEMM performance, utilizing 128GB LPDDR5 memory with a bandwidth of 204.8GB/s, all within a 90-watt TDP.

The Processing Elements are built on RISC-V cores, featuring both scalar and vector extensions, and Meta's accelerator module includes dual CPUs. At Hot Chips 2024, ServeTheHome noticed a Memory Expansion linked to the PCIe switch and the CPUs. When asked if this was CXL, Meta rather coyly said, “it is an option to add memory in the chassis, but it is not being deployed currently.”

More from TechRadar Pro

Wayne Williams
Editor

Wayne Williams is a freelancer writing news for TechRadar Pro. He has been writing about computers, technology, and the web for 30 years. In that time he wrote for most of the UK’s PC magazines, and launched, edited and published a number of them too.

Read more
Zuckerberg Meta AI
Meta powers ahead with conscious chip uncoupling with Nvidia as it tests its first in-house training AI-PU
Half man, half AI.
Yet another tech startup wants to topple Nvidia with 'orders of magnitude' better energy efficiency; Sagence AI bets on analog in-memory compute to deliver 666K tokens/s on Llama2-70B
Sam Altman and OpenAI
Nvidia, look away! OpenAI is almost ready to deliver first prototype of its AI GPU - General Processing Unit
A Corsair One i500 on a desk
Microsoft backed a tiny hardware startup that just launched its first AI processor that does inference without GPU or expensive HBM memory and a key Nvidia partner is collaborating with it
d-Matrix Corsair card
Tech startup proposes a novel way to tackle massive LLMs using the fastest memory available to mankind
Cerebras WSE-3
DeepSeek on steroids: Cerebras embraces controversial Chinese ChatGPT rival and promises 57x faster inference speeds
Latest in Pro
cybersecurity
What's the right type of web hosting for me?
Security padlock and circuit board to protect data
Trust in digital services around the world sees a massive drop as security worries continue
Hacker silhouette working on a laptop with North Korean flag on the background
North Korea unveils new military unit targeting AI attacks
An image of network security icons for a network encircling a digital blue earth.
US government warns agencies to make sure their backups are safe from NAKIVO security issue
Laptop computer displaying logo of WordPress, a free and open-source content management system (CMS)
This top WordPress plugin could be hiding a worrying security flaw, so be on your guard
construction
Building in the digital age: why construction’s future depends on scaling jobsite intelligence
Latest in News
Ray-Ban Meta Smart Glasses
Samsung's rumored smart specs may be launching before the end of 2025
Apple iPhone 16 Review
The latest iPhone 18 leak hints at a major chipset upgrade for all four models
Quordle on a smartphone held in a hand
Quordle hints and answers for Monday, March 24 (game #1155)
NYT Strands homescreen on a mobile phone screen, on a light blue background
NYT Strands hints and answers for Monday, March 24 (game #386)
NYT Connections homescreen on a phone, on a purple background
NYT Connections hints and answers for Monday, March 24 (game #652)
Quordle on a smartphone held in a hand
Quordle hints and answers for Sunday, March 23 (game #1154)