Amazon unveils the largest text-to-speech model ever made

secret phone call in office
(Image credit: Shutterstock / Reshetnikov_art)

Researchers at Amazon have introduced the largest text-to-speech model to date, which is set to have enhanced qualities that allow it to better articulate complex sentences.

The model, BASE TTS (text-to-speech), which stands for Big Adaptive Streamable TTS with Emergent abilities, could set the foundation for more human-like interactions.

According to the research, it looks like extensive training for TTS models could improve reliability and versability in the same way that we see with large language models (LLMs) used for artificial intelligence.

Amazon’s BASE TTS impresses researchers

The text-to-speech model has been trained on 100,000 hours of speech data that lives in the public domain, which gives the tool a “state-of-the-art naturalness.” Predominantly English, some German, Dutch and Spanish data was also used.

Moreover, the researchers found that even training a TTS model on 10,000 hours of speech can result in an improved ability to articulate complex sentences more naturally.

At 980 million parameters, BASE-large has been recognized as the largest text-to-speech model ever made. The team also trained lesser models, with 400 million and 150 million parameters, and 10,000 and 1,000 hours of speech, in order to compare results. 

Amazon’s team describes BASE TTS as a “high-fidelity model capable of mimicking speaker characteristics with just a few seconds of reference audio,” recognizing the need for more research but acknowledging its potential.

Some of the key areas the researchers focused on were compound nouns, emotions, foreign words, paralinguistics, punctuations, questions, and syntactic complexities – examples can be found on a dedicated web page.

With revolutionary artificial intelligence headlining most of 2023, text-to-speech breakthroughs like this in 2024 could continue to bring once-futuristic technologies into the hands of the masses, but the research team’s cautious approach does highlight a need for proper regulation amid security and privacy fears.

More from TechRadar Pro

TOPICS
Craig Hale

With several years’ experience freelancing in tech and automotive circles, Craig’s specific interests lie in technology that is designed to better our lives, including AI and ML, productivity aids, and smart fitness. He is also passionate about cars and the decarbonisation of personal transportation. As an avid bargain-hunter, you can be sure that any deal Craig finds is top value!

Read more
PlayAI
What is PlayAI: Everything we know about this text-to-speech, voice-cloning platform
ElevenLabs GenFM
What is ElevenLabs? Everything we know about the best AI speech startup
Speechify
What is Speechify? Everything we know about the AI text-to-speech tool
Person using a laptop
Best text-to-speech software of 2025
Text to speech
Universal translators are tantalizing close as Facebook's Meta reveals its tech can translate between 101 languages
A person using a laptop and phone against a sepia background.
Best free text-to-speech software of 2025
Latest in Pro
cybersecurity
What's the right type of web hosting for me?
Security padlock and circuit board to protect data
Trust in digital services around the world sees a massive drop as security worries continue
Hacker silhouette working on a laptop with North Korean flag on the background
North Korea unveils new military unit targeting AI attacks
An image of network security icons for a network encircling a digital blue earth.
US government warns agencies to make sure their backups are safe from NAKIVO security issue
Laptop computer displaying logo of WordPress, a free and open-source content management system (CMS)
This top WordPress plugin could be hiding a worrying security flaw, so be on your guard
construction
Building in the digital age: why construction’s future depends on scaling jobsite intelligence
Latest in News
Quordle on a smartphone held in a hand
Quordle hints and answers for Sunday, March 23 (game #1154)
NYT Strands homescreen on a mobile phone screen, on a light blue background
NYT Strands hints and answers for Sunday, March 23 (game #385)
NYT Connections homescreen on a phone, on a purple background
NYT Connections hints and answers for Sunday, March 23 (game #651)
Google Pixel 9 Pro Fold main display opened
Apple is rumored to be prioritizing battery life on the foldable iPhone – which could also feature a liquid metal hinge for added durability
Google Pixel 9
The Google Pixel 10 just showed up in Android code – and may come with a useful speed boost
L-mount alliance
Sirui joins L-Mount Alliance to deliver its superb budget lenses for Leica, DJI, Sigma and Panasonic cameras