Millions of Meta LLama AI platform users could be at risk from leaked Hugging Face APIs
Data used to train AI models was compromised
Thousands of valid API tokens were left exposed on an open-source repository for AI projects, potentially granting hackers easy access to major business accounts, researchers have revealed.
A report from Lasso Security claims the access could have been used for supply chain attacks, having run several substring searches on the Hugging Face platform and manually collecting the API tokens that were returned.
Then, by using the whoami Hugging Face API, the researchers were able to learn if the tokens were valid, who they belonged to, what the owner’s email was, and what permissions they had.
Data poisoning
In total, the researchers found at least 1,500 API tokens that gave them access to more than 700 business accounts. Most tokens (655) had write permissions, allowing the attackers to modify the files they found in the repositories. Those 655 tokens belonged to 77 organizations, including hotshots Meta.
So, how could hackers exploit these API tokens? The researchers said attackers could use them to swipe or poison training data, or even steal AI models. In its writeup, The Register claims Google’s Gmail anti-spam filters work on “reliably trained” artificial intelligence models, and should the training data be compromised, that could potentially result in spam or malicious emails making it into people’s inboxes.
The publication also claims data poisoning of this sort could lead to the sabotage of network traffic. “If network traffic isn't correctly identified as email, web browsing, etc, then it could lead to misallocated resources and potential network performance issues. “
During their analysis, the researcher stole more than 10,000 private models, they concluded.
Are you a pro? Subscribe to our newsletter
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
"The ramifications of this breach are far-reaching, as we successfully attained full access, both read and write permissions to Meta Llama 2, BigScience Workshop, and EleutherAI, all of these organizations own models with millions of downloads – an outcome that leaves the organization susceptible to potential exploitation by malicious actors," says Bar Lanyado, security researcher at Lasso Security.
"The gravity of the situation cannot be overstated.”
The three companies in question have all barred access to these tokens in the meantime.
More from TechRadar Pro
- Millions more 23andMe records leaked online
- Here's a list of the best firewalls today
- These are the best endpoint protection software right now
Sead is a seasoned freelance journalist based in Sarajevo, Bosnia and Herzegovina. He writes about IT (cloud, IoT, 5G, VPN) and cybersecurity (ransomware, data breaches, laws and regulations). In his career, spanning more than a decade, he’s written for numerous media outlets, including Al Jazeera Balkans. He’s also held several modules on content writing for Represent Communications.