Meta admits it scraped all Australian Facebook posts since 2007 to train its AI

In this photo illustration, the Meta Platforms, Inc. logo is displayed on a smartphone screen.
(Image credit: Photo Illustration by Rafael Henrique/SOPA Images/LightRocket via Getty Images)

Meta has admitted it used Facebook and Instagram publicposts for Australian users to train its Artificial Intelligence models, and has scraped information from as far back as 2007.

An Australian Parliamentary committee has heard that whilst European users can opt out thanks to GDPR laws, Australian customers are not given that choice.

Meta has denied using the information of anyone under 18, but did confirm it had used over a decade’s worth of data. The firm could not answer whether it has scraped the photos of children who are now adults (i.e. those who created their accounts as a child, but have since turned 18).

A turning tide

The process of ‘scraping’ is essential for the development of AI and is basically data harvesting from websites, extracting the information and feeding it back to a Large Language Models (LLMs) which learns from the data. This means that GDPR regulations are becoming troublesome for more and more LLMs such as ChatGPT, which collects data from all over the internet without consent from the original source.

Meta’s global privacy director Melinda Claybaugh sat before the inquiry and admitted that the company was forced to pause the launch of AI products in Europe due to a lack of certainty, and it has had to give European users an opt-out due to more robust privacy laws. Senator Shoebridge grilled the Meta representative,

“The truth of the matter is that, unless you consciously had set those posts to private, since 2007, Meta has just decided you will scrape all of the photos and all of the text from every public post on Instagram or Facebook that Australians have shared since 2007, unless there was a conscious decision to set them on private. But that’s actually the reality, isn’t it?”

Claybaugh replied, “Correct”. She added that users can set their posts to private now to prevent future scraping, but this would have no effect on the data already taken.

The realization seems to be creeping in for the public and for tech companies that training AI models requires such vast amounts of data that it is ‘impossible’ to do so without using copyrighted materials. Considering millions of user's posts have been used without their consent, it looks like tech giants might face much stricter regulations in future.

Via The Guardian

More from TechRadar Pro

Ellen Jennings-Trace
Staff Writer

Ellen has been writing for almost four years, with a focus on post-COVID policy whilst studying for BA Politics and International Relations at the University of Cardiff, followed by an MA in Political Communication. Before joining TechRadar Pro as a Junior Writer, she worked for Future Publishing’s MVC content team, working with merchants and retailers to upload content.

Read more
Zuckerberg Meta AI
Meta purportedly trained its AI on more than 80TB of pirated content and then open-sourced Llama for the greater good
Meta social media icons are being displayed on a smartphone among Facebook, Messenger, Instagram, Threads, and other products, with the Meta icon visible in the background.
How to opt out of Meta AI
Meta AI on a smartphone
Meta wants to fill your social media feeds with bots – here's why I think it's wrong
DeepSeek on a mobile phone
Australian and Indian governments block DeepSeek from worker devices
Make It Fair campaign on phone screen
UK creative industries launch ‘Make it Fair’ campaign against AI content theft
In this photo illustration, the business and employment-oriented network and platform owned by Microsoft, LinkedIn, logo seen displayed on a smartphone with an Artificial intelligence (AI) chip and symbol in the background.
LinkedIn facing lawsuit over accusations private messages used to train AI
Latest in Pro
Hands typing on a keyboard surrounded by security icons
Outdated ID verification myths put businesses at risk
China
Chinese hackers targeting Juniper Networks routers, so patch now
Google Meet create custom backgrounds
More AI features are coming to Google Workspace
Mac Studio on a desk
I compared Apple's Mac Studio M3 Ultra with 10 Windows workstations and I am truly shocked by what I found
Google Chrome dark mode
Google updates Chrome extension rules to ban affiliate link injection without user action or benefit
Abstract image of robots working in an office environment including creating blueprint of robot arm, making a phone call, and typing on a keyboard
This worrying botnet targets unsecure TP-Link routers - thousands of devices already hacked
Latest in News
Google Gemini Robotics
Gemini just got physical and you should prepare for a robot revolution
Lilo & Stitch Official Trailer
Stitch crashes into earth and steals our hearts with the first trailer for the live-action Lilo & Stitch
GTA 5
GTA Online publisher Take-Two is gunning for a black market that’s basically heaven for cheaters
Y2K cast looking shocked
Y2K has a streaming release date on Max, so you can witness the technology uprising at home
The Discovery+ homepage
Discovery+ just got a big update to its streaming app that makes it more like Max – here are 5 great new features to try
Two Android phones on a green and blue background showing Google Messages
Struggling with slow Google Messages photo transfers? Google says new update will make 'noticeable difference'