Microsoft AI CEO claims any content published online is "freeware" to help train AI — and people are not happy

(Image credit: Shutterstock / thanmano)

Microsoft's head of AI has sparked outrage by claiming all publicly available information used to train AI models as “freeware.”

Speaking in an interview with CNBC at the Aspen Ideas Festival, Mustafa Suleyman, CEO of Microsoft AI attempted to differentiate between openly accessible web content and copyrighted material explicitly protected by publishers.

However, he also acknowledged the complexity surrounding content that publishers specifically protect from scraping.

Should AI use content published online for training?

During the broad discussion covering the current state of AI technology, its potential impact on various industries and society, the challenges and concerns surrounding its development and the role of AI in the future, Suleyman also emphasized the need for responsible development and governance.

The conversation delves into the debate surrounding open source and closed-source AI models, with Suleyman advocating for cooperation rather than an adversarial approach when it comes to international development, particularly with regard to China.

However, regardless of where AI models are trained, content creators have argued that their intellectual property is being exploited without compensation, with many suggesting that the continued unauthorized use of their work threatens their livelihoods and, to a certain degree, the integrity of generative AI.

Suleyman’s statement that the legal boundaries of AI model training are still unclear is reflected in ongoing court cases. Shortly after the discussion, the Center for Investigative Reporting filed a lawsuit against OpenAI and its major investor, Microsoft, for using the nonprofit’s content without permission or compensation.

The body’s CEO, Monika Bauerlein, stated: “OpenAI and Microsoft started vacuuming up our stories to make their product more powerful, but they never asked for permission or offered compensation, unlike other organizations that license our material.”

While Microsoft faces ongoing scrutiny over its handling of data for AI, it has at least offered protection for users of its GenAI tools to protect them from any copyright cases.

An OpenAI spokesperson told us: "We are working collaboratively with the news industry and partnering with global news publishers to display their content in our products like ChatGPT, including summaries, quotes, and attribution, to drive traffic back to the original articles. A component of the partnerships is the ability to leverage publisher content using various machine learning and training techniques to help us optimize the display of that content and make it more useful to users."

TechRadar Pro asked Microsoft to comment on the recent lawsuit, but we did not receive an immediate response.

More from TechRadar Pro

These are the best AI tools and best AI writers
Check out our roundup of the best cloud hosting providers
LLM copyright and IP theft "de facto endorsed" by UK government

TOPICS

With several years’ experience freelancing in tech and automotive circles, Craig’s specific interests lie in technology that is designed to better our lives, including AI and ML, productivity aids, and smart fitness. He is also passionate about cars and the decarbonisation of personal transportation. As an avid bargain-hunter, you can be sure that any deal Craig finds is top value!

Should AI use content published online for training?

Are you a pro? Subscribe to our newsletter

More from TechRadar Pro