OpenAI, Microsoft face class-action suit over internet data use for AI models

A class-action complaint filed Wednesday in the northern district of California alleges tech leaders OpenAI and Microsoft Corp. used "stolen and misappropriated" information from hundreds of millions of internet users without their knowledge to train and develop its artificial intelligence tech like chatbot ChatGPT. 

The 16 plaintiffs, who are represented by the Clarkson Law Firm and listed with initials, claimed the defendants "continue to unlawfully collect and feed additional personal data from millions" worldwide to that end and that they systematically scraped 300 billion words from the internet without consent.

The 157-page lawsuit written by Ryan Clarkson, the managing partner of the firm, also asserts that without the "unprecedented theft of private and copyrighted information belonging to real people," the products developed by the companies "would not be the multi-billion-dollar business they are today."

"Once trained on stolen data, defendants saw the immediate profit potential and rushed the products to market without implementing proper safeguards or controls to ensure that they would not produce or support harmful or malicious content and conduct that could further violate the law, infringe rights and endanger lives," Clarkson continued. "Without these safeguards, the products have already demonstrated their ability to harm humans, in real ways."



OpenAI with Microsoft Bing on mobile in a photo illustration March 15, 2023, in Brussels, Belgium. (Jonathan Raa/NurPhoto via Getty Images)

The firm said the defendants' disregard for privacy laws was only matched by their disregard for the "potentially catastrophic risk to humanity," citing a previous statement from OpenAI CEO Sam Altman. 

He has warned of the dangers of a misaligned superintelligent AGI before and recently called for AI regulation on Capitol Hill.

"AI will probably most likely lead to the end of the world, but in the meantime, there’ll be great companies," they quoted Altman as saying. Although, some media outlets have noted he was likely joking. 

In addition to calls for "transparency," "accountability" and "control," the lawsuit requests injunctive relief in the form of a temporary freeze on commercial access and development of the OpenAI products. 


Sam Altman, CEO of OpenAI, speaks during the Bloomberg Technology Summit in San Francisco June 22, 2023. (David Paul Morris/Bloomberg via Getty Images)

It also asks for the establishment of an "AI Council" to be responsible for approval of products before they are deployed and "data dividends" as compensation for "the stolen data on which the products depend."

OpenAI did not immediately respond to FOX News' request for comment on the matter. 


A sign outside the Microsoft Campus in Redmond, Wash., March 3, 2022. (Chona Kasinger/Bloomberg via Getty Images)

In March, the company updated its data usage and retention policies, saying it would not use data submitted by customers via its Application Programming Interface to train or improve its models unless the user explicitly decides to share data for that purpose. 

Additionally, any data sent through the API would be retained for abuse and misuse monitoring purposes for a maximum of 30 days, after which it will be deleted, unless otherwise required by law.

"We don’t use data for selling our services, advertising or building profiles of people — we use data to make our models more helpful for people," an OpenAI blogger said last week. 

Microsoft, which plans to invest billions into OpenAI, declined to comment.

OpenAI isn't the only company that has used internet data to train AI models, but Clarkson told The Washington Post Wednesday OpenAI was the "natural first target" after igniting an "AI arms race." 

LINK: Get updates and more on this story at