OpenAI is challenging a federal court order requiring the company to retain all user data, including deleted chats, in a lawsuit brought by The New York Times.
“We strongly believe this is an overreach by The New York Times. We will continue to challenge this decision so that we can maintain your trust and privacy can ensure,” said OpenAI COO Brad Lightcap.
The decision follows a May 13 order to “retain and set aside all output log data that would otherwise be deleted until further determined by the court.”
The New York Times filed a lawsuit against OpenAI and Microsoft in December 2023, alleging that both companies used content from the newspaper without permission to train large language models such as ChatGPT and Bing Chat.
The newspaper claims this infringes on its copyright and threatens its business model of original journalism. Last month, it reported that potential evidence of copyright infringement could be removed as users clear their chat history.
At the heart of the case is whether using copyrighted material to train generative AI models qualifies as “fair use.” The Times alleges that OpenAI’s tools sometimes generate almost verbatim text from its articles and are also able to bypass the paywall with AI-generated summaries.
Both sides claim to be taking the moral high road. The Times has stated that it is protecting journalism and the ability of the media to do their job and get paid for it.
OpenAI Chief Executive Officer Sam Altman has accused the newspaper of being “on the wrong side of history,” while the company claims The Times selectively chose the dates in the lawsuit.
As the generative AI industry expands, courts are becoming important battlegrounds in the fight over data, privacy, and intellectual property.
The lawsuit is part of several high-profile copyright claims against OpenAI and other AI companies. In April, Ziff Davis, which owns media outlets including PCMag and Mashable, sued OpenAI, alleging it used its content without permission.
This week, Reddit filed a lawsuit against another AI company, Anthropic, alleging that it copied Reddit data without permission. Anthropic is also being sued by music publishers and authors.
What is the reason for the lawsuit against OpenAI?
The lawsuit was filed by The New York Times, which alleges that OpenAI and Microsoft unauthorized use of copyrighted material to train their language models.
Why did OpenAI appeal the court order?
OpenAI argues that the requirement to retain all user data, including deleted chats, is a violation of privacy and not relevant to the lawsuit.
How could the outcome of this lawsuit impact the broader AI industry?
The outcome could set precedents for future use of copyrighted material in the training of AI models and the way privacy and intellectual property are handled in the digital environment.