The New York Times is suing OpenAI and Microsoft for copyright infringement
The AI companies used the newspaper’s articles for training.
The New York Times is suing OpenAI and Microsoft for using published news articles to train its artificial intelligence chatbots without an agreement that compensates it for its intellectual property. The lawsuit, which was filed in a Federal District Court in Manhattan, marks the first time a major news organization has pursued the ChatGPT developers for copyright infringement. The NYT did not specify how much it seeks in payout from the companies but that “this action seeks to hold them responsible for the billions of dollars in statutory and actual damages.”
The NYT claims that OpenAI and Microsoft, the makers of Chat GPT and Copilot, “seek to free-ride on The Times’s massive investment in its journalism” without having any licensing agreements. In one part of the complaint, the NYT highlights that its domain (www.nytimes.com) was the most used proprietary source mined for content to train GPT-3.
It alleges more than 66 million records, ranging from breaking news articles to op-eds, published across the NYT websites and other affiliated brands were used to train the AI models. The lawsuit alleges that the defendants in the case have used “almost a century’s worth of copyrighted content,” causing significant harm to the Times’ bottom line. The NYT also says that OpenAI and Microsoft’s products can “generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style.” This mirrors other complaints from comedians and authors like Sarah Silverman and Julian Sancton who claim OpenAI has profited off their works.
The New York Times sued OpenAI and Microsoft for copyright infringement, a new front in the debate over the use of published work to train AI. https://t.co/u8qZ247dCl
— The New York Times (@nytimes) December 27, 2023
“We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from AI technology and new revenue models,” an OpenAI spokesperson told Engadget. In an email, the representative explained that the two parties were engaged in ongoing “productive conversations” and the company described the lawsuit as unexpected. “We are surprised and disappointed with this development,” the OpenAI spokesperson told Engadget. Still, OpenAI is hopeful that the two will find a “mutually beneficial way to work together.”
If the lawsuit makes any headway, it could create opportunities for other publishers to pursue similar legal action and make training AI models for commercial purposes more costly. Competitors in the space, like CNN and BBC News have already tried limiting what data AI web crawlers can scrape for training and development purposes.
While it’s unclear if the NYT is open to a licensing agreement after its earlier negotiations failed, leading to the lawsuit, OpenAI has reached a few deals recently. This month, it agreed to pay publisher Axel Springer for access to its content in a deal projected to be worth millions. And articles from Politico and Business Insider will be made available to train OpenAI’s next gen AI tools as part of a three year deal. It also previously made a deal with the AP to use its archival content dating back to 1985. Microsoft did not respond to a request for comment.
Update, December 27 2023, 8:36 PM ET: This story has been to include comments from an OpenAI spokesperson on the lawsuit.
(20)