C.A.G. Insights

REMOVAL OF COPYRIGHT MANAGEMENT INFORMATION NOT ACTIONABLE FOR TRAINING OPEN AI LLM MODEL

Raw Story Media v Open AI ArticleRaw Story Media and AlterNet sued OpenAI in the Southern District of New York, alleging violation of the Digital Millennium Copyright Act (DMCA) by OpenAI.  Raw Story Media argued that using copyrighted content to train Open AI’s LLM model resulted in a risk of reproducing their works with the copyright management information (CMI) removed, thus violating Section 1202(b) of the DMCA.

Lack of Standing to Bring DMCA Claim

OpenAI moved to dismiss the lawsuit.  Recently, the court granted the motion to dismiss, reasoning that the Plaintiffs failed to explain how they were harmed if nobody ever saw ChatGPT’s internal training models.  Notably, the court found that given the model was trained on a large and diverse data set, it was unlikely that ChatGPT would output plagiarized content from one of plaintiffs’ articles.

The court also rejected the argument that removing the identifying information from the copyrighted work was a DMCA violation:  “I am not convinced that the mere removal of identifying information from a copyrighted work — absent dissemination — has any historical or common-law analog.” The court reasoned, “Plaintiffs allege that their copyrighted works (absent copyright management information) were used to train an AI-software program and remain in ChatGPT’s repository of text. But Plaintiffs have not alleged any actual adverse effects stemming from this alleged DMCA violation.”

The court distinguished cases filed by other news outlets where claims of copyright infringement were alleged, stating that it remains to be seen whether those other legal theories can survive, “but that question is not before the court today.”

According to the court, Section 1202 is not intended to grant copyright holders exclusive control over their works’ modifications; instead, it ensures the integrity of CMI in the digital marketplace. Unlike copyright infringement claims, this provision of the DMCA does not protect owners’ right to control their work’s future iterations. Based on this interpretation, and with no allegations to support tangible harm, such as the dissemination of plaintiffs’ works by ChatGPT without CMI, the plaintiffs’ claims failed the test of concrete injury required to meet Article III’s standing requirement.

Plaintiffs argued there was a “substantial risk” that ChatGPT would produce their works without attribution. However, the court rejected this argument, finding no substantial likelihood of harm. According to the court, ChatGPT’s large and diverse training dataset, drawn from internet sources, reduced the probability that a response would match plaintiffs’ content verbatim.

Although Plaintiffs cited instances of previous ChatGPT versions generating plagiarized outputs, the court found this insufficient to establish a “substantial risk” that the current version would violate their copyrights. Consequently, the court held that the plaintiffs lacked the standing required to pursue injunctive relief based on future harm.

Broader Implications of the DMCA

The court observed that the plaintiffs’ core complaint addressed concerns beyond the DMCA’s intended protections, focusing instead on OpenAI’s use of their content for training purposes without licensing or compensation. Judge McMahon emphasized that Section 1202 aims to prevent misinformation and uphold CMI integrity in digital markets, not to guarantee compensation for unlicensed data use in AI training. Although other legal theories or statutes might more appropriately address the plaintiffs’ issues, this provision of the DMCA does not recognize their alleged harm as actionable injury.

Denial of Leave to Amend

The plaintiffs requested permission to amend their complaint if dismissed. Still, the court denied this without prejudice, allowing them to refile an amended complaint if they can demonstrate valid grounds for relief. However, the court doubted their ability to craft a viable DMCA claim under these facts.

Conclusion

The court concluded that the plaintiffs lacked Article III standing to claim damages or injunctive relief and dismissed the case in favor of OpenAI. This decision illustrates the challenge of proving concrete injury under the DMCA without evidence of actual dissemination or infringement.  The decision also suggests that it may be difficult even to bring a claim of copyright infringement based on current versions of AI, but again, copyright infringement was not before the court in this case.