ARTICLE
5 December 2024

Generative AI Is Facing Decisive Battles In War Over Fair Use

KM
Katten Muchin Rosenman LLP

Contributor

Katten is a firm of first choice for clients seeking sophisticated, high-value legal services globally. Our nationally and internationally recognized practices include corporate, financial markets and funds, insolvency and restructuring, intellectual property, litigation, real estate, structured finance and securitization, transactional tax planning, private credit and private wealth.
The legality of generative artificial intelligence rests in part on copyright law, and the law's tolerance for training generative AI models with third-party works without permission could affect...
United States California Intellectual Property
  • Katten partner examines two court cases over AI training data
  • Copyright Office guidance will influence fair use in AI

The legality of generative artificial intelligence rests in part on copyright law, and the law's tolerance for training generative AI models with third-party works without permission could affect the overall legal viability of current technology.

Model developers argue that generative AI models need third-party copyrighted data, because the public domain and licensed data currently available is insufficient for the vast volume of data needed for training. So, the argument goes, without copyrighted data in training data sets, we wouldn't have today's or tomorrow's generative AI breakthroughs.

Developers also argue that since the training data isn't copied verbatim into the generative AI model, the model shouldn't produce verbatim copies of copyrighted content. They say the copyrighted content is transformed into something new and unrecognizable within the model.

That concept is called fair use—and it's being hotly contested in courts, legislatures, and agencies worldwide. While answers remain elusive, several legal actions could offer some substantive guidance over the next several months. Two federal court cases in particular present a real possibility of substantive answers in the generative AI copyright wars.

A Delaware federal judge is poised to rule on renewed motions for summary judgment in Thomson Reuters v. Ross Intelligence. The case involves allegations that Ross Intelligence Inc.'s competing AI legal research tool used content from Thomson's Westlaw legal research database as training data without authorization. Oral argument on the summary judgment motions will be held on Dec. 5 and 6.

The court postponed trial to decide the renewed summary judgment motions, so it must believe there are issues that may be resolved before trial. A substantive ruling in Thomson would become a roadmap for litigating generative AI infringement and fair use issues. For example, if the court rules that particular AI training practices are or aren't fair use as a matter of law, litigants around the country undoubtedly would seek to apply or distinguish those rulings.

A California federal court heard oral arguments on motions Monday in Concord Music Group v. Anthropic, involving Anthropic's alleged unauthorized use of the plaintiffs' music lyrics to train the Claude series of generative AI models. The defendant moved to dismiss the case, while the plaintiff moved for preliminary injunction—the only generative AI copyright case featuring such a motion.

A preliminary injunction motion forces the court, early in the case, to confront the question of the plaintiff's "likelihood of success on the merits" of the copyright infringement claim and the defendant's fair use arguments.

Like the summary judgment ruling in Thomson, the court's preliminary injunction decision in Concord may become a guiding light in the dozens of other cases that raise similar fundamental issues.

Practitioners also are eagerly awaiting guidance from the Copyright Office on applying the fair use doctrine to generative AI training and the copyrightability of works created using generative AI. Register of Copyrights Shira Perlmutter recently testified before the Senate Judiciary Committee that she expects those reports to arrive by the end of 2024.

The Copyright Office's report should influence courts and others, particularly if it takes a clear position on fair use under certain factual scenarios, provides additional considerations or factors to weigh in the generative AI fair use analysis, or calls for Congress to deal with the fair use issue.

Today, US copyright law has a strict "human authorship" requirement that essentially prohibits fully AI-generated works from receiving protection because a human didn't make the creative decisions. Meanwhile, generative AI adoption across creative industries shows no signs of slowing. New or different IP protection strategies may emerge, depending on whether the Copyright Office softens its stance on human authorship (and whether Congress weighs in).

For example, companies that rely heavily on copyright protection for core products and services may implement new policies on generative AI uses and corresponding human involvement and recordkeeping to increase chances of copyright protection. They also may seek to build out enhanced contractual protections for their work and shift some resources towards trade secrets and other IP systems.

Since there is no comprehensive federal AI law, legislatures remain focused on regulating generative AI, focusing on issues such as algorithmic discrimination and bias, AI use disclosures, and right of publicity issues for AI lookalikes, soundalikes, and deep fakes.

The Colorado AI Act is a prominent example, because it's considered the first comprehensive state AI law regarding algorithmic discrimination across a number of industries. California recently enacted, requiring generative AI developers to post information on their websites on the data used to train their AI systems—including whether copyrighted works were included in training data.

The EU AI Act, which became law earlier this year, may apply to US companies doing business in the EU, but detailed training data summaries remains unclear.

To the extent generative AI model developers must provide training data summaries that contain sufficient detail for content owners to identify use of their works to train models, this could lead to a surge in copyright litigation and/or licensing deals.

While generative AI developers argue fair use practices and defend their model training practices from copyright challenges, they're not merely waiting for answers from courts or legislatures.

Developers have pursued licensing deals with major publishers and content owners for permission to use copyrighted content as model training data and/or retrieval augmented search data (that is, sources from which generative AI answers are pulled). They're also exploring technological guardrails to reduce regurgitation—verbatim copying in generative AI outputs.

As a result of these efforts, future generative AI models may not be as dependent on fair use arguments or as vulnerable to output-side infringement claims. The level of investment in these safeguards, and their effectiveness in reducing legal challenges, is another front in the generative AI wars worth monitoring.

Originally published by Bloomberg Law.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

Mondaq uses cookies on this website. By using our website you agree to our use of cookies as set out in our Privacy Policy.

Learn More