Key Takeaways
Judge Chhabria recently granted summary judgment for Meta Platforms, Inc. (Meta) in two key rulings finding that:
- Meta's use of copyrighted books to train LLMs is fair use due to its highly transformative nature and the lack of market harm evidence.
- The training use is transformative, as its purpose is to train tools to generate varied outputs, unlike the original works, which are meant for entertainment/education.
- The argument that training use harmed the licensing market for training content is circular and should not be considered.
- AI-generated works could cause market harm through indirect substitution, but found the plaintiffs lacked sufficient evidence to prove such harm.
- Meta's LLMs do not produce substantial portions of plaintiffs' works, supporting its fair use arguments.
- Plaintiffs' DMCA claims fail because fair use is a full defense to infringement.
Judge Vince Chhabria of the U.S. District Court for the Northern District of California recently issued a pair of significant summary judgment rulings in Kadrey v. Meta, one of the first copyright cases to squarely address "fair use" in the context of training AI models. While the first ruling addressed the fair use question, the second addressed the plaintiffs' claims for removal of copyright management information under the Digital Millennium Copyright Act (DMCA).
Meta prevailed in both rulings, although Judge Chhabria suggested that he expects future litigants may be able to present stronger arguments against fair use if they can show sufficient evidence of market impacts.
Fair Use Ruling
The parties filed cross-motions for partial summary judgment on the question of fair use. Fair use permits certain unauthorized uses of copyrighted works that would otherwise constitute infringement. The Copyright Act (17 U.S.C. §107) sets out four nonexclusive factors for determining whether an unauthorized use of another's work is a fair use. In a June 25 order, Judge Chhabria reviewed each of these factors in turn and ultimately denied the plaintiffs' motion and granted Meta's motion, finding that the training use constituted fair use.
Factor 1: Purpose and Character of the Use
Under the first factor, courts assess the purpose and character of the use. While a number of elements—including whether a use is of a commercial nature or for nonprofit educational purposes—may influence this analysis, Judge Chhabria focused primarily on whether the accused use was "transformative."1
Transformative Use
Judge Chhabria found that this factor strongly favored Meta since Meta's use of the plaintiffs' books to train its large language models (LLMs) was highly transformative because the purpose and character of the training use were fundamentally different from those of the original works. He emphasized that while the plaintiffs' books were meant to be read for entertainment or education, Meta used them to train innovative software tools that generated a wide variety of outputs and performed other tasks.
The court rejected the argument that training LLMs serves the same purpose as the books because it is similar to a human reading a book. The court distinguished the technical nature of LLMs from human learning and noted that Meta's creation of a general-purpose tool available to anyone has the potential to exponentially multiply creative and functional expression.
The court also rejected plaintiffs' argument that Meta's use is not transformative because they can output materials that "mimic" the plaintiffs' work or writing styles, if prompted to do so, and amounts to mere "repackaging." Instead, the court found that plaintiffs had presented no evidence that Meta's LLMs reproduced substantial portions of their works, even when prompted adversarially to do so. It concluded that, at most, plaintiffs' evidence suggested Meta wanted its LLMs to emulate certain writing styles, which are not protectable under copyright law.
Commercial Use
The court also addressed the commercial nature of Meta's use. While it acknowledged that Meta expected to generate significant revenue from the models, it found this did not outweigh the transformative character of the use. Although commercial uses tend to weigh against fair use, courts have found such weight diminished where the use is transformative. Thus, the commercial nature was relevant but not dispositive.
Bad Faith
Plaintiffs argued that knowingly using unauthorized sources, such as shadow libraries, to acquire the books used for training should preclude a fair use finding. However, the court explained that the use of shadow libraries—which the plaintiffs refer to as "piracy"—is not itself determinative under fair use, which by definition, allows for certain unauthorized copying. While the manner of acquisition could be relevant to questions of bad faith, the court expressed skepticism about the significance of bad faith in the fair use analysis and found it was non-dispositive here, given the overall transformative purpose of the use.
Downloading
Judge Chhabria also rejected the argument that downloading the books should be evaluated separately from the use of the books in training. He concluded that downloading must be considered in light of its ultimate, highly transformative purpose (i.e., training), and because the ultimate training use was transformative, so too was the intermediate use of downloading the books. Judge Chhabria reasoned that even works that were downloaded for training datasets but were not ultimately used for training were still part of the same transformative process and thus could not be meaningfully separated from the overall purpose, which justified the intermediate copying as fair use.
Factor 2: Nature of the Copyrighted Works
Under the second factor, courts look at whether the work is more creative or more informational and factual, recognizing that fair use is more difficult to establish when the works are highly creative. Judge Chhabria concluded that this factor favored the plaintiffs because the plaintiffs' works, which are primarily novels, memoirs, and plays, are "highly expressive." Even as to factual works, such as autobiographies, the court emphasized that copyright still protects the expressive choices authors make in presenting the facts.
Nevertheless, the court noted that this factor is typically of limited importance in the overall fair use analysis, especially when the works in question have already been published.
Factor 3: Amount and Substantiality of the Portion Used
The third factor considers "the amount and substantiality of the portion used in relation to the copyrighted work as a whole" in relation to the purpose of the use. Judge Chhabria explained that this factor is closely linked to the first factor, since the amount of permissible copying depends on the nature and purpose of the secondary use.
In this case, the court concluded that the third factor favored Meta, even though Meta copied the plaintiffs' books in full, because copying the entire work was reasonably necessary to achieve the transformative purpose of training its models. The court also noted that the amount copied was not especially relevant because Meta's LLMs did not output any meaningful amount of the plaintiffs' books, and therefore the amount copied would not increase the risk that the use would serve as a market substitute for the original.
Factor 4: Effect of the Use Upon the Potential Market for or Value of The Copyrighted Work
The fourth fair use factor examines the effect of the use upon the potential market for or value of the copyrighted work. Often considered the most important factor in fair use analysis, it considers both actual market harm and whether there is a potential for harm to the market for the work if defendant's conduct becomes widely adopted. The court noted that the relevant harm at issue was market substitution for the original works and addressed three possible types of market harm a plaintiff could try to establish.
Direct Substitution
The court acknowledged that if the tool could be used to generate significant portions of the plaintiff's books, that could serve as a market substitute. However, it found no evidence of direct substitution because both parties' experts agreed that Meta's LLMs' would not output more than 50 words from any plaintiff's book, even when provoked with "adversarial" prompts. The court found that this minimal capacity for reproduction posed no realistic threat of market substitution.
Licensing Market for AI Training
The plaintiffs relied heavily on the theory that Meta's unauthorized use of their books undermined a potential market for licensing those books for training LLMs. However, the court rejected this argument, explaining that if accepted, it would always favor plaintiffs because it assumes that the use is not transformative and is therefore subject to licensing. If the use is transformative, on the other hand, it may be a fair use for which no license is necessary. Thus, the court held that to prevent this analysis from becoming circular, the harm from the loss of fees paid to license the work at issue for a transformative purpose is not considered.
Indirect Substitution
The third theory of potential market harm discussed is that LLMs trained on copyrighted books may enable the mass generation of competing works that, even if not infringing themselves, may dilute the market for the original works by saturating the field. The potential harm is a form of market dilution or "indirect" market substitution. Judge Chhabria recognized that such indirect substitution may vary based on the work or the author but noted the possibility that LLMs could generate vast numbers of books quickly, cheaply, and with minimal creativity, potentially crowding out new or lesser-known authors.
The decision rejected Meta's argument that harm caused by outputs is only relevant if the outputs themselves are infringing. It notes that while it would be easier to show market harm in that situation, less similar outputs, such as books on the same topic or in the same genre, can still compete for sales with plaintiffs' books. It also rejected the argument that "legitimate" competition from non-infringing works is not cognizable as market harm, in part because the court says the LLMs benefit from the creative expression in the works they are trained on.
Although Judge Chhabria was open to theories of market harm based on indirect substitution, he found for Meta on this issue because plaintiffs had not raised this theory in their complaint or in their summary judgment motion. Judge Chhabria concluded that the plaintiffs had therefore failed to produce sufficient evidence to support this theory.
Overall Analysis
The court noted that because Meta's use was found to be highly transformative, the plaintiffs needed a decisive win on the market harm factor to prevail. They fell short of this standard because they did not produce meaningful evidence of market harm, and the court therefore granted Meta summary judgment. However, the court also noted that fair use is a fact-specific affirmative defense and made clear that other cases with different fact patterns, and better developed evidence of the market effects of the defendant's use, could have a different outcome.
DMCA Ruling
Both parties filed cross-motions for partial summary judgment on the plaintiffs' claims under § 1202(b)(1) of the DMCA. Under that provision, the intentional removal of copyright management information is prohibited when the person removing it knows or has reason to know that such removal will "induce, enable, facilitate, or conceal an infringement." In a June 27 order, the court denied the plaintiff's motion and granted Meta's motion holding that, because Meta's copying of the plaintiffs' books was fair use, there was no underlying infringement to facilitate or conceal, and therefore no DMCA violation.
Other Judicial Activity
Judge Chhabria's decisions follow closely on the heels of another ruling addressing whether training of generative AI tools is a fair use from the Northern District of California. In Bartz v. Anthropic, Judge William Alsup also ruled, in a June 23 order, that using copyrighted works to train an LLM constituted a fair use. However, Judge Alsup held that other uses of the books, such as making copies to create a central library, were not entitled to summary judgment on fair use on the record before him and permitted plaintiffs to proceed to trial on those uses. The decisions are notable for some key differences in the reasoning they employ.
Both decisions found that training generative AI systems is a highly transformative use and placed substantial weight on this finding. Given this substantial weight, they both rejected the argument that training LLMs on content will displace a market for licensing the content for that purpose. However, Judge Alsup separately analyzed use of the copies to create a central library, versus for training purposes, and he further distinguished between purchased copies and pirated copies. He concluded that the use of pirated copies to build a central library was not transformative and did not grant summary judgment with respect to copies made from the central library that were not used for training. Judge Chhabria, by contrast, was not concerned about making copies to create a central library, even if they were not ultimately used for training.
In his analysis of market harm, though, Judge Alsup did not view indirect market substitution as sufficient market harm. He analogized LLM training to using books to teach children how to write, an activity that could also lead to the creation of numerous competing works, but that he concluded was manifestly not an infringing use of a work under current law. In his view, such downstream competition was not the kind of displacement the Copyright Act was designed to prevent. Rather, he said that the act seeks to advance original works of authorship, not to protect authors against competition. Judge Chhabria explicitly rejected the analogy about teaching children how to write. He emphasized that human learning is not comparable to creating a software tool that enables a single user to produce massive quantities of directly competing content with minimal time and effort.
These rulings leave many questions unanswered, but they offer valuable insights for content creators, rights holders, and technology companies navigating copyright issues in the generative AI landscape.
Footnote
1 The test for whether a use is transformative is whether and to what extent it "merely supersedes the objects of the original creation, or instead adds something new, with a further purpose or different character, altering the first with new expression, meaning, or message." Campbell v. Acuff-Rose Music, 510 U.S. 569 (1994)
The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.