ARTICLE
4 July 2025

Making Sense Of The First Two—Often Contradictory, Sometimes Confounding—Gen AI Fair Use Rulings

MP
Manatt, Phelps & Phillips LLP

Contributor

Manatt is a multidisciplinary, integrated national professional services firm known for quality and an extraordinary commitment to clients. We are keenly focused on specific industry sectors, providing legal and consulting capabilities at the very highest levels to achieve our clients’ business objectives.
After a long wait, we received the first two generative AI (GAI) fair use rulings in rapid succession. The headline findings of fair use for these particular GAI trainings mask deeper lessons in these mixed rulings...
United States California Technology

After a long wait, we received the first two generative AI (GAI) fair use rulings in rapid succession. The headline findings of fair use for these particular GAI trainings mask deeper lessons in these mixed rulings and belie judicial disagreements about how to reckon with machine learning's disruptive power.

On June 23, 2025, District Judge William Alsup of the Northern District of California issued the first substantive fair use ruling regarding a GAI platform in Bartz v. Anthropic. Two days later, on June 25, District Judge Vince Chhabria issued the second in Kadrey v. Meta.

1. Bartz v. Anthropic – A Mixed Result on Vulnerable Footing?

Bartz considered the Claude chatbot, which was trained on books and texts—without the authors' permission—that Anthropic either pirated from various sources (including Books3, Library Genesis and Pirate Library Mirror) or bought and then destructively scanned (taking off covers and bindings). These sets were placed into a central "research library" and copied again in whole or part to create "data mixes" used to train large language models (LLMs) that supported the Claude software. Anthropic retained these libraries after training.

Anthropic moved for summary judgment on the issue of fair use only, arguing "that pirating initial copies of Authors' books and millions of other books was justified because all those copies were at least reasonably necessary for training LLMs." Judge Alsup—himself a coder, who also presided over Oracle v. Google—made multiple rulings: (1) He held that "the use of the books at issue to train Claude and its precursors was exceedingly transformative" and a fair use. (2) The digitization of the print books purchased by Anthropic was also a fair use because it only replaced them with scanned and compressed digital copies. (3) However, the Court held that Anthropic "had no entitlement to use pirated copies for its central library." (4) Finally, creating a "permanent, general-purpose library was not itself a fair use excusing Anthropic's piracy."

Is the reasoning sound? The Court rejected the argument that using the Authors' works for training in reading and writing requires a license, reasoning that "[e]veryone reads texts, [] then writes new texts," and to require payment "specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable." The Court's reasoning makes no allowances for the unprecedented, machine-based consumption of such works and how such models are engaging in activities that humans have never performed with such speed and capacity. Thus, the Court's rationale that, "[l]ike any reader aspiring to be a writer, Anthropic's LLMs trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different," anthropomorphizes and normalizes machine learning in ways other courts might reject on similar facts.

Indeed, two days later, Judge Chhabria disagreed with Judge Alsup, writing that "[w]hat copyright law cares about, above all else, is preserving the incentive for human beings to create artistic and scientific works." Judge Chhabria then took issue with his colleague's learning analogy, calling it "inapt" and writing that "when it comes to market effects, using books to teach children to write is not remotely like using books to create a product that a single individual could employ to generate countless competing works with a minuscule fraction of the time and creativity it would otherwise take."

The lens through which a judge considers the power of machine learning thus plays a significant role in a fair use determination. And there appears to be a disconnect between Bartz's reasoning that the "technology at issue was among the most transformative many of us will see in our lifetimes," and its conclusion that the training is permissible because the machine learns in the way a human would. In such a framing, human creators and rightsholders will nearly always come out on the losing end to more powerful computers. Is that the result the Copyright Act intends? Surely not, especially as courts and the Copyright Office are unanimous in holding that only humans can be authors for purposes of the Copyright Act.

Other Key Takeaways: The legitimacy of the source and acquisition of the training material matters. Using pirated works did not support fair use, while purchasing copies did. How this plays out for musical, visual or film works remains to be seen, but books are a much different medium when it comes to purchasing and disposing of them through destructive copying than are music, images or audiovisual works. And the Bartz ruling may render vulnerable LLMs that have already been trained on unlawfully acquired works, as the Court "doubts that any accused infringer could ever meet its burden of explaining why downloading source copies from pirate sites that it could have purchased or otherwise accessed lawfully was itself reasonably necessary to any subsequent fair use."

The Limits of Bartz: Here, only inputs and training were at issue, not outputs. The Court noted that "if the outputs seen by users had been infringing, Authors would have a different case. And, if the outputs were ever to become infringing, Authors could bring such a case." Thus, the legal questions of fair use for outputs are for another day (and another case).

What Happens Next? The court said it will hold a trial "on the pirated copies used to create Anthropic's central library and the resulting damages, actual or statutory (including for willfulness)." Given the potential size of the class (certification remains pending), and the possibility of millions of registered works, a damages award could be significant. However, it seems reasonably likely that one side or the other will seek interlocutory review of Judge Alsup's order.

2. Kadrey v. Meta – A Narrow Platform Victory and Roadmap for Finding Market Harm?

Two days later and two floors down in San Francisco, Judge Chhabria ruled for Meta and against book authors in what he called a "limited," case-specific ruling—one that is unlikely to give GAI platforms much comfort. The "upshot," Judge Chhabria wrote, "is that in many circumstances it will be illegal to copy copyright-protected works to train generative AI models without permission." But this case, he found, was different, and that "given the state of the record, the Court has no choice but to grant summary judgment to Meta." In the same breath he noted that "in the grand scheme of things, the consequences of this ruling are limited," and chastised the plaintiffs for having "made the wrong arguments and fail[ing] to develop a record in support of the right one."

So if the plaintiffs' attorneys here made the wrong arguments, what would have been the right one? According to the Court, the "potentially winning argument" was that the authors' works were copied "to create a product that will likely flood the market with similar works, causing market dilution." But the evidentiary record did not support that argument, and the Court held that Meta's Llama model—which was trained from works downloaded from "shadow libraries," including LibGen, Z-Library, and Books3—was a fair use.

The Fair Use Factors: Marching through the fair use test, the Court found that the first factor—the purpose and character of the use—favored Meta because the training of Llama on the plaintiffs' works was "highly transformative." Factor two, the nature of the copyrighted work—though typically not a dispositive factor—favored the plaintiffs because their books are "highly expressive works," including novels, memoirs and plays. Factor three—the amount and substantiality of the works—is related to the first and favored Meta, in part because "Meta's LLMs won't output any meaningful amount of the plaintiffs' books" and also because the amount copied "was reasonable given its relationship to Meta's transformative purpose."

Factor 4: Market Harm: The Court recognized that the fourth factor—effect upon the potential market for or value of the copyrighted work—was the single most important and, on the record before it, ruled for Meta. Judge Chhabria noted three potential market harm arguments here:

First, that the model regurgitates the plaintiffs' works, thereby substituting for them for free. Here, however, the Court found that Llama "does not allow users to generate any meaningful portion of the plaintiffs' books," noting the record showed no more than an output of 50 words from any of their books, even in response to adversarial prompting.

Second, there might be harm to the market (or development of the market) for licensing for AI training. But the Court found that whether such a market currently or might exist "is irrelevant," because it is not one that plaintiffs are entitled to monopolize and that the harm from the loss of licensing fees is not cognizable harm for this factor, lest the test become circular.

Third, the Court noted that plaintiffs might argue that the model will generate outputs that will compete with the originals and might substitute for them. But here it found that "the plaintiffs' presentation is so weak" that it could not support this position, having "never so much as mentioned it in their complaint," nor "summary judgment motion." The Court suggested that such an argument, properly developed and presented, could have carried the day in the fair use analysis, observing that this case "involves a technology that can generate literally millions of secondary works, with a miniscule fraction of the time and creativity used to create the original works it was trained on." And it noted that novelty of the market and lack of precedent would not be an impediment to making such an argument, underscoring that "Courts can't stick their heads in the sand to an obvious way that a new technology might severely harm the incentive to create, just because the issue has not come up before."

Finally, Judge Chhabria brushed away policy arguments that have been made in favor of unlicensed training, writing that "the suggestion that adverse copyright rulings would stop this technology in its tracks is ridiculous."

Key Takeaways & Silver Linings for AI Plaintiffs?: The lessons from Kadrey could thus be summed up as follows: copyright owners have a path to combat fair use defenses, but only where they can adequately develop a record of market harm sufficient to overcome the transformative nature of GAI training.

And with these two orders now issued, we turn our eyes back to the horizon to see how other courts will reckon with similar issues and how they may cite, adopt, or distinguish these rulings.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

Mondaq uses cookies on this website. By using our website you agree to our use of cookies as set out in our Privacy Policy.

Learn More