Two federal judges have ruled that using copyrighted works to train an AI model is a "fair use" under copyright law. A brief overview of the cases and decisions are below.
Kadrey v. Meta Platforms (N.D. Cal. – Judge Vince Chhabria)
Thirteen well-known authors, led by novelist Richard Kadrey, accused Meta of downloading their books from "shadow libraries" and feeding them into its Llama models. Judge Chhabria denied the authors' motion and granted Meta partial summary judgment, holding that Meta's training use was fair as to these plaintiffs.
- Factor 1 (purpose/character): Training an LLM is "highly transformative" due to turning books into statistical patterns for a tool that can draft emails, translate text, or write code; the commercial motive did not outweigh that transformative use.
- Factor 2 (nature): The books are expressive, but that factor "rarely plays a major role" and carried little weight here.
- Factor 3 (amount): Copying entire works was "reasonably necessary" because LLMs perform better when trained on complete, high-quality texts.
- Factor 4 (market effect): This "single most important" factor sank the authors' case. The court found no evidence that Llama can spit out meaningful chunks of their books or that its existence has harmed sales, and it ruled that any lost opportunity to license books as training data is not a cognizable market under copyright law.
Because the writers "made the wrong arguments and failed to develop a record in support of the right one," Meta escaped liability—but only as to these 13 plaintiffs, not the "countless others" whose books it used. Therefore, the court left the door open for future artists and authors to sue Meta for copyright infringement.
Bartz v. Anthropic (N.D. Cal. – Judge William Alsup)
Three authors — Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson — sued Anthropic over millions of books that the start-up either pirated from Books3/LibGen or bought, scanned, and stored in its internal library before using subsets to train Claude. Judge Alsup ruled as follows:
- Training Copies: Copying books to derive "statistical relationships" and build an LLM was "exceedingly transformative" and therefore fair use.
- Scanned Purchases: Turning lawfully bought print books into digital files for in-house reference was also fair, likened to making space-saving, searchable backups.
- Pirated Library: Maintaining unauthorized copies downloaded from pirate sites in a permanent reference library was not fair use; the court held that building a private research archive did not justify wholesale infringement.
The ruling leaves Anthropic potentially liable for damages tied to the pirated cache but shields its model-training pipeline and its digitization of lawfully acquired books.
Take-aways
The two decisions show that fair-use outcomes can hinge on different factors, including what proof of market harm exists and how the copies are obtained and used before training. Judge Chhabria's opinion spotlights plaintiffs' evidentiary burdens, while Judge Alsup draws a bright line between innovative machine-learning uses and plain old piracy.
The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.