There is a lot of hype around providing data subject rights for the use of personal data in training AI models. I have always thought that there really is no need to provide rights to delete, correct, or even access personal information "in" the trained AI model because the trained AI model, by itself, doesn't actually have any personal information (maybe the subject for a longer blog post). The arguments made by the court in favor of Meta Platform's motion to dismiss seem to support this theory - if the AI model doesn't contain a derivative work of copyrighted material, it stands to reason that the trained large language model (LLM) also doesn't contain personal information (under any modern definition) either.

Of course, time will tell if a court directly addresses the need to provide data subject rights for personal information used to train an AI model itself. For now, all we know is that the U.S. Federal Trade Commission requires that notice be provided for the use of personal information in training AI models.

The plaintiffs allege that the "LLaMA language models are themselves infringing derivative works" because the "models cannot function without the expressive information extracted" from the plaintiffs' books. This is nonsensical. A derivative work is "a work based upon one or more preexisting works" in any "form in which a work may be recast, transformed, or adapted." 17 U.S.C. § 101. There is no way to understand the LLaMA models themselves as a recasting or adaptation of any of the plaintiffs' books.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.