Recent Ruling Regarding Copyright Protected Data And AI Tool Training

Article Insights

Meister Seelig & Fein are most popular:

with Senior Company Executives, HR and Inhouse Counsel
with readers working within the Business & Consumer Services and Construction & Engineering industries

The meteoric rise of AI has brought upon new legal issues especially in the copyright space. Currently, the following ongoing litigations are at the forefront of the intersection of AI and copyright infringement: Authors Guild v. OpenAI Inc., Case No. 1:23-cv-08292 (S.D.N.Y.) consolidated with Alter v. OpenAI Inc., Case No. 1:23-cv-10211 and Basbanes v. Microsoft Corporation, Case No. 1:24-cv-00084; Kadrey v. Meta Platforms Inc., Case No. 23-cv-03417 (N.D.C.A.) consolidated with Farnsworth v. Meta Platforms, Inc., Case No. 24-cv-06893; Thomson Reuters v. Ross Intelligence, Inc., Case No. 20-cv-00613 (D. Del.). The parties involved in these cases are the copyright owner and companies that employ and develop AI-based tools such as OpenAI/Microsoft and Meta Platforms. These pending cases concern legal issues that courts have not previously addressed and thus, any decisions in these cases may have a profound impact on the future of AI.

In fact, in the Thomson Reuters case, Judge Stephanos Bibas issued a noteworthy ruling earlier this month, which found that Ross Intelligence, Inc. (“Ross”) engaged in direct copyright infringement of Thomas Reuters' data by its AI-based legal research tool. Judge Bibas rejected a fair use defense asserted by Ross. Judge Bibas' opinion is a first of its kind in considering how courts may view the use of copyright protected materials in training non-generative / traditional AI-based products.

Thomson Reuters owns copyrights to the headnotes in its Westlaw, legal research platform, which contain summaries of key points of law and case holdings in legal opinions. In this case, Ross, a legal search engine and competitor of Westlaw, obtained “Bulk Memos” which were compilations of legal questions and answers that were derived from Westlaw's headnotes, through a third-party vendor, LegalEase, to train its AI-based legal search tool. Judge Bibas noted that “Ross built its competing product using Bulk Memos, which in turn were built from Westlaw headnotes.” Ross did not have a license nor authorization to use the data from Westlaw's headnotes and thus, to avoid a finding infringement Ross claimed a fair use defense.

Under U.S. Copyright law, the court analyzes the following four factors to determine whether a party has a valid fair use defense:

the purpose and character of use, including whether the use is of a commercial nature or for nonprofit educational purposes;
the nature of the copyrighted work;
the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
the likely effect of the use on the potential market for the copyrighted work. See 17 U.S.C. § 107(1)- (4).

The first and fourth factors generally hold the most weight in a fair use analysis. For the first factor, Ross acknowledged that its use of the data from Westlaw's headnotes was for a commercial purpose. In addition, Judge Bibas did not find that Ross' use was transformative as the copying of the Westlaw headnotes was not “reasonably necessary to achieve the user's new purpose.” The second and third factors were found in favor of Ross. For the second factor, Judge Bibas found that there was limited creativity in the headnotes. For the third factor, Judge Bibas concluded that “Ross's output to an end user does not include a West[law] headnote” and thus, the Westlaw headnotes were not accessible to the public. Judge Bibas weighed the fourth factor most heavily here in finding that Ross used the copyrighted material from the Westlaw headnotes to train its AI-based tool for the purpose of developing a competing legal research product. Judge Bibas further stated in his reasoning that the “effect on a potential market for AI training is enough.”

Judge Bibas cautiously limited his ruling to “traditional AI” meaning that Ross' AI tool used the copyright protected data from the Westlaw headnotes as the input to train its AI legal research tool to classify such information. As Judge Bibas explained “….when a user enters a legal question, Ross spits back relevant judicial opinions that have already been written.” It was undisputed between the parties that Ross' AI tool did not use generative AI, which is “AI that writes new content itself.” As the scope of Judge Bibas' ruling is narrow, this leaves open the question of fair use involving generative AI tools, such as the AI technology at issue in the pending OpenAI and Meta Platform cases may have a different result.

Yet, regardless of the type of AI technology one employs, copyright concerns should be at the forefront for AI training specifically the use of data for such training. One consideration is to obtain a license that legally provides authorization for a company to access and use copyright protected data/information from the owner for the purpose of training the AI tool.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

[View Source]

Recent Ruling Regarding Copyright Protected Data And AI Tool Training

Contributor

Technology

Contributor

United States