Introduction
Generative Artificial Intelligence ("GAI") has revolutionized the way we create and interact with content. From creating realistic images and videos to generating complex text outputs, GAI models have transformed almost all sectors. With its growing prominence, it has, of late, also sparked debates on underlying legal, ethical, and regulatory concerns across the globe.
The class action lawsuit filed by New York Times against OpenAI LLC ("OpenAI") and Microsoft Corporation in December 2023 highlighted concerns over unpermitted use of its published articles by these GAI companies, to train their large language models ("LLM").1 The New York Times expressed concerns regarding these LLMs generating near-verbatim reproductions and more expressive content basis their works, resulting in possible financial loss to it. Earlier in 2024, the copyright infringement lawsuits filed by Michael Chabon,2 Paul Tremblay,3 and Sarah Silverman4 against OpenAI, alleging the latter of using their works without consent, were consolidated. The court is now yet to adjudicate whether the large-scale scraping of data amounts to infringement within the copyright framework.
Interestingly, a recent Public Interest Litigation filed before the Hon'ble High Court of Delhi ("Delhi HC") has raised concern over the unauthorized use of original artistic works by GAI before the judiciary for intervention.5 The petitioners further seek amendments to the Indian Copyright Act, 1957 ("Copyright Act") as well having explicit rules and regulations formulated to protect authors and creators.
From BARTZ to ANI: Two Nation, One Battle
While the above cases remain under judicial consideration, authors, Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson ("Plaintiff") also filed a class action lawsuit against AI company Anthropic in August 2024, claiming copyright infringement. The Plaintiff alleged that Anthropic used their copyrighted works, such as books and writings, to train its AI model, 'Claude' without permission or proper compensation. They contended Anthropic downloaded known pirated versions of Plaintiffs' works, made copies of them, and fed such pirated copies into its models. The Plaintiff argued that commercial gains came at the expense of creators and rightsholders, depriving authors of fair compensation, and licensing revenues.
The risk of Generative AI has sparked legal discourse in India as well, with a lawsuit filed by Asian News International Media Private Limited ("ANI") against Open AI before the Delhi HC in November 2024, accusing the latter of copyright infringement through its AI model, 'ChatGPT'6. ANI claims that 'ChatGPT' reproduces its proprietary content without authorization, depriving it of control over its content and potential revenue streams and has, thus, sought an injunction on usage of its content henceforth. Pursuant to the Delhi HC's order dated 19 November 2024, OpenAI has confirmed that ANI's domain name will soon be blocked from its servers, thereby excluding it from any future training processes. The Delhi HC has framed three crucial issues, i.e., a) whether use and storage of ANI's data for training purposes amounts to copyright infringement under the Copyright Act, b) whether the usage of the said data amounts to 'fair use'; c) and most importantly, whether the courts in India have the jurisdiction to even entertain the present suit. The ANI lawsuit happens to be the first case against ChatGPT in India, which holds immense potential to dictate the future jurisprudence on copyright infringement by AI systems in India.
Challenges and Way Forward
One of the most common concerns in almost all these litigations centres around copyright infringement. In order to train their LLM and produce sound and usable derivates, AI models require vast amounts of data such as books, music, articles, and images. It is a known fact that in doing so, such GAI models proactively make use of publicly available data. This raises concerns, especially when data on open-source platforms may itself infringe third-party copyrights. Moreover, could training LLMs on such unauthentic online data compromise the quality of their output?
That said, content creators should also have the right to opt in or out of having their work used for AI training Recently, a private member bill was introduced in the United States Congress, viz, Generative AI Copyright Disclosure Act ("Act") on April 9, 2024. The Act requires that any company launching their AI model must submit a notice regarding the identity and the URL address of the copyrighted works used in training its model to the Register of Copyrights. Interestingly, the Act does not ban the use of copyrighted works for AI training but shifts the onus upon the creators to be on the lookout for such curated databases to ensure their works are not being utilized. Nevertheless, such steps certainly hint towards the attempt being made towards a transparent and accountable framework around AI.
Another significant concern is the lack of fair compensation for authors. Majority of the lawsuits demand that GAI companies provide financial compensation for the use of copyrighted materials, without which the content creators stand to lose out on their potential revenue. Thus, clear legislative guidelines in this regard ensuring provisions for royalties and licensing fees could make strides in resolving some complexities involved.
Beyond copyright infringement and lack of compensation, output generated by GAI has also led to personality rights violations, as seen in recent cases concerning Anil Kapoor7 and Arijit Singh.8 The Hon'ble Bombay High Court in the Arjit Singh case noted, "making AI tools available that enable the conversion of any voice into that of a celebrity without his/her permission constitutes a violation of the celebrity's personality rights." Recently, the U.S. lawmakers introduced the Nurture Originals, Foster Art, and Keep Entertainment Safe Act ("NO FAKES Act") to protect individuals against unauthorized highly realistic, digital replicas that use an individual's voice or likeness. While the federal law is yet to be enacted, the state of Tennessee enacted the Ensuring Likeness Voice and Image Security Act ("ELVIS Act") in July 2024 to protect musicians from the unauthorized use of their voices through AI technologies and against audio deepfakes and voice cloning.9
Conclusion
In addition to the uncertainty over the ownership of AI generated data under the Copyright Act, the Indian legislative framework does not seem to adequately address the concerns surrounding AI training on copyrighted data at present, leaving creators vulnerable and prone to exploitation. As AI continues to evolve, clear guidelines and legislative efforts in protecting creators' rights and ensuring that AI development is ethically and legally compliant remain indispensable. Thus, provisions in terms of obligations of developers and users of GAI, identification of the potential harms from GAI, inclusion of penal liabilities arising thereof, and efficient redressal mechanisms for the affected parties would foster a more secure and balanced ecosystem for growth and responsible integration of AI.
Footnotes
1. The New York Times Company v. Microsoft Corporation, 1:23-cv-11195, (S.D.N.Y.)
2. Chabon v. OpenAI, Inc., 3:23-cv-04625, (N.D. Cal.)
3. Tremblay v. OpenAI, Inc., 23-cv-03223-AMO (N.D. Cal. Jul. 30, 2024)
4. Silverman v. OpenAI, Inc., 3:23-cv-03416, (N.D. Cal.)
5. Kanchan Nagar & Ors v. Union Of India & Ors W.P.(C) 16739/2024 & C.M.Nos.70790-70791/2024
6. Ani Media (P) Ltd. V. Open AI Inc, 2024 SCC OnLine Del 8120
7. Anil Kapoor v. Simply life & Ors., Manu/Deor/248558/2023.
8. Arijit Singh v. Codible Ventures LLP, 2024 SCC OnLine Bom 2445.
9. The Ensuring Likeness, Voice, And Image Security Act Of 2024 (HB 2091/SB 2096) (replacing THE PERSONAL RIGHTS PROTECTION ACT OF 1984, TENN. CODE ANN. ยง 47-25-1103 (2021) ("the 1984 Act")
The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.