A potentially landmark copyright case has been referred to the Court of Justice of the European Union (CJEU), raising, for the first time, the question of how EU copyright law applies to generative AI outputs and the training of large language models.
The case, Like Company v. Google Ireland C-250/25, concerns the use of protected press content by Google's AI chatbot Gemini, formerly Bard. The outcome is expected to shape the obligations of AI providers across the EU and clarify the scope of rights held by press publishers in the digital age.
Background
The plaintiff, Like Company, is a Hungarian news publisher. It alleges that Google's generative AI service Gemini reproduced and made available portions of its protected news articles without consent. The example before the Hungarian court involves an article about a Hungarian singer's plan to introduce dolphins to Lake Balaton. When prompted by a user, Gemini generated a detailed summary of the article, which Like Company claims included substantial elements of the original protected content.
The referring court, the Budapest Környéki Törvényszék, has asked the CJEU to interpret provisions of both the 2001 Copyright and Information Society Directive (InfoSoc Directive), and the 2019 Copyright in the Digital Single Market Directive (CDSM Directive). The core legal question is whether the training and functioning of an LLM, and the responses it produces, can amount to acts of reproduction or communication to the public under EU copyright law.
Legal Questions Before the CJEU
The Hungarian court has referred four questions for a preliminary ruling:
- Whether displaying content in chatbot responses that mirrors press articles and exceeds what can be considered a "very short extract" constitutes communication to the public under Article 15(1) of the CDSM Directive and Article 3(2) of the InfoSoc Directive.
- Whether the act of training an LLM, through tokenisation and the learning of linguistic patterns from protected works, constitutes reproduction under Article 2 of the InfoSoc Directive.
- If such training is a form of reproduction, whether it qualifies for the text and data mining exception under Article 4 of the CDSM Directive.
- Whether the generation of a response by the chatbot that includes protected content, in response to a user query, qualifies as reproduction by the AI provider and requires authorisation from the rightsholder.
Arguments Presented by the Parties
Like Company argues that both the training and output of Gemini constitute unlawful use of protected works. It claims the chatbot's responses exceed the limits of what can be lawfully used without permission, particularly under Article 15 CDSM, which only permits very short extracts. The publisher submits that the training phase involved the systematic reproduction of its content and that this use falls outside the TDM exception because it causes economic harm and was not carried out for legitimate scientific research purposes. It also argues that the responses given by Gemini effectively replace visits to its website, undermining its advertising-based revenue model.
Google Ireland argues that Gemini does not store or retrieve copies of articles. Instead, it tokenises training data and generates new text based on probabilistic modelling. The company maintains that any similarity to Like Company's content is incidental or the result of hallucination, a well-documented phenomenon in generative AI systems. Google contends that no "new public" is reached, since users of the chatbot could have accessed the original content online. It also relies on EU exceptions for temporary acts of reproduction and the TDM exception, and frames Gemini as a creative support tool rather than a database.
What we Expect the CJEU to Clarify
This case offers the CJEU an opportunity to define several crucial points of law:
- Whether AI-generated content that mirrors protected work can amount to communication to the public under EU law.
- Whether the conversion of protected works into tokens for pattern analysis during training is a reproduction within the meaning of Article 2 of the InfoSoc Directive.
- Whether the training of LLMs on publicly available content falls within the research-related text and data mining exception, or whether commercial use invalidates that protection.
- Whether content generated in response to a prompt that includes identifiable elements of protected material amounts to infringement.
Implications for the AI Sector and Publishers
This case is the first referral to the CJEU specifically addressing the use of press publisher content by generative AI systems. The decision is likely to have wide-reaching consequences for AI developers, publishers, online platforms and rights holders.
If the CJEU finds that copyright rules govern both training and output of LLMs and not covered by exceptions, AI companies may be required to obtain licences before training on or reproducing protected content. This would increase legal and financial pressure on AI developers and reinforce the enforcement rights of creators and press publishers.
On the other hand, if the CJEU determines that AI outputs are too indirect, probabilistic or transformative to qualify as reproduction or communication to the public, then developers may continue to rely on publicly available data for training and generation, without necessarily requiring individual licences. That outcome would likely encourage the continued development of LLMs across Europe, particularly in the commercial sector.
The case is also likely to have major implications for the interpretation of the text and data mining exception under Article 4 of the CDSM Directive, which many AI developers currently rely on to justify large-scale ingestion of publicly available content. The CJEU will be expected to rule on the limits of that exception and its applicability to commercial LLM training.
Copyright Compliance in the Age of AI
Companies in AI development, digital publishing, content licensing, or any data-driven industry should closely monitor this case. When it comes to copyright compliance in the age of AI, certain considerations may become necessary, depending on a company's needs and data use, such as:
- Conducting a due diligence review of training datasets and auditing whether press content is included.
- Considering contractual approaches to licensing third-party content or excluding high-risk categories.
- Reviewing how the organisation's AI tools handle the generation of summaries, quotations or outputs that may reflect protected works.
- Monitoring developments in EU and national enforcement trends on the application of Article 15 CDSM and TDM exceptions.
- Preparing for the potential need to renegotiate or establish licences with press publishers, should the judgment require it.
Conclusion
Like Company v. Google Ireland is the most significant copyright case to date concerning generative AI in the European Union. The CJEU's findings in the case are likely to provide clarity on how AI-generated content is treated under the CDSM Directive and whether current practices around LLM training fall within the scope of existing exceptions. The judgment is likely to be a cornerstone decision, with long-term implications for the development, regulation and commercialisation of AI in Europe.
For organisations developing or deploying AI tools, the case signals that the regulatory and rights environment is catching up. Preparing for either outcome is now a strategic imperative.
The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.