- within Environment, Real Estate and Construction and Insolvency/Bankruptcy/Re-Structuring topic(s)
- with readers working within the Insurance industries
A Landmark Ruling for AI and Copyright
The Munich Regional Court has delivered a landmark judgment that will have far-reaching consequences for the artificial intelligence sector. In GEMA v. OpenAI, a case brought by Germany's music rights society GEMA, the court ruled that ChatGPT unlawfully used copyrighted German song lyrics in its training data and outputs. The case represents the first time a European court has held an AI developer directly liable for using protected works without a licence.
What the Court Found
The dispute centred on whether OpenAI's models had been trained on GEMA's repertoire of song lyrics, allowing ChatGPT to reproduce them word for word. GEMA argued that this amounted to direct reproduction, while OpenAI maintained that its models did not store or copy any specific text. The court found in favour of GEMA, concluding that:
- the models contained “reproducible determinations” of copyrighted lyrics, and that both the training and the outputs constituted acts of reproduction under Article 2 of the InfoSoc Directive and Section 16 of the German Copyright Act.
- the EU's text and data mining (TDM) exceptions could not shield the company. According to the decision, the TDM provisions permit temporary copies for analysis but do not extend to permanent embodiment of works in trained models. The court reasoned that memorisation of copyrighted material cannot be considered text and data mining, since it involves a lasting fixation that interferes with authors' economic rights. This interpretation narrows the scope of the TDM exception significantly and challenges one of the key legal arguments relied upon by AI developers operating in Europe.
The Role of Memorisation – Conflicting Approaches
The concept of memorisation lies at the heart of the judgment. Memorisation occurs when an AI model overfits on specific content that appears repeatedly in its training data, resulting in the ability to reproduce that content verbatim. In this case, ChatGPT's consistent reproduction of lyrics convinced the court that the works were “embodied” in the model. The reasoning equates statistical patterns within neural networks with stored copies, a conclusion that has attracted criticism from both technical and legal experts, and a point on which this case may very well be appealed.
The Munich Court's findings contrast sharply with the approach taken in the United Kingdom and the United States.
UK
In the recent decision of Getty Images v. Stability AI, the English High Court found that the Stable Diffusion AI model did not contain any reproductions of Getty's photographs and therefore did not infringe copyright.
US
Bartz v. Anthropic went further, with Judge William Alsup determining that AI training can qualify as fair use (pursuant to the U.S. fair use doctrine which is not applicable in the UK and EU) when it is transformative and non-substitutive. Under that reasoning, training is viewed as an analytical process that identifies general linguistic patterns rather than reproducing expression.
EU
The European approach differs in that it treats copyright as an economic property right requiring prior authorisation. The Munich judgment suggests that, in the EU, using copyrighted material for AI training without a licence constitutes reproduction, regardless of whether the model later emits verbatim copies.
Implications of Judgment
Blanket Licensing of Training Data
The Munich judgment establishes a high compliance threshold for AI companies and signals that blanket licensing of training data may become the norm.
This is particularly relevant given the copyright obligations of General Purpose AI (GPAI) model providers under Article 53 of the AI Act (and the associated Recital 106), where models must be trained in compliance with EU copyright law, regardless of the copyright law in the jurisdiction in which the model was actually trained. This is also reflected in the GPAI Code of Conduct which most major AI developers signed up to.
Transparency and Traceability under the AI Act
The Court's treatment of memorisation also has implications for transparency and traceability under the AI Act. The decision supports the European Commission's policy direction, which places greater emphasis on documenting training data and protecting rights holders. It reinforces the expectation that AI developers will be required to maintain records of datasets and demonstrate that appropriate licences have been obtained.
Meaning of “storage” in Digital Realm | Knock on Implications
The Court's interpretation that an AI model can constitute a “copy” even though it contains only probabilistic representations rather than text extends copyright law into new territory. It blurs the line between the physical storage of works and the abstract mathematical relationships created during machine learning. If upheld on appeal, this reasoning could make it difficult for developers to train large language models without licensed data, even where the models themselves do not reproduce any identifiable content.
If EU courts go down this route our legal understanding of what constitutes “storage” in the digital realm will change to include the probabilistic mathematical parameters of AI models. This would have major knock on effects, not least for areas such as Data Protection. The European Data Protection Board's Opinion on data protection and AI models, appeared to take a similar view to the Munich court, saying that personal data could be “stored” in the probabilistic parameters of AI models.
Cultural Knowledge and Legal Boundaries
The judgment also raises practical questions about cultural and educational use and illustrates the tension between copyright protection and shared cultural knowledge. If AI systems are prevented from recognising or referencing widely known lyrics, their ability to reflect human culture may be diminished.
Key Takeaways for developers and the AI Industry
For the AI sector, the GEMA judgment marks a turning point. Companies developing or deploying generative models in the EU may now face an increased risk of copyright litigation if they cannot demonstrate that training data was lawfully sourced. It highlights the need for a more transparent and legally sound approach to training data.
OpenAI has announced its intention to appeal. The case will likely reach the Munich Court of Appeal in 2026. Until then, developers operating within the EU must assume that any unlicensed use of protected material could expose them to liability.
Key Takeaways for Rights Holders
For rights holders, this decision represents a major victory and strengthens the argument for collective licensing mechanisms similar to those long used in the music industry.
Conclusion
Europe's first ruling on AI training and copyright has arrived, and it is unequivocal. The Court has placed the balance of power firmly in the hands of rights holders, reaffirming that innovation in artificial intelligence must operate within the established framework of copyright law. It underscores the deepening divergence between the United States, where fair use offers broad flexibility for innovation, and Europe, where copyright remains an enforceable economic right that demands prior consent.
Only time will tell given an appeal is expected.
The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.