ARTICLE
24 November 2025

Munich Court Finds Copyright Infringement Of Song Lyrics 'Memorised' By ChatGPT

KL
Herbert Smith Freehills Kramer LLP

Contributor

Herbert Smith Freehills Kramer is a world-leading global law firm, where our ambition is to help you achieve your goals. Exceptional client service and the pursuit of excellence are at our core. We invest in and care about our client relationships, which is why so many are longstanding. We enjoy breaking new ground, as we have for over 170 years. As a fully integrated transatlantic and transpacific firm, we are where you need us to be. Our footprint is extensive and committed across the world’s largest markets, key financial centres and major growth hubs. At our best tackling complexity and navigating change, we work alongside you on demanding litigation, exacting regulatory work and complex public and private market transactions. We are recognised as leading in these areas. We are immersed in the sectors and challenges that impact you. We are recognised as standing apart in energy, infrastructure and resources. And we’re focused on areas of growth that affect every business across the world.
The "memorisation" of training data and reproduction of song lyrics in ChatGPT outputs has been found to infringe copyright, by the Munich I Regional Court, in a case that GEMA the collecting society representing...
Germany Intellectual Property
Herbert Smith Freehills Kramer LLP are most popular:
  • within Intellectual Property, Environment and Law Department Performance topic(s)
  • in United States

The "memorisation" of training data and reproduction of song lyrics in ChatGPT outputs has been found to infringe copyright, by the Munich I Regional Court, in a case that GEMA the collecting society representing the rights of composers, lyricists and music publishers in Germany had brought against OpenAI (GEMA v OpenAI, 42 O 14139/24, 11 November 2025). The Court concluded that such actions do not fall within the "test and data mining" exception in section 44b of the German Copyright Act which implements Article 4 of the Directive on copyright and related rights in the Digital Single Market (DSM Directive). Note that only ChatGPT models 4 and 4o were the subject of this lawsuit (the current model is 5.1).

Background

This case concerned the reproduction of German song lyrics by ChatGPT, the well-known AI chatbot developed by OpenAI based on a generative pre-trained transformer large language model (LLM). That the lyrics were used as training data was not in dispute. The principal issue was whether the ChatGPT outputs constituted an unauthorised reproduction and centred on whether "memorised" lyrics constituted "reproduction" as was required for there to be an infringement of copyright under section 16 of the German Copyright Act (Article 2 InfoSoc Directive) and whether such reproductions were covered by the text and data mining exceptions to copyright infringement provided under section 44b of that Act.

Memorisation in an LLM occurs (as described by the Court) when "non-specific parameters during training not only extract information from the training data set, but also completely adopt the training data in the parameters specified after training". This means that the training data can be contained in models and reproduced to a significant degree of accuracy and reliability. Memorisation can be demonstrated by testing whether the LLM repeatedly gives the same or largely similar response to simple prompts without conducting an online search.

The decision of the Munich Regional Court

The Munich Court held that ChatGPT models 4 and 4o had memorised the disputed song lyrics, making them "reproducibly contained in the model and thus embodied".

"For copyright reproduction, it can be left open how memorisation works in detail. It is irrelevant whether there is talk of saving or copyright the training data, or, as the defendants put it, the parameters of the model reflect what it has learned based on the entire training data set, namely relationships and patterns of all words or tokens that represent the diversity of human language and its contexts. This is because it is crucial that the song lyrics that serve as training data are reproducibly contained in the model and thus embodied".

The resulting outputs reproducing the lyrics were thus considered infringing parts of the song lyrics, even where they were altered or only partially reproduced. In one example, the reproduction of 15 words from a lyric was held to be an infringement.

To the extent the chatbot altered the song lyrics as a result of "hallucination", the Court regarded this as an adaptation within the meaning of section 23 of the German Copyright Act. The Court left open the distinction from reproduction, noting that every adaptation also entails a reproduction.

The Court also held that by granting the public access both from places and at times of the public's choice, the works were made available to the public within the meaning of section 19a of the German Copyright Act (Art 3(1) InfoSoc Directive).

The defences offered by OpenAI, discussed below, failed:

  • The "rare bug" defence failed: One of the defences put forward by OpenAI was that "a regurgitation of training data is a rare bug that is being continuously worked on". The Court considered that the examples of the infringing output demonstrated that they were unlikely to be accidentally reproduced. It was found that, even with artificial randomisation in a model, LLMs will "reproduce memorised content consistently and with minimal variance because, when generating memorised content, the probabilities of the tokens that make up the memorised content reached very high values – often close to 100%".
  • No exception from infringement via the text and data mining exception: OpenAI further argued that model's training and memorisation of song lyrics fell under the general 'text and data mining' exemption to copyright infringement under section 44b of the German Copyright Act (Article 4 DSM Directive). However, the Munich Court did not consider the requirements for this exemption to have been met, concluding that memorisation does not serve any ongoing 'mining' function and was equivalent to a reproduction remaining in the trained model. With reference to the LAION decision (of the Regional Court of Hamburg, judgment of 27th September 2024, 310 O 227/23), the Court distinguished three successive phases for consideration of infringement and the possible application of the text and data mining exception: (1) creating the training data material: extracting and converting the training material into a machine-readable format; (2) the training of the model: the analysis of the data material and its enrichment with meta-information; and (3) the subsequent use of the trained model through prompts and outputs. The Court held that the exception in section 44b of the German Copyright Act – and its EU-law basis in Article 4 of the DSM Directive – applies to text and data mining when training artificial intelligence; accordingly, acts of reproduction undertaken to "prepare" the training corpus fall within its scope. However, the Court found that the reproduction of the song lyrics at issue within the AI model does not constitute text and data mining within the meaning of that provision, because the training data (the lyrics), were not merely analysed but were "copied in their entirety into the model's parameters". Acts of reproduction for text and data mining would not jeopardise authors' exploitation interests, because only information is extracted and the work as such is not reproduced. By contrast, reproductions within the model would materially impair the exploitation of the work and thereby infringe rightholders' legitimate interests. The Court elaborated that, according to the recitals of the DSM Directive – particularly Recitals 3, 8 and 18 – the introduction of the text and data mining exceptions is intended not only to promote innovation and new technologies but also to protect authors. In other words, section 44b (and Article 4 of the DSM Directive) can justify only the creation of the training dataset (phase 1 above), not the training of the model itself (phase 2), nor phase 3, although the Court did not specify this last point.
  • Claims that the rightholder had consented to the use of the work also failed: The Court held that the infringement was not authorised by consent. Whether the song lyrics that were already available on the internet had been made available with the rightholder's consent, was disputed between the parties. GEMA maintained that they have expressly reserved their works and asserted opt-outs. Leaving this question open, the Court held that the training of models (phase 2) cannot be deemed a "customary and foreseeable" mode of exploitation that the rightholder must reasonably expect. Assuming consent would moreover undermine the new exception in section 44b of the German Copyright Act (Article 4 DSM Directive), as it permits reproduction only for the purpose of text and data mining and does not extend to reproductions beyond that.
  • Neither quotation, nor pastiche exception defences applied: The Court also declined to apply other copyright exceptions. - The Court held that the outputs were not covered as a quotation by the limitation provision according to section 51 of the German Copyright Act, because models are structurally incapable of pursuing the quotation purpose required. The Court of Justice of the European Union describes the meaning of the quotation (Article 5 (3) lit. d InfoSoc Directive) as explaining statements, defending an opinion or enabling an intellectual engagement between the work and the user's statements. According to the Court, such an intention on the part of the model, as the subjective element of the quotation purpose, cannot be determined. - By similar reasoning, the Court held that the outputs were not covered by the pastiche exception under section 51a of the German Copyright Act (Article 5 (3) lit. k InfoSoc Directive). This provision would require use for the purpose of pastiche; consequently, an artistic engagement with a pre-existing work or other point of reference would be necessary. Such engagement would not be possible for the models, as they would lack any personality through which they could artistically express themselves using protectable elements. In the court's view, the straightforward prompts used that led to the outputs would not display any artistic content either.
  • Claims by OpenAI that the users of ChatGPT were liable failed:According to the Court, the prompts that led to the infringing outputs would be simple and would not prescribe any content. Given such open-ended prompts, any copyright infringement in the outputs could not be attributed to the user. Since OpenAI would be responsible for the architecture and content of their models, and that content would be reproduced in response to simple prompts, OpenAI would be directly responsible (and liable) and not merely indirectly, like operators of hosting platforms and providers of hardware and software. The Court made clear that the mere triggering of the act of reproduction by entering a simple prompt does not render the user the party effecting the reproduction.
  • Further defences: Finally, the defendants' further arguments surrounding the exemption for non-profit research institutions, the proportionality of the injunction and that the song lyrics are also available on third party websites, all failed.

Commentary

Thus, the Munich court held that memorisation of song lyrics and reproduction thereof by ChatGPT models 4 and 4o amounted to an infringing copy for which OpenAI was liable. Should OpenAI want to use song lyrics to train its ChatGPT models beyond text and data mining allowed by law or to provide such lyrics as an output in response to users' requests, it needs to obtain a GEMA licence.

The decision only concerns ChatGPT models 4 and 4o, and OpenAI has brought out ChatGPT model 5.1 already. When being asked for the lyrics of Helene Fischer's Atemlos (one of those which featured in the case) the current model (5.1) explicitly states that it cannot output such lyrics for copyright reasons. Instead, a short summary or analysis of the song is offered.

An interesting aspect of this case is the evidentiary path. The Court relied on section 286 of the German Code of Civil Procedure (ZPO), under which the Court can decide on the basis of its free evaluation of the evidence, to infer that the model had memorised the song lyrics, even though the technical processes were not fully set out in the proceedings. The Court considered a comparison of the lyrics with the outputs sufficient for it to establish that memorisation had occurred. The Court argued that even in response to very simple prompts (for example, "What are the lyrics to [song title]?"), the outputs reproduced the lyrics in a clearly recognisable form. The use of the lyrics at issue as training data was undisputed.

The case differs from the UK court's decision Getty Images v Stability AI (see our blog post of 4 November 2025 here), in particular because the UK court did not find that the Stable Diffusion model had stored reproductions of copyright works; it found that the model weights are "purely the product of the patterns and features which they have learnt over time during the training process".

The German court's decision is significant for three reasons:

  • First, it sharpens the line between lawful text and data mining and infringing reproduction, emphasising that near-verbatim outputs can evidence memorisation and undermine reliance on the text and data mining exception, particularly where rightsholders have reserved rights and licensing is available through collecting societies.
  • Second, it highlights practical compliance: providers need robust filtering, output suppression and provenance controls, especially for high‑risk works such as lyrics, poetry and scripts.
  • Third, it may catalyse coordinated enforcement by collecting societies and prompt recalibration of model governance across the EU, given the court's receptive reading of the DSM recitals and the authors' exploitation interests.

An appeal may yet narrow or refine the analysis, but the immediate signal to the market is clear: where outputs track protected works too closely, courts may find inferences of memorisation, and licensing obligations will follow. As the provisions on which the judgment is based largely stem from EU law, the judgment is expected to have a significant impact for the evolution of the law on AI in other EU jurisdictions.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

Mondaq uses cookies on this website. By using our website you agree to our use of cookies as set out in our Privacy Policy.

Learn More