The "training" of AI models generally requires the ingestion of large quantities of data in the form of existing musical, literary, graphic or other works. If these works are protected by copyright, then their reproduction as a result of this process will, if unlicensed by the copyright owners, be an infringement of that copyright, at least in the UK.
The UK government is committed to making the UK an attractive environment for the development of AI models and has proposed a change to the UK's Copyright, Designs and Patents Act 1988 under which AI developers would have the right to train their models on copyright works unless the copyright owners "opt out". Following an outcry from the creative industry including from many well-known artists such as Paul McCartney and Kate Bush, the government is reconsidering its position.
WHY THE GOVERNMENT'S PROPOSAL IS CONTROVERSIAL
When the government made its proposal for a general "text and data mining" (TDM) exception from copyright infringement in its consultation paper of December 2024, it probably did not expect quite such a hue and cry in response. After all, the proposal merely follows the scheme that has already been implemented in the EU.
Response of AI developers...
As it turns out, however, objections have come not just from copyright owners but also from AI developers. For example, Open AI, while welcoming the proposed TDM exception, considers that the opt out for copyright owners is both misconceived in principle and unworkable in practice. It says that an easy opt-out would make too many lawfully accessible works unavailable for training without express licensing, the cost of which would particularly disadvantage SMEs and start-ups. It argues that developers should be free to use any such works in training its models provided only that the developers take reasonable steps to prevent users of the models from generating infringing copies of those works.
...and of the creative industry
This sector tends to agree that the proposed opt-out is unworkable in the current state of technology. More importantly, however, the creators of copyright works consider that any unlicensed use of copyright works in AI training should remain unlawful. This is, in part, because of the risk of such use leading to the generation of infringing copies.
More generally, though, it is because AI models that are capable of generating works such as songs, images and stories may diminish the market for such works created by human authors. As a result, the income of human authors (already falling) may be reduced to the point where very few can still earn a living from their creative endeavours.
AI DEVELOPMENT AND COPYRIGHT IN THE EU AND US
As noted above, the EU has already adopted the broad TDM exception to copyright infringement proposed by the UK government. In addition, from August 2026, developers of general purpose AI models will be obliged to provide at least a high-level summary of the works on which their models have been trained.
The UK government will want to make the UK's environment for the development of AI models at least as attractive as that of the EU. Furthermore, there appears to have been an assumption in at least some responses to the UK government's consultation that the EU's environment is itself less favourable to AI developers than that of the United States. This is based on a belief that the training of AI models will generally fall within that country's "fair use" exceptions from copyright infringement (whereas the UK's "fair dealing" framework, which provides for specific exceptions only, is generally considered narrower).
As it happens, though, a pre-publication report on this topic prepared by the US Copyright Office in May 2025 indicates that, in the view of the Office, it is unlikely that the training of all AI models will benefit from the "fair use" exception. This is particularly the case where the end-use of the AI model is not "transformative", that is to say, where it is designed, for example, to produce images which, though not infringements, are nevertheless similar to the images on which it was trained. That said, the sacking of the director of the US Copyright Office shortly after the release of this report may indicate the US government's direction of travel on this matter.
TAKEAWAYS
Under UK copyright law as it currently stands, AI developers who train their models on existing copyright works risk infringement if they do not obtain licences from all of the relevant rightsholders (which may sometimes be a practical impossibility). On the other hand, rightsholders generally do not know if their rights are being infringed by AI model training unless use of the model results in the generation of an infringing work.
In the circumstances, it is highly unlikely that the UK government's further consideration of this matter following the responses to its consultation will result in a decision to leave copyright law as it is. The enactment of some kind of general TDM exception seems inevitable and one solution to at least part of the creative industry's concerns may perhaps be to limit the exception to situations where there is "fair use" of the rightsholders' works taking into account some or all of the factors that apply to the "fair use" doctrine in the US.
Any general TDM exception will undoubtedly require from the AI industry, as a quid pro quo, an obligation to disclose information as to the materials on which its AI models are trained. Defining the ambit of that obligation in a way that is satisfactory to both AI developers and the creative industry is likely to be at least as challenging as formulating the exception itself.
The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.