WilmerHale's Guide To AI And GDPR

Article Insights

WilmerHale are most popular:

within Environment topic(s)

The rise of AI and its widespread availability offers significant growth opportunities for businesses. However, it necessitates a robust governance framework to ensure compliance with regulatory requirements, especially under the EU AI Act (see our Guide to the AI Act) and the EU GDPR.

The reason GDPR compliance is so important is that (personal) data is a key pillar of AI. For AI to function effectively, it requires good-quality and abundant data so that it can be trained to identify patterns and relationships. Additional personal data is often gathered during deployment and incorporated into AI to assist with individual decision-making.

This guide discusses GDPR compliance throughout the AI development lifecycle and when using AI.

Data Protection by Design

GDPR compliance plays a key role throughout the AI development lifecycle, starting from the very first stages. This reflects one of the key requirements and guiding principles of the GDPR, called "data protection by design" (Article 25 GDPR). Businesses are required to implement appropriate technical and organizational measures, such as pseudonymisation, both at the determination stage of processing methods and during the processing itself. These measures should aim to implement data protection principles, such as data minimisation, and integrate necessary safeguards into the processing to ensure GDPR compliance and protect individuals' data protection rights.

AI Development Lifecycle

The AI development lifecycle encompasses four distinct phases: planning, design, development and deployment. In this context, in accordance with the terminology of the EU AI Act, we will refer to both AI models and AI systems:

– AI models are a component of an AI system and are the engines that drive the functionality of AI systems. AI models require the addition of further components, such as a user interface, to become AI systems.

– AI systems present two characteristics: (1) they operate with varying levels of autonomy and (2) they infer from the input they receive how to generate outputs such as predictions, content, recommendations or decisions that can influence physical or virtual environments.

How We Can Help

WilmerHale has a leading practice in EU law and regulation, advising clients on highprofile matters in both established and emerging market sectors across a wide variety of industries. With around 1,100 lawyers located throughout 12 offices in the United States, Europe and the United Kingdom, we offer a global perspective on EU law issues and offer single-team transatlantic and Europe-wide services. We practice at the very top of the legal profession and offer a cutting-edge blend of capabilities that enables us to handle cases of any size and complexity.

Our European offices in Brussels, Frankfurt, Berlin and London are best known for highquality regulatory work before authorities and appellate work before EU courts. Clients entrust us with complex cases because of our expertise, reliability, responsiveness, precision and reputation with authorities. Our European team is involved in a huge number of cases in various areas of EU law, including several major data protection law cases setting breakthrough principles. In addition, many of our lawyers are qualified in several jurisdictions across the European Union, its neighbouring countries, and the United States and can handle the most complex cases requiring native-speaker proficiency in multiple languages.

Our European team works seamlessly with our US AI and Cybersecurity and Privacy teams, leveraging our combined legal expertise to provide comprehensive, crossborder support on data protection and AI-related matters. This close collaboration ensures that our clients benefit from globally informed legal strategies.

1. First Phase of the AI Development Lifecycle: Planning

The first phase of the AI development lifecycle involves understanding the business problem, defining objectives and requirements, and developing a solid AI governance structure to ensure regulatory compliance. During this phase, it is essential to determine the scope of (personal) data needed and identify any constraints related to such data, with a focus on the availability of the relevant datasets.

In this context, key GDPR compliance considerations involve evaluating whether the data is personal data, ensuring the processing of the data has a valid legal basis, and verifying that the processing respects the principle of purpose limitation, including with regard to other key principles under the GDPR.

Personal Data

The GDPR only applies to personal data, i.e., any information relating to a natural person that is or can be identified, directly or indirectly. A key question, therefore, is whether AI input or output data constitutes personal data.

– Input data is information provided to or directly obtained by an AI system, based on which the system generates an output.

– Output data varies depending on the type of AI model and its intended usage. There are three major sorts of outputs: prediction, recommendation, and classification.

The European Data Protection Board (EDPB), the umbrella group of the EU's data protection authorities, issued a nonbinding Opinion on AI Models in December 2024. In the opinion, the EDPB considered whether and how AI models trained with personal data can be deemed anonymous. The EDPB identified two scenarios:

– The AI model is designed to provide personal data. When an AI model is specifically designed to provide personal data regarding individuals whose personal data was used to train the model or in some way to make such data available, it cannot be regarded as anonymous and the GDPR necessarily applies. According to the EDPB, examples of such AI models include a generative model fine-tuned on the voice recordings of an individual to mimic their voice or a model designed to reply with personal data from the training when prompted for information regarding a specific person.

– The AI model is not designed to provide personal data. The EDPB considers that, even when an AI model has not been designed to produce personal data from the training data, it is still possible that personal data from the training dataset remains absorbed in the parameters of the model and can be extracted from that model. Whether the outputs of such AI models can be considered anonymous should be determined on a case-by-case basis. The EDPB appears to agree that an AI model may be anonymous, although it considers such a scenario highly unlikely. According to the EDPB, an AI model can only be anonymous provided it meets the following conditions:

The likelihood that individuals whose data was used to build the model may be identified (directly or indirectly) is insignificant; and
The likelihood of obtaining, intentionally or not, such personal data from queries is insignificant too

The EDPB considers that examining whether these conditions are met must take into account the Article 29 Working Party's Guidance on Anonymisation. This guidance treats pseudonymisation merely as a security measure. However, in SRB v EDPS, the Court of Justice of the European Union (CJEU) held that pseudonymised data should not be regarded as personal data in all cases and for every person (see Chapter 2).

More fundamentally, the EDPB considers that determining whether the above conditions are met must take into account whether the risk of identification has been assessed, considering all the means reasonably likely to be used to identify individuals (Recital 26 GDPR). According to the EDPB, the determination of those means should be based on objective factors, such as:

The characteristics of the training data (e.g., the uniqueness of the records in the training data, precision of the information, aggregation, and randomization, and how these affect the vulnerability to identification), the AI model, and the training procedure.
The context in which the AI model is released and/or processed, with contextual elements including measures such as legal safeguards and limiting access only to some persons.
The additional information that would allow the identification and may be available to the given person.
The costs and amount of time that the person would need to expend to obtain such additional information.
The technology available at the time of the processing, and technological developments.

The EDPB Opinion on AI Models provides a nonexhaustive and non-prescriptive list of possible elements that may be considered when assessing AI's anonymity. These include the steps controllers take in the design stage to minimise or stop the gathering of training-related personal data and make it less identifiable, AI model testing and resistance to attacks, and documentation regarding processing operations, including anonymisation (see Chapter 2).

Legal Basis

Under the GDPR, the processing of personal data is only lawful if the controller can demonstrate a valid legal basis. The most relevant legal bases for AI under the GDPR are consent and legitimate interests. According to the EDPB, the development and deployment phases entail different processing activities that call for different legal bases and should be evaluated individually.

– Consent. Valid consent is often difficult to obtain because it must be individual, specific, informed, unambiguous and provided by a clear affirmative action. These conditions are generally interpreted restrictively. In addition, consent can be withdrawn at any time, and it should be as easy to withdraw consent as it is to give it.

– Legitimate interests. Personal data may be processed if the processing is necessary to pursue a legitimate interest and such interest is not overridden by the interests or fundamental rights and freedoms of the individuals concerned. Legitimate interests may only be relied on provided the following three-step test is satisfied, and this test must be assessed on a case-by-case basis:

Legitimate interest. The processing must pursue a legitimate interest. An interest is considered legitimate if it is lawful, clearly and precisely articulated, and real and present (i.e., not hypothetical). For example, the EDPB considers that the use of a chatbot to assist users and the use of AI to improve cyber threat detection may be legitimate interests.
Necessity. The processing must be necessary to pursue the legitimate interest in question. The EDPB sets a very high bar for necessity, as it considers that the assessment must evaluate the appropriate volume of personal data involved to determine whether the processing is proportionate to pursue the legitimate interest, but also whether there are less intrusive alternatives to achieve it in accordance with the data minimisation principle. In other words, the processing of personal data is not necessary if the legitimate interest can be pursued through an AI model that does not entail such processing. This is obviously a very restrictive approach.
Balancing test. The legitimate interest must not be overridden by the interests or fundamental rights and freedoms of the individuals concerned. This step consists of identifying and describing the different opposing rights and interests at stake. The interests of the individuals concerned may include, for example, their interest in retaining control over their personal data, financial interests (e.g., where an AI model is used by an individual to generate revenues), personal benefits (e.g., where the individual is using AI to improve accessibility to services), or socioeconomic interests (e.g., AI that improves access to healthcare or education). Opposing interests would typically include the AI developer's fundamental right to conduct business.

The impact of the processing on individuals may be influenced by the nature of the data processed by the models (e.g., financial or location data may be particularly sensitive), the context of the processing (e.g., whether personal data is combined with other datasets, what is the overall volume of data and number of individuals affected, and whether they are vulnerable), and its consequences (e.g., violation of fundamental rights, damage, or discrimination). Importantly, the analysis of such possible consequences must take into account the likelihood of these consequences materializing, especially considering the measures in place and the circumstances of the case.

To view the full article clickhere

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

WilmerHale's Guide To AI And GDPR

Contributor

Privacy

Contributor

United States