The simulation of human intelligence capabilities by a machine or a computer system is an Artificial Intelligence (AI) Model or an AI Tool. AI tools are becoming the new trend as they can help increase efficiency in workplace. The popularity of AI Models and AI Tools has brought in many data privacy concerns.


There has been a constant endeavour, throughout modern history, to make computers / machines to be as intelligent as humans, if not more. On these lines the Artificial Intelligence (AI), a branch of computer science, has seen an exponential growth, especially in the recent times. The primary intention behind AI was to enable it to perform tasks, which are otherwise capable through human intelligence only.

In most spheres such as robotics, finance, medicine, e-commerce, AI has already taken over a lot of tasks which initially were being performed by humans alone, and has also managed to reduce the margin of human error.

AI tools are fast gaining momentum and Data scraping is one such technique which has gained immense limelight.

Data scraping is a technique in which a computer program extracts valuable data generated by another program. One of the most popular kind of data scraping is "website scraping". Valuable information is scrapped from various websites regardless of any limited access. Data scraping is done for various purposes like content scraping for duplicating and reproducing the same content, price scraping to identify competitors, contact scraping for collecting contact information available on websites and using the same for bulk emailing and advertisement purposes, etc.

There are growing concerns about companies aggressively deploying the AI techniques to create innovative products as seen by the rapid regulatory and enforcement action around this.


Privacy Concerns

  • Generative AI is posing complex privacy risks to individuals and societies at large
  • Leading global entities have reported leaks of sensitive information and chat histories
  • Generative AI has shown a tendency to not be in complete alignment with the globally accepted privacy principles (data quality, data collection limitation, purpose specification, etc.)
  • Generative AI chatbots are using large language models on a variety of data sets, and hence it is a challenge to adhere with the regulatory requirements spanning across the globe
  • Transparency is another challenge with personal data being collected from individuals without obtaining their explicit consent

Leading Cases*

*Cases being discussed are from the recent period and we may come across other such enforcement actions

Leakage of data

  • A leading telecommunication company unwittingly leaked top secret data while using a leading AI Chatbot tool
  • While the tool was being used to help fix problems pertaining to the source code
  • Private and sensitive data was inputted while performing these actions
  • The Company did send out a warning to its workers about the potential hazards of using such AI Chatbot tools but many perceive it to be the case of 'too little, too late'
  • The Company has committed about working towards developing its own such tool

Ban on Chatbot Tool

  • Recently, Italy banned a major advanced Chatbot tool
  • The non-justifiable legal basis to amass, collect and store large amount of personal data is a crucial reason behind the ban
  • Exposure of minors to unsuitable answers due to lack of controls to verify the age of users, is one of the key reasons
  • The regulators banned the said tool with immediate effect and also opened it for further investigations by citing the privacy concerns
  • The Chatbot Tool's executives have reiterated that they comply with the privacy laws and will be continuing with their pursuit


AI models are known to increase the business efficiencies while also being coupled with a few risks. Organizations need to adhere to principles of privacy and cybersecurity while using AI tools in business model and continuously monitor the risks placed by the innovative technology.

Fairness, Lawfulness and Transparency
  • Entities should ensure that their data collection practices through AI tools do not compromise with data privacy law under any geography
  • A thorough understanding of the fundamentals behind privacy laws, and alignment of this with the data collection practices through AI is critical
Purpose Limitation
  • Entities should ascertain and outline the end goal behind collection of data through AI tools
  • The collected data should be used only for its predetermined purpose; with prior communication via privacy notices, privacy policies, etc.
Data Minimization
  • The minimal the data, the lesser chances of any security related issues
  • Entities should establish controls regarding collection of only the required data; excessive collection of data can lead to increase in compliance and greater breach risks
  • Data collected should be accurate to avoid incorrect results. Inaccurate data might lead also to incorrect personal data storage
  • Accuracy of data collected to be maintained while using logic based techniques for analysing large amounts of data, and finding patterns in them
Storage Limitations
  • Personal data collected through AI tools should not be retained for a period which is longer than is necessary
  • Policies, practices, processes and controls need to be in place which result into the deletion of data after it is no longer required
Integrity & Confidentiality
  • Entities should have appropriate controls are in place to ensure the integrity & confidentiality of the data sets even when the personal data is being collected by the AI
  • Personal information should be safeguarded from internal and external threats
  • Entities are responsible for the data being collected and retained, irrespective of the operations or processes for which it is being used for
  • Entities should have the same level of accountability for data collected through AI tools

Key Governance Requirements

  • Conducting risk assessments such as ROPAs, DPIAs, of any AI tools / systems
  • Ensuring that the organization wide processes, including processes based on AI tools are aligned with Privacy by Design
  • Frequent monitoring of regulatory compliances and requirements such as UK's AI based white paper
  • Robust mechanism for Data Subject / Individual rights management which should also include data collected via AI tools
  • Updated privacy policies, processes, controls, SOPs, etc. inclusive of policy mechanisms for AI tools

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.