On August 24, 2023, 12 international privacy and data protection authorities, including the Office of the Privacy Commissioner of Canada and the UK Information Commissioner, released a Joint Statement on data scraping and data protection (the "Joint Statement"). 'Data scraping' is an automated technology that pulls large volumes of data from the internet, sparking significant privacy concerns if the data contains personal information, even when the data is accessible to the public. The growing popularity of publically available generative AI tools has brought significant attention to the prevalence of these technologies. The Joint Statement calls attention to the risks of data scraping and what social media companies as well as individuals whose information is at stake can do to better protect personal information.


In most jurisdictions that have enacted modern data protection or privacy legislation, personal information, whether publicly accessible or not, has the benefit of the protections of those laws. This means that there are certain rules and guard rails around the collection, use, disclosure and other "processing" of personal information. These rules may not be entirely harmonized internationally but generally speaking, the collection and use of personal information requires a "legal basis" or justification. This may be consent in some countries (Canada) or alternative legal bases such as processing in the "legitimate interests" of an organization. There may also be exceptions to consent, but those are specifically set out in, for example, Canada's Personal Information Protection and Electronic Documents Act ("PIPEDA").

PIPEDA provides for an exception to consent for "publically available" information, but the definition is much more narrow than the name suggests and is limited to certain prescribed information only. It is limited to "information appearing in telephone directories, professional or business directories, government registry information, and records of quasi-judicial bodies that are available to the public." Personal information on someone's social media profile, for example, may be accessible to the public, but this does not means it is "publically available."

Therefore, social media companies and other online platforms are legally obligated to secure personal information that is protected by various privacy and data protection laws, from unlawful scraping activities. Such activities, if widespread, can be considered reportable data breaches.


The Joint Statement highlights multiple risks tied to data scraping. These risks range from:

  • Targeted cyber-attacks
  • Identity fraud
  • Monitoring, profiling and surveilling individuals
  • Unauthorized political or intelligence gathering purposes
  • Unwanted direct marketing or spam

Essentially, the activity erodes individuals' control over their personal data, leading to a loss of trust in digital platforms. This is why the control of data scraping is in the interests of both business and individuals whose data is at risk. Businesses looking to leverage AI tools should ensure through contractual or other means that non-compliant data scraping techniques were not used to deliver the final product.


To address these issues, the Joint Statement calls on social media and other online platforms to implement robust, multi-layered security measures proportionate to the sensitivity of the data at risk. These measures could include tactics such as limiting the number of profile visits per account and detecting unusual automated/bot activity. Legal action is also recommended when data scraping is confirmed (breach), along with appropriate notification of affected users and regulatory bodies.

The Joint Statement does not leave the responsibility solely to organizations. It outlines several steps individuals can take to safeguard their own privacy and control over personal information, such as understanding a platform's privacy policies, managing what information is shared online, and using privacy settings effectively.

The Joint Statement is a global effort endorsed by various countries, aiming to create a cohesive approach to the challenges data scraping presents. It underlines the imperative for social media platforms to continuously adapt to new threats and for individuals to be proactive about their data privacy. It sets a one-month deadline for feedback from online platforms on how they plan to comply with these expectations, emphasizing the issue's urgency.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.