Firstly, it is important to consider what "big data" is and why it is important. Big data is not the subject of any particular industry agreed definition, however, Gartner's definition has been commonly used and has been adopted by the Office of the Australian Information Commissioner (OAIC). This definition states that big data is:
"[...] high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing for enhancing insight, decision-making, and process optimisation." 1
What this means, practically speaking, is that big data is "a data set that is extremely large so that it can be mined for patterns, trends and associations, as in relation to human behaviour online". 2 Both of these definitions highlight that the emergence of new technology has changed the way individuals interact with organisations and consequently, many organisations now hold large datasets that can be analysed to identify trends and business opportunities. The use of the term "data" suggests that it refers simply to figures, statistics information or elements of information. On this basis it is reasonable to think that if this data is being collected by an organisation, then that organisation should be able to use that data as it pleases. However, it is important to note that where that data contains Personal Information then the Privacy Act 1988 (Cth) (Privacy Act) applies and in many cases, places restrictions on its use.
What does the Privacy Act regulate?
Personal information is defined as "any information or an opinion about an identified individual, or an individual who is reasonably identifiable whether the information or opinion is true or not and whether the information is recorded in a material form or not". 3 Accordingly, to be personal information the individual does not need to be automatically identified (such as by name and address) but needs to be reasonably identifiable or have the ability to stand out in the crowd. This test must be applied to each particular set of circumstances. In providing examples of what might constitute personal information in a particular situation, the OAIC Guidelines state that:
"Most entities and individuals would encounter difficulty in using a licence plate number to identify the registrant of the car, as they would not have access to the car registration database. By contract, an agency or individual with access to that database may be able to identify the registrant. Accordingly, the licence plate number may be "personal information" held by that agency or individual, but may not be personal information if held by another entity." 4
Contextual analysis similarly applies in the analysis of big data. The Commissioner's Guidelines state at paragraph B.94 "where it is unclear whether an individual is "reasonably identifiable", an APP entity should err on the side of caution and treat the information as personal information". 5 This caution applies when engaging in analysis of all data sets, whether they, in fact, contain personal information or not.
Why is an Individual's consent relevant?
As stated above, personal information is regulated under the Privacy Act 6 and as such, can only be used by organisations with the consent of the identified individual. This consent must be specific to the use of that information and must be given at the time the data was collected. The OAIC guidelines state that to be valid consent must be informed, voluntary, current and specific. Given that the proposed uses for data seem to grow exponentially, it is likely that in many cases an individual will not be found to have reasonably consented to the use of the data for the particular analytics. On this basis, the personal information collected cannot legally be used without contravening the Act.
Furthermore, if the data is to be shared with other organisations (disclosed) or matched with other data sets then the privacy risk to the organisation is heightened. Accordingly, one way to minimise the risk of a breach of the Privacy Act is to de-identify the information. However, the process of de-identification raises further questions regarding what steps are necessary to effectively de-identify information, what are the risks of re-identification and whether this process will potentially give rise to the creation of new personal information.
De-identification – what is it?
The obvious response to issues regarding restrictions on the uses of personal information is to de-identify or anonymise the data. There are a number of resources available to businesses to assist them in considering how this might best be achieved. These include the OAIC Guide7 (considered below) and a further comprehensive guide issued by the Information Commissioner's Office in the United Kingdom.8 Recently the Australian Information Commissioner, Timothy Pilgrim, spoke at the CeBIT conference in Sydney in relation to privacy de-identification and data analytics where he said:
"de-identification is a smart and contemporary response to the privacy challenges of big data – using the same technology that allows data analytics to strip data sets of their personal identification potential, while retaining the research utility of the data." 9
This emphasises that de-identification is something that needs to be factored into any data analytics equation.
New draft Guide on Big Data released
In late May the OAIC released its consultation draft Guide to Big Data and the Australian Privacy Principles (the Guide). 10 While this draft is for consultation and submissions are open until 25 July 2016, the Guide is useful in setting out the OAIC view on how big data and big data analytics interact with the Australian Privacy Principles (APPs). The OAIC has already released some guidelines in relation to de-identifying personal information in its publication Business Resource 4: De-identification of Data Information. 11 This publication focusses on the ways in which personal information may be de-identified so as to avoid breaking Privacy laws, which, in itself is a significant and complex undertaking. The Guide then builds on this foundational information and considers how information that is de-identified may be used.
What does the Guide say?
The Guide considers the application of each of the 13 APPs in the context of big data. It is clear from a review of the Guide that the OAIC is concerned with individuals being fully informed about the potential ways in which their personal information may be used. In doing so, the Guide places an enhanced focus on the form, content and delivery of privacy collection notices and consents that allow for information to be collected and used for various purposes. The requirement for clear communication of secondary uses that may arise as a consequence of data analytics involving personal information potentially creates additional challenges for many organisations.
The Guide also points to the importance of the relationship between privacy notices in communicating information handling practices, and of the carrying out of Privacy Impact Assessments (PIA). PIAs act as a tool for assessing big data usage and big data practices and aims to minimise the risk of breaching the APPs. The use of PIAs to formally record the risks that have been considered and the steps that have been put in place to mitigate them provides a basis to demonstrate privacy compliance in the event of a breach.
In addition, the Guide also suggests that PIAs be undertaken in conjunction with the use of information security risk assessments so that the analysis of the technical and legal risks and the approaches to their mitigation can be aligned.
A further issue considered in the Guide is re-identification of personal information where various data sets are combined such that new personal information is created by the analytics. In this case, compliance with consent and collection notices becomes highly problematic for organisations.
In the context of privacy risks, it is important that contracts for data analytics factor in all of the above issues and make relevant provisions for them in the transaction documents. It is an area where the form of the contract may not adequately cover all of the possible outcomes and all of the risks for the organisation holding the data set. This is particularly true if, for example, one is using an old style contract for software services or other services.
While there are common themes such as confidentiality of information and Privacy Act implications for personal information, there are also a range of issues around inputs, outputs and the various uses that might be made of various elements by either the data set owner organisation or the analytics provider. It is, therefore, necessary to undertake some form of scenario planning to ensure that all possible outcomes are covered. While there are no inherent intellectual property rights which attach to data of itself, there are many rights attaching to outputs and processes that are used. Therefore, significant care must be taken to ensure that the intent of the parties to be able to continue to use or the limit the use of either processes or outputs is clearly specified in any agreement.
What action should I take?
For those organisations that propose to use big data they hold for analytics, either on their own or together with data from other sources, the Guide is a worthwhile starting point to consider the design principles that might be employed to ensure compliance with the Privacy Act. This is because it clearly sets out the common compliance issues faced by organisations and provide a roadmap of solutions that may be employed to address them. The use of a formal or informal PIA process will also assist in identifying risks and mitigation strategies.
Finally, the data analytics contract and the rights granted by the data holder to the use of the data need to align with identified risks and mitigations.
The old saying "a stitch in time saves nine" applies to the data analytics space where planning ahead and establishing a strategic approach to identifying and mitigation risks in contracts is a sound investment.
As Timothy Pilgrim the Australian Information Commissioner stated in his CeBIT speech in May this year,
"smart privacy solutions and smart data solutions are therefore not mutually exclusive, nor elusive, but mutually supportive". 12
We enjoy assisting organisations to reach smart privacy solutions.
1Gartner, The Importance of 'Big Data': A Definition, cited in Department of Finance and Deregulation, The Australian Public Service Big Data Strategy, 2013, p 8, see: http://www.finance.gov.au/sites/default/files/Big-Data-Strategy.pdf
2Susan Butler (ed), Macquarie Dictionary (online ed, at 14 June 2016) 'big data'
3Privacy Act 1988 (Cth) s 6(1)
4Office of the Australian Information Commissioner, Australian Privacy Principles Guidelines: Privacy Act 1988, 31 March 2015, p 20
5Office of the Australian Information Commissioner, Australian Privacy Principles Guidelines: Privacy Act 1988, 31 March 2015, p 21
6Privacy Act 1988 (Cth)
7Office of the Australian Information Commissioner, Australian Privacy Principles Guidelines: Privacy Act 1988, 31 March 2015, p 13
8United Kingdom Information Commissioner's Office, Anonymisation: managing data protection risk code of practice, 2012, see: https://ico.org.uk/for-organisations/guide-to-data-protection/
9Timothy Pilgrim, 'Privacy, Data and De-identification' (Speech delivered at the CeBIT conference, Sydney, 2 May 2016), see: https://www.oaic.gov.au/media-and-speeches/speeches/privacy-data-de-identification
10Office of the Australian Information Commissioner, Consultation Draft: Guide to big data and the Australian Privacy Principles, May 2016, see: https://www.oaic.gov.au/engage-with-us/consultations/guide-to-big-data-and-the-australian-privacy-principles/consultation-draft-guide-to-big-data-and-the-australian-privacy-principles
11Office of the Australian Information Commissioner, Privacy business resource 4: De-identification of data and information, see: https://www.oaic.gov.au/agencies-and-organisations/business-resources/privacy-business-resource-4-de-identification-of-data-and-information
12Timothy Pilgrim, 'Privacy, Data and De-identification' (Speech delivered at the CeBIT conference, Sydney, 2 May 2016), see: https://www.oaic.gov.au/media-and-speeches/speeches/privacy-data-de-identification
This publication does not deal with every important topic or change in law and is not intended to be relied upon as a substitute for legal or other advice that may be relevant to the reader's specific circumstances. If you have found this publication of interest and would like to know more or wish to obtain legal advice relevant to your circumstances please contact one of the named individuals listed.