Australia: Defining big data - large data sets of information

Big data: Legal challenges
Last Updated: 7 February 2014
Article by Mark Vincent and Katrina Crooks

Big Data: Legal Challenges (Full Report)

The cost of data storage has plummeted and our computing power to use and analyse data has increased exponentially.

"Big Data" is the term for a collection of data sets so large and complex that they cannot readily be processed with traditional data processing applications in a reasonable amount of time. Attempts to meaningfully define "Big Data" focus on size of the data sets (volume), the real time way in which much of the information is captured by a system (velocity) and the increasing variety of disparate data sets that can be accessed (variety).

Data sets grow in size, in part, because they are increasingly being gathered by ubiquitous information- sensing mobile devices, aerial sensory technologies (remote sensing), software logs, cameras, microphones, radio-frequency identification readers, and wireless sensor networks.

In addition, movement sensors, building access systems, smart power metres, surveillance cameras, smart cars, smart televisions, and all manner of devices now Internet enabled, facilitate the collection and use of information logs. This has led to the emergence of the so called "internet of things", or IOT, referring to the increasing number of connected objects relative to the population. The IOT means more data created at an accelerated rate, which compounds and feeds into the volume, velocity and variety of Big Data sets. The emergence of the IOT is illustrated in the following diagram:

The shift towards consuming computing as a hosted, utility-style service (cloud) has revolutionised the way in which organisations can analyse data. An organisation needing massive amounts of processing power to process complex data queries can ramp up their access to processing power during the project and afterwards scale back that access. This is all possible at steadily declining prices using new models of access to computing power that were not available five years ago.

To illustrate, when scientists first decoded the genome in 2003 it took ten years of work to sequence the three billion base pairs. Now that much DNA can be sequenced in a day4.

There are numerous examples of organisation collecting and harvesting large amounts of data in an attempt to differentiate themselves in the marketplace. Among the more notable:

  • Walmart, the world's largest retailer, generates vast quantities of data every day from its online presence and over 10,000 retail stores. It has over 1.5 million customer transactions every hour and a data warehouse with over three petabytes of information. Walmart records every purchase by every customer for future analysis and claims to have driven over US$1 billion in incremental revenue (10-15 percent boost in sales) on decisions resulting from this analysis. Walmart knows, for example, that with any hurricane warning in a given area will be an inevitable increase in the sale of Kellogg's Pop-Tarts.
  • The New York Police Department, among others, uses computerised mapping and analysis of variables like historical arrest patterns, paydays, sporting events, rainfall and holidays to try to predict likely crime "hot spots" and deploy police officers there in advance5.

The uses of big data are many and varied, the only constant being that it will underpin much commercial advantage in the post industrial economy. Studies by the World Economic Forum (WEF) have described personal data as a new asset class - the new "oil" for business. The WEF has further predicted the use of big data tools for the better targeting of services, prediction and prevention of crisis, understanding population health and improving health care, reduction of food spoilage and identification of areas in food related distress and tracking the movements and conditions of refugees6. In its 2011 paper entitled "Personal Data: The Emergence of a New Asset Class" WEF noted7:

"We are moving towards a "Web of the world" in which mobile communications, social technologies and sensors are connecting people, the Internet and the physical world into one interconnected network. ... Personal data will be the new "oil" - a valuable resource of the 21stcentury. It will emerge as a new asset class touching all aspects of society. At its core, personal data represents a post-industrial opportunity. ... As personal data increasingly becomes a critical source of innovation and value, business boundaries are being redrawn. Profit pools, too, are shifting towards companies that automate and mine the vast amounts of data we continue to generate."
'Big Data, Big Impact: New Possibilities for International Development'

A McKinsey study has further concluded that Big Data could be the defining basis for success in the post industrial economy. We are already seeing large gaps emerge between the growth of traditional business failing to heed the impact of Big Data on their decisions and those companies embracing the power of Big Data to direct decisions8:

"Big data will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus."

structured and unstructured data

A key feature of the 'big data' era is growth in the volume and variety of unstructured data.

Most traditional databases are 'structured', relational databases - in which each field in a database is known and named and the relationship between each field is defined. This would apply to everything from your bank account and your TV guide to corporate databases such as customer relationship management databases, HR/payroll databases and document management databases.

While 'Big Data' can include structured data in a relational database, in the majority of cases large tracts of data will be stored with no apparent meaning beyond its face value. Skills are required to create meaning from disparate data sets by pursuing inquiries or associations between those data sets. The logs created by a smart power meter, a building's air conditioning system, machine logs and most of the so called "digital exhaust" created by use of the internet will not be stored in a traditional relational database.

Furthermore, much of the relevant data is transient and traditionally not stored after the purpose of its creation has passed (the data of an internet phone call, the watching of a Youtube video), yet increasingly new opportunities and optimised products and services can be created by applying analytical models to these data sources. Such data is generally defined as "unstructured" data as opposed to traditional "structured" data.

Attempts to measure the amount of unstructured or semi-structured data are imprecise, but it is generally accepted to account for 80 to 90 percent of the world's data.

The IDC "Digital Universe" study sought to analyse total enterprise data growth in the period 2005- 20159. By 2015, by far the dominant form of data will be so called "unstructured" data.

Related links


4Big Data, supra note 1, page 8
5The Age of Big Data, Steve Lohr, February 11, 2012:
6Big Data, Big Impact: New Possibilities for International Development, World Economic Forum 2012
7Personal Data: The Emergence of a New Asset Class, World Economic Forum, January 2011, pages 5 and 6
8Big Data: The next frontier for innovation, competition, and productivity, May 2011, James Manyika et al:
9"THE DIGITAL UNIVERSE IN 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East", John Gantz and David Reinsel, December 2012:

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

Shelston IP has been awarded the MIP Global Award for Australian IP Firm of the Year 2013.

To print this article, all you need is to be registered on

Click to Login as an existing user or Register so you can print this article.

Katrina Crooks
Some comments from our readers…
“The articles are extremely timely and highly applicable”
“I often find critical information not available elsewhere”
“As in-house counsel, Mondaq’s service is of great value”

Up-coming Events Search
Font Size:
Mondaq on Twitter
Register for Access and our Free Biweekly Alert for
This service is completely free. Access 250,000 archived articles from 100+ countries and get a personalised email twice a week covering developments (and yes, our lawyers like to think you’ve read our Disclaimer).
Email Address
Company Name
Confirm Password
Mondaq Topics -- Select your Interests
 Law Performance
 Law Practice
 Media & IT
 Real Estate
 Wealth Mgt
Asia Pacific
European Union
Latin America
Middle East
United States
Worldwide Updates
Mondaq Ltd requires you to register and provide information that personally identifies you, including what sort of information you are interested in, for three primary purposes:
  • To allow you to personalize the Mondaq websites you are visiting.
  • To enable features such as password reminder, newsletter alerts, email a colleague, and linking from Mondaq (and its affiliate sites) to your website.
  • To produce demographic feedback for our information providers who provide information free for your use.
  • Mondaq (and its affiliate sites) do not sell or provide your details to third parties other than information providers. The reason we provide our information providers with this information is so that they can measure the response their articles are receiving and provide you with information about their products and services.
    If you do not want us to provide your name and email address you may opt out by clicking here
    If you do not wish to receive any future announcements of products and services offered by Mondaq you may opt out by clicking here

    Terms & Conditions and Privacy Statement (the Website) is owned and managed by Mondaq Ltd and as a user you are granted a non-exclusive, revocable license to access the Website under its terms and conditions of use. Your use of the Website constitutes your agreement to the following terms and conditions of use. Mondaq Ltd may terminate your use of the Website if you are in breach of these terms and conditions or if Mondaq Ltd decides to terminate your license of use for whatever reason.

    Use of

    You may use the Website but are required to register as a user if you wish to read the full text of the content and articles available (the Content). You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, link, display, or in any way exploit any of the Content, in whole or in part, except as expressly permitted in these terms & conditions or with the prior written consent of Mondaq Ltd. You may not use electronic or other means to extract details or information about’s content, users or contributors in order to offer them any services or products which compete directly or indirectly with Mondaq Ltd’s services and products.


    Mondaq Ltd and/or its respective suppliers make no representations about the suitability of the information contained in the documents and related graphics published on this server for any purpose. All such documents and related graphics are provided "as is" without warranty of any kind. Mondaq Ltd and/or its respective suppliers hereby disclaim all warranties and conditions with regard to this information, including all implied warranties and conditions of merchantability, fitness for a particular purpose, title and non-infringement. In no event shall Mondaq Ltd and/or its respective suppliers be liable for any special, indirect or consequential damages or any damages whatsoever resulting from loss of use, data or profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection with the use or performance of information available from this server.

    The documents and related graphics published on this server could include technical inaccuracies or typographical errors. Changes are periodically added to the information herein. Mondaq Ltd and/or its respective suppliers may make improvements and/or changes in the product(s) and/or the program(s) described herein at any time.


    Mondaq Ltd requires you to register and provide information that personally identifies you, including what sort of information you are interested in, for three primary purposes:

    • To allow you to personalize the Mondaq websites you are visiting.
    • To enable features such as password reminder, newsletter alerts, email a colleague, and linking from Mondaq (and its affiliate sites) to your website.
    • To produce demographic feedback for our information providers who provide information free for your use.

    Mondaq (and its affiliate sites) do not sell or provide your details to third parties other than information providers. The reason we provide our information providers with this information is so that they can measure the response their articles are receiving and provide you with information about their products and services.

    Information Collection and Use

    We require site users to register with Mondaq (and its affiliate sites) to view the free information on the site. We also collect information from our users at several different points on the websites: this is so that we can customise the sites according to individual usage, provide 'session-aware' functionality, and ensure that content is acquired and developed appropriately. This gives us an overall picture of our user profiles, which in turn shows to our Editorial Contributors the type of person they are reaching by posting articles on Mondaq (and its affiliate sites) – meaning more free content for registered users.

    We are only able to provide the material on the Mondaq (and its affiliate sites) site free to site visitors because we can pass on information about the pages that users are viewing and the personal information users provide to us (e.g. email addresses) to reputable contributing firms such as law firms who author those pages. We do not sell or rent information to anyone else other than the authors of those pages, who may change from time to time. Should you wish us not to disclose your details to any of these parties, please tick the box above or tick the box marked "Opt out of Registration Information Disclosure" on the Your Profile page. We and our author organisations may only contact you via email or other means if you allow us to do so. Users can opt out of contact when they register on the site, or send an email to with “no disclosure” in the subject heading

    Mondaq News Alerts

    In order to receive Mondaq News Alerts, users have to complete a separate registration form. This is a personalised service where users choose regions and topics of interest and we send it only to those users who have requested it. Users can stop receiving these Alerts by going to the Mondaq News Alerts page and deselecting all interest areas. In the same way users can amend their personal preferences to add or remove subject areas.


    A cookie is a small text file written to a user’s hard drive that contains an identifying user number. The cookies do not contain any personal information about users. We use the cookie so users do not have to log in every time they use the service and the cookie will automatically expire if you do not visit the Mondaq website (or its affiliate sites) for 12 months. We also use the cookie to personalise a user's experience of the site (for example to show information specific to a user's region). As the Mondaq sites are fully personalised and cookies are essential to its core technology the site will function unpredictably with browsers that do not support cookies - or where cookies are disabled (in these circumstances we advise you to attempt to locate the information you require elsewhere on the web). However if you are concerned about the presence of a Mondaq cookie on your machine you can also choose to expire the cookie immediately (remove it) by selecting the 'Log Off' menu option as the last thing you do when you use the site.

    Some of our business partners may use cookies on our site (for example, advertisers). However, we have no access to or control over these cookies and we are not aware of any at present that do so.

    Log Files

    We use IP addresses to analyse trends, administer the site, track movement, and gather broad demographic information for aggregate use. IP addresses are not linked to personally identifiable information.


    This web site contains links to other sites. Please be aware that Mondaq (or its affiliate sites) are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of these third party sites. This privacy statement applies solely to information collected by this Web site.

    Surveys & Contests

    From time-to-time our site requests information from users via surveys or contests. Participation in these surveys or contests is completely voluntary and the user therefore has a choice whether or not to disclose any information requested. Information requested may include contact information (such as name and delivery address), and demographic information (such as postcode, age level). Contact information will be used to notify the winners and award prizes. Survey information will be used for purposes of monitoring or improving the functionality of the site.


    If a user elects to use our referral service for informing a friend about our site, we ask them for the friend’s name and email address. Mondaq stores this information and may contact the friend to invite them to register with Mondaq, but they will not be contacted more than once. The friend may contact Mondaq to request the removal of this information from our database.


    From time to time Mondaq may send you emails promoting Mondaq services including new services. You may opt out of receiving such emails by clicking below.

    *** If you do not wish to receive any future announcements of services offered by Mondaq you may opt out by clicking here .


    This website takes every reasonable precaution to protect our users’ information. When users submit sensitive information via the website, your information is protected using firewalls and other security technology. If you have any questions about the security at our website, you can send an email to

    Correcting/Updating Personal Information

    If a user’s personally identifiable information changes (such as postcode), or if a user no longer desires our service, we will endeavour to provide a way to correct, update or remove that user’s personal data provided to us. This can usually be done at the “Your Profile” page or by sending an email to

    Notification of Changes

    If we decide to change our Terms & Conditions or Privacy Policy, we will post those changes on our site so our users are always aware of what information we collect, how we use it, and under what circumstances, if any, we disclose it. If at any point we decide to use personally identifiable information in a manner different from that stated at the time it was collected, we will notify users by way of an email. Users will have a choice as to whether or not we use their information in this different manner. We will use information in accordance with the privacy policy under which the information was collected.

    How to contact Mondaq

    You can contact us with comments or queries at

    If for some reason you believe Mondaq Ltd. has not adhered to these principles, please notify us by e-mail at and we will use commercially reasonable efforts to determine and correct the problem promptly.

    By clicking Register you state you have read and agree to our Terms and Conditions