Australia: Further advances in Technology Assisted Review (TAR)

Last Updated: 15 July 2016
Article by Craig Macaulay

1 Introduction

Technology Assisted Review (TAR), or predictive coding, is an alternative to the traditional manual review of documents.

It involves:

  • The manual review by senior members of a legal team of a small subset of documents to identify whether they belong to certain categories (relevant, not relevant, privileged, etc.)
  • Computer analysis to apply the characteristics of the subset to the full population of documents to group them into the same categories. The resulting document set is consistently categorised using a process which is both auditable and repeatable.

Computer Assisted Review (CAR) and Technology Assisted Review (TAR) are interchangeable terms.

More information on TAR is in our publications (TAR and CAR are interchangeable terms)

March 2016 – Forensic Matters: Technology assisted momentum is building
A recent English Case has further paved the way for the use of Technology Assisted Review ('TAR') in Common Law Jurisdictions. In this article we comment on the ten factors seen as pivotal in making this decision.

June 2015 – Forensic Matters: Predicting the future of electronic discovery
Adoption of Technology Assisted Review to increase worldwide following decision by the Irish High Court. In this article, we discuss the landmark decision made by the Irish High Court, which may pave the way for the use of TAR in Australia.

February 2013 – Forensics Matters: Death, taxes and computer assisted review
Has the time come for a radical change in how eDiscovery is undertaken? When we consider all of the events that are taking place in courts the world over, perhaps computer assisted review (CAR), which includes predictive coding, should be adopted as best practice, especially in cases involving large amounts of electronically sourced information (ESI).

July 2012 – Forensics Matters: Is Predictive coding the electronic discovery 'Magic Bullet'?
Predictive coding is the emerging tool of choice in the fight against the escalating size and related cost of managing and disclosing electronically stored information ('ESI'). In this article, we consider how some recent cases involving predictive coding may affect the future of eDiscovery.

To some it is bewildering that TAR has not been more widely adopted. Cynics might suggest that lawyers have fees to lose. But we believe it's not as simple as that. The complex statistics involved in some of the TAR protocols used, together with a natural level of inertia and fear of the unknown, have combined to slow the acceptance and use of TAR. Another key factor is that TAR requires an acceptance that any review process is imperfect.

At KordaMentha Forensic we maintain the position that a more intuitive non-statistical TAR protocol will always be more palatable. Recent studies show that human-aligned protocols are both more workable and more efficient.

A recent article1 by Maura Grossman and Gordon Cormack on the current state of TAR in the electronic document review marketplace, points out that:

  1. There are a number of protocols which can be used when undertaking TAR. New protocols recently developed can be even more efficient in reviewing documents.
  2. Different TAR products use algorithms to perform TAR. They discuss the effectiveness of the various algorithms.

Grossman and Cormack are leading the research into TAR. They have published a number of ground-breaking studies on document review. Their latest study2 in late 2014 compares the effectiveness of the protocols currently available.

2 The adoption of TAR in the United States has been slow

Grossman and Cormack lament that the adoption of TAR has been very slow in the USA despite strong judicial support, and the significant cost and time advantages. They suggest that the protocols and algorithms in the most commonly used protocols – such as Simple Active Learning (SAL) and Simple Passive Learning (SPL) – contain complex statistical vocabulary and rituals which dissuade practitioners from using TAR. However, none of them are essential. They argue that using a new, simpler protocol, more resembling a Web-search methodology, will encourage greater adoption of the technology.

Continuous Active Learning (CAL) is more efficient

The CAL protocol removes the complexity of statistical control sets, random samples etc. Instead, it relies on the ongoing stream of documents coded by reviewers. The TAR algorithm uses this coding to continually re-define the documents that are presented for review and further coding, until the legal team is comfortable that they have identified and reviewed the potentially relevant documents.

The research by Grossman and Cormack showed that CAL produces a much more efficient form of TAR. Using CAL, manual review was lower, but the number of relevant documents found was higher. The average saving in the manually reviewed documents was 5%. This represented an average of 36,250 documents per case: the equivalent of 72.53 lawyer review days. These savings are over and above the significant savings that can be achieved by moving from traditional manual review to the earlier protocols of TAR (often referred to as TAR version 1.0).

3 How does CAL differ to other forms of TAR?

Using the CAL protocol both reduces and simplifies the steps involved in the process when compared to SAL or SPL: see the Appendix.

As well as being more efficient, CAL has a number of other benefits:

Advantage Explanation
CAL is more flexible when introducing new documents to a corpus The control set for SAL or SPL needs to be a statistical representation of the corpus. If the corpus changes, a new control set is needed to be a statistical representation of the new corpus.
CAL is more flexible if the criteria for a relevant document change during the legal proceedings If criteria change, the process of creating a control needs to be created from the start again.
The legal team does not need to pre-determine an acceptable level of risk. Following the CAL process, the legal team continues to review documents and train the algorithm until they are comfortable that they have reviewed the potentially relevant documents. The SAL and SPL protocols require the legal team to determine an F-Score which is measure of the recall and precision the legal team wish to accept. In essence this is measure of how much error (not finding relevant documents) is acceptable. This is traditionally something that legal teams have struggled with.
No need to create a control set Control sets often encounter problems. For example, selecting a control set which turns out not to contain even one relevant document, thereby rendering the control set useless for the SAL protocol.In our experience it is common to create many control sets which fail. This destroys much of the benefit of using TAR
No need to create random samples As with control sets, random samples with a large corpus will often not include any relevant documents. While further training of the algorithm can occur, it is not very efficient unless relevant documents are included in conducting the re-training.
CAL gives the legal team much more control over the process. Rather than the statistical formula telling the legal team when to stop reviewing documents, the decision is made by the legal team.This allows the reviewers to quickly identify legally significant documents, and to adapt the process when new documents are added, or new issues or interpretations arise.

KordaMentha Forensic's experience using TAR is consistent with Grossman's and Cormack's findings. In practice, using random sampling in large corpuses of data becomes very inefficient, especially if there are few relevant documents in the corpus. Often random samples will contain no relevant documents at all to further enhance the training algorithm.

Interestingly, the CAL protocol follows generally accepted methods of implementing artificial intelligence (AI). 'Deep learning' AI algorithms work by a human telling the AI algorithm what he or she thinks is correct or important, based on a small set of documents. The AI algorithm uses this input to analyse all of the data to determine what is correct or important and what is not. This is undertaken as an iterative process similar to CAL.

4 KordaMentha Forensic's Input to the CAL process

KordaMentha Forensic has been using a form of CAL, which we call Continuous Review, when implementing TAR. As part of our Continuous Review protocol we determine the next sets of documents to review, as part of the ongoing training process, using four key criteria to identify documents which;

  1. Are categorised as 'highly relevant' by the software.
  2. Are on the threshold of being categorised relevant or not relevant by the software.
  3. Have been tagged as 'non-relevant' by a reviewer, but which, based on analytics, appear to contain concept and textual similarities to documents which were tagged as relevant by a reviewer.
  4. Based on analytics, show volatility in categorisation over a number of training rounds. For example where a document moves from being categorised as relevant to not relevant and back again over a number of training rounds.

Reviewing these types of documents will improve the accuracy of the results from the algorithm and allow the legal team to see the documents being identified as most likely to be relevant by the algorithms, and the issues that these documents raise.

5 Not All TAR algorithms are the same

Different eDiscovery tools use different underlying algorithms to perform TAR. Grossman and Cormack also compare the effectiveness of the different types of algorithm.

6 Conclusion

We believe that simplified and intuitive CAL protocols and workflows, such as our 'continuous review', will help to remove many of the current barriers – real or perceived – to the legal profession embracing TAR. Ongoing cost pressures from general counsel will also help to encourage litigators to consider TAR. Further, the Australian judiciary is showing increasing interest in the use of these sorts of technologies to ensure that discovery/disclosure is undertaken in a proportionate manner. We believe that a successful Australian test case on TAR is unlikely to be far away as the eDiscovery revolution continues.

Endnotes

1Grossman, Maura R and Cormack, Gordon V: Continuous Active Learning For TAR, April/May 2016 E-Discovery Bulletin.

2Maura R. Grossman & Gordon V. Cormack: Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery, 2014 Proceedings of the 37th Ann. Int'l ACM SIGIR Conf. on Research & Dev. in Info. Retrieval, 153-62 (2014).

3Based on a reviewer reviewing 500 documents per day.

4Recall – The fraction of Relevant Documents that are identified as Relevant by a search or review effort.

5Precision – The fraction of Documents identified as Relevant by a search or review effort that are in fact relevant.

6Based on Grossman, Maura R and Cormack, Gordon V; Continuous Active Learning For TAR, April/May 2016 E-Discovery Bulletin.

7Irish Bank Resolution Corporation Ltd & Oors -v- Quinn & Oors [2015] IEHC 175.

8Da Silva Moore v. Publicis Groupe, Case No. 1:11-cv-01279.

9Overturns are documents which have been predicted by the algorithm as relevant but after another round of training the document is re-predicted as not relevant.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

To print this article, all you need is to be registered on Mondaq.com.

Click to Login as an existing user or Register so you can print this article.

Authors
Craig Macaulay
 
Some comments from our readers…
“The articles are extremely timely and highly applicable”
“I often find critical information not available elsewhere”
“As in-house counsel, Mondaq’s service is of great value”

Mondaq Advice Centre (MACs)
Related Video
Up-coming Events Search
Tools
Print
Font Size:
Translation
Channels
Mondaq on Twitter
 
Register for Access and our Free Biweekly Alert for
This service is completely free. Access 250,000 archived articles from 100+ countries and get a personalised email twice a week covering developments (and yes, our lawyers like to think you’ve read our Disclaimer).
 
Email Address
Company Name
Password
Confirm Password
Mondaq Topics -- Select your Interests
 Accounting
 Anti-trust
 Commercial
 Consumer
 Criminal
 Employment
 Energy
 Environment
 Family
 Finance
 Government
 Healthcare
 Immigration
 Insolvency
 Insurance
 International
 IP
 Law Performance
 Law Practice
 Litigation
 Media & IT
 Privacy
 Real Estate
 Strategy
 Tax
 Technology
 Transport
 Wealth Mgt
Regions
Africa
Asia
Asia Pacific
Australasia
Canada
Caribbean
Europe
European Union
Latin America
Middle East
U.K.
United States
Worldwide Updates
Check to state you have read and
agree to our Terms and Conditions

Terms & Conditions and Privacy Statement

Mondaq.com (the Website) is owned and managed by Mondaq Ltd and as a user you are granted a non-exclusive, revocable license to access the Website under its terms and conditions of use. Your use of the Website constitutes your agreement to the following terms and conditions of use. Mondaq Ltd may terminate your use of the Website if you are in breach of these terms and conditions or if Mondaq Ltd decides to terminate your license of use for whatever reason.

Use of www.mondaq.com

You may use the Website but are required to register as a user if you wish to read the full text of the content and articles available (the Content). You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, link, display, or in any way exploit any of the Content, in whole or in part, except as expressly permitted in these terms & conditions or with the prior written consent of Mondaq Ltd. You may not use electronic or other means to extract details or information about Mondaq.com’s content, users or contributors in order to offer them any services or products which compete directly or indirectly with Mondaq Ltd’s services and products.

Disclaimer

Mondaq Ltd and/or its respective suppliers make no representations about the suitability of the information contained in the documents and related graphics published on this server for any purpose. All such documents and related graphics are provided "as is" without warranty of any kind. Mondaq Ltd and/or its respective suppliers hereby disclaim all warranties and conditions with regard to this information, including all implied warranties and conditions of merchantability, fitness for a particular purpose, title and non-infringement. In no event shall Mondaq Ltd and/or its respective suppliers be liable for any special, indirect or consequential damages or any damages whatsoever resulting from loss of use, data or profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection with the use or performance of information available from this server.

The documents and related graphics published on this server could include technical inaccuracies or typographical errors. Changes are periodically added to the information herein. Mondaq Ltd and/or its respective suppliers may make improvements and/or changes in the product(s) and/or the program(s) described herein at any time.

Registration

Mondaq Ltd requires you to register and provide information that personally identifies you, including what sort of information you are interested in, for three primary purposes:

  • To allow you to personalize the Mondaq websites you are visiting.
  • To enable features such as password reminder, newsletter alerts, email a colleague, and linking from Mondaq (and its affiliate sites) to your website.
  • To produce demographic feedback for our information providers who provide information free for your use.

Mondaq (and its affiliate sites) do not sell or provide your details to third parties other than information providers. The reason we provide our information providers with this information is so that they can measure the response their articles are receiving and provide you with information about their products and services.

If you do not want us to provide your name and email address you may opt out by clicking here .

If you do not wish to receive any future announcements of products and services offered by Mondaq by clicking here .

Information Collection and Use

We require site users to register with Mondaq (and its affiliate sites) to view the free information on the site. We also collect information from our users at several different points on the websites: this is so that we can customise the sites according to individual usage, provide 'session-aware' functionality, and ensure that content is acquired and developed appropriately. This gives us an overall picture of our user profiles, which in turn shows to our Editorial Contributors the type of person they are reaching by posting articles on Mondaq (and its affiliate sites) – meaning more free content for registered users.

We are only able to provide the material on the Mondaq (and its affiliate sites) site free to site visitors because we can pass on information about the pages that users are viewing and the personal information users provide to us (e.g. email addresses) to reputable contributing firms such as law firms who author those pages. We do not sell or rent information to anyone else other than the authors of those pages, who may change from time to time. Should you wish us not to disclose your details to any of these parties, please tick the box above or tick the box marked "Opt out of Registration Information Disclosure" on the Your Profile page. We and our author organisations may only contact you via email or other means if you allow us to do so. Users can opt out of contact when they register on the site, or send an email to unsubscribe@mondaq.com with “no disclosure” in the subject heading

Mondaq News Alerts

In order to receive Mondaq News Alerts, users have to complete a separate registration form. This is a personalised service where users choose regions and topics of interest and we send it only to those users who have requested it. Users can stop receiving these Alerts by going to the Mondaq News Alerts page and deselecting all interest areas. In the same way users can amend their personal preferences to add or remove subject areas.

Cookies

A cookie is a small text file written to a user’s hard drive that contains an identifying user number. The cookies do not contain any personal information about users. We use the cookie so users do not have to log in every time they use the service and the cookie will automatically expire if you do not visit the Mondaq website (or its affiliate sites) for 12 months. We also use the cookie to personalise a user's experience of the site (for example to show information specific to a user's region). As the Mondaq sites are fully personalised and cookies are essential to its core technology the site will function unpredictably with browsers that do not support cookies - or where cookies are disabled (in these circumstances we advise you to attempt to locate the information you require elsewhere on the web). However if you are concerned about the presence of a Mondaq cookie on your machine you can also choose to expire the cookie immediately (remove it) by selecting the 'Log Off' menu option as the last thing you do when you use the site.

Some of our business partners may use cookies on our site (for example, advertisers). However, we have no access to or control over these cookies and we are not aware of any at present that do so.

Log Files

We use IP addresses to analyse trends, administer the site, track movement, and gather broad demographic information for aggregate use. IP addresses are not linked to personally identifiable information.

Links

This web site contains links to other sites. Please be aware that Mondaq (or its affiliate sites) are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of these third party sites. This privacy statement applies solely to information collected by this Web site.

Surveys & Contests

From time-to-time our site requests information from users via surveys or contests. Participation in these surveys or contests is completely voluntary and the user therefore has a choice whether or not to disclose any information requested. Information requested may include contact information (such as name and delivery address), and demographic information (such as postcode, age level). Contact information will be used to notify the winners and award prizes. Survey information will be used for purposes of monitoring or improving the functionality of the site.

Mail-A-Friend

If a user elects to use our referral service for informing a friend about our site, we ask them for the friend’s name and email address. Mondaq stores this information and may contact the friend to invite them to register with Mondaq, but they will not be contacted more than once. The friend may contact Mondaq to request the removal of this information from our database.

Security

This website takes every reasonable precaution to protect our users’ information. When users submit sensitive information via the website, your information is protected using firewalls and other security technology. If you have any questions about the security at our website, you can send an email to webmaster@mondaq.com.

Correcting/Updating Personal Information

If a user’s personally identifiable information changes (such as postcode), or if a user no longer desires our service, we will endeavour to provide a way to correct, update or remove that user’s personal data provided to us. This can usually be done at the “Your Profile” page or by sending an email to EditorialAdvisor@mondaq.com.

Notification of Changes

If we decide to change our Terms & Conditions or Privacy Policy, we will post those changes on our site so our users are always aware of what information we collect, how we use it, and under what circumstances, if any, we disclose it. If at any point we decide to use personally identifiable information in a manner different from that stated at the time it was collected, we will notify users by way of an email. Users will have a choice as to whether or not we use their information in this different manner. We will use information in accordance with the privacy policy under which the information was collected.

How to contact Mondaq

You can contact us with comments or queries at enquiries@mondaq.com.

If for some reason you believe Mondaq Ltd. has not adhered to these principles, please notify us by e-mail at problems@mondaq.com and we will use commercially reasonable efforts to determine and correct the problem promptly.