United States: Efficiency: A Discovery Philosophy, And All You Really Need To Know About Predictive Coding

Last Updated: December 28 2016
Article by Marty J. Solomon

The main problem with discovery is the cost. In a very small number of truly bet-the-company cases (for example, where the CEO's emails must be produced) the greater risk can be failing to do discovery perfectly. But 99 times out of 100, cost is the most important factor in discovery. My guiding principle in handling discovery is, therefore, to reduce cost.

One of the biggest problems with the way many lawyers approach discovery is that they perform its steps in the wrong order. The order of steps in discovery can have an enormous impact on its costs. Meaning, many law firms will get a case in the door, then begin thinking about discovery, then send and receive discovery requests, and only then approach the client to begin to determine what documents the company actually has, and what systems and capabilities it has to retrieve them. The lawyer then begins a good-faith conference process to negotiate with opposing counsel what procedures will be used in discovery, only after objections have already been sent. This means the lawyer will often have written responses to discovery requests, and objections to those requests, without knowing what the company actually has or what problems it may face in retrieving documents from its systems. In today's world this is simply irresponsible.

The better order of steps is to: receive the case, do a preliminary analysis of what will likely be relevant and what the other side will likely want. Then, contact the company and discuss this preliminary analysis, begin to determine what documents the company has, and gain an understanding of the company's systems. The lawyer should begin the good-faith conference process with opposing counsel before any discovery requests are sent or received. This means negotiations with opposing counsel will let the lawyer and the company know how reasonable the other side intends to be (and reveal the likelihood that discovery will become a problem) before discovery responses are even prepared. If the other side plans to be obstreperous, then it becomes important to frame discovery responses and objections carefully so they will look good to the court in motion practice. However, if the other side plans to cooperate, a more cooperative spirit governs responses. The tone is different tone, and far less time and money can be spent on the discovery responses. Only after these factors are known, and the company's documents and systems have been taken into account, should discovery responses be sent or responded to.

Another difficult aspect of discovery for many companies is uncertainty associated with what it will cost to respond to requests, gather responsive documents, get them produced, and, if necessary, persuade the court that the company has fulfilled all its discovery obligations. The best way to deal with uncertainty is to use your experience in discovery to establish either capped fees or piece-rates that are fair, reasonable, and give the company a measure of certainty in advance about discovery costs. I believe the best way to establish such rates is transparently and collaboratively with the company.

For some large-scale discovery projects, the way to save money is to use a document review vendor, or computer assisted review. For matters in which the company expects to review hundreds of thousands of document it should be company policy to employ outside legal consultants to conduct document review, rather than having higher priced outside counsel be solely responsible for all aspects of review. Some outside counsel resist this idea because they don't want to give up control. But with reviewers offshore charging $50 per hour, the efficiencies can be tremendous, and outside counsel should overcome their fear.

In the largest of large scale projects, computer assisted review tools are likely to become more efficient than even a document production consultant. In some matters, it is appropriate to use both computer-assisted review tools and a document production consultant. But in many matters using computer-assisted review tools can obviate the need for a document production consultant. The following discussion of computer-assisted review tools will familiarize you with what they are and what they do.

Understanding computer assisted review

Certain important standard technical terms and techniques have been developed in the electronic discovery literature. See Aaron Goodman, "Predictive Coding," 43 Litigation 23 (Fall 2016); Lea Malani Bay et al., "Technology-Assisted Review: Advice For Requesting Parties," Practical Law (Nov. 2016). 

In my opinion, the best way to explain these terms and techniques to the court is to submit an affidavit that defines and explains them using common sense examples. You can save money, if appropriate, by having a company employee submit this affidavit. But if litigation is more contentious or involves higher stakes, it might be worthwhile to have an independent electronic discovery expert submit this affidavit.

Here are the most important terms and techniques:

Proportionality:  the legal concept that discovery should not outweigh and overwhelm the value of what is at stake in litigation. Proportionality is now specifically mentioned in the Federal Rules of Civil Procedure. The concept is also present in the reasoning of many state court decisions, although most states have not yet adopted their Rules of Civil Procedure to mirror the federal rules.

Good faith conference: the procedure by which parties meet, discuss the discovery needs, and try to agree on particular methods that will be used to identify relevant documents for production. The Federal Rules of Civil Procedure require good faith conference, and spell out much of what parties are required to do with respect to conferring about electronic discovery. In addition, a growing number of state courts also require good faith conference to one extent or another. While some litigants take an aggressive or obstreperous approach to good faith conference, my opinion is that aggressiveness only costs more money and rarely advances the company's interests. A transparent and cooperative approach to good faith conference can save tremendous amounts of money, and even if agreement cannot be reached,  puts you in the best position to win discovery motion practice before most courts. It is rare that a transparent and cooperative approach will reveal attorney work product or compromise the company's approach to litigation, which is often the concern litigants have with this approach. So, for example, sharing the list of custodians, sharing a list of repositories, trying to come to a reasonable agreement on keywords, and being reasonable with the use of technology assisted review for both sides will, in my opinion, benefit the company by saving money and making victory more likely. See Hyles v. New York City, 2016 WL 4077114 (S.D.N.Y. Aug. 1, 2016) (an example of contentious discovery driving up costs that involves electronic discovery and technology assisted review); United States v. Education Management LLC, 2013 WL 12140442 (W.D. Pa. Nov. 24, 2013) (same); see also Apple, Inc. v. Samsung Electronics Co. Ltd., 2013 WL 1942163 (N.D. Cal. May 9, 2013) (explaining parties' duty to confer cooperatively and transparently about electronic discovery); Romero v. Allstate Ins. Co., 271 F.R.D. 96 (E.D. Pa. 2010) (same).

Clawback: an agreement by the parties, sometimes enforced by a court order, that if privileged documents or work product are inadvertently produced, the party receiving the privileged documents will return them and not argue that the privilege has been waived.

Custodians: all of the people within the company who may have relevant documents.

Repositories: all the places (email boxes, files, folders, etc.) where custodians may keep potentially relevant documents.

Universe: all the documents that you intend to review. In other words, your universe contains all of the documents found in all of the repositories belonging to all of the custodians.

Keyword search: a review of the universe that searches for particular words and returns only those documents that contain those words. The usual criticism of keyword searching is that the people trying to think of the keywords will not be able to come up with all of the terms that may be found in relevant documents, so relevant documents can be missed. Another weakness of keyword searching is that some keyword search tools can search only a precise word. So, for example, if you search for "closings", but the word "closing" appears in your document, the search tool may miss it.

Boolean search: a type of search that addresses this last problem. More sophisticated systems are able to use things like connectors, wildcards, etc. that ensure the computer will return documents similar to the keywords you used, even if not an exact match. This can dramatically improve the effectiveness of a keyword search.

Technology assisted review: sometimes called TAR, this is any tool used for a review in which a computer is trained to identify relevant documents, so your review team need not manually review the entire universe.

Predictive coding: the most common and well-accepted form of TAR. In predictive coding, a review team reviews a set sample size of documents from within the universe, their coding is fed into the computer, and the computer reviews the rest of the universe and returns relevant results.

Seed set: a group of documents, sampled from the universe, that is manually reviewed to determine relevance, then coded and used to train the predictive coding software. The seed set is sometimes reviewed only by the producing party, but sometimes both producing and requesting parties can agree to cooperate and review the seed set together. This can minimize the likelihood of discovery disputes and save cost.

Computer assisted learning: a less common and less well-accepted form of TAR, but one that may represent the future because the empirical literature suggests it may be more effective. Sometimes called CAL, it also involves training a computer to identify relevant documents, but it does not begin with a fixed seed set. Instead, it continues to actively train as reviewers do their work.

Recall: the percentage of relevant documents in the universe that are identified in your review. For example, if you have a universe of 100,000 documents, and 30,000 of those documents are actually relevant, and your review identifies 15,000 documents as relevant, then your recall is 50 percent.

Precision: the percentage of documents your review identifies as relevant that are in fact relevant. So if your review identifies 30,000 documents as relevant but a quality control review shows that only 10,000 documents are relevant then your precision is 33 percent.

Richness: the percentage of documents in your universe that are actually relevant. So if you have a universe of 100,000 documents and 30,000 of those documents are actually relevant, then your richness is 30 percent.

F1: a highly technical term that can be difficult for courts to understand, but it is the gold standard for measuring the quality of a computer assisted review overall. It is the "harmonic mean" (a particular kind of average) of your recall and your precision. It is intended to measure effectiveness in a practical and conservative way, so it tends to be closer to the lower (less effective) of your recall and your precision.

Nested review: a system in which manual review, keyword searching, and technology assisted review, are used together in an agreed-upon order to identify all the relevant documents within the universe that will be produced. For example, a review might begin by running a Boolean keyword search on the email boxes of all of the document custodians to generate a subset of documents. That subset will then be reviewed using predictive coding to winnow the documents further. All of the documents identified by these two techniques as relevant might then be reviewed manually to identify privileged documents and make final relevance calls before a production is made. See In re Lithium Ion Batteries Antitrust Litig., 2015 WL 833681 (N.D. Cal. Feb. 24, 2015) (describing a typical nested review); Progressive Cas. Ins. Co. v. Delaney, 2014 WL 3563467 (D. Nev. July 18, 2014) (same).

A large and growing body of empirical research shows that technology assisted review is actually more effective than manual review of the entire universe of documents. See Maura R. Grossman et al., Technology-Assisted Review In E-Discovery Can Be More Effective And More Efficient Thank Exhaustive Manual Review, 17 Rich. J. of Law & Tech. 11 (Spring 2011); Nicholas Barry, "Man Versus Machine Review: The Showdown Between Hordes of Discovery Lawyers and a Computer-Utilizing Predictive-Coding Technology," 15 Vand. J. Ent. & Tech. L. 343 (Winter 2013).

In other words, computer tools have gotten so good that they can produce better recall, better precision, and better F1 then a team of manual reviewers who actually lay eyes on every document in the universe. This is an amazing feat. It is the reason why courts today generally accept technology assisted review when the parties in litigation agree to its use. See Da Silva Moore v. Publicis Groupe, 287 F.R.D. 182 (S.D.N.Y. 2012) (the seminal opinion by widely respected Sedona-conference active Magistrate Andrew J. Peck that first approved predictive coding in court); see also Rio Tinto PLS v. Vale S.A., 306 F.R.D. 125 (S.D.N.Y. 2015) (Magistrate Peck's follow-up to Da Silva Moore).

Although it is much less common for parties or courts to consider using technology in assisted review if either side insists on manual review, this is probably the future. See Hinterberger v. Catholic Health System, Inc., 2013 WL 2250603 (W.D.N.Y. May 21, 2013) (refusing to compel the use of TAR over the other party's objection); Bridgestone Americas, Inc. v. IBM, 2014 WL 4923014 (M.D. Tenn. July 22, 2014) (ordering parties to confer in good faith about TAR);  but see also Dynamo Holdings Ltd. Partnership v. Comm. of IRS, 2016 WL 4204067 (U.S. Tax. Ct. July 13, 2016) (after parties agreed to TAR, one party could not compel the other to do a further review).


In an appropriate case we would urge a court to order the use of technology assisted review, even against opposition, to save money, make litigation more efficient, and be more consistent with the proportionality concept. I believe that if all these tools, techniques, and services are used, we can maximize the likelihood that the company will achieve optimum efficiency in driving down the total cost of its discovery throughout the nation.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

To print this article, all you need is to be registered on Mondaq.com.

Click to Login as an existing user or Register so you can print this article.

In association with
Related Video
Up-coming Events Search
Font Size:
Mondaq on Twitter
Register for Access and our Free Biweekly Alert for
This service is completely free. Access 250,000 archived articles from 100+ countries and get a personalised email twice a week covering developments (and yes, our lawyers like to think you’ve read our Disclaimer).
Email Address
Company Name
Confirm Password
Mondaq Topics -- Select your Interests
 Law Performance
 Law Practice
 Media & IT
 Real Estate
 Wealth Mgt
Asia Pacific
European Union
Latin America
Middle East
United States
Worldwide Updates
Check to state you have read and
agree to our Terms and Conditions

Terms & Conditions and Privacy Statement

Mondaq.com (the Website) is owned and managed by Mondaq Ltd and as a user you are granted a non-exclusive, revocable license to access the Website under its terms and conditions of use. Your use of the Website constitutes your agreement to the following terms and conditions of use. Mondaq Ltd may terminate your use of the Website if you are in breach of these terms and conditions or if Mondaq Ltd decides to terminate your license of use for whatever reason.

Use of www.mondaq.com

You may use the Website but are required to register as a user if you wish to read the full text of the content and articles available (the Content). You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, link, display, or in any way exploit any of the Content, in whole or in part, except as expressly permitted in these terms & conditions or with the prior written consent of Mondaq Ltd. You may not use electronic or other means to extract details or information about Mondaq.com’s content, users or contributors in order to offer them any services or products which compete directly or indirectly with Mondaq Ltd’s services and products.


Mondaq Ltd and/or its respective suppliers make no representations about the suitability of the information contained in the documents and related graphics published on this server for any purpose. All such documents and related graphics are provided "as is" without warranty of any kind. Mondaq Ltd and/or its respective suppliers hereby disclaim all warranties and conditions with regard to this information, including all implied warranties and conditions of merchantability, fitness for a particular purpose, title and non-infringement. In no event shall Mondaq Ltd and/or its respective suppliers be liable for any special, indirect or consequential damages or any damages whatsoever resulting from loss of use, data or profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection with the use or performance of information available from this server.

The documents and related graphics published on this server could include technical inaccuracies or typographical errors. Changes are periodically added to the information herein. Mondaq Ltd and/or its respective suppliers may make improvements and/or changes in the product(s) and/or the program(s) described herein at any time.


Mondaq Ltd requires you to register and provide information that personally identifies you, including what sort of information you are interested in, for three primary purposes:

  • To allow you to personalize the Mondaq websites you are visiting.
  • To enable features such as password reminder, newsletter alerts, email a colleague, and linking from Mondaq (and its affiliate sites) to your website.
  • To produce demographic feedback for our information providers who provide information free for your use.

Mondaq (and its affiliate sites) do not sell or provide your details to third parties other than information providers. The reason we provide our information providers with this information is so that they can measure the response their articles are receiving and provide you with information about their products and services.

If you do not want us to provide your name and email address you may opt out by clicking here .

If you do not wish to receive any future announcements of products and services offered by Mondaq by clicking here .

Information Collection and Use

We require site users to register with Mondaq (and its affiliate sites) to view the free information on the site. We also collect information from our users at several different points on the websites: this is so that we can customise the sites according to individual usage, provide 'session-aware' functionality, and ensure that content is acquired and developed appropriately. This gives us an overall picture of our user profiles, which in turn shows to our Editorial Contributors the type of person they are reaching by posting articles on Mondaq (and its affiliate sites) – meaning more free content for registered users.

We are only able to provide the material on the Mondaq (and its affiliate sites) site free to site visitors because we can pass on information about the pages that users are viewing and the personal information users provide to us (e.g. email addresses) to reputable contributing firms such as law firms who author those pages. We do not sell or rent information to anyone else other than the authors of those pages, who may change from time to time. Should you wish us not to disclose your details to any of these parties, please tick the box above or tick the box marked "Opt out of Registration Information Disclosure" on the Your Profile page. We and our author organisations may only contact you via email or other means if you allow us to do so. Users can opt out of contact when they register on the site, or send an email to unsubscribe@mondaq.com with “no disclosure” in the subject heading

Mondaq News Alerts

In order to receive Mondaq News Alerts, users have to complete a separate registration form. This is a personalised service where users choose regions and topics of interest and we send it only to those users who have requested it. Users can stop receiving these Alerts by going to the Mondaq News Alerts page and deselecting all interest areas. In the same way users can amend their personal preferences to add or remove subject areas.


A cookie is a small text file written to a user’s hard drive that contains an identifying user number. The cookies do not contain any personal information about users. We use the cookie so users do not have to log in every time they use the service and the cookie will automatically expire if you do not visit the Mondaq website (or its affiliate sites) for 12 months. We also use the cookie to personalise a user's experience of the site (for example to show information specific to a user's region). As the Mondaq sites are fully personalised and cookies are essential to its core technology the site will function unpredictably with browsers that do not support cookies - or where cookies are disabled (in these circumstances we advise you to attempt to locate the information you require elsewhere on the web). However if you are concerned about the presence of a Mondaq cookie on your machine you can also choose to expire the cookie immediately (remove it) by selecting the 'Log Off' menu option as the last thing you do when you use the site.

Some of our business partners may use cookies on our site (for example, advertisers). However, we have no access to or control over these cookies and we are not aware of any at present that do so.

Log Files

We use IP addresses to analyse trends, administer the site, track movement, and gather broad demographic information for aggregate use. IP addresses are not linked to personally identifiable information.


This web site contains links to other sites. Please be aware that Mondaq (or its affiliate sites) are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of these third party sites. This privacy statement applies solely to information collected by this Web site.

Surveys & Contests

From time-to-time our site requests information from users via surveys or contests. Participation in these surveys or contests is completely voluntary and the user therefore has a choice whether or not to disclose any information requested. Information requested may include contact information (such as name and delivery address), and demographic information (such as postcode, age level). Contact information will be used to notify the winners and award prizes. Survey information will be used for purposes of monitoring or improving the functionality of the site.


If a user elects to use our referral service for informing a friend about our site, we ask them for the friend’s name and email address. Mondaq stores this information and may contact the friend to invite them to register with Mondaq, but they will not be contacted more than once. The friend may contact Mondaq to request the removal of this information from our database.


This website takes every reasonable precaution to protect our users’ information. When users submit sensitive information via the website, your information is protected using firewalls and other security technology. If you have any questions about the security at our website, you can send an email to webmaster@mondaq.com.

Correcting/Updating Personal Information

If a user’s personally identifiable information changes (such as postcode), or if a user no longer desires our service, we will endeavour to provide a way to correct, update or remove that user’s personal data provided to us. This can usually be done at the “Your Profile” page or by sending an email to EditorialAdvisor@mondaq.com.

Notification of Changes

If we decide to change our Terms & Conditions or Privacy Policy, we will post those changes on our site so our users are always aware of what information we collect, how we use it, and under what circumstances, if any, we disclose it. If at any point we decide to use personally identifiable information in a manner different from that stated at the time it was collected, we will notify users by way of an email. Users will have a choice as to whether or not we use their information in this different manner. We will use information in accordance with the privacy policy under which the information was collected.

How to contact Mondaq

You can contact us with comments or queries at enquiries@mondaq.com.

If for some reason you believe Mondaq Ltd. has not adhered to these principles, please notify us by e-mail at problems@mondaq.com and we will use commercially reasonable efforts to determine and correct the problem promptly.