What Litigators Should Know About Statistical Sampling in Labor and Employment Disputes
With statistical sampling, counsel can simplify damage analyses, avoid potential issues with incomplete or missing data, and minimize the risk of error.
Questions Counsel Should Ask When Determining if Sampling Is Appropriate:
- Are there significant gaps in the data required to complete an analysis?
- Is the class too large to review each member individually?
- Is the data unorganized, messy, or too complicated to analyze in a timely fashion?
- Would a representative sample allow for more detail and care to be paid to a review or analysis?
- How will the sampling process be documented and shared between parties?
- Do I have an expert who can conduct the sampling, and is the sampling methodology reproducible?
Sampling in Labor and Employment Litigation
Statistical sampling is a generally accepted methodology used to make inferences about populations. When done correctly, statistical samples can produce valid and reliable results that are used in academic research or by courts and regulatory agencies. In Tyson Foods, Inc. v. Bouaphakeo No. 14-1146, 2016 WL 1092414 (U.S. March 22, 2016), respondents introduced "a representative sample to fill an evidentiary gap created by the employer's failure to keep adequate records." The Supreme Court upheld the use of statistical sampling, noting that in this case, the sample was "reliable in proving or disproving the elements of the relevant cause of action." Consistent with Tyson, in instances where the data is incomplete or an individualized review of all class members is necessary but infeasible, statistical sampling can be used to simplify an otherwise complicated damage analysis.
Simplifying Complicated Data
In labor and employment class actions, a wide variety of data sources are needed to complete a thorough analysis. Typically, sensitive data records from human resources, timekeeping, and payroll sources are produced and transmitted to multiple parties to complete the analysis. Depending on the size of the class, this data can quickly become large, unorganized, and unwieldy to work with. Statistical sampling can be used to select a subset of employees, pay periods, and/or locations to limit the amount of data that needs to be analyzed. However, a valid, statistically representative sample of the population can provide nearly as precise results without needing to analyze the entire population.
Filling in Gaps in Data
In some industries or lines of work, it is more common that employee time records will be incomplete. In Tyson, timekeeping records did not specifically record the amount of time employees used putting on and taking off specific equipment. A statistical sample of employee shifts was used to infer the average amount of unpaid time spent putting on and taking off specific equipment before and after a shift. Similarly, in other instances where all worked time has not been appropriately captured or data was not properly retained, statistical sampling of complete time periods can be used to fill in the gaps.
Limiting Specific Review
First, statistical sampling offers the benefit of reviewing a subset of the larger population but maintaining nearly as accurate results as if the entire population had been analyzed, if done properly. If each individual in the class requires a detailed, individualized review, the analysis may become too time-consuming or costly. Second, designing a representative sample may actually yield more accurate results than individualized review of the entire population because more care and supervision can be applied to a smaller subset of information.1 Third, statistical sampling can limit the quantity of sensitive data that is transferred between parties in the litigation proceedings. Fourth, in instances when electronic data is not available, for example, the company maintains timekeeping information using paper time cards or log sheets, a random sample of documents can yield accurate results without necessitating an extensive and expensive data-entry process.
Sampling Process
For a sample to provide statistically valid results, certain steps must be taken to ensure the sample is random2 and reliable:
- Definition of Sampling Frame. Before the
sample is drawn, the entire population must be identified. This can
be a specific list of employees, shifts, pay periods, or locations
that are at issue in the litigation. Similarly, the sampling unit,
or level of data that will be randomly selected, must be properly
defined.
- Ordering of Sampling Units. The data must be
sorted and ordered using a specific data field or methodology that
can be replicated. For example, you can order the data based on
employee name, date, and shift start time, or the original order in
which the data was produced.
- Random Number Generation. After the data has
been properly ordered, a random number is assigned to each sampling
unit. Random numbers can be generated using a number of tools.
However, it is important to understand conceptually that random
numbers are generated by these tools in a way that each number has
an equal chance of being generated, so that each sampling unit also
has an equal chance of being selected. For litigation, it is
equally important that the standard of reproducibility and
verifiability can be met by using a seed value for the
sample.
- Reorder the Sampling Frame. When the random
numbers have been assigned to the data, the data is resorted using
those random numbers.
- Select the Sample. Given a set sample size,
sampling units are selected from the top of the data, working
downwards. For example, if the sample size was determined to be 75
units, when ordered by the generated random number, the first 75
units listed would be selected as the sample.
- Document the Sampling Process. Following the selection of the sample, the details pertaining to the five steps listed above must be properly documented and shared with relevant parties so that the sampling process can be replicated and reviewed.
Takeaways
Statistical sampling, when done properly, can allow for a simplified analysis while producing reliable results. Courts and regulatory agencies alike have acknowledged and allowed for the use of statistical sampling in situations where data may be incomplete or too unorganized to analyze in their entirety. Litigators should consider the option of statistical sampling in their future labor and employment class action cases as the size of data continues to grow and companies continue to use varying technologies to capture relevant employee data.
In our next installment of this three-part series, we will explore the key questions counsel should consider once the decision to sample has been made.
- What is Statistical Inference?
- What do the margin of error (MOE) and confidence level mean?
- What types of sampling methods are there?
- How can I be confident in the results of the sample?
Footnotes
1. Sampling Techniques, 3rd Edition by William G. Cochran.
2. A random sample minimizes any systematic bias that could be introduced if the sample was not drawn randomly. (See Dattalo, P. (2010). Strategies to Approximate Random Sampling and Assignment. New York, NY: Oxford University Press, p.20.)
The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.