Focus on the Merits to Find What is Relevant, Not Search Terms Alone

Responding to a discovery request marries the practice of law to search technology. Rule 26 Conferences in Federal Court often have parties spending a significant amount of time exchanging “search terms” to determine the most effective discovery protocol for a case.

I think focusing on “search terms” alone is the wrong focus. Parties in a meet and confer should focus on search concepts, such as who are the relevant individuals, date ranges, and core terms. However, the focus should be on how to identify what is relevant to the claims and defenses in a lawsuit and not the minutia of “search terms.”

Mature business male conducting a meeting

Consider the case of In re Lithium Ion Batteries Antitrust Litig. The parties negotiated and developed a search protocol using search terms. The Court summarized the ESI protocol as follows[1]:

  1. The producing/responding party will develop an initial list of proposed search terms and provide those terms to the requesting party;
  2. Within 30 days, the requesting party may propose modifications to the list of terms or provide additional terms (up to 125 additional terms or modifications); and
  3. Upon receipt of any additional terms or modifications, the producing/responding party will evaluate the terms, and
  4. Run all additional/modified terms upon which the parties can agree and review the results of those searches for responsiveness, privilege, and necessary redactions (Proposed Search Term Protocol § B4), or
  5. For those additional/modified terms to which the producing/responding party objects on the basis of overbreadth or identification of a disproportionate number of irrelevant documents, that party will provide the requesting party with certain quantitative metrics and meet and confer to determine whether the parties can agree on modifications to such terms. Among other things, the quantitative metrics include the number of documents returned by a search term and the nature and type of irrelevant documents that the search term returns. In the event the parties are unable to reach agreement regarding additional/modified search terms, the parties may file a joint letter regarding the dispute.

The Plaintiffs recommended if there were disputed search terms after “quantitative metrics evaluation, the parties would then conduct a randomized qualitative sampling.” This would be done by a “random number generator” that would “generate a statistically valid number of ordinal positions of the identified documents,” and the “randomly selected documents can be viewed by the Requesting Party immediately after the appropriate privilege check.”[2]

Nothing left to chance - Business Strategy

The Defendants were not keen on this plan, because the sampling protocol would result in production of irrelevant information.[3]

The Court agreed with the Plaintiff, stating that the “point of random sampling is to eliminate irrelevant documents from the group identified by a computerized search and focus the parties’ search on relevant documents only.”[4]

Judge Donna Ryu explained the Court’s holding on the fact that keywords can be “overinclusive” and can find a large numbers of irrelevant documents in addition to relevant ones. [5] Invoking the Moore v. Publicis Groupe opinion, the Court stated the goal of “quality control test[ing]” is “to assure accuracy in retrieval and elimination of ‘false positives.'”[6] 

The Court pointed out the Plaintiff’s common sense argument: a random sample that shows that a search is returning a high proportion of irrelevant documents is a bad search and needs to be modified to improve its precision in identifying relevant documents.[7]

The Court further stated that the proposed sampling procedure was designed to prevent irrelevant documents from being reviewed and would obviate motion practice over search terms.[8]

The Defendants’ concerns did not fall on deaf ears. The Court explained that the Defendants could remove any irrelevant files from the random qualitative sample for any reason as long as the removed files were replaced with an equal number of randomly generated files.[9]

The Court order became very specific on the following points[10]:

The parties agreed that the procedure for qualitative sampling shall apply only after exhaustion of the quantitative evaluation process.

Irrelevant documents in the sample shall be used only for the purpose of resolving disputes regarding search terms in this action, and for no other purpose in this litigation or in any other litigation; those irrelevant documents, as well as any attorney notes regarding the sample, shall be destroyed within fourteen days of resolution of the search term dispute, with such destruction confirmed in an affidavit by counsel. 

In addition, the court held that access to the random sample shall be limited to one attorney from each law firm designated co-lead class counsel for Direct Purchaser Plaintiffs and Indirect Purchaser Plaintiffs (total of six attorneys).

Plaintiffs could invoke the random sampling process with respect to no more than five search terms per defendant group. 

A defendant family would run one combined search for up to five disputed terms, rather than creating separate samples for each disputed term. The parties were ordered to meet and confer regarding the sample size, as well as the overall limit on the number of sample documents generated per defendant family.

Bow Tie Thoughts 

This search protocol was very specific on sampling. Moreover, it also highlights how complex “search” can be in litigation. However, it also highlights the danger of only using “search terms” in discovery.

“Search terms” are recognized as easily being both over and under inclusive. As such, there is no meet and confer that will ever determine every possible search term. If there were, ESI Protocols would like the Napoleonic Code of Discovery.

The issue with discovery is determining how to find the ESI that is relevant to the claims and defenses of a lawsuit. The first steps include determining the key players in the litigation, the date ranges, how they communicated, and terms of art that they used. The context of communications should go beyond “search terms,” to what are the concepts at issue in the lawsuit so today’s eDiscovery software can truly be used as “technology-assisted review” to help lawyers find responsive ESI.

 Footnotes 

[1] In re Lithium Ion Batteries Antitrust Litig., 2015 U.S. Dist. LEXIS 22915, 48-49 (N.D. Cal. Feb. 24, 2015).

[2] Id., at *49.

[3] Id., at *55.

[4] Id., at *54.

[5] Id., at 54, citing Moore v. Publicis Groupe, 287 F.R.D. 182, 191 (S.D.N.Y. 2012).

[6] Id., at 54, citing Moore at 191, citing William A. Gross Constr. Assocs., Inc. v. Am. Mfrs. Mut. Ins. Co., 256 F.R.D. 134, 136 (S.D.N.Y. 2009).

[7] Id., at *55.

[8] Id.

[9] Id., at *55.

[10] Id., at 55-56.