This one shows a lot of forethought, but I am puzzled by the form of production.
Technology Assisted Review is Good for You and Me
There is nothing magical about using Technology Assisted Review. There is also no rule requiring specific technology to find responsive electronically stored information. The issue is always one of whether a production was adequate.
The Case Management Order in Green v. Am. Modern Home Ins. Co., states the following on Technology Assisted Review:
- Technology Assisted Review in Lieu of Search Terms. In lieu of identifying responsive ESI using the search terms and custodians/electronic systems as described in Sections II.C & II.D above, a party may use a technology assisted review platform to identify potentially relevant documents and ESI.
Green v. Am. Modern Home Ins. Co., 2014 U.S. Dist. LEXIS 165956, 4 (W.D. Ark. Nov. 24, 2014).
I would argue such a decree in a Case Management Order is unnecessary under the Federal Rules of Civil Procedure and case law, but such a specific order should preemptively end any question on whether predictive coding, data analytics, “find similar,” conceptual search, and any other available search technology can used in the case.
The Form of Production
I am not a fan of converting native files to TIFFS and conversion to OCR, absent the need to redact confidential or privileged information. That is exactly what this order proscribed, minus spreadsheets:
- Format. All ESI, other than databases or spreadsheets, shall be produced in a single- or multi-page 300 dpi TIFF image with a Concordance DAT file with standard delimiters and OPT file for image loading. The documents shall also be processed through Optical Character Recognition (OCR) Software with OCR text files provided along with the production. Extracted Text shall be provided for all documents unless it cannot be obtained. To the extent a document is redacted, OCR text files for such document shall not contain text for the redacted portions of the document. Each TIFF image will be assigned a Bates number that: (1) is unique across the entire document production; (2) maintains a constant length across the entire production padded to the same number of characters; (3) contains no special characters or embedded spaces; and (4) is sequential within a given document. If a Bates number or set of Bates numbers is skipped in a production, the Producing Party will so note in a cover letter or production log accompanying the production. Each TIFF image file shall be named with the Bates Number corresponding to the number assigned to the document page contained in that image. In the event a party determines that it is unableto produce in the format specified in this section without incurring unreasonable expense, the parties shall meet and confer to agree upon an alternative format for production.
- Metadata. To the extent that any of the following metadata fields associated with all applicable documents are available, the Producing Party will produce those metadata fields to the Requesting Party: file name, file size, author, application date created, file system date created, application date last modified, file system date last modified, date last saved, original file path, subject line, date sent, time sent, sender/author, recipient(s), copyee(s), and blind copyee(s). For emails with attachments, the Producing Party will indicate when a parent-child relationship between the message and the attachment exists. A Producing Party shall also produce a load file with each production with the following fields: Starting Bates; Ending Bates; Begin Attach; End Attach; and Source (custodian/location from which document was collected). If any metadata described in this section does not exist, is not reasonably accessible, is not reasonably available, or would be unduly burdensome to collect or provide, nothingin this ESI Order shall require any party to extract, capture, collect or produce such metadata.
The order does included extracted text, but why go to the trouble of requiring production as TIFFs in the first place? The statement about OCR could be misconstrued to requiring OCRing the TIFFs when any searchable information is already available on the form of extracted text, thus OCRing is both redundant and adds cost. The only reason to OCR a TIFF is because it needs to be redacted, because producing extracted text would inadvertently produce the redacted content.
Most review applications today do a great job of ingesting native files and allowing users to review in near-native. If the native file needs to be accessed, most applications allow for reviewing the native within the review application or a copy downloaded for review in the native application.
Requiring conversion to static images is not the default of Federal Rule of Civil Procedure Rule 34. I do not recommend requiring conversion to TIFF for production, unless there is a substantial amount of redactions that must take place.
There are many types of metadata, from embedded, to substantive, to system. The above order reflects metadata as it was objective coding, seeking specific information. While all useful information, I would encourage parties to think in more terms of types of metadata, in addition to how the information should appear in a review application.
Spreadsheets in Native File Format
The order stated the following on spreadsheets:
- Spreadsheets. Absent special circumstances, Excel files, .csv files and other similar spreadsheet files will be produced in native format (“Native Files”). Native Files will be provided in a self-identified “Natives” directory. Each Native File will be produced with 6a corresponding single-page TIFF placeholderimage, which will contain language indicating that the document is being produced as a Native File. Native Files will be named with the beginning Bates number that is assigned to that specific record in the production. A “NativeLink” entry for each spreadsheet will be included in the .DAT load file indicating the relative file path to each native file on the Production Media. Native Files will be produced with extracted text and applicable metadata fields if possible and consistent with Section III.A.2 above. For documents that contain redacted text, the parties may either apply the redactions directly on the native file itself or produce TIFF image files with burned-in redactions in lieu of a Native File and TIFF placeholder image. Each Producing Party will make reasonable efforts to ensure that Native Files, prior to conversion to TIFF, reveal hidden data from redacted Native Files that are produced as TIFF image files and will be formatted so as to be readable. (For example, column widths should be formatted so that numbers do not appear as “#########”.) Under these circumstances, all single-page TIFF images shall include row and column headings.
Green, at *8-9.
I am glad the default for spreadsheets did not deviate from the Rule 34. I am curious if any of my case manager friends would agree with the order requiring TIFF placeholders and renaming the native files.
The past year has seen parties become more detailed in their case management orders regarding electronically stored information. This is a good thing. However, I strongly encourage parties to not deviate from the Federal Rules of Civil Procedure without reason, leverage the search abilities of their review applications, and make sure the case management order helps the case comply with Federal Rule of Civil Procedure Rule 1.
Josh Gilliland is a California attorney who focuses his practice on eDiscovery. Josh is the co-creator of The Legal Geeks, which has made the ABA Journal Top Blawg 100 Blawg from 2013 to 2016 and was nominated for Best Podcast for the 2015 Geekie Awards. Josh has presented at legal conferences and comic book conventions across the United States. He also ties a mean bow tie.