Did a Judge Say Scanning of a Native File and OCR?

Taxation of cost cases can get funky. One such example is where a party sought costs for discovery to be produced in “OCR” format. The Court went on to state: “The scanning and conversion of native files to the agreed-upon format for production of ESI constitutes ‘making copies of materials’ as pursuant to §1920(4)” and found the OCR costs recoverable. Kuznyetsov v. West Penn Allegheny Health Sys., 2014 U.S. Dist. LEXIS 150503 (W.D. Pa.Oct. 23, 2014).


The Court further held that that cost of 5 cents per page for TIFF-ing was not unreasonable and the cost of 24 cents per page for scanning paper documents to also not be unreasonably high.  Id.

What is strange is seeing the words “native file” and “scanning” and “OCR format” in the same sentence. Native files are already electronically stored information. Business data such as email messages, text messages, Word documents, Excel files, are already searchable. These is no reason to print the ESI to paper, then scan them, and then perform optical character recognition on them. All of that would be extremely strange and drive up costs.

There is a chance the Court was discussing both scanning paper and conversion of native files in the same sentence, without directly saying “paper documents” before “conversion of native files.” Even if that was the case, I would recommend producing native files natively to keep costs down. Only convert privileged or confidential native files to TIFFs or PDFs for redaction.

If native files need to be converted to static images such as TIFFs in order to redact confidential information, making the static images searchable with OCR would make sense, because you want to produce information that is searchable, but not with the confidential information also searchable. However, if these native files converted into static images did not need to be redacted, there would be no reason to make them OCR searchable. A producing party could simply produced the “extracted text” from the native files, thus including searchable information for a review database to comply with the Federal Rules of Civil Procedure.

Josh Gilliland is a California attorney who focuses his practice on eDiscovery. Josh is the co-creator of The Legal Geeks, which has made the ABA Journal Top Blawg 100 Blawg from 2013 to 2016 and was nominated for Best Podcast for the 2015 Geekie Awards. Josh has presented at legal conferences and comic book conventions across the United States. He also ties a mean bow tie.

One comment

Comments are closed.