A party fighting over not producing metadata is like bringing a protective order to produce a document without ink on the paper. Metadata is part of the native file. The “Data about Data” is captured during collection and processed for production. An active effort would have to be made to strip a native file of its metadata, which arguably is not producing the discovery in the form it is ordinarily maintained at best, or spoliation of evidence at worst.
Yet these battles still happen.
Consider the following: the producing party in Younes v. 7-Eleven, Inc., brought a motion for a protective order against the production of metadata. The Court denied the protective order and ordered the production of the requested metadata. Younes v. 7-Eleven, Inc., 2015 U.S. Dist. LEXIS 33793, at *2-3 (D.N.J. Mar. 18, 2015).
The Plaintiffs sought the metadata for 38 documents and two Excel spreadsheets to identify “the date of origination, author, custodian, date of each modification and author of each modification, and to the extent available, any data which established to whom the document had been electronically distributed.” Younes, at *11.
The metadata had not been produced pursuant to a discovery agreement, prior to any discovery productions. Discovery was produced that did not identify dates of creation, author, or to whom the file was distributed to. Younes, at *10. Depositions to “get to the bottom” of produced discovery was unsuccessful; with deponents denying the existence of programs discussed in discovery, not answering who created specific documents, or claiming they not to have knowledge about projects. Younes, at *11.
The Court explained that under Fed. R. Civ. P. 34(b)(2)(E) a party can request electronically stored information in a specific form with metadata. Younes, at *13. However, some Courts have required a “particularized need” for metadata. Id.
The Court held the Plaintiffs demonstrated a particularized need for metadata, stating:
To the extent it is necessary, plaintiffs have shown a particularized need for the requested metadata. Plaintiffs have demonstrated that many of the paper documents produced to date are missing source, date, and other key background information. This missing information is plainly relevant and discoverable. Further, the requested metadata is relevant to authenticating 7- Eleven’s documents, especially since the authors or creators of some important documents are unknown.
Id.
The Defendant arguments against producing metadata for the following reasons:
(1) The parties agreed at the outset of the case that documents need only be produced in PDF format without metadata;
(2) 7-Eleven does not possess much of the requested metadata;
(3) The metadata that is available is “extremely limited, minimally meaningful and potentially misleading”; and
(4) It would be “unreasonably burdensome to require 7-Eleven to reproduce [its] … documents with metadata.”
Younes, at *14.
The Defendant’s explained that the only metadata they maintained was the embedded metadata in the file preserved by the native application (such as PowerPoint or Excel). Id. As such, the document author could only be company name based on the Microsoft license; date created; and date of last modification and the author who made the modification. Younes, at *15.
The Court did not find any of the producing party’s arguments persuasive. Moreover, good cause existed to modify the original agreement to not produce metadata. Younes, at *16. The Court explained:
Had plaintiffs known at the outset of the case the difficulties they would face in obtaining relevant information regarding 7-Eleven’s documents, it is unlikely they would have agreed to forego requesting metadata. The changed circumstances plaintiffs face justify modifying their earlier agreement not to request metadata.
Younes, at *16-17.
The Court ordered the production of metadata for the specifically requested discovery. The Court stated in its holding:
Thus far plaintiffs have undertaken substantial efforts to obtain key information about some of 7-Eleven’s documents without success. This includes basic information such as who prepared a document and when and to whom the document was distributed. There is no justifiable reason to deny plaintiffs this basic discovery that may exist in 7-Eleven’s metadata, especially when 7-Eleven has not offered to voluntarily provide the requested information. 7-Eleven’s suggestion that plaintiffs should question its witnesses at their depositions about the requested information is unnecessary and impractical. Plaintiffs should not have to use their limited deposition time to question witnesses about basic issues such as when scores of documents were prepared, who prepared them, who received them, etc. The requested metadata is unquestionably relevant to important issues in the case. It cannot be gainsaid that plaintiff is entitled to this information.
Magistrate Judge Joel Schneider drove home the pointless of not producing metadata that contained the basic information of who did what and when. This information should be produced as a matter of standard operating procedure.
Requested metadata might not contain everything a requesting party desires, but if there is relevant information, it should be produced. Moreover, it is unlikely for a native file to NOT have some type of embedded metadata. Virtually all applications will track some information on Word, PowerPoint, Excel, and other similar applications.
Parties should think of metadata as free objective coding for their review databases. It should not be feared, because the data can be used to expedite document review and leverage the analytical tools. The alternative is reducing discovery review back to clicking through files like paper in a box. Such tactics only drive up the cost of discovery in opposition to Federal Rule of Civil Procedure Rule 1’s mandate to handle cases in a cost-effective manner.
Good article. I concur that the production of a jointly agreed-upon list of metadata fields is usually the most appropriate production format. (I would exclude “embedded data” such as Track Changes, cell formulae, cell comments, and slide notes from this discussion. Embedded data is not metadata in any respect, but is instead simply document content that can be toggled on or off as an application design feature.)
I suspect that many parties resist the generalized production of metadata because it can include hundreds of metadata fields for a typical collection of corporate files, virtually all of which contain no information, inaccurate information, or non-relevant information. Unfortunately, it’s difficult to determine which fields are being used by each workgroup within your client’s corporation, how they are being used – and if usage patterns changed during the period at issue, the apparent accuracy of the usage, and the degree to which one or more of these fields has relevance – however tangential – to the issues in the matter.
Client aren’t standing in line to throw more budget and time at discovery. It takes a lot of both to conduct a thorough data mapping exercise to profile potential data sources subject to discovery, and it’s rarely done. Expecting legal teams to go beyond data mapping to profile the dizzying array of metadata fields for each common application (Word, Excel, PowerPoint, Acrobat, SAP, Facebook, Twitter, Instagram, Snapchat, LinkedIn, Visio, Project, Bloomberg mail, Lync, SameTime, SMS text messages, etc.) not to mention an even larger number of databases and file types, is not practical, especially considering the very low probability those fields will contain probative information.
The major cost factor though is attorney review. For every additional metadata field (however empty or meaningless) included in the production, the producing party has to incur the expense of reviewing those fields multiple times to ensure they’re not missing something that might bite them later on. This is the primary reason for most parties’ reluctance to automatically produce “[all] metadata”.
Reaching an equitable compromise is not that difficult. It’s a simple exercise to list the “basic” metadata fields for the common data types (.doc, .xls, .ppt, .msg., etc.) which should be included in most productions; and a secondary list of metadata fields that should be evaluated for inclusion based on certain issue criteria. The requesting party (as in the 7-Eleven case) should be required to justify the inclusion of more than a handful of purely “bibliographic coding” metadata fields, and the producing party should likewise be allowed to exclude otherwise “standard” production metadata based on a showing that such fields are either not used by the producing party or are consistently inaccurate (for the usual well-established reasons.)
As There’s a lot of metadata fields out there; mostly unused and inaccurate. As my favorite eDiscovery pundit, Gertrude Stein, put it: When you’re talking about metadata, there’s not a lot of there, there.
Great post Josh and very timely. I recently wrote about a trade secret case (Selectiva v. Novatus) where files stored in a personal cloud were deleted (along with the metadata). The court wrote:
“In most cases, metadata is unlikely to have any evidentiary value to the parties. But in cases involving the alleged theft and misuse of electronically stored information, the parties may very well utilize metadata to establish their claims and defenses.”
Interesting juxtaposition.