1. Good article. I concur that the production of a jointly agreed-upon list of metadata fields is usually the most appropriate production format. (I would exclude “embedded data” such as Track Changes, cell formulae, cell comments, and slide notes from this discussion. Embedded data is not metadata in any respect, but is instead simply document content that can be toggled on or off as an application design feature.)

    I suspect that many parties resist the generalized production of metadata because it can include hundreds of metadata fields for a typical collection of corporate files, virtually all of which contain no information, inaccurate information, or non-relevant information. Unfortunately, it’s difficult to determine which fields are being used by each workgroup within your client’s corporation, how they are being used – and if usage patterns changed during the period at issue, the apparent accuracy of the usage, and the degree to which one or more of these fields has relevance – however tangential – to the issues in the matter.

    Client aren’t standing in line to throw more budget and time at discovery. It takes a lot of both to conduct a thorough data mapping exercise to profile potential data sources subject to discovery, and it’s rarely done. Expecting legal teams to go beyond data mapping to profile the dizzying array of metadata fields for each common application (Word, Excel, PowerPoint, Acrobat, SAP, Facebook, Twitter, Instagram, Snapchat, LinkedIn, Visio, Project, Bloomberg mail, Lync, SameTime, SMS text messages, etc.) not to mention an even larger number of databases and file types, is not practical, especially considering the very low probability those fields will contain probative information.

    The major cost factor though is attorney review. For every additional metadata field (however empty or meaningless) included in the production, the producing party has to incur the expense of reviewing those fields multiple times to ensure they’re not missing something that might bite them later on. This is the primary reason for most parties’ reluctance to automatically produce “[all] metadata”.

    Reaching an equitable compromise is not that difficult. It’s a simple exercise to list the “basic” metadata fields for the common data types (.doc, .xls, .ppt, .msg., etc.) which should be included in most productions; and a secondary list of metadata fields that should be evaluated for inclusion based on certain issue criteria. The requesting party (as in the 7-Eleven case) should be required to justify the inclusion of more than a handful of purely “bibliographic coding” metadata fields, and the producing party should likewise be allowed to exclude otherwise “standard” production metadata based on a showing that such fields are either not used by the producing party or are consistently inaccurate (for the usual well-established reasons.)

    As There’s a lot of metadata fields out there; mostly unused and inaccurate. As my favorite eDiscovery pundit, Gertrude Stein, put it: When you’re talking about metadata, there’s not a lot of there, there.

  2. Great post Josh and very timely. I recently wrote about a trade secret case (Selectiva v. Novatus) where files stored in a personal cloud were deleted (along with the metadata). The court wrote:

    “In most cases, metadata is unlikely to have any evidentiary value to the parties. But in cases involving the alleged theft and misuse of electronically stored information, the parties may very well utilize metadata to establish their claims and defenses.”

    Interesting juxtaposition.