BOSTON – When the New England Journal of Medicine used a word-processing function to reveal that Merck & Co. had deleted study data about Vioxx and heart attacks, the pharmaceutical giant joined a long line of organizations bitten by information lurking in electronic files.
Metadata is data about data. A word-processing document, for instance, has metadata on who authored it, when someone saved it and what that person did to it. Microsoft Corp.’s Word has a “track changes” feature that preserves a file’s original text and shows another person’s edits. All that is metadata.
But because it doesn’t show up when a document is printed and doesn’t appear on screen in normal settings, it’s easy to forget about.
In the Merck case, the company said the Vioxx data that was uncovered by the New England Journal had been deleted merely because the heart attacks in question came after a cutoff date for collecting information in the study.
Meanwhile, the next generation of Microsoft’s widely used Office software, which includes Word, Excel and PowerPoint, will make it simpler to strip metadata from files before they are disseminated.
Even so, Gartner analyst Michael Silver says the problem will remain metadata will exist in documents unless users make a point of getting rid of it.
Automated tools to help protect against metadata releases have existed for a while, but they are beginning to see wider use.
For example, Workshare sells a product called Trace (a free version can be downloaded) that scans documents for metadata and ranks the findings by risk level.
For most of Workshare’s six years in existence, the company’s customers were primarily lawyers, who are particularly sensitive about client information escaping to the opposing side. But in the past year, Workshare has seen business expand to 60 percent of the Fortune 1000, CEO Joe Fantuzzi said.