As a Document Lifecycle Evangelist, Chuck advocates best practices and tools covering the complete document lifecycle. For over 35 years in legal IT, Chuck has worked for and with all sorts of professional practices, from sole practitioner to global enterprise. Whether as a developer, trainer, business process analyst, or consultant, in all of those roles, he has served as a translator between business users and technical professionals facilitating solutions to the practical needs of professional users and support staff, and the opportunities for IT departments to make a real difference for their end users.
A true story about metadata in embedded images, and unintended consequences:
Back in 2012, John McAfee (of McAfee anti-virus fame) was holed up in a Central American jungle, hiding from a variety of government investigations as a “person of interest” (drugs, guns, money, and murder). On social media, he regularly taunted the Central American government authorities: “catch me if you can, but you can’t, because I cover my tracks,” that sort of thing. For years, the police hadn’t been able to locate him.
McAfee had a wide social media following, including an admiring blogger who asked if he could interview him in person. McAfee agreed and, during the interview, let the blogger take a selfie with him – there was nothing in the picture itself that revealed his location. The blogger published the interview with the snapshot, McAfee got an ego-hit of extra exposure, and the local authorities downloaded the photo from the blogger’s website. Just for kicks, they looked at the photo’s metadata and hit a jackpot; the blogger had location services and GPS turned on when he took the picture. Shortly afterward, McAfee was in custody, exposed by GPS coordinates in an image embedded in a blog.
Image Metadata is Good Metadata – Sometimes
Whenever you take a picture, the device you’re using adds metadata to the image: the time and date when you click the shutter, information about the camera, location coordinates, and even copyright information that confirms legal ownership of the image. Editing software, like Lightroom or Photoshop, enables adding and changing metadata in images.
Usually that extra (hidden) information is innocuous, even useful: photographers and designers often leverage that information when editing images; social media platforms use it for tagging and socializing. In the design and social spheres, it’s almost irresponsible not to retain and work with image metadata.
Legal Responsibility and Image Metadata
Lawyers work under different, stricter requirements when it comes to metadata: preserve only what’s required to serve their clients’ needs, legal requirements, and professional responsibilities – delete everything else. Not coincidentally, that approach coincides with good data governance practice. That applies to anything they share – documents, images, images embedded in documents – wherever/however they share it.
That’s why professional firms routinely scrub metadata that leaves the organization as an email attachment or when saving to removable media or cloud storage. Many firms clean metadata from internal documents as well. A clearly documented, standardized approach to metadata cleaning provides insurance against inadvertent data leakage risks, and serves as a defense if discovery questions arise.
Embedded Image Metadata: Expect the Unexpected
When you insert a photo in an MS Office document using Insert > Pictures, the Office application automatically makes a copy of the image, renames it generically (“image1.jpg”), and deletes the image metadata.
However, PDF editing applications do not delete metadata when embedding images. In addition, applications that generate Word, Excel, or PowerPoint files automatically, bypassing the Office application UI, usually leave image metadata intact when embedding images directly in an Office file.
So don’t expect metadata-free images in any document; always assume embedded images retain some sort of metadata, and handle them accordingly. The metadata in embedded images may not pose any threat; then again, it might prove valuable in an investigation or discovery scenario.
Complete Metadata Cleaning
To protect their clients, their firms, and themselves, professionals need a metadata cleaning solution that provides extended insurance against data leakage.
Many metadata cleaning applications focus on documents (Word, Excel, PowerPoint, and PDF), but ignore images. Very few solutions clean metadata in images attached to emails.
It’s essential to confirm that your metadata solution cleans the metadata in images embedded inside documents (Word, Excel, PowerPoint, PDF) while still leaving the images intact in the documents, in addition to cleaning metadata in images that are embedded in an email body, or attached to an email as a separate file. Out of the box, Metadact’s default cleaning profile deletes all image metadata wherever images are found. Metadact’s configurable cleaning profiles allow firms and users to adjust cleaning settings, including leaving image metadata in place if desired.
Cleaning embedded images provides an additional level of protection against unknown, unexpected metadata. It satisfies every professional’s responsibility to share only what they intend to share, nothing more, so everyone benefits with better peace of mind.