Metadata removal tool
Metadata removal tool or metadata scrubber is a type of privacy software built to protect the privacy of its users by removing potentially privacy-compromising metadata from files before they are shared with others, e.g., by sending them as e-mail attachments or by posting them on the Web.[1][2]
Overview
Metadata can be found in many types of files such as documents, spreadsheets, presentations, images, and audio files. They can include information such as details on the file authors, file creation and modification dates, geographical location, document revision history, thumbnail images, and comments.[3] Metadata may be added to files by users, but some metadata is often automatically added to files by authoring applications or by devices used to produce the files, without user intervention.
Since metadata is sometimes not clearly visible in authoring applications (depending on the application and its settings), there is a risk that the user will be unaware of its existence or will forget about it and, if the file is shared, private or confidential information will inadvertently be exposed. The purpose of metadata removal tools is to minimize the risk of such data leakage.[4]
The metadata removal tools that exist today can be divided into four groups:
- Integral metadata removal tools, which are included in some applications, like the Document Inspector in Microsoft Office.
- Batch metadata removal tools, which can process multiple files.
- E-mail client add-ins, which are designed to remove metadata from e-mail attachments just before they are sent.
- Server based systems, which are designed to automatically remove metadata at the network gateway.
To securely delete the metadata of a PDF file, it is important to linearize the PDF file afterwards, otherwise changes are reversible and the metadata can be recovered.[5][6]
Metadata removal tools are also commonly used to reduce the overall sizes of files, particularly image files posted on the Web. For example, a small image on a website, which may contain metadata including a thumbnail image, can easily contain as much metadata as image data, thus removal of that metadata can halve the file size.
References
- Hassan, Nihad; Hijazi, Rami (2 July 2017). Digital Privacy and Security Using Windows: A Practical Guide. Apress. ISBN 978-1-4842-2799-2. Retrieved 12 December 2022.
- "The Many Faces of Fraud" (PDF). LAWPRO Magazine. Lawyers’ Professional Indemnity Company (June 2004). June 2004. Archived from the original (PDF) on 12 December 2022. Retrieved 12 December 2022.
- "A Guardian guide to your metadata – Interactive Graphic". elearningexamples.com. Elearning Examples. Archived from the original on 5 March 2014.
- O'Reilly, Dennis. "Remove metadata from Office files, PDFs, and images". CNET. Archived from the original on 1 August 2022. Retrieved 12 December 2022.
- "PDF Tags".
All metadata edits are reversible. While this would normally be considered an advantage, it is a potential security problem because old information is never actually deleted from the file. (However, after running ExifTool the old information may be removed permanently using the "qpdf" utility with this command: "qpdf --linearize in.pdf out.pdf".
- "exiftool Application Documentation".