Tools and techniques for redacting files that contain personal information
When source documents/files contain sensitive information it is often necessary to remove the sensitive components whilst retaining useful information. Examples where this might me necessary include:
- MRI Safety request medical history
- Brain volumes (in particular high resolution structurals)
- Subject response forms
There are several techniques available depending on the type of the data, a small selection appear here.
Redacting PDFs - software and techniques
On macOS 11+ (Big Sur onwards) you can use the Preview tool for simple redaction. It is not always able to fully redact a document, for example if text extends over the right/left edge of a page then this may remain after saving - Preview can only remove what you can see on screen. If the resulting redacted document is to be shared externally, then this may not be adequately secure and you should consider converting the resulting document to a PNG file - this may not be acceptable on accessibility grounds as the text will no longer be easily machine read.
To use Preview's tool, open the PDF and use Tools > Redact. When you select this you will get a warning that the content you redact will be permanently removed - if you need to retain an unredacted copy then please ensure you are editing a copy of the original file. Click OK. Now select the text to redact by dragging over it, the text will be covered by a black rectangle with crosses. You can undo and edit this until you save the document.
For ultimate confidence that the text is gone you can use File > Export... to save to a PNG.
Windows/macOS Adobe Acrobat DC
The subscription-ware Adobe Acrobat tool can be used to redact content - see https://www.adobe.com/acrobat/hub/how-to/how-to-black-out-text-in-a-pdf-file.
Subscriptions to Adobe Acrobat can be purchased via WIN IT help.
The LibreOffice Draw program can be used to redact PDFs. The output from this process may not be identical to the original document, in particular text layout may not be as good as the original, as the redacted document will be an image, not text.
To redact content, open the PDF in Draw and then use Tools > Redact > Rectangle (or Freeform) to blank out the area of the document to be redacted and then the redacted form can be generated using Tools > Redact > Export Redacted PDF (black) (or white).
Techniques for redacting MRI volumes
High resolution structural MRI volumes can potentially be used to identify the individual to which the image is of through facial recognition techniques. With the rise of machine learning techniques and the plethora of images on social media, the automated identification of individuals is now practical. Any data that is being shared or stored/processed on platforms shared with others should have as many identifying features removed as is feasible without preventing it's use in the intended processing pipelines.
The minimum recommended de-identification step is the removal of the facial features and the ear canals. If the skull volume is not required for the processing pipeline, then skull stripping should be carried out to achieve this; with FSL you would use BET to achieve this.
Where the skull volume is required then you should take steps to de-face the volume. FSL 6.0.3+ include the tool fsl_deface which can be used for this purpose. A summary of a range of tools and their effectiveness can be found in this paper https://www.sciencedirect.com/science/article/pii/S1053811921001221.
Redacting bitmap images
Bitmap images are relatively easy to redact, paint over the section you need to hide and save the file.
You should be aware that some image formats (in particular TIFF and bitmap editors' native formats) support layers - the redaction painting may be placed in a layer above the original content and thus this will be recoverable. If you need to share in one of these formats make sure that the image is 'flattened' before you save it for sharing.