On the Freedom to Tinker blog, Timothy Lee writes that he has been conducting detailed research on documents uploaded to PACER where the “parties tried to redact sensitive information but the redactions failed for technical reasons.” Lee explains that the problem is rooted in the fact that the “PDF” format of saving a document uses “vector graphics” that represent an image as a series of drawing commands such as lines, rectangles and lines of text. While vector graphics have various advantages, Lee says they have at least one significant disadvantage: they may contain more information than is visible to the naked eye because they can have multiple “layers.” As such, while a PDF document may appear to have a black rectangle blocking out text, the text still exists under the box and can often be read by performing a simple cut and paste. Using a collection of 1.8 million PACER documents, Lee identified approximately 2000 documents with redaction rectangles. Of these redacted documents, Lee found 194 documents with “failed redactions,” mainly from commerical litigation in which the parties attempted to redact text concerning trade secrets, medical information, addresses, dates of birth, witness names, jurors, and more. Based on this study,…

Leave a Reply

Your email address will not be published. Required fields are marked *