A very small PDF for testing

For testing, I often find myself prioritizing having small data sets because I find they are easier to comprehend. Also, the small size helps emphasize what is being tested when data isn’t filled with many irrelevant data points.

Today I needed a PDF for testing. Following the same principles, I sought out “the smallest (valid) PDF” and ended up on StackOverflow.

In most cases, the size of the PDF, probably doesn’t matter much but I needed a PDF base64 encoded in a JSON file. A huge PDF would not fit on my screen and force me to scroll making it harder to get an overview of the JSON file in its entirety.

Back at StackOverflow, specifically at the question: What is the smallest possible valid PDF? I found several suggestions, but none that would show text AND work with the default PDF viewer (Evince) on my Linux Fedora laptop.

Having my test PDF contain visible text was important to me. A blank page can easily be a symptom of something not working, so using a blank page for testing could potentially hide actual errors.

I started searching for the PDF specification, pondering if I could write one by hand. I quickly got discouraged because I didn’t want to put too much time and effort into something so “unimportant”. But just as I was about to settle for something slightly larger, I stumbled over PrintMyFolders and its owner: Steve.

He provided a PDF sample from which I was able to easily create an A4 document with the (almost) centered text “Dummy PDF”. Not only was the test PDF useable by my application but also worked flawlessly in both Firefox, Chrome and Evince.

For a similar file, Inkscape would create a 5.2 kilo byte PDF file, while LibreOffice Writer would create a 7.8 kilo byte file.

630 bytes is of course more than the other StackOverflow examples, but still very small. The slightly larger size seems like a small price for the satisfaction of well working test PDF - worth the compromise.

It is even fairly easy to edit the text by just using a normal text editor. Though PDF viewers doesn’t seem picky about it startxref is supposed to mark where xref (cross-reference table) is supposed to “start”, and changing the text length pushes where xref starts.

The base64 encoded version: JVBERi0xLjQKMSAwIG9iago8PC9UeXBlIC9DYXRhbG9nCi9QYWdlcyAyIDAgUgo+PgplbmRvYmoK MiAwIG9iago8PC9UeXBlIC9QYWdlcwovS2lkcyBbMyAwIFJdCi9Db3VudCAxCj4+CmVuZG9iagoz IDAgb2JqCjw8L1R5cGUgL1BhZ2UKL1BhcmVudCAyIDAgUgovTWVkaWFCb3ggWzAgMCA1OTUgODQy XQovQ29udGVudHMgNSAwIFIKL1Jlc291cmNlcyA8PC9Qcm9jU2V0IFsvUERGIC9UZXh0XQovRm9u dCA8PC9GMSA0IDAgUj4+Cj4+Cj4+CmVuZG9iago0IDAgb2JqCjw8L1R5cGUgL0ZvbnQKL1N1YnR5 cGUgL1R5cGUxCi9OYW1lIC9GMQovQmFzZUZvbnQgL0hlbHZldGljYQovRW5jb2RpbmcgL01hY1Jv bWFuRW5jb2RpbmcKPj4KZW5kb2JqCjUgMCBvYmoKPDwvTGVuZ3RoIDUzCj4+CnN0cmVhbQpCVAov RjEgMjAgVGYKMjIwIDQwMCBUZAooRHVtbXkgUERGKSBUagpFVAplbmRzdHJlYW0KZW5kb2JqCnhy ZWYKMCA2CjAwMDAwMDAwMDAgNjU1MzUgZgowMDAwMDAwMDA5IDAwMDAwIG4KMDAwMDAwMDA2MyAw MDAwMCBuCjAwMDAwMDAxMjQgMDAwMDAgbgowMDAwMDAwMjc3IDAwMDAwIG4KMDAwMDAwMDM5MiAw MDAwMCBuCnRyYWlsZXIKPDwvU2l6ZSA2Ci9Sb290IDEgMCBSCj4+CnN0YXJ0eHJlZgo0OTUKJSVF T0YK

… and as an inline “data” link to the PDF.

Happy testing!