A very small PDF for testing

For testing, I often find myself prioritizing having small data sets because I find they are easier to comprehend. Also, the small size helps emphasize what is being tested when data isn’t filled with many irrelevant data points.

Today I needed a PDF for testing. Following the same principles, I sought out “the smallest (valid) PDF” and ended up on StackOverflow.

In most cases, the size of the PDF, probably doesn’t matter much but I needed a PDF base64 encoded in a JSON file. A huge PDF would not fit on my screen and force me to scroll making it harder to get an overview of the JSON file in its entirety.

Back at StackOverflow, specifically at the question: What is the smallest possible valid PDF? I found several suggestions, but none that would show text AND work with the default PDF viewer (Evince) on my Linux Fedora laptop.

Having my test PDF contain visible text was important to me. A blank page can easily be a symptom of something not working, so using a blank page for testing could potentially hide actual errors.

I started searching for the PDF specification, pondering if I could write one by hand. I quickly got discouraged because I didn’t want to put too much time and effort into something so “unimportant”. But just as I was about to settle for something slightly larger, I stumbled over PrintMyFolders and its owner: Steve.

He provided a PDF sample from which I was able to easily create an A4 document with the (almost) centered text “Dummy PDF”. Not only was the test PDF useable by my application but also worked flawlessly in both Firefox, Chrome and Evince.

For a similar file, Inkscape would create a 5.2 kilo byte PDF file, while LibreOffice Writer would create a 7.8 kilo byte file.

630 bytes is of course more than the other StackOverflow examples, but still very small. The slightly larger size seems like a small price for the satisfaction of well working test PDF - worth the compromise.

It is even fairly easy to edit the text by just using a normal text editor. Though PDF viewers doesn’t seem picky about it startxref is supposed to mark where xref (cross-reference table) is supposed to “start”, and changing the text length pushes where xref starts.


… and as an inline “data” link to the PDF.

Happy testing!