Recipe: Converting PDF Documents Into PNG
In a little side-project involving DeepZooming documents I needed to convert PDF documents into images, and that with a controllable quality.
Of course, the program convert (part of the trusty ImageMagick library) can do that:
It will not produce a single PNG, though, but one for each page, numbering them through. Other formats like TIFF can hold several images, so you would have to explicitely ask for multiple files with
If you need larger dimensions, say 1000 px width, then you can use scale:
But the image is rather fuzzy, so you need to increase the precision, both in the PDF to scan, as well as when producing the PNG:
-adjoin
-density 600x600
-quality 90
paper.pdf paper.png
That works, but only for rather small PDF documents. As soon as you have a whole paper, you will notice a substantial slowdown of your machine as convert is sucking up memory.
This is where professional software can distinguish itself, and ImageMagick does so: You can limit the amount of allocated memory (for the image data):
-adjoin
-density 600x600
-quality 90
-limit memory 256mb -limit map 256mb
paper.pdf paper.png
convert is now using the disk (TMPDIR, to be exact) in replacement of memory. Of course, that is much slower, but you cannot have everything.
It happened to me, that my /tmp/ partition is too small to hold everything. Luckily (ok, no luck but reason involved) there is another environment variable to move the cache somewhere else:
export MAGICK_TMPDIR=/home/rho/projects/deepzoom/xxx/
convert -limit memory 256mb -limit map 256mb .....
Very nice. Me likez.
- rho's blog
- Login to post comments
- Printer-friendly version
