Does anyone have any recommendation or procedures for repairing a corrupt PDF? When I open the file I get 'There was an error opening this document. the file is damaged and cannot be repaired.' There seems to be a myriad of tools out there but none that I could describe as reputable. Are there any opensource linux based solutions for this possibly?

Ghostscript will repair your corrupted PDF automatically... if it can open it in the first place (that is, if it is not damaged beyond repair). But afterwards you'll still need to double-check the result...

On Linux, try this command:

On Windows, try this one:

I had a corrupted PDF file, print.pdf , that Ghostscript couldn't open, but the usual graphical Linux PDF viewers (Okular, Evince) opened fine. (In my case, the file had garbage at the start instead of a PDF header, when opened in a hex editor.)

These PDF viewers use Poppler as a back-end PDF renderer. So you can repair the PDF using Poppler's command-line tools. In Ubuntu these are in the poppler-utils package. I used:

which generated a PDF file with correct headers, which tools like Ghostscript now accepted.

mutool (project page, manpage)will repair broken PDFs without printing them.

  • Installation e.g. on Ubuntu: sudo apt-get install mupdf-tools
  • Run it like this: mutool clean input.pdf output.pdf

Alternatively, there are a few tools and frameworks that can decompose/decompile PDFs into their components without rendering them. These could be useful for extracting text, scripts, and images. See this answer for a list of such tools: https://reverseengineering.stackexchange.com/q/1526/8210. E.g. you can try the current top answer Origami, it has a GTK-based viewer.

I had a corrupted pdf file, because the php file used to download it echoed some errors (in HTML) and NUL characters at the end.

The solution was to open the pdf with Notepad++ and remove all text after the line

