-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Corrupted / blank page PDF downloads #145
Comments
are they all from EPMC? I have downloaded PMC4841245 and it gives a PDF of 38 Mbytes which doesn't open. |
The header shows it to be a PDF:
|
Are they all from EUPMC? Yes. The header shows it to be a PDF: Yes. Some of the PDFs open (for me with I have this problem on two independent machines too. Reproducible. It's not just that specific query either. Other EUPMC API queries (this one with just 3 open access hits) also give the same problem:
The downloading of fulltext XML ( This bug also affects PDFs downloaded from the arxiv API. I tried both sample queries, both of which return corrupted PDFs, all the same size ~2.1kb:
|
Just to say, I also appear to be getting blank page PDFs in Windows 8.1 |
Very bizarre. Getpapers appears to be downloading PDF files of the right size for me (they are not 0-byte files) but when I open them there are completely blank. Blank pages. The right number of pages, but just completely blank. Nor is it a problem with my local PDF viewing software: cloud PDF viewing services also show that these PDF files are seemingly blank pages despite MB file sizes.
I have zipped up the entire output project folder so you can inspect the files yourself (only 12 'hits' for the search): https://github.com/rossmounce/tmpfilestorage/raw/master/testaardvark.zip
The text was updated successfully, but these errors were encountered: