-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect (I think?) MD5IntegrityError #105
Comments
Oh wow, I can reproduce this error on the download side. I'll have to investigate. This is pretty weird. |
I ran into this issue when writing jpg/png images with CloudVolume and had gzip compression explicitly turned on (overriding CloudVolume default for jpg/png). |
Yes, that's right, thanks Nico – I have a little library for format conversion npimage to which I just added support for saving as neuroglancer precomputed via cloudvolume, and if the user asks for I thought this was the default behavior of cloudvolume but perhaps its not and so I've ended up in a rarely used corner case? Are jpeg encoded precomputed volumes not gzipped typically? |
Correct, for compressed image file formats such as JPG and PNG, gzip should not be necessary, because it's already part of the format, anyway. (PNG uses the exact same compression algorithm as GZIP, which is called DEFLATE; and JPG uses Huffman coding, which again is part of DEFLATE). CloudVolume determines whether or not gzip compression is required, here: https://github.com/seung-lab/cloud-volume/blob/master/cloudvolume/datasource/precomputed/common.py#L12-L19 Still, somewhere CloudFiles seems to compare gzipped with ungzipped checksums for these "double compressed" files? |
Great, I skipped gzipping when in jpeg encoding and that got rid of the checksum issues from cloudfiles. Thanks for the input Nico. If it really doesn't make sense to ever gzip when in jpeg encoding, you might think of enforcing that on the cloudvolume side Will. (There still remains the question of why cloudfiles is getting confused about the checksums for these double compressed files, but if you update cloudvolume to refused to make such files, the problem is probably 90% solved, in practice.) Feel free to close this issue or not depending on whether you think you'll try to dive in and fix the checksum bug. |
I wonder if this is a bug in Google's library? I played around with this and it seems like |
Hi Will,
I used cloudvolume to upload a simple greyscale image volume in precomputed format to google cloud, as I've done a million times. The upload seemed to succeed without issue. But if I try to download the data from google cloud using cloudvolume, I get a scary error:
I've never seen this before. I tried re-uploading the dataset and got the same problem, so I don't think it was a failed upload / corrupted data. The dataset also loads into neuroglancer just fine. I can also download the files using a
gcloud storage cp
command just fine. So I suspect that the issue may not actually be with the files but with howcloudfiles
is attempting to validate the checksum. Not sure if its relevant, but the specific cube that triggers the error is in fact all black (pixel values all 0) and is the top-left-most block in the dataset.Do you have any idea what could be going on here? Can you reproduce the issue if you try to load this exact volume into memory via cloudvolume?
Thanks a lot!
The text was updated successfully, but these errors were encountered: