-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
md5sum mismatch #1
Comments
@senwu Thanks for your interest in our work. Let's see if we can narrow down the issue. The checksums of the original TACRED (train.json, dev.json, and test.json) match, so this is fine. Could you tell me a little more about your setting, e.g., operating system and python version. It could be that storing a json behaves differently, e.g., line endings, on different platforms. Your observation is correct, there are less samples in the patch files than reported in the paper. The reason is that the number of "revised" samples also includes those that were assigned a second label by our annotators. As the TACRED format does not support multiple labels per sample, we chose not to patch those instances. |
Just wanted to report, in case its helpful, that I had the same MD5sum problem as @senwu originally when using Python2. When running with Python3, the MD5s were consistent with the ones published by @ChristophAlt. I have not spent time identifying if the problem is just JSON formatting differences or whether there are other potentially important content differences. |
@liviosoares Thank you! Your feedback is very much appreciated. I'll try to identify the root cause of the problem. |
Thanks for sharing this awesome work. Really appreciate!!!
I want to use this revised TACRED dataset for my study, while I found my md5 checksums don't match the ones mentioned in the README.
Here are my md5checksums:
Also, from the patch files, I found there are 1590 samples and 936 samples in dev and test files. (Seems like those numbers doesn't match the numbers reported in the paper?)
Please let me know if I am doing anything wrong? Thanks!
The text was updated successfully, but these errors were encountered: