Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Parallel Hashing #3

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Enable Parallel Hashing #3

wants to merge 3 commits into from

Conversation

clbg
Copy link

@clbg clbg commented Sep 28, 2019

Dear John:
Your treehash library is fantastic, I really loved it!
Your code is clean and powerful, and It's convenient to use. because I can install from pip no matter where i am.
And I noticed that , this treehash lib use only one thread to compute hash and read file in the mean time, so I thought maybe I can help to speed it up a little bit.
So I added a multithread trick to speed up hashing
using the main thread to read file, and hand out hashing jobs to thread pool.

Pros: faster speed
I tested a 37GB file using 4 hashing threads on my computer, the consumed time reduced from 4'23 to 1'39. That's really a speedup. (271%)

Cons:compatibility
I imported concurrent.futures to use the thread pool, which is only for python3 , so this PR is not compatible with python2

@clbg clbg changed the title Enable Parrael Hashing Enable Parallel Hashing Sep 29, 2019
@jdswinbank
Copy link
Owner

Thank you for much for this useful contribution! I am travelling at the moment, so haven't been able to properly look at your work yet — I will do so when I return. I very much appreciate the time you took to submit a PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants