-
Notifications
You must be signed in to change notification settings - Fork 114
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
The goal here is to allow for more concurrency when inserting data by breaking up the single `Mutex` that guards the `DataBuilder` by sharding it S ways, each with its own `Mutex`. The method chosen here is intentionally not particulary clever and just distributes entries among the 16 hashmaps based on their hash. This has the benefit of making lookups fairly simple, at the cost of us not being able to fully exploit knowledge we have about our data layout (e.g. how many shards we have in input and we know the input is sorted). Downside is obviously one extra layer of indirection to go through when doing lookups. The constant of 16 has been chosen somewhat arbitrarily: I think it should correlate well with how many CPUs we have available. In any case, bumping it up to 128 made performance far worse. Testing with our 100G dataset suggests that if S = 1, it'll take 40-45 minutes to upload all data; with S = 16, it'll take 19-25 minutes. Bug: 337062283 Change-Id: I55ac133f2587df93c73b41748738252078eb0131
- Loading branch information
Showing
12 changed files
with
93 additions
and
86 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.