Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bzip2 to lbzip2 migration to use all CPU cores #480

Closed
loadit1 opened this issue Feb 23, 2020 · 10 comments · Fixed by #483
Closed

bzip2 to lbzip2 migration to use all CPU cores #480

loadit1 opened this issue Feb 23, 2020 · 10 comments · Fixed by #483

Comments

@loadit1
Copy link
Contributor

loadit1 commented Feb 23, 2020

Hi!

Line 30 in sqlite_download.sh :

bunzip2 -f "${LOCAL_BZ2_PATH}" > "${LOCAL_DB_PATH}"

Line 53 in download_sqlite_all.js :

extract = bunzip2;

Would you be so kind to advice me why not to migrate from bzip2 to lbzip2 in order to increase performance by using all CPU cores?

Thanks!

@missinglink
Copy link
Member

Sounds reasonable, I'm not familiar with lbzip2.

The only negative I could see is that it uses more RAM but this is probably a decent trade-off considering the reduction in wall-clock decompression time.

https://vbtechsupport.com/1614/

Could you please open a PR and do some testing to ensure compatibility?

@orangejulius
Copy link
Member

Yeah, agreed. Speeding up the process of decompressing massive WOF bzip2 archives sounds good to me :) I'd be happy to see something like this tested out.

Since I imagine lbzip2 is not commonly installed on most systems, perhaps we can use it only if its present? We could add it to the Pelias docker images if it does show significant performance benefit.

@missinglink
Copy link
Member

Hmmm so according to that benchmark I linked lbzip2 used over 100x as much RAM as bzip2.

Definitely worth investigating a little bit more and possibly detecting/logging OOM errors on modest systems.

@loadit1
Copy link
Contributor Author

loadit1 commented Feb 24, 2020

Hi!

PR is just created: #481

Summary of changes:

  1. If lbzip2 installed on system we use it. If not, using legacy bzip2.
  2. Updated Dockerfile to install lbzip2
  3. Additional dependency: command-exists

Tested lbzip2 version on my machine for command npm run download. Results are below:

Using lbzip2

542.68user 101.13system 1:49.16elapsed 589%CPU (0avgtext+0avgdata 210392maxresident)k
201176inputs+15375728outputs (1001major+1221746minor)pagefault 0swaps

Using bzip2

526.88user 168.91system 3:01.95elapsed 382%CPU (0avgtext+avgdata 48292maxresident)k
91608inputs+15375736outputs (275major+70098minor)pagefaults 0swaps

Frankly speaking, speed results are not so different just because we use parallel download and parallel run of several instances of bzip2 ( const simultaneousDownloads in download_data_all.js, for example ). Memory consumption is just 4 times higher for lbzip2 (200MB vs 50MB)

But reason why I start to investigate it is that in PiP there is hardcoded one single sqlite file whosonfirst-data-latest.db which extracted extremely slow by one CPU core when not using lbzip2.

@NickStallman
Copy link

Seems like a reasonable trade off to me. It's unlikely anyone is running this on a extremely low memory machine.

@loadit1 how many CPU cores did you have for your test?
I just did a test with my Threadripper 16 core test server with tons of ram on whosonfirst-data-latest.db.bz2.
bunzip2 was 10 mins 43 seconds.
lbunzip2 was 1 minute 16 seconds.

Even with the parallel download, on the Threadripper the CPU was never pushed very hard with bunzip2 so it would make a significantly bigger difference on larger systems.

@loadit1
Copy link
Contributor Author

loadit1 commented Feb 25, 2020

Hi @NickStallman

I don't remember exact specs on VM for my tests above, so I decided to test once again with 2 options:

  1. Single file extract: whosonfirst-data-latest.db.bz2
    This scenario is important when we follow pelias/docker instructions as during pelias download all step extraction of whosonfirst-data-latest.db.bz2 consume a lot of time because bzip2 utilizes only one CPU core during the extraction process.
  2. "Real" scenario with pelias/whosonfirst downloading and extraction. This is not "clean" from perspective of influence from network speed as well as files downloading in parallel and then extracting in parallel by bzip2, so we should have here better performance for multicore systems and less difference in extraction time comparing to single file extract.

Both options tested on Medium and Tiny VM setups.

HW specs of my Host machine:
Core i5-8250U, 24Gb RAM, Samsung SSD 840 EVO
I use Hyper-V, so below specs of Hyper-V VMs and their vCPUs.

Test 1: Medium VM setup
8 vCPUs (100% Host machine resource allocation), 16Gb RAM, Ubuntu Server 19.10

Option 1. Single file whosonfirst-data-latest.db.bz2
Download time: 2min 32sec (to consider network influence for pelias/whosonfirst test)
bzip2 extract: 13min 30sec
lbzip2 extract: 4min 0sec
Option 2. pelias/whosonfirst downloading and extraction:
bzip2 version time: 3min 10sec
lbzip2 version time: 1min 40sec

Test 2: Tiny VM setup
2 vCPUs (25% Host machine resource allocation), 2Gb RAM, Ubuntu Server 19.10

Option 1. Single file whosonfirst-data-latest.db.bz2
Download time: 2min 36sec (to consider network influence for pelias/whosonfirst test)
bzip2 extract: 13min 42sec
lbzip2 extract: 5min 33sec
Option 2. pelias/whosonfirst downloading and extraction:
bzip2 version time: 4min 3sec
lbzip2 version time: 4min 17sec

My test bash script:

#!/bin/bash
set -x

#Test created to run on fresh installed OS. Please consider to change or remove lines below if you already have some packages
apt-get -y update
apt-get -y upgrade
apt-get -y install lbzip2 bzip2 curl
curl -sL https://deb.nodesource.com/setup_12.x | bash -
apt-get -y install nodejs
apt-get -y autoremove
apt-get -y autoclean

mkdir /tmp/test
cd /tmp/test

# Download whosonfirst-data-latest.db.bz2
SECONDS=0
wget -O whosonfirst-data-latest.db.bz2 https://dist.whosonfirst.org/sqlite/whosonfirst-data-latest.db.bz2
echo "Download time: $(($SECONDS / 3600))hrs $((($SECONDS / 60) % 60))min $(($SECONDS % 60))sec"

SECONDS=0
# test bzip2
bzip2 -dk whosonfirst-data-latest.db.bz2
echo "bzip2 extract: $(($SECONDS / 3600))hrs $((($SECONDS / 60) % 60))min $(($SECONDS % 60))sec"
rm -rf whosonfirst-data-latest.db

SECONDS=0
# test lbzip2
lbzip2 -dk whosonfirst-data-latest.db.bz2
echo "lbzip2 extract: $(($SECONDS / 3600))hrs $((($SECONDS / 60) % 60))min $(($SECONDS % 60))sec"
#remove both files
rm -rf whosonfirst-data-latest.*

#test bzip2 on real pelias/whosonfirst. 
git clone https://github.com/pelias/whosonfirst.git
cd /tmp/test/whosonfirst
npm install
SECONDS=0
npm run download
echo "bzip2 version time: $(($SECONDS / 3600))hrs $((($SECONDS / 60) % 60))min $(($SECONDS % 60))sec"

cd /tmp/test
rm -rf whosonfirst

#test lbzip2 on real pelias/whosonfirst
git clone -b lbzip2 https://github.com/loadit1/whosonfirst
cd /tmp/test/whosonfirst
npm install
SECONDS=0
npm run download
echo "lbzip2 version time: $(($SECONDS / 3600))hrs $((($SECONDS / 60) % 60))min $(($SECONDS % 60))sec"

cd /tmp
rm -rf test

You may use my bash test script to test other HW configs even with less RAM if smaller setups is using widely.
It is make sense to review other Pelias repositories and add lbzip2 support in Dockerfile and scripts as using bzip2 and tar with flag -j (without flag --use-compress-program=lbzip2) is slowing down extraction process on multicore systems.

@missinglink
Copy link
Member

Thanks for the benchmarks, they really help to put my mind at ease!

Our docs say the absolute minimum RAM for Pelias is 8GB and we really recommend 16GB.

We just recently suffered a bug where running another program on a 64 core machine caused OOM errors due to each core potentially using >2GB RAM at peak, so I wanted to avoid that here.

It's a fairly recent issue since virtualization/containerization allows for uncommon CPU/RAM ratios these days 🤷‍♂️

Looking at your tests it seems that a ratio of 1GB per CPU core is adequate to avoid OOM errors.

I'm happy to merge this, thanks for taking the time to investigate 😃

@missinglink
Copy link
Member

Thanks @loadit1, I have added you to our @pelias/contributors team which means you can create your own branches within the pelias org repos.

When you do that, please prefix the branch with your username, eg loadit1/name-of-branch, the advantage of this is that docker images will be automatically built for every version of your branch like this and one for the tip of the branch like this.

Thanks 🎉

Screenshot 2020-02-25 at 10 56 35

@loadit1
Copy link
Contributor Author

loadit1 commented Feb 25, 2020

@missinglink good to know, thank you!

@orangejulius
Copy link
Member

Thanks @loadit1 for all the PRs and the extensive benchmarks :)

The faster extraction should really help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants