-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel py3 #43
Open
mxndrwgrdnr
wants to merge
41
commits into
master
Choose a base branch
from
parallel_py3
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Parallel py3 #43
Changes from all commits
Commits
Show all changes
41 commits
Select commit
Hold shift + click to select a range
e40dc69
parallelized synthpop and updated for python 3
mxndrwgrdnr e2b946d
cleaned up parallel processing code
mxndrwgrdnr 6caa811
added tqdm to travis config
mxndrwgrdnr 014b9d4
more packages for travis config
mxndrwgrdnr 1d2aea6
python 3 specifications for tests
mxndrwgrdnr 1a7de6f
more python3 fixes for tests
mxndrwgrdnr cae7992
update to ipu test to account for the fact that max_iterations no lon…
mxndrwgrdnr e2b8b2a
fixed ipu test for py3
mxndrwgrdnr 49138a2
pep8 fix
mxndrwgrdnr dd648f4
script to generate 9 county bay area population in parallel
mxndrwgrdnr af117fd
script to generate 9 county bay area population in parallel
mxndrwgrdnr a8a2876
fixed relative imports for tests
mxndrwgrdnr feaebc4
replaced pep8 with pycodestyle per pep8 UserWarning
mxndrwgrdnr 87a8929
travis fixes
mxndrwgrdnr 4072946
pycodestyle does not like bare 'except' clauses
mxndrwgrdnr b446b96
this might take too long for travis. let's see
mxndrwgrdnr 8416521
changed test county for starter2 to something smaller bc travis is ti…
mxndrwgrdnr 046166c
edited travis config to try and fix the issue
mxndrwgrdnr d97deeb
edited travis config to try and fix the issue
mxndrwgrdnr 8ab5cc7
still trying to fix memory error in travis
mxndrwgrdnr 8c94652
still trying to fix memory error in travis
mxndrwgrdnr 3783b02
still trying to fix memory error in travis
mxndrwgrdnr 8d1cc00
added unit test for census cache
mxndrwgrdnr 39601fb
added test for parallel synthesizer
mxndrwgrdnr 2f0e8b5
fixed indentation
mxndrwgrdnr 32c3e75
Merge branch 'master' into parallel_py3
mxndrwgrdnr 82a8f67
relaxed fit quality requirements for tests
mxndrwgrdnr fd70c64
Merge branch 'parallel_py3' of github.com:UDST/synthpop into parallel…
mxndrwgrdnr 9922683
retain runtime error for max_iterations in IPU and add ignore_max_ite…
mxndrwgrdnr de52355
increase wait time for travis build
mxndrwgrdnr 97c793b
porting latest changes from rome to oslo
mxndrwgrdnr 6acc0b8
oslo back to rome
mxndrwgrdnr 0c10761
use new parallel method in tests
mxndrwgrdnr 6098f17
updated travis yaml to use specific version of tqdm that should hopef…
mxndrwgrdnr 315bdaa
fixed style errors should pass tests now
mxndrwgrdnr c180116
fixed style errors should pass tests now
mxndrwgrdnr 5ab4799
starter2 parallel test
cvanoli 6b9b5e2
Add ignore_max_iterations var to synthesize_all functions
cvanoli 0631425
update setup.py
cvanoli 8c9e9ac
Correct the deleted acsyear missing in query function, add h_acs self
cvanoli d3a3a53
Merge branch 'parallel_py3' of https://github.com/UDST/synthpop into …
cvanoli File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
import os | ||
import pandas as pd | ||
from glob import glob | ||
import warnings | ||
from datetime import date | ||
from multiprocessing import freeze_support | ||
|
||
from synthpop.census_helpers import Census | ||
from synthpop.recipes.starter2 import Starter | ||
from synthpop.synthesizer import synthesize_all_in_parallel, \ | ||
synthesize_all_in_parallel_mp, \ | ||
synthesize_all_in_parallel_full | ||
|
||
warnings.filterwarnings('ignore') | ||
|
||
today = str(date.today()) | ||
|
||
counties = [ | ||
# "Alpine County", | ||
# "Napa County", | ||
"Santa Clara County", | ||
# "Solano County", | ||
# "San Mateo County", | ||
# "Marin County", | ||
# "San Francisco County", | ||
# "Sonoma County", | ||
# "Contra Costa County", | ||
# "Alameda County" | ||
] | ||
|
||
if __name__ == '__main__': | ||
|
||
freeze_support() | ||
|
||
for county in counties: | ||
print('#' * 80) | ||
print(' Processing {0} '.format(county).center(80, '#')) | ||
c = Census(os.environ["CENSUS"]) | ||
starter = Starter(os.environ["CENSUS"], "CA", county) | ||
# county_dfs = synthesize_all(starter, num_geogs=1) | ||
county_dfs = synthesize_all_in_parallel_full( | ||
starter, | ||
# max_workers=20, | ||
# num_geogs=100 | ||
) | ||
print('#' * 80) | ||
|
||
# hh_all = county_dfs[0] | ||
# p_all = county_dfs[1] | ||
# fits_all = county_dfs[2] | ||
|
||
# hh_all.index.name = 'household_id' | ||
# p_all.index.name = 'person_id' | ||
# p_all.rename(columns={'hh_id': 'household_id'}, inplace=True) | ||
|
||
# hh_all['age_of_head'] = p_all[p_all.RELP == 0].groupby( | ||
# 'household_id').AGEP.max() | ||
# hh_all['race_of_head'] = p_all[p_all.RELP == 0].groupby( | ||
# 'household_id').RAC1P.max() | ||
# hh_all['workers'] = p_all[p_all.ESR.isin([1, 2, 4, 5])].groupby( | ||
# 'household_id').size() | ||
# hh_all['children'] = p_all[p_all.AGEP < 18].groupby( | ||
# 'household_id').size() | ||
# hh_all['tenure'] = 2 | ||
# hh_all.tenure[hh_all.TEN < 3] = 1 # tenure coded 1:own, 2:rent | ||
# hh_all['recent_mover'] = 0 | ||
# hh_all.recent_mover[hh_all.MV < 4] = 1 # 1 if recent mover | ||
# hh_all = hh_all.rename(columns={ | ||
# 'VEH': 'cars', 'HINCP': 'income', 'NP': 'persons', | ||
# 'BLD': 'building_type'}) | ||
|
||
# for col in hh_all.columns: | ||
# if col not in [ | ||
# 'persons', 'income', 'age_of_head', 'race_of_head', | ||
# 'hispanic_head', 'workers', 'children', 'cars', 'tenure', | ||
# 'recent_mover', 'building_type', 'serialno', 'state', | ||
# 'county', 'tract', 'block group']: | ||
# del hh_all[col] | ||
|
||
# p_all.rename(columns={ | ||
# 'AGEP': 'age', 'RAC1P': 'race_id', 'NP': 'persons', | ||
# 'SPORDER': 'member_id', 'HISP': 'hispanic', 'RELP': 'relate', | ||
# 'SEX': 'sex', 'WKHP': 'hours', 'SCHL': 'edu', 'PERNP': 'earning', | ||
# 'JWTR': 'primary_commute_mode'}, | ||
# inplace=True) | ||
# p_all['student'] = 0 | ||
# p_all.loc[p_all.SCH.isin([2, 3]), 'student'] = 1 | ||
# p_all['work_at_home'] = 0 | ||
# p_all.loc[p_all.primary_commute_mode == 11, 'work_at_home'] = 1 | ||
# p_all['worker'] = 0 | ||
# p_all.loc[p_all.ESR.isin([1, 2, 4, 5]), 'worker'] = 1 | ||
# p_all['self_employed'] = 0 | ||
# p_all.loc[p_all['COW'].isin([6, 7]), 'self_employed'] = 1 | ||
|
||
# for col in p_all.columns: | ||
# if col not in ['household_id', 'member_id', | ||
# 'relate', 'age', 'sex', 'race_id', 'hispanic', | ||
# 'student', 'worker', 'hours', | ||
# 'work_at_home', 'edu', 'earning', 'self_employed']: | ||
# del p_all[col] | ||
|
||
# hh_all.to_csv('{0}_hh_synth_parallel_{1}.csv'.format( | ||
# county.replace(' ', '_'), today)) | ||
# p_all.to_csv('{0}_p_synth_parallel_{1}.csv'.format( | ||
# county.replace(' ', '_'), today)) | ||
|
||
# # concat all the county dfs | ||
# hh_fnames = glob('*hh*.csv') | ||
|
||
# p_df_list = [] | ||
# hh_df_list = [] | ||
# hh_index_start = 0 | ||
# p_index_start = 0 | ||
|
||
# for hh_file in hh_fnames: | ||
# county = hh_file.split('_hh')[0] | ||
# hh_df = pd.read_csv(hh_file, index_col='household_id', header=0) | ||
# p_df = pd.read_csv( | ||
# glob(county + '_p*.csv')[0], index_col='person_id', header=0) | ||
# print(county + ': {0}'.format(str(hh_df.iloc[0].county))) | ||
# hh_df.index += hh_index_start | ||
# p_df.household_id += hh_index_start | ||
# p_df.index += p_index_start | ||
# hh_df_list.append(hh_df) | ||
# p_df_list.append(p_df) | ||
# hh_index_start = hh_df.index.values[-1] + 1 | ||
# p_index_start = p_df.index.values[-1] + 1 | ||
|
||
# hh_all = pd.concat(hh_df_list) | ||
# p_all = pd.concat(p_df_list) | ||
# print(len(hh_all.iloc[hh_all.index.duplicated(keep=False)])) | ||
# print(len(p_all.iloc[p_all.index.duplicated(keep=False)])) | ||
# p_all.to_csv('sfbay_persons_2018_09_27.csv') | ||
# hh_all.to_csv('sfbay_households_2018_09_27.csv') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem comment to ipu.py line263.