-
Notifications
You must be signed in to change notification settings - Fork 100
Project Meeting 2021.01.12
Ben Stabler edited this page Jan 13, 2021
·
8 revisions
- Discuss Doyle completing estimation mode for trip mode choice
- Trip mode choice now done and was quite complex and needed to update the 2 and 3 examples as well
- Results not exactly the same since we merged in some other changes that changed results
- I can now review the estimation branch
- Doyle to now work on performance tuning (see below)
- Discuss Jeff Newman's nmtf estimation notebook
- Will rebase his cdap and nmtf notebooks off of estimation branch
- And send me note to review when ready
- Jeff will also test the other estimation notebooks and update them to be more automatable/testable since they're sort of software and sort of training materials right now
- Then we'll pause the task and discuss how to make fewer smarter / more generic notebooks rather than a bunch of separate notebooks for each submodel
- Discuss Joe's copyright and licensing history
- The basic plan is to:
- Update the license so it says Copyright AMPORF
- Make sure work for hire is in the contracts
- Add a Contributor License Agreement like this one and have GitHub automatically manage it with Pull Requests
- Refactor out the orca code under a bench contract work order
- The license will stay BSD-3
- Let's discuss the partner MOU/payment agreement on Thursday
- Discuss PSRC progress and need for location sampling improvements
- 40k MAZ model with 100k HHs is up and running from start to finish
- Ran into issues with location sampling since it's considering all 40k alternatives every time for every chooser
- Would be good to do something like DaySim's two stage sampling approach - basically do it at the taz level and then pick an maz within each taz based on the maz's share of the size term. It also pre-calculates the taz to taz probability matrix and then just draws random numbers and picks a zone.
- We could also filter on size term > 0 before sol ving expressions, which would help with sparse alternative sets like school/university
- These issues are similar to the performance improvements tasks below so let's add this to that discussion for consideration
- Using importance sampling like DaySim is a good approach and we'll need to implement something like it for the SANDAG cross border model
- Could do it at the start of each process since its fast and then we don't have to persist / share across processes
- Discuss Doyle's list of potential performance improvements
- expression file optimization
- good to make these templates as smart as possible since new users rely on them
- could speed up the model maybe 50%?
- finish adaptive chunking
- automatically determine chunksize based on available memory
- explore adaptive chunking based on actual memory usage rather than ‘registered’ data objects
- deduplicate alternatives as discussed for ARC
- cache logsums, which needs categories/market segments defined
- would help a lot; would require a lot of plumbing updates
- data format and size optimization.
- (e.g. rightsize numbers and convert strings to factors)
- pipeline optimization
- alternative pipeline file format (e.g. feather)
- improve control over checkpointing (pipeline footprint and read/write time)
- Alex likes this one, see his email
- two stage location sampli ng for PSRC like DaySim
- run arc and optimize where appropriate
- run psrc and optimize where appropriate
- We will run the ARC and PSRC versions of the model and review submodel and component runtimes and memory usage to help inform our discussion of what to work on
- Review logs and snakeviz profiler
- Stefan and Guy/Clint to share setups
- It would be good to understand level-of-effort since maybe we can do a few smaller easier ones and then a big one or two?
- expression file optimization