Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for performance regressions in testing suite #174

Closed
gsarma opened this issue Jul 2, 2015 · 32 comments
Closed

Check for performance regressions in testing suite #174

gsarma opened this issue Jul 2, 2015 · 32 comments
Assignees

Comments

@gsarma
Copy link
Member

gsarma commented Jul 2, 2015

With @travs:
Yes, this is super meta, but we want to make sure the testing suite itself isn't taking longer than it is supposed to. Casually, it looks like there is about 20% variation in the time it takes these to run on my machine- I don't really know why.

Open questions:

  1. For individual tests, we can hard code timing information for catching performance regressions. What about in this case? Presumably, we don't want a build to fail, just because the tests took too long, but that information should be stored somewhere.
@cheelee
Copy link
Contributor

cheelee commented Jul 3, 2015

@gsarma @travs Hi, you don't have to hard code timing information if you don't have to. Running the test with "python -m cProfile -o testPerf.out test.py" will give you a (possibly overly) detailed performance profile to help you get an idea of where the time is spent. I can try to find the mechanisms by which we can reduce the profiling overhead, but it currently stands at 10% overhead.

@gsarma
Copy link
Member Author

gsarma commented Jul 3, 2015

Thanks Chee Wai- that's useful information! Maybe the bigger question then is to store this information and keep track of it over time? In the future, I think the right thing is to have some kind of anomaly detection framework so that we are notified if anything unusual happens. And as you say, this will allow us to not hard code timing information.

@cheelee
Copy link
Contributor

cheelee commented Jul 4, 2015

@gsarma It ought to be possible. I can do a quick check of what people are using to store/process performance information gathered this way to support performance regression testing for their codes. The output from using "-m cProfile" is a binary data file that the cProfile module in Python provides an access API for accessing and displaying the data:

https://docs.python.org/2/library/profile.html (26.4.3)
http://pymotw.com/2/profile/

I'm assuming there ought to be a way to store the profile as database entries in some MySQL performance database for performance regression purposes. If they do not yet exist, this might be a very good impetus for me to design and create something of this nature. I had been thinking for some time now that such a tool would be a very useful addition to any software engineering workflow.

@gsarma
Copy link
Member Author

gsarma commented Jul 4, 2015

@cheelee that sounds great! I think this should be part of a larger effort to have a web-based information dashboard for tracking tests across all repos.

@travs @slarson

@cheelee
Copy link
Contributor

cheelee commented Jul 4, 2015

@gsarma Commercial systems do exist for performance regression via web dashboards, but the one I'm familiar with (NewRelic - http://newrelic.com/) is somewhat expensive, and holds your data. Might be nice to design an extensible open source skeleton for supporting something similar for groups like ours.

@gsarma
Copy link
Member Author

gsarma commented Jul 4, 2015

@cheelee We could conceivably make a bare bones one to serve our needs with Jupyter notebooks.

@cheelee
Copy link
Contributor

cheelee commented Jul 4, 2015

@gsarma hmmm this looks interesting. Thanks, I'll check it out! https://jupyter.org/

@gsarma
Copy link
Member Author

gsarma commented Jul 4, 2015

@cheelee I think the most basic thing would be to simply have the .out files dumped somewhere that can be accessed from the web and write some simple tools to summarize the results. We'll need to figure out where to store these and then to setup an iPython server for OpenWorm.

@slarson @travs

@gsarma
Copy link
Member Author

gsarma commented Jul 7, 2015

Suggestion from @travs:
https://www.pythonanywhere.com

@cheelee
Copy link
Contributor

cheelee commented Jul 8, 2015

@gsarma That's a python development/execution cloud service. What we're gonna need is a way to store/process/present performance regression data. Some of my thoughts on this are:

  1. Storage - depending on frequency, nature and size of data (I'll elaborate below,) we'll need some large-ish free hosting ... my guess is in the 50Gb to 100Gb range over the long-term. The 500Mb offered by pythonanywhere will not be anything close to sufficient.
  2. Nature of data - I do not expect us to have to go past collecting profile information. This should limit our per-experiment (i.e., each test, not each battery of tests) to 50-100kb of data. Heaven help us if we find ourselves requiring detailed performance log traces.
  3. Frequency of data collection - in the long term, I'm expecting this to be a per-week thing unless we extend this to collecting performance regression data of some extensive test of the scientifically production simulation of the C. Elegans worm itself - that may require a daily performance regression test.

So given those rough estimates above, we're talking about 30-50 tests a week, which amounts to roughly 5Mb per week. It is fairly optimistic so while I guess 500Mb kinda works (less than 2 years of data under the above regime,) I'd be far more comfortable with more. Someone's local machine, or some kind of larger generic free hosting might work. Thoughts?

@gsarma
Copy link
Member Author

gsarma commented Jul 8, 2015

@cheelee thanks for the great analysis! I think @travs suggested Python Anywhere in the context of figuring how to host our own server for Jupyter notebooks. What would be the right way to do that?

In terms of hosting, this doesn't sound like it will be very expensive. Once we absolutely nail down what we need we can talk to Stephen.

@travs travs added this to the Generic testing milestone milestone Jul 9, 2015
@travs
Copy link

travs commented Jul 9, 2015

@cheelee @gsarma
At a glance, I think codespeed might be exactly what we're looking for...
Here's one implementation running for PyPy.

It's a django app, so that's 40+MB of overhead, but I think the framework it provides may outweigh the small amount of lost storage.

In any case, we can start on PythonAnywhere and scale accordingly.

@cheelee
Copy link
Contributor

cheelee commented Jul 26, 2015

Argh! Dropped the ball on this one. I'll put this on my TODOs, and try out codespeed and see if it can be configured to our purposes.

@cheelee
Copy link
Contributor

cheelee commented Jul 26, 2015

Alright, tested codespeed using its instructions and it sort of works. The necessary caveats:

  1. The release version has not been updated in a while, and will not work with modern environments.
  2. The trick is to use the current repository version (still maintained) and virtualenv to create a sandbox pip environment. The steps can be found in this gist: https://gist.github.com/cheelee/3423e1c580e079bf09e5

The next step is to get our data in the necessary json format, making changes to suit our needs as needed, and we should have a basic framework for regression analysis we can improve upon.

@cheelee cheelee self-assigned this Jul 26, 2015
@travs
Copy link

travs commented Jul 27, 2015

@cheelee
Awesome stuff here! Have you and @kevcmk discussed where we could potentially deploy this? Is PythonAnywhere suitable, or should we go somewhere else?

@cheelee
Copy link
Contributor

cheelee commented Jul 27, 2015

@travs Not yet. PythonAnywhere looks like a good place to start. Codespeed's workflow expects performance data generation to make use of codespeed too, so any initial deployment of this will be a prototype which we can use to craft something more suited to our needs. Right now I think @kevcmk and my plan is to quickly shoehorn the performance data we get from python's cProfile output into the json format codespeed's visualization/analysis unit expects. With this early prototype, we should be able to see enough to start coming up with ideas on crafting something that works better for us.

@cheelee
Copy link
Contributor

cheelee commented Jul 27, 2015

Realized I should throw up some screenshots of their sample data to give people a feel for what kind of visualizations to expect from codespeed, and drive some discussion on what more we could/would like to see. This is run on my Mac, but it can be served from pythonanywhere I think.

screen shot 2015-07-28 at 12 34 44 am

screen shot 2015-07-28 at 12 35 40 am

screen shot 2015-07-28 at 12 35 56 am

@travs
Copy link

travs commented Jul 27, 2015

Ok that plan definitely makes sense to me too. As for the PyAnywhere side of things, I recently deployed a Flask app to that service with minimal trouble. I used the Flask counterpart to this django tutorial to get the wsgi server up and running.

@cheelee Would you be interested in giving the django setup a shot?

@kevcmk
Copy link
Contributor

kevcmk commented Jul 28, 2015

@travs are you using Flask for the webserver? Or are you planning on running apache/nginx

@cheelee
Copy link
Contributor

cheelee commented Jul 28, 2015

@travs I can take a look. I'll first have to familiarize myself with PyAnywhere, and Django. Shouldn't be more than a few days! Do we have an OpenWorm account set up for PyAnywhere?

@travs
Copy link

travs commented Jul 28, 2015

@kevcmk To be clear here, the Flask app is a separate project, but I used Gunicorn as the webserver in that one locally, so when I pushed it to PythonAnywhere I just kept it (for now). PythonAnywhere can also use uWSGI+nginx to serve up the Flask app directly, so my gunicorn setup is kind of needless.

@cheelee I think that account I showed you a few weeks ago should be the one we use for now. If you need the creds again let me know!

@cheelee
Copy link
Contributor

cheelee commented Jul 28, 2015

@travs Ah ok. Time Machine to the rescue! If I can't rescue it, I'll let you know! Thanks! Still chugging through that long Django tutorial, but I think I got the gist of what PythonAnywhere will support. I'll soon put codespeed on PythonAnywhere and try it out.

@cheelee
Copy link
Contributor

cheelee commented Jul 29, 2015

@gsarma @travs @kevcmk Codespeed with sample data is up and running at my own PythonAnywhere account. You ought to be able to try it at http://cheewai1972.pythonanywhere.com/

Next step is to design a workflow for automatically getting performance data incrementally added to the tool for analysis and an updated display.

@kevcmk
Copy link
Contributor

kevcmk commented Jul 29, 2015

@cheelee Should we just follow their use case, using codespeed's REST routes to submit the data from Jenkins? Or do you see any advantages to writing our own inserts.

@cheelee
Copy link
Contributor

cheelee commented Jul 29, 2015

@kevcmk I'm not entirely familiar with REST, and not at all with Jenkins. Would you mind giving me a summary of their features and capabilities? We can do this over our private gitter chat, and I can also spend a few hours tonight familiarizing myself with codespeed's relationship with using the REST model, and with Jenkins. My current thoughts are naively using codespeed as a template for creating our own tool which could involve quite a bit of effort. Anything to reach for low-hanging fruit without any unnecessary effort would be a great boon to getting a performance regression framework for OpenWorm off the ground :)

@travs
Copy link

travs commented Jul 29, 2015

@cheelee Awesome stuff! 😄

@kevcmk If I am understanding this correctly, using the builtin REST routes seems like a good idea. I wonder is there a way to submit data from TravisCI rather than Jenkins? I've only ever used the former myself.

@slarson
Copy link
Member

slarson commented Jul 29, 2015

@gsarma @cheelee @kevcmk Good stuff -- I've moved this issue under the milestone: "Make all functions return in 1 second or less".

@cheelee @kevcmk I wonder if we can use the functions named in the specific issues #42, #90, #21 as our first functions in PyOpenWorm to put under performance regression? This would help move the whole milestone forward as well as giving a specific focus in a lean way to this effort.

@kevcmk
Copy link
Contributor

kevcmk commented Jul 31, 2015

@travs Correction -- we'll run the submits from Travis, not Jenkins. And regarding the REST routes (also @cheelee ) Codespeed has a JSON api (/result/add/json/), I'll be sitting down to figure that out in the next couple have days.

@slarson I think that's a good idea. Chee and I will use those as a benchmark.

@cheelee
Copy link
Contributor

cheelee commented Jul 31, 2015

@kevcmk Awesome! Meanwhile, I'm trying to get a handle on those specific tests (e.g. which tests, and how to run just those tests).

We don't really have to limit ourselves to running just those tests since your changes to the testing framework pretty much allows us to get performance profiles on every single one of those tests, but keeping our performance data compact and relevant might be the right way to go at first.

If anyone has the dibs on how I can run each test individually (am looking at the testing framework code right now to figure this out - this is not documented to the best of my knowledge)

@travs
Copy link

travs commented Jul 31, 2015

@cheelee to run an individual test you can do (from the project root):

# example: python -m unittest tests.test.TestCase.test_name
python -m unittest tests.test.NeuronTest.test_same_name_same_id

@cheelee
Copy link
Contributor

cheelee commented Aug 1, 2015

@travs Thanks! I'll update the README.md documentation if that's not already done.

@mwatts15
Copy link
Contributor

Some good ideas here (particularly the use of Python's cProfile for tracking performance in a cross-platform way), but there's no real acceptance criteria. For now, the build time on Travis-CI is good enough for identifying regressions, so closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants