-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Create performance tests [WEB-1458] #7741
Conversation
Co-authored-by: Calderon, Carolina <[email protected]>
✅ Deploy Preview for determined-ui canceled.
|
"@babel/core": "7.13.16", | ||
"@babel/plugin-proposal-class-properties": "7.13.0", | ||
"@babel/plugin-proposal-object-rest-spread": "7.13.8", | ||
"@babel/preset-env": "7.13.15", | ||
"@babel/preset-typescript": "7.13.0", | ||
"@types/k6": "^0.45.3", | ||
"@types/webpack": "5.28.0", | ||
"babel-loader": "8.2.2", | ||
"clean-webpack-plugin": "4.0.0-alpha.0", | ||
"copy-webpack-plugin": "^9.0.1", | ||
"typescript": "4.2.4", | ||
"webpack": "5.76.1", | ||
"webpack-cli": "5.0.1", | ||
"webpack-glob-entries": "^1.0.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that the webui uses vite/esbuild and this has us using webpack/babel, consider tagging some work to migrate this to vite using library mode: https://vitejs.dev/guide/build.html#library-mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ticket here: https://hpe-aiatscale.atlassian.net/browse/WEB-1623
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! I was able to run through all the steps and see the results and JUNIT export. Let some comments.
How did you generate the JUNIT html pages?
"@babel/core": "7.13.16", | ||
"@babel/plugin-proposal-class-properties": "7.13.0", | ||
"@babel/plugin-proposal-object-rest-spread": "7.13.8", | ||
"@babel/preset-env": "7.13.15", | ||
"@babel/preset-typescript": "7.13.0", | ||
"@types/k6": "^0.45.3", | ||
"@types/webpack": "5.28.0", | ||
"babel-loader": "8.2.2", | ||
"clean-webpack-plugin": "4.0.0-alpha.0", | ||
"copy-webpack-plugin": "^9.0.1", | ||
"typescript": "4.2.4", | ||
"webpack": "5.76.1", | ||
"webpack-cli": "5.0.1", | ||
"webpack-glob-entries": "^1.0.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙏
Another things to note, The HTML was actually created using a python library I support adding a CI step to create the |
That sounds great, having it as an artifact would be sweet! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates!
Description
The goal of this PR is to both introduce load tests for the purpose of benchmarking the current performance of our system and implement a system that will easily allow us to implement future tests. This PR will describe the current testing setup and future possibilities and enhancements, and add a few important notes about k6 that may be important in future updates.
For reference there is a prototype branch that contains a more in-depth setup for the load tests the branch is web-1458-prototype.
Current Setup
Running the test
junit2html
python package.(within performance/determined)
npm install
to install dependenciesnpm start
to build the 'api_performance_tests.js` test filek6 run -e DET_MASTER=http://localhost:8080 build/api_performance_tests.js to run the file built in (3) the
DET_MASTER` env var will set the url for the test clusterjunit.xml
file will be generated containing a test report.junit2html junit.xml
to create an html report from the generated xml.The results of the test are:
junit.xml
file with pass/fail test results as well as http request duration statistics for each test.html
file created from (2) with similar information.To note, I had originally planned on implementing the test schema in web-1458-prototype but after discussions with @ashtonG we decided that it was a bit excessive for the ultimate goal of this ticket.
Example Results
Console output example:
![653B195D-F509-4ECD-A6BD-2B7972C2470A](https://private-user-images.githubusercontent.com/103522725/264444775-221cc9c0-43d7-4ba3-88ef-cad3008415bb.jpeg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5ODU1NzUsIm5iZiI6MTczOTk4NTI3NSwicGF0aCI6Ii8xMDM1MjI3MjUvMjY0NDQ0Nzc1LTIyMWNjOWMwLTQzZDctNGJhMy04OGVmLWNhZDMwMDg0MTViYi5qcGVnP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxOSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTlUMTcxNDM1WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9Mzk0MTkyZmE2N2UxMDIwZTc5MzI5NDU5MzQ4MDllOWZjMzViZjM5NjFkNWZlMWQ0ZGJmYmFkMWUxYjI5NjNkMSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.3O54_WGQDlli_B9vNyRLZbfb_f-wimA_WfbtGd6PcGA)
jUnit output example (html version):
![815CF64C-66E2-4E6C-BE74-E798B2C1C4B3](https://private-user-images.githubusercontent.com/103522725/264444556-dacdbc59-29c4-447f-b04e-27ce3150f93d.jpeg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5ODU1NzUsIm5iZiI6MTczOTk4NTI3NSwicGF0aCI6Ii8xMDM1MjI3MjUvMjY0NDQ0NTU2LWRhY2RiYzU5LTI5YzQtNDQ3Zi1iMDRlLTI3Y2UzMTUwZjkzZC5qcGVnP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxOSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTlUMTcxNDM1WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NzMyODdmN2I0NjU1NjA5NGQ3ZjgzNWMyMDY4ZWEwNzVhMTFjNjhiZTgwNTQ4ZTYyOTU3ZWU4NDM3NzQ5ZTdhMCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.0RwWjfHEapZcLlnjiED2BNrXg0o-Keu0DVGJeQPiK_E)
Test Logic
The current testing setup allows us to benchmark the current system performance by implementing a single "average load" test that simulates a ramping number of queries to the
master
. The current test will simulate25
users querying the master endpoint. The test will ramp up to25
users over the course of5
minutes, Then sustain that request rate for10
minutes, then ramp down to0
users over a period of 5 minutes. The total test runtime is20
minutes This is to simulate anaverage
load on the system. The25
users was based on of the number of users that recursion has in their system which is about20
.k6
allows for setting important thresholds for a given test. Currently, two thresholds are set. First, I added arequest failed
threshold that will abort and fail the test if more than1
percent of the HTTP requests fail. The idea being that if we are seeing that many tests fail we likely will want to investigate the cause and should not allow the test to pass.Secondly, I have added a threshold for the request duration, the threshold expects more than
95%
of all http requests to have a duration of less than1 second
, which is the overall performance goal for our system. The test suite is currently setup so that tests will not fail if this threshold is crossed. However, this gives us the ability to easily view this metric in test reports.Sample Extension
Additional tests can be added using the
test
construct created in this PR. An example extra test to query the telemetry endpoint is:example output after the test addition (the testing stages were made shorter for example purposes):
console output:
![Screen Shot 2023-08-30 at 3 00 43 PM](https://private-user-images.githubusercontent.com/103522725/264445931-4ca325f7-f797-4614-bdd4-8c73dc4bf31b.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5ODU1NzUsIm5iZiI6MTczOTk4NTI3NSwicGF0aCI6Ii8xMDM1MjI3MjUvMjY0NDQ1OTMxLTRjYTMyNWY3LWY3OTctNDYxNC1iZGQ0LThjNzNkYzRiZjMxYi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxOVQxNzE0MzVaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT01MTQ0OGU1MmViYzMzNzI3YzFhYWQ2MDA5OTNkNjcyNzM0MDVlZmJkNTRkMzBkZmU5YjM1OWI5NjA5YTViZjYyJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.9vU9jGkn9JjQVaw2MJ_HQrTxEzqjmHOdquuAl52jRDk)
jUnit output example (html version):
![Screen Shot 2023-08-30 at 3 01 24 PM](https://private-user-images.githubusercontent.com/103522725/264446117-5e2c3b2f-dfd5-47d5-a97c-11be5590d58c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5ODU1NzUsIm5iZiI6MTczOTk4NTI3NSwicGF0aCI6Ii8xMDM1MjI3MjUvMjY0NDQ2MTE3LTVlMmMzYjJmLWRmZDUtNDdkNS1hOTdjLTExYmU1NTkwZDU4Yy5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxOVQxNzE0MzVaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1mZDI3YWQ5YTc2YjU2ZjU3NGQyYmRkNjk1ZGI5Nzk2MWM3ZDYzN2EyNTE2Mjg5OWNlOWM3OTFhMzFmYWY4ZDAwJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.bZ-ZHOYkaCkW5vfnxjS_g8JGDANvkTvdGHN1sh4Yyyg)
Test Plan
Commentary
Future Possibilities and Enhancements and Important Notes
Reference: web-1458-prototype
Test Structure
Load testing will often implement different types of test scenarios in order to track system performance under different test situations. The most common scenarios being smoke, average load, stress, soak, spike, and breakpoint tests.
An example set up
k6
would look similar to this:example console output:
![84B51C06-AC27-437A-B6F2-9FFAAF0CF73E](https://private-user-images.githubusercontent.com/103522725/263350901-ccd07f96-651a-448a-b26f-72018a5d5104.jpeg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5ODU1NzUsIm5iZiI6MTczOTk4NTI3NSwicGF0aCI6Ii8xMDM1MjI3MjUvMjYzMzUwOTAxLWNjZDA3Zjk2LTY1MWEtNDQ4YS1iMjZmLTcyMDE4YTVkNTEwNC5qcGVnP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxOSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTlUMTcxNDM1WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZDAzOTYyNDVlNGI3NWNhOTE2YzlhMWJjZTEzZmQ1MTViOWMyOTcxOTgyNzgzOWNmODI5YjU3ODczYWQwNDZlZiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.tlqJ6E2yxAmuDvaDVUy_ZiNsP2ntHJzYORvKCe1mVUA)
example jUnit output:
![7EB467A7-B5B2-4CB3-B809-44136DB99822](https://private-user-images.githubusercontent.com/103522725/263350960-44f10ae5-8e05-43e0-a07d-b8d9e24928ac.jpeg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5ODU1NzUsIm5iZiI6MTczOTk4NTI3NSwicGF0aCI6Ii8xMDM1MjI3MjUvMjYzMzUwOTYwLTQ0ZjEwYWU1LThlMDUtNDNlMC1hMDdkLWI4ZDllMjQ5MjhhYy5qcGVnP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxOSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTlUMTcxNDM1WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MmJlMTlhZDBhNzM5Mzk0Y2E3YTk5ZDAzMGZhNDdmODhkMGEzOTVmODRjNmYyMWJlZWE2OTExNTc1YjQwNzVjMSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.HdCsk-52CaJZUH6pkAcYodQshjhgkRp_T_CKs49-HMs)
in the above example virtual users are spun up and down over a specified duration of time to simulate variations in web traffic. You can reference the k6 scenario documentation to learn more about how this configuration works.
In the future we will likely want to move to this sort of test scheme so that we can gather a holistic view of our system performance under different load types. Additionally, we will likely want to implement much longer running tests for some scenarios.
User Initialization
During implementation planning with @stoksc we discussed that we will likely want to be able to track unique users per test. For example, we might want to track performance for RBAC users with differing permissions.
k6
has a few utilities using unique data within tests however there are some caveats. The largest one being thatk6
does not allow for makinghttp
requests during the test initializing phase, meaning we cannot implement logic such as:there are alternative workflows we could implement but I did find it worth calling this fact out.
The current setup does not implement any sort of
login
orunique user
configuration.API Bindings and Typescript
The
k6
recommended k6-template-typescript project was used to generate a typescript project for our test suite. The current setup does not use our generated typescript bindings but in the future we may want to, this was one of the main considerations for making this a typescript project. I wanted to add this note since it came up in discussions with @loksonariusResult Reporting
There are a few quirks around metric reporting that were found during this implementation that are worth calling out.
Limitations around reporting results within k6
k6 gives the ability to tag and group tests in various ways. Tests can easily be tagged via custom tags, endpoint, groups, etc. k6 gives the ability to render a custom report output via a handle summary method that is defined within the test suite. However, all information and details regarding tags are scrubbed from the
data
that thehandleSummary
method receives. You can read more in this github issue grafana/k6#1321 as well as this thread about the lack of tag data: https://community.grafana.com/t/show-tag-data-in-output-or-summary-json-without-threshold/99320for example no matter how you tag any metrics, even custom metrics, the data available for writing the report will look as follows:
as you can see there are no mentions of any tags. The workaround is to add
thresholds
for each tag that you want to follow, this will causek6
to show more information regarding the tag in the output. A code example can be seen here:https://github.com/determined-ai/determined/compare/web-1458-prototype#diff 31e2b17ee608e49eacf18f0b0b17988d36964f621b588600e701bfce8466649aR69 and you can see in the example outputs above how the tag information becomes available in the test output.Future Results Reporting in k6
Thankfully, all information regarding tags is kept within the individual data points created during testing. This data is what will be sent to Grafana for example when we want to implement external result viewing, so we will still be able to build custom dashboards and charts when we decide to enable viewing results sent to a time series db.
Additionally, the individual data points mentioned above can be written to a
json
orcsv
file. In the future we could write custom file parsing logic to build a more in depth report from the data in the output fileFor reference here is an example point from the file mentioned above, you will notice that all tag data is present.
Checklist
docs/release-notes/
.See Release Note for details.
Ticket
WEB-1458