-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve test flakiness #3
Comments
Merged
Merged
lydell
added a commit
that referenced
this issue
Jul 16, 2022
As a stop-gap solution for #3, retrying flaky tests improves confidence by getting those green checkmarks. The retried tests are logged so I should be able to scrape which ones are the flakiest and improve them over time. The Windows tests seem to fail consistently in CI though (but not locally) even with retry.
Here is a script to see which tests are retried the most: set token FILL_ME_IN
set dir (status dirname)/scrape
mkdir -p $dir
set workflow_runs (curl -H "Accept: application/vnd.github+json" "https://api.github.com/repos/lydell/elm-watch/actions/runs?per_page=100&created=>=2022-07-16&exclude_pull_requests=true" | jq -c '.workflow_runs[]')
set count (count $workflow_runs)
for i in (seq $count)
set workflow_run $workflow_runs[$i]
set name (string join \n -- $workflow_run | jq -r '.name')
if test $name != Test
echo "### $i/$count: Skipping: $name"
continue
end
set created_at (string join \n -- $workflow_run | jq -r '.created_at')
set logs_url (string join \n -- $workflow_run | jq -r '.logs_url')
set subdir $dir/$created_at
set zip $subdir/logs.zip
echo "### $i/$count: Download logs from $created_at to $subdir"
rm -rf $subdir
mkdir -p $subdir
curl -L -H "Authorization: token $token" -H "Accept: application/vnd.github+json" $logs_url >$zip
unzip -d $subdir $zip
end
set results_file $dir/results.tsv
rg 'RETRY ERRORS (.+)' $dir -or '$1' | rg '([^/]+Z)[^(]+\(([^,]+), (\d+)\)[^:]+:(.+)' -or '$1'\t'$2'\t'$3'\t'$4' >$results_file
cut -f 4 $results_file | sort | uniq -c | sort -nr |
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The tests are very comprehensive, and I’m very happy that they helped me find so many edge cases. They are written at a very high level, which gives a lot of confidence (and should make a potential rewrite in another language nice). However, the high level involves real time passing, real file system watching and real Web Sockets. While that did help me understand for example file watching better (like, how many watcher events do you get if the same file changes rapidly?), it does make the tests a bit flaky. I’ve used some pretty … clever … hacks to stabilize many tests, but not all.
Currently, the tests pass locally on Linux, Windows and macOS. In CI,
jest.retryTimes(2)
is needed but even with that one or two jobs usually fail and manually restarting them it’s possible to get all green checkmarks. So the tests still give a lot of confidence, they’re just a little bit annoying.It doesn’t help that I got tired of testing and “fixed” some tests with arbitrary sleeps (in the tests, not in the source code). The readme says “elm-watch is serious about stability” – so this is a bit embarrassing.
As soon as I get some more energy I want to get back to this and clean the tests up.
The text was updated successfully, but these errors were encountered: