Improve test flakiness #3

lydell · 2022-03-12T12:55:26Z

The tests are very comprehensive, and I’m very happy that they helped me find so many edge cases. They are written at a very high level, which gives a lot of confidence (and should make a potential rewrite in another language nice). However, the high level involves real time passing, real file system watching and real Web Sockets. While that did help me understand for example file watching better (like, how many watcher events do you get if the same file changes rapidly?), it does make the tests a bit flaky. I’ve used some pretty … clever … hacks to stabilize many tests, but not all.

Currently, the tests pass locally on Linux, Windows and macOS. In CI, jest.retryTimes(2) is needed but even with that one or two jobs usually fail and manually restarting them it’s possible to get all green checkmarks. So the tests still give a lot of confidence, they’re just a little bit annoying.

It doesn’t help that I got tired of testing and “fixed” some tests with arbitrary sleeps (in the tests, not in the source code). The readme says “elm-watch is serious about stability” – so this is a bit embarrassing.

As soon as I get some more energy I want to get back to this and clean the tests up.

The text was updated successfully, but these errors were encountered:

As a stop-gap solution for #3, retrying flaky tests improves confidence by getting those green checkmarks. The retried tests are logged so I should be able to scrape which ones are the flakiest and improve them over time. The Windows tests seem to fail consistently in CI though (but not locally) even with retry.

lydell · 2022-07-23T11:21:39Z

Here is a script to see which tests are retried the most:

set token FILL_ME_IN

set dir (status dirname)/scrape
mkdir -p $dir

set workflow_runs (curl -H "Accept: application/vnd.github+json" "https://api.github.com/repos/lydell/elm-watch/actions/runs?per_page=100&created=>=2022-07-16&exclude_pull_requests=true" | jq -c '.workflow_runs[]')

set count (count $workflow_runs)

for i in (seq $count)
    set workflow_run $workflow_runs[$i]
    set name (string join \n -- $workflow_run | jq -r '.name')
    if test $name != Test
        echo "### $i/$count: Skipping: $name"
        continue
    end
    set created_at (string join \n -- $workflow_run | jq -r '.created_at')
    set logs_url (string join \n -- $workflow_run | jq -r '.logs_url')
    set subdir $dir/$created_at
    set zip $subdir/logs.zip
    echo "### $i/$count: Download logs from $created_at to $subdir"
    rm -rf $subdir
    mkdir -p $subdir
    curl -L -H "Authorization: token $token" -H "Accept: application/vnd.github+json" $logs_url >$zip
    unzip -d $subdir $zip
end

set results_file $dir/results.tsv
rg 'RETRY ERRORS  (.+)' $dir -or '$1' | rg '([^/]+Z)[^(]+\(([^,]+), (\d+)\)[^:]+:(.+)' -or '$1'\t'$2'\t'$3'\t'$4' >$results_file
cut -f 4 $results_file | sort | uniq -c | sort -nr

lydell mentioned this issue Jul 9, 2022

Fix tests on Windows #9

Merged

lydell mentioned this issue Jul 16, 2022

Retry flaky tests #12

Merged

lydell mentioned this issue Aug 2, 2022

Make it possible for tests to pass in Windows CI #13

Merged

lydell mentioned this issue Oct 18, 2022

Support HTTPS with external SSL handoff #39

Closed

lydell mentioned this issue Sep 22, 2024

Improve tests #102

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve test flakiness #3

Improve test flakiness #3

lydell commented Mar 12, 2022 •

edited

Loading

lydell commented Jul 23, 2022

Improve test flakiness #3

Improve test flakiness #3

Comments

lydell commented Mar 12, 2022 • edited Loading

lydell commented Jul 23, 2022

lydell commented Mar 12, 2022 •

edited

Loading