miri: Enable tests using epoll_wait #19956

def- · 2023-06-14T22:12:36Z

Seems to have been implemented recently in rust-lang/miri#2764, I'll work on fixing this up. Probably an alternative to #19950

Based on recent segfaults we definitely want more Miri testing.

Motivation

Tips for reviewer

Checklist

This PR has adequate test coverage / QA involvement has been duly considered.
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
This PR includes the following user-facing behavior changes:

danhhz · 2023-06-15T15:24:39Z

src/persist/src/s3.rs

@@ -926,7 +926,6 @@ mod tests {

    #[mz_ore::test(tokio::test(flavor = "multi_thread"))]
    #[cfg_attr(coverage, ignore)] // https://github.com/MaterializeInc/materialize/issues/18898
-    #[cfg_attr(miri, ignore)] // unsupported operation: can't call foreign function `epoll_wait` on OS `linux`


I think it'd also be totally reasonble to continue ignoring the persist stuff that actually uses external network, if that's easier or lets us use more aggressive timeouts:

file_blob

postgres_consensus

s3_blob

(off the top of my head, I think everything else under src/persist* uses in-mem impls). if you do, could also delete these env whitelisted above

--env MZ_PERSIST_EXTERNAL_STORAGE_TEST_S3_BUCKET --env MZ_PERSIST_EXTERNAL_STORAGE_TEST_POSTGRES_URL

def- · 2023-06-16T07:20:47Z

One problem is that this increases the miri runtime to 39 minutes. At least it can run in parallel with the normal build + tests but this will make our normal tests CI slower. If it's not acceptable I can disable some more tests, or have to figure out a way to split this into multiple parts (like testdrive and cloudtest), but there is some overhead in that case since they all have to build the tests again, and use builder agents for that.

The counter argument is that this found quite a few potential issues, so seems valuable. Could also maybe cut down the number of tests here further, and move the full suite to nightly. But I already put in more work than expected into this the last 2 days.

Based on recent segfaults we definitely want more Miri testing.

def- · 2023-07-13T09:51:00Z

One problem is that this increases the miri runtime to 39 minutes. At least it can run in parallel with the normal build + tests but this will make our normal tests CI slower. If it's not acceptable I can disable some more tests, or have to figure out a way to split this into multiple parts (like testdrive and cloudtest), but there is some overhead in that case since they all have to build the tests again, and use builder agents for that.

The counter argument is that this found quite a few potential issues, so seems valuable. Could also maybe cut down the number of tests here further, and move the full suite to nightly. But I already put in more work than expected into this the last 2 days.

@philip-stoev @benesch Opinions? With 6 workers it's possible to keep this in normal tests. The main disadvantage is that each worker has to build the tests again. The main disadvantage of moving to nightly is that devs wouldn't notice Miri failures immediately and QA has more to do.

My preference would be a split approach, same as for SLT. Keep a small set of fast Miri running in tests, put the rest into a slow nightly. But I'd like some confirmation before I rework that again. Maybe we're fine with the extra cost of keeping it in tests.

philip-stoev · 2023-07-13T10:16:57Z

So you are saying that miri regularily detects regressions and is not one of those CI jobs that always pass without anything of interest? If this is the case, a split approach would be preferable to parallelism of 6 . six is a lot

def- · 2023-07-13T10:30:59Z

Not sure how regular, devs fix them before submitting since it's part of tests run currently.

def- · 2023-07-13T11:35:30Z

@philip-stoev Implemented. Should be good to go in now. Nightly to verify successfully completed: https://buildkite.com/materialize/nightlies/builds/2817#_

ParkMyCar

Surface changes LGTM, glad we're getting more miri coverage!

benesch · 2023-07-13T18:02:06Z

My preference would be a split approach, same as for SLT. Keep a small set of fast Miri running in tests, put the rest into a slow nightly.

My preference as well!

def- mentioned this pull request Jun 14, 2023

test: Have mz_ore::test spawn it's own runtime #19950

Closed

5 tasks

def- force-pushed the pr-epoll-miri branch 5 times, most recently from 1ac4112 to 8a3b731 Compare June 15, 2023 07:47

def- changed the title ~~miri: Try enabling epoll_wait~~ miri: Enable tests using epoll_wait Jun 15, 2023

def- force-pushed the pr-epoll-miri branch 3 times, most recently from 8e3d2d9 to d5b2c0b Compare June 15, 2023 14:15

danhhz reviewed Jun 15, 2023

View reviewed changes

def- force-pushed the pr-epoll-miri branch 6 times, most recently from 954bccd to d5fd537 Compare June 15, 2023 22:52

def- marked this pull request as ready for review June 16, 2023 07:18

def- requested review from a team June 16, 2023 07:18

def- requested review from benesch, a team and umanwizard as code owners June 16, 2023 07:18

def- requested a review from nrainer-materialize June 16, 2023 07:18

def- force-pushed the pr-epoll-miri branch from d5fd537 to 90d6131 Compare June 16, 2023 07:23

nrainer-materialize approved these changes Jun 16, 2023

View reviewed changes

def- force-pushed the pr-epoll-miri branch from 90d6131 to 6c794cc Compare July 13, 2023 09:18

miri: Enable tests using epoll_wait

c019367

Based on recent segfaults we definitely want more Miri testing.

def- force-pushed the pr-epoll-miri branch from 6c794cc to c019367 Compare July 13, 2023 09:51

def- requested review from philip-stoev and nrainer-materialize and removed request for nrainer-materialize July 13, 2023 10:15

def- force-pushed the pr-epoll-miri branch from f169e2e to 71ec550 Compare July 13, 2023 10:43

Split miri into two parts, run slow only in nightly

8f7dcec

def- force-pushed the pr-epoll-miri branch from 71ec550 to 8f7dcec Compare July 13, 2023 10:48

ParkMyCar approved these changes Jul 13, 2023

View reviewed changes

def- merged commit 5149d3d into MaterializeInc:main Jul 14, 2023

def- deleted the pr-epoll-miri branch July 14, 2023 07:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

miri: Enable tests using epoll_wait #19956

miri: Enable tests using epoll_wait #19956

def- commented Jun 14, 2023 •

edited

Loading

danhhz Jun 15, 2023

def- commented Jun 16, 2023 •

edited

Loading

def- commented Jul 13, 2023 •

edited

Loading

philip-stoev commented Jul 13, 2023

def- commented Jul 13, 2023

def- commented Jul 13, 2023 •

edited

Loading

ParkMyCar left a comment

benesch commented Jul 13, 2023

miri: Enable tests using epoll_wait #19956

miri: Enable tests using epoll_wait #19956

Conversation

def- commented Jun 14, 2023 • edited Loading

Motivation

Tips for reviewer

Checklist

danhhz Jun 15, 2023

Choose a reason for hiding this comment

def- commented Jun 16, 2023 • edited Loading

def- commented Jul 13, 2023 • edited Loading

philip-stoev commented Jul 13, 2023

def- commented Jul 13, 2023

def- commented Jul 13, 2023 • edited Loading

ParkMyCar left a comment

Choose a reason for hiding this comment

benesch commented Jul 13, 2023

def- commented Jun 14, 2023 •

edited

Loading

def- commented Jun 16, 2023 •

edited

Loading

def- commented Jul 13, 2023 •

edited

Loading

def- commented Jul 13, 2023 •

edited

Loading