Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo test doesn't check against conjure #508

Open
YehorBoiar opened this issue Nov 27, 2024 · 11 comments
Open

cargo test doesn't check against conjure #508

YehorBoiar opened this issue Nov 27, 2024 · 11 comments
Assignees
Labels
area::conjure-oxide Related to conjure_oxide. kind::discussion General discussion and high-level planning. kind::testing Testing and Correctness

Comments

@YehorBoiar
Copy link
Contributor

Conjure-oxide checks solutions against savile row only when ACCEPT=true is passed, but shouldn't it always do that?

I was thinking that because in PR #435 I currently have a couple of solutions that don't match conjure, but it shows that all tests are being passed. It could lead to some problems in future PRs. However, I'm not sure about it, what do you think? @niklasdewally

@YehorBoiar YehorBoiar added area::conjure-oxide Related to conjure_oxide. kind::testing Testing and Correctness kind::discussion General discussion and high-level planning. labels Nov 27, 2024
@niklasdewally
Copy link
Collaborator

I presume you mean Conjure?

When you run ACCEPT=true, it should save the expected solutions to the file, checking these against Conjure. When testing our model we compare our solutions to these. As they have already been checked by Conjure during code-gen, we shouldn't need to do so again?

@niklasdewally
Copy link
Collaborator

The fact that the integration tests pass but conjure fails is definitely a bug though.

We have started having issues with CI tests failing for no reason, in particular macos-latest, which might be related

@ozgurakgun
Copy link
Contributor

I agree we shouldn't run conjure every time we run integration tests since the output shouldn't change.

If there is a case where it did, that would definitely be a bug (or some sort of a major conjure version upgrade, but that didn't happen)

@YehorBoiar can you explain what's happening? How can we reproduce the issue?

@ozgurakgun ozgurakgun changed the title cargo test doesn't check against savilerow cargo test doesn't check against conjure Nov 27, 2024
@YehorBoiar
Copy link
Contributor Author

Here is the bug:

When I run ACCEPT=true cargo test I get my tests_integration_basic_max_02 failed because solutions don't match conjure
image

However, despite that, my expected parse.serialised, rewrite.serialised, and minion.solutions are still being rewritten to its genereted files. So next time I run cargo test it shows that my tests_integration_basic_max_02 is fine.
image

@ozgurakgun
Copy link
Contributor

ozgurakgun commented Nov 28, 2024

All I can say without looking too much into it is that a=3 shouldn't be a solution. the test case Essence is:

find a : int(0..3)
such that max([2,a]) <= 2

Something is going wrong, not sure what.

@YehorBoiar
Copy link
Contributor Author

YehorBoiar commented Nov 28, 2024

I know that solution isn't correct, but I want to point out a separate issue. We still rewrite from generated model into an expected model even though the test isn't being passed. I was expecting that when we pass an ACCEPT=true it would rewrite only those generated models that have passed the test. Is my expectation reasonable? Or accept flag should work this way?

@ozgurakgun
Copy link
Contributor

wait a minute. conjure generating an incorrect solution would break our flow. why do we get a=3 as a solution from conjure?

can you check conjure solve --number-of-solutions=all FILENAME on that Essence file to see why this is happening?

@YehorBoiar
Copy link
Contributor Author

It seems there's a problem with the behavior of the ACCEPT=true flag. Here's what I'm observing:

  • Conjure generates the correct solution: a = 0, a = 1, a = 2.
  • My incorrect solution is: a = 0, a = 1, a = 2, a = 3.
  • When running ACCEPT=true cargo test, the test fails (as expected) because my solution does not match Conjure's.
  • However, if I run cargo test after that, it shows all tests as passing. This indicates that ACCEPT=true rewrote the expected solution despite the test failing, causing a mismatch in behavior between the two runs.

The core issue is that ACCEPT=true cargo test appears to rewrite the expected solution regardless of the test result. This breaks the flow by essentially "accepting" incorrect solutions.

Is this the intended behavior of the ACCEPT=true flag, or should it only rewrite the expected models if the test passes?

@ozgurakgun
Copy link
Contributor

I see, thanks for the investigation. It sounds like we should only accept the outputs if the tests pass, indeed. Do you want to work on a fix?

@YehorBoiar
Copy link
Contributor Author

I can do that

@YehorBoiar YehorBoiar self-assigned this Nov 28, 2024
@YehorBoiar
Copy link
Contributor Author

YehorBoiar commented Nov 28, 2024

Just had a chat with Felix and he gave me a good idea. Should we just create a separate flag CONJURE_CHECK that would run conjure on cargo test when passed, and add that check to CI pipeline as well? @ozgurakgun @niklasdewally

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area::conjure-oxide Related to conjure_oxide. kind::discussion General discussion and high-level planning. kind::testing Testing and Correctness
Projects
None yet
Development

No branches or pull requests

3 participants