Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the integration testing code #549

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

YehorBoiar
Copy link
Contributor

This PR is created to address the issue #523

@YehorBoiar YehorBoiar marked this pull request as draft December 7, 2024 19:36
@YehorBoiar YehorBoiar marked this pull request as ready for review December 7, 2024 19:39
@YehorBoiar
Copy link
Contributor Author

@ozgurakgun Please review

@niklasdewally
Copy link
Collaborator

Thank you for improving the docs as part of this :)

Copy link
Collaborator

@niklasdewally niklasdewally left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, aside from some minor nitpicks!

conjure_oxide/tests/generated_tests.rs Show resolved Hide resolved
conjure_oxide/tests/generated_tests.rs Show resolved Hide resolved
conjure_oxide/tests/generated_tests.rs Outdated Show resolved Hide resolved
@niklasdewally niklasdewally linked an issue Dec 9, 2024 that may be closed by this pull request
In PR #421, tests were unintentionally passing due to expected files
being overwritten regardless of whether the generated solutions matched
Conjure’s solutions.

This commit refactors the integration_test_inner function and related
file I/O utilities so that:

* When ACCEPT=false, tests still compare generated outputs (parse,
  rewrite, solutions, and traces) against the expected results as
  before.

* When ACCEPT=true, tests now first verify that the generated solutions
  match the Conjure solutions. Only if they match are the expected files
  updated. If they don’t match, the test fails without updating the
  expected files.
Copy link
Contributor

github-actions bot commented Dec 9, 2024

Code and Documentation Coverage Report

Documentation Coverage

This PR:

conjure_core: 4% with examples, 49% documented -- 8/98/223
conjure_oxide: 0% with examples, 22% documented -- 0/7/35
minion_rs: 30% with examples, 82% documented -- 2/23/112
tree-morph: 17% with examples, 50% documented -- 1/4/10

Main:

conjure_core: 4% with examples, 49% documented -- 8/98/223
conjure_oxide: 0% with examples, 19% documented -- 0/4/35
minion_rs: 30% with examples, 82% documented -- 2/23/112
tree-morph: 17% with examples, 50% documented -- 1/4/10

View full documentation coverage for main, this PR

Code Coverage Summary

This PR: Detailed Report

  lines......: 76.9% (5131 of 6669 lines)
  functions..: 61.7% (448 of 726 functions)
  branches...: no data found

Main: Detailed Report

  lines......: 76.7% (5145 of 6704 lines)
  functions..: 61.7% (448 of 726 functions)
  branches...: no data found

Coverage Main & PR Coverage Change

Lines coverage changed by 0.20% and covered lines changed by -14
Functions coverage changed by 0.00% and covered lines changed by 0
Branches... coverage: No comparison data available

@ozgurakgun
Copy link
Contributor

I am not sure of this one yet @YehorBoiar - I feel we can tidy this up a bit more, but I need to spend a bit more time formulating my thoughts. Thanks for the work so far!

@niklasdewally
Copy link
Collaborator

niklasdewally commented Dec 11, 2024

@ozgurakgun @YehorBoiar

Maybe we need to think deeper about how we customise each stage of the tester.

Currently we have the following steps

  1. Parse
  2. Rewrite
  3. Solve

(thanks Yehor for documenting this)

We could think about all the other stuff the tester does as stages too:

1a. Parse
1b. Check against other parser(s)
3a. Rewrite
3c. Check model properties (extra_asserts)
4a. Solve
4b. Check solutions against Conjure

These steps fall into two categories:

  • Optional steps, enabled by env var / configuration value: 1b,3c,4b
  • Required steps with multiple concrete implementations: e.g. native parser vs json parser, naive rewriter vs optimising one.

Thinking of everything as steps would make configuration simpler. For each optional step, we would parse a config.toml value enable_<step>. For each required step, we would keep a list of possible implementations, which could be changed in config by <step>_impl.

@niklasdewally
Copy link
Collaborator

niklasdewally commented Dec 11, 2024

It would also be nice to have parser only tests : e.g. for treesitter development or for when I'm adding more complicated stuff to the AST. Also no-solver tests so I can test normalisation rule PRs.

A possible approach for the above:

  1. Parse test configuration / What are we testing? At the start of testing, we read config.toml and the environment variables, and construct a list of steps to perform.
  2. Run tests: we run each step in the list, erroring if it fails.
  3. Update tests: if accept=true, we run some custom on_accept() function for each step in our list.

This is a vague, high level proposal, so I'm happy to give a more concrete example in Rust.

@ozgurakgun
Copy link
Contributor

(1) The steps having a similar shape to each other, maybe a step abstraction that takes in a function and implements the generate/check-against/accept logic once is closer to what I am thinking.

(2) We can also think about a global properties/assertions/warnings structure. That is enabled/disabled depending on what we are looking for in a particular run. An example is the prop_ we added in the naive rewriter. A test run or an executable run of oxide should be able ro enable/disable different properties at different levels. Consider it an error/warning/trace etc.

I have been looking into the existing tracing libraries for (2) but didn't manage to find one that does everything I think we need. I think we need named properties, registered centrally (maybe through codegen) and when at call site we decide how to interpret each property. (1) is this PR, (2) is a separate but related one...

@niklasdewally
Copy link
Collaborator

niklasdewally commented Dec 13, 2024

(2) We can also think about a global properties/assertions/warnings structure. That is enabled/disabled depending on what we are looking for in a particular run. An example is the prop_ we added in the naive rewriter. A test run or an executable run of oxide should be able ro enable/disable different properties at different levels. Consider it an error/warning/trace etc.

To clarify, these need to be dynamic, not static? If build time is fine, feature flags and simple declarative macros could take us most the way there...

Can we store these flags in the Context? Context already acts as a global variable store,accessible by the entire program which is initialised by the caller of Conjure Oxide API (be that the tester, CLI, user crates): this sounds like it fits the bill?

As we have serde, it would be easy to set these flags through a toml file instead of the command line. And we already read toml in the for the tests anyways, so we could do:

# config.toml
[test]
# test settings here
rewriter=  "naive"

[debug_flags]
# conjure oxide debug settings here
debug_check_only_one_applicable_rule= true

For example, we could parse the debug_flags section into a DebugFlags struct which is part of our Context.

@niklasdewally
Copy link
Collaborator

niklasdewally commented Dec 13, 2024

Moving our existing properties to the context and making it serialisable from the test toml should be possible to do immediately without too much effort.

Adding macros / a nicer API would hopefully then go on top of our existing stuff without much large scale refactoring.

@ozgurakgun
Copy link
Contributor

dynamic as in runtime instead of compile time, yes.

I expect we won't change this during a run, but I guess it should be possible. A kind of debugging option where we increase the message levels during a run could be useful.

putting them in context is fine.

Copy link
Contributor

The pull request is marked as stale because it has been inactive for 30 days. If there is no new activity it will be automatically closed in 5 days.

@github-actions github-actions bot added the Stale label Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor the integration testing code
3 participants