Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(infra): concurrent materializer tests #1243

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

LePremierHomme
Copy link
Contributor

@LePremierHomme LePremierHomme commented Dec 28, 2024

Introducing concurrent tests in materializer, enabling the generation of traceable workloads without significantly altering how test scenarios are written.

Existing tests refactored to use make_testnet instead of with_testnet.

Progress

  • scenario 1: native coin transfer (docket_tests::benches::test_native_coin_transfer)
  • scenario 2: contract deployment
  • scenario 3: contract call (simplecoin transfer)
  • framework: assert that block gas limit isn't the TPS bottleneck
  • framework: TPS reporting
  • framework: latency reporting
  • framework: reach 1,000 max concurrency
  • framework: standardize report publication and regression checks
  • framework: NonceManager was introduced due to get_transaction_count not being reliable. need to double check that

For follow-up PRs

  • Create manifest files dynamically
  • Introduce profiling features
  • Introduce tracing / statistical analysis features

@LePremierHomme LePremierHomme requested a review from a team as a code owner December 28, 2024 20:52
@LePremierHomme LePremierHomme marked this pull request as draft December 28, 2024 20:56
@LePremierHomme LePremierHomme changed the title feat(tests): concurrent materializer tests [WIP] feat(tests): concurrent materializer tests Dec 28, 2024
fendermint/testing/materializer/src/concurrency.rs Outdated Show resolved Hide resolved
fendermint/testing/materializer/src/concurrency.rs Outdated Show resolved Hide resolved
@@ -59,13 +54,14 @@ fn read_manifest(file_name: &str) -> anyhow::Result<Manifest> {

/// Parse a manifest file in the `manifests` directory, clean up any corresponding
/// testnet resources, then materialize a testnet and run some tests.
pub async fn with_testnet<F, G>(manifest_file_name: &str, alter: G, test: F) -> anyhow::Result<()>
pub async fn with_testnet<F, G>(manifest_file_name: &str, concurrency: Option<concurrency::Config>, alter: G, test: F) -> anyhow::Result<()>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I consider this inversion of control (IoC) as convenient utility -- rather bespoke -- that the original developer of the materializer used to create an initial batch of tests. I would not glorify it and turn it into a focal entrypoint for every future test.
  2. There isn't a single clear cut way we'll want all potentially tests that require some concurrency to behave, so I'm not sold on introducing concurrency as framework feature.

Instead, I think we're better served if we introduced a non-IoC API here:

  1. Extract the logic that actually materializes the definition into a separate function that returns the Manifest, DockerMaterializer, DockerTestnet.
  2. The test can now call this function to materialize a definition, do whatever it wants (using whatever concurrency it desires).
  3. Something needs to have a drop guard here that destroys the materialized testnet, probably the DockerTestnet? Not sure if that's implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've inverted the inversion, as you suggested, now providing a cleanup function. I don't think it's helpful to maintain 2 patterns, so I migrated prev tests.

It also helped to untangle concurrency from the framework, making it a simple test lib utility.

@LePremierHomme LePremierHomme changed the title [WIP] feat(tests): concurrent materializer tests feat(test): concurrent materializer tests Jan 7, 2025
@LePremierHomme LePremierHomme marked this pull request as ready for review January 7, 2025 12:23
@LePremierHomme LePremierHomme changed the title feat(test): concurrent materializer tests feat(infra): concurrent materializer tests Jan 7, 2025
}

pub async fn record(&mut self, label: String) {
let duration = self.start_time.unwrap().elapsed();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel expect here if better if you assume caller know calling "start" should happen first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll revise this API once the reporting summary is more solid.

Ok(bencher) => (Some(bencher), None),
Err(err) => (None, Some(err)),
};
step_results.lock().await.push(TestResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this triggers a lot of threads, then everyone is waiting on this as well, also as step_results gets big, so allocation might take some time. Just curious, if step_results are not updated at all, will there be a big difference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unlikely to be a bottleneck, it can only impose a small delay in the after-test lifecycle of the future, which isn't recorded nor is time sensitive. But I'll double check that once I'll get to high max concurrency figures.


#[derive(Default)]
pub struct NonceManager {
nonces: Arc<Mutex<HashMap<H160, U256>>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bottom neck as well, every address is waiting on the same lock. Maybe this might help: https://github.com/xacrimon/dashmap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is just a temporary solution, I was hoping to remove it entirely. If not, I'll optimize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

Successfully merging this pull request may close these issues.

3 participants