Skip to content
This repository has been archived by the owner on May 22, 2023. It is now read-only.

System tests #267

Closed
pdyraga opened this issue Mar 2, 2020 · 5 comments
Closed

System tests #267

pdyraga opened this issue Mar 2, 2020 · 5 comments

Comments

@pdyraga
Copy link
Member

pdyraga commented Mar 2, 2020

EDIT: Replaced by #382

Semi-automated tests proving all elements work as expected:

  • opening new keeps, generating keys
  • signing with existing keeps
  • closing keeps and archiving public keys
  • tests at scale stressing the network
  • fuzz testing for protocol and network routines
@pdyraga pdyraga added this to the v0.10.0 milestone Mar 2, 2020
@pdyraga
Copy link
Member Author

pdyraga commented Mar 3, 2020

Two scenarios I am very interested in:

  • the same set of members participate in two or more keeps and signing is requested at the same time
  • operator initiated unstake operation; it should no longer be eligible for work selection but existing sign requests should be processed as usual

@sthompson22
Copy link
Contributor

This perspective is going to come largely from a "potential operator of a client or clients on the network".

Two things I'm primarily curious about:

  • Behavior at scale. Behavior here meaning, behavior of multiple clients on the network, and of a single client participating in multiple simultaneous activities.
  • Behavior during non-happy path scenarios

What is scale?

We keep talking about scale but we haven't defined the borders, do we know:

  1. The minimum number of unique keep-tecdsa and beacon operator accounts participating to have a viable network?
  2. The potential maximum number of unique keep-tecdsa and beacon operator accounts participating at launch?

Ideally we can get an idea of 1 and 2, then test at those values and somewhere in between.

Behavior at scale

  • Network overhead when a client is connected to other clients at scale
  • CPU/Mem/Network utilization when a client is working with 1, 10, 100 keeps in parallel. Is there a limit here?

non-happy path scenarios

I think this is what concerns me the most, as it relates to being slashed.. For a given client we should be able to:

  • create a scenario in which a client should be slashed and observe the client being slashed appropriately
  • create a scenario that is unexpected but shouldn't result in slashing and observe the client not being slashed

In addition to being able to force these scenarios it would be beneficial to be able to monitor the overall slashed state of the network, possibly even look at this as time-series data to see if there are any run away trends that my impact network stability.


Outside of slashing, a way to exercise retries and recoveries where we expect them. Testing scenarios here may involve decoupling a running client from its associated Kube service, or pod cycling a client mid-activity. I'm a bit ignorant here so can't split this out into specifics, but feels like a place we should exercise.

@sthompson22
Copy link
Contributor

I'll keep noodling this, wanted to brain dump what's been flying around in my head for a bit.

@sthompson22
Copy link
Contributor

Meeting notes (Piotr / Kuba / Antonio / Sloan) 2020/03/18

  • Not sure how many nodes we can support on the network, beacon or ecdsa
  • Latent failures, such as groups not forming a month into deployment.
  • Round trip integration tests post merge of PRs (needs deployment pipeline sorted out as well). This will be time consuming, worried about impacting tests at scale.
  • Looking at tackling tests in a more manual fashion to sort things out, automate where we can, but focus on getting the tests (whatever they are) done first.
  • What are the parameters that we want to test?
    • opening / closing deposits
    • beacon requests
    • serially and burst requests
    • opening and closing keeps (in a vacuum)
  • we need more ropsten ETH

Next steps:

  • Sloan / Antonio do deployment work (Tenderly / scaling beacon and ecdsa nodes)
  • Kuba testing framework / test plan

@pdyraga pdyraga modified the milestones: v0.11.0, v0.12.0, v0.13.0, v0.14.0 Mar 31, 2020
@pdyraga
Copy link
Member Author

pdyraga commented Apr 9, 2020

Replaced by #382

@pdyraga pdyraga closed this as completed Apr 9, 2020
@pdyraga pdyraga removed this from the v0.14.0 milestone Apr 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants