Initial draft of local test framework RFC #8896

MartinMinkov · 2021-05-17T19:37:01Z

Proposes a containerization-focused solution on the architecture and overall workflow of the local test framework.

nholland94

This all looks great, with the exception of the approach for streaming logs back to the test executive. I think we need to think of a more robust solution upfront for this, since our goal here is to run local networks with snarks turned off (see my other comment for details of why this is relevant).

nholland94 · 2021-05-17T20:36:54Z

rfcs/0042-local-test-framework.md

+
+# Unresolved Questions
+
+- Is compiling a docker-compose file the right approach for scheduling the containers? The nice thing about using a docker-compose file is that all network management should be automatic.


Yeah, I think this is the right way to go, especially for the network management bit.

nholland94 · 2021-05-17T20:43:08Z

rfcs/0042-local-test-framework.md

+
+- Is using a different service for each type of node the best effective approach? Would it be better to launch all nodes under the same service in the docker-compose file?
+
+- Is polling each service and then aggregating those logs the best approach? Would it be better to do filtering before aggregating?


I think the polling logs in the local integration test engine is not the correct approach.

We do a pull-based approach for the cloud engine, and that works out fine for us right now because we pre-filter the logs (so the volume we handle per poll is very low), but more importantly, because we are only running networks with snarks enabled. When snarks are enabled, the networks move pretty slow, so small delays in the order of 5-10 seconds are not a problem.

However, with the local engine, the goal is the run snarkless networks, which move extremely quickly by comparison. In that world, a 5-10 second delay in logs could break the test. Furthermore, the log volume per second of a snarkless network is massive by comparison due to the speed the network operates at. Because of this, I think we both need to pre-filter the logs in the docker containers (to bring the log volume the single-threaded test_executive needs to handle down to a fraction of the size), and we need to be reading those logs off of a pipe in a more push-based fashion.

Another advantage of using a push-based pipe approach here is that you don't need to deal with re-reading logs you already handled, which appears to be a problem you would need to solve with the docker swarm logging interface. You might be able to get by with asking docker swarm to only give you logs that were recorded since the last time you asked for logs, but you have some risk of missing logs in that approach and are relying on docker swarm's performance to quickly perform that query on it's log cache.

Followed up with a restructuring of the proposed polling solution to use the push-based pipe approach. 👍

nholland94 · 2021-05-17T20:43:14Z

rfcs/0042-local-test-framework.md

+
+- Is polling each service and then aggregating those logs the best approach? Would it be better to do filtering before aggregating?
+
+- Does this plan capture the overall direction we want the local testing framework to go?


nholland94 · 2021-05-17T20:57:51Z

rfcs/0042-local-test-framework.md

+
+The new local testing engine must implement all existing features which include:
+
+- Starting/Stopping nodes dynamically


Just wanted to mention as a note here: If you continue using the puppeteer container (which I recommend), then this is very easy to do. You can peek at how this works in the cloud engine, but starting/stopping a puppeteered node is as simple as running a shell script in the container which communicates with the puppeteer process.

A good point was made about restructuring the proposed logging architecture. Instead of polling on an interval with Docker CLI which can lead to latency issues, we can redirect all stdout to a pipe shared with the test executive. This change reflects the following changes in the rfc to address that feedback.

MartinMinkov · 2024-01-11T17:50:06Z

Moved too o1-labs/rfcs#40

Initial draft of local test framework RFC

5d59ae5

MartinMinkov linked an issue May 17, 2021 that may be closed by this pull request

Local Execution Engine RFC #8860

Open

MartinMinkov mentioned this pull request May 17, 2021

[RFC] Local Testing Framework #8890

Closed

nholland94 changed the base branch from develop to compatible May 17, 2021 20:30

nholland94 requested changes May 17, 2021

View reviewed changes

nholland94 reviewed May 17, 2021

View reviewed changes

MartinMinkov closed this Jan 11, 2024

MartinMinkov deleted the rfc/local-test-framework branch January 11, 2024 17:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial draft of local test framework RFC #8896

Initial draft of local test framework RFC #8896

MartinMinkov commented May 17, 2021

nholland94 left a comment

nholland94 May 17, 2021

nholland94 May 17, 2021

MartinMinkov May 19, 2021

nholland94 May 17, 2021

nholland94 May 17, 2021

MartinMinkov commented Jan 11, 2024


		# Unresolved Questions

		- Is compiling a docker-compose file the right approach for scheduling the containers? The nice thing about using a docker-compose file is that all network management should be automatic.


		- Is using a different service for each type of node the best effective approach? Would it be better to launch all nodes under the same service in the docker-compose file?

		- Is polling each service and then aggregating those logs the best approach? Would it be better to do filtering before aggregating?


		- Is polling each service and then aggregating those logs the best approach? Would it be better to do filtering before aggregating?

		- Does this plan capture the overall direction we want the local testing framework to go?


		The new local testing engine must implement all existing features which include:

		- Starting/Stopping nodes dynamically

Initial draft of local test framework RFC #8896

Initial draft of local test framework RFC #8896

Conversation

MartinMinkov commented May 17, 2021

nholland94 left a comment

Choose a reason for hiding this comment

nholland94 May 17, 2021

Choose a reason for hiding this comment

nholland94 May 17, 2021

Choose a reason for hiding this comment

MartinMinkov May 19, 2021

Choose a reason for hiding this comment

nholland94 May 17, 2021

Choose a reason for hiding this comment

nholland94 May 17, 2021

Choose a reason for hiding this comment

MartinMinkov commented Jan 11, 2024