[Local Test Framework] - Docker Network Config #9000

MartinMinkov · 2021-06-08T19:27:59Z

Adds the initial implementation of the local test framework. This PR includes the implementation for the Network_config interface which adds an interface for defining a Docker Swarm for the specified network. Additionally has some logic for polling the status of the services and starting each puppeteered node if the services are live.

An example command to invoke the local test framework after building is:

test_executive local payments --debug --coda-image codaprotocol/coda-daemon-puppeteered:1.1.5-compatible | logproc -i inline -f '!(.level in ["Spam", "Debug"])'

This will create a directory in the ./automation folder that contains the compose file that has the specification for the Docker Swarm and some test configurations as files. These test configuration files are files such as the runtime and block producer keys. The following is an example of such file:

{
  "version": "3.9",
  "services": {
    "seed": {
      "image": "codaprotocol/coda-daemon-puppeteered:1.1.5-compatible",
      "volumes": [
        {
          "type": "bind",
          "source": "./runtime_config",
          "target": "/root/runtime_config"
        }
      ],
      "deploy": { "replicas": 1 },
      "command": [
        "daemon", "-log-level", "Debug", "-log-json",
        "-log-snark-work-gossip", "true", "-client-port", "8301",
        "-generate-genesis-proof", "true", "-peer",
        "/dns4/mina_seed-node/tcp/10401/p2p/12D3KooWCoGWacXE4FRwAX8VqhnWVKhz5TTEecWEuGmiNrDt2XLf",
        "-seed", "-config-file", "/root/runtime_config"
      ],
      "environment": {
        "CODA_LIBP2P_PASS": "",
        "DAEMON_CLIENT_PORT": "8301",
        "DAEMON_METRICS_PORT": "10001",
        "DAEMON_REST_PORT": "3085"
      }
    },
    "test-block-producer-1": {
      "image": "codaprotocol/coda-daemon-puppeteered:1.1.5-compatible",
      "volumes": [
        ...
      ],
      "deploy": { "replicas": 1 },
      "command": [
        ...
      ],
      "environment": {
       ...
      }
    },
    "test-block-producer-2": {
      "image": "codaprotocol/coda-daemon-puppeteered:1.1.5-compatible",
      "volumes": [
        ...
      ],
      "deploy": { "replicas": 1 },
      "command": [
        ...
      ],
      "environment": {
        ...
      }
    },
    "test-block-producer-3": {
      "image": "codaprotocol/coda-daemon-puppeteered:1.1.5-compatible",
      "volumes": [
        ...
      ],
      "deploy": { "replicas": 1 },
      "command": [
        ...
      ],
      "environment": {
        ...
      }
    }
  }
}

The local engine runs this file with docker stack deploy -c ... <swarm_name>. The local engine will initialize the Docker Swarm and start each puppeteered node once they pass a service health check.

TODO:

Implement a more sophisticated Network_manager since most of the implementation has been copied from the cloud version.
~~Implement archive node~~
~~QA snark coordinator node~~

Import the engine interface that will be implemented for the local engine. Additionally add the local engine as a CLI flag to the test_executive

This adds the functionality to create a base docker-compose file from the specified test configs in the integration framework. Each node will be assigned to a service that will be running a container for each node in the network. This implementation currently supports creating a docker-compose file, creating a docker stack to run each container, and basic monitoring before continuing the rest of the test_executive execution. Lots of code in this PR has been stubbed and copied to make the compiler happy. Much of this code will be refactored to support the changes needed for the local engine later.

psteckler · 2021-06-09T17:54:42Z

src/lib/integration_test_local_engine/mina_automation.ml

+
+  let deploy t =
+    if t.deployed then failwith "network already deployed" ;
+    [%log' info t.logger] "Deploying network" ;


you could add let logger = t.logger in ... and then just use %log instead of %log' throughout this function

psteckler · 2021-06-09T17:56:38Z

src/lib/integration_test_local_engine/swarm_network.ml

+    Util.run_cmd_exn cwd "sh" ["-c"; kubectl_cmd]
+
+  (* 
+    Maybe we get rid of this? We are redirecting all logs to stdout which the test executive will pick up


the redirected logs are the ones from the container?

Sorry, this comment is a little misleading. It was more of a reminder to myself that the logging infrastructure for the local engine will be different than the cloud engine and that this function is subject to change later down the road.

Adds support for snark coordinator and workers based off the test configurations passed into the test executive. Right now, we only implement 1 snark coordinator for all other snark workers. Additionally added a new file for all the default Mina node commands and environment variables.

MartinMinkov · 2021-06-11T17:23:13Z

src/lib/integration_test_local_engine/mina_automation.ml

      run_cmd_exn t "docker"
        ["stack"; "deploy"; "-c"; "compose.json"; t.stack_name]
    in
+    [%log' info t.logger] "DEPLOYED STRING: %s" s ;


This snuck in while debugging, will remove.

nholland94 · 2021-06-08T19:47:43Z

src/lib/integration_test_local_engine/cli_inputs.ml

@@ -0,0 +1,19 @@
+open Cmdliner


Looks like you don't need any cli inputs for the local engine (at least not yet). You can just replace this file with:

open Cmdliner type t = unit let term = Term.const ()

nholland94 · 2021-06-08T19:51:58Z

src/lib/integration_test_local_engine/mina_automation.ml

+    in
+    let all_stacks = String.split ~on:'\n' all_stacks_str in
+    let testnet_dir =
+      network_config.mina_automation_location


You can probably just make this a constant directory that is relative to where the executable is run. I don't think you will need the mina_automation_location for anything else.

nholland94 · 2021-06-14T22:09:06Z

src/lib/integration_test_local_engine/docker_compose.ml

+
+module Compose = struct
+  module DockerMap = struct
+    type 'a t = (string, 'a, String.comparator_witness) Map.t


This is better summarized as type 'a t = 'a String.Map.t.

Given that this module only has this type def and the empty value, you should probably just delete this module and use String.Map directly (it has both 'a t and empty on it already).

nholland94 · 2021-06-14T22:12:03Z

src/lib/integration_test_local_engine/docker_compose.ml

+      let create name =
+        {type_= "bind"; source= "." ^/ name; target= "/root" ^/ name}
+
+      let to_yojson {type_; source; target} =


You shouldn't need a custom deriver for this. [@@deriving to_yojson] should work on this type def if you use the [@key "type"] decorator for the type_ field (see https://github.com/ocaml-ppx/ppx_deriving_yojson#key).

nholland94 · 2021-06-14T22:12:48Z

src/lib/integration_test_local_engine/docker_compose.ml

+      type t = string DockerMap.t
+
+      let create =
+        List.fold ~init:DockerMap.empty ~f:(fun accum env ->


This can just be String.Map.of_alist

nholland94 · 2021-06-14T22:24:10Z

src/lib/integration_test_local_engine/mina_automation.ml

+open Mina_base
+open Integration_test_lib
+
+let version = "3.9"


Can we call this docker_version so it's more obvious what this string is versioning?

nholland94 · 2021-06-14T22:25:07Z

src/lib/integration_test_local_engine/mina_automation.ml

@@ -0,0 +1,462 @@
+open Core


File name nit: It's a little confusing to call this module mina_automation given that it doesn't interact with our automation/ codebase. Recommend renaming to Docker_network or Mina_docker or something like that.

nholland94 · 2021-06-14T22:25:55Z

src/lib/integration_test_local_engine/mina_automation.ml

+    ; docker: docker_config }
+  [@@deriving to_yojson]
+
+  let expand ~logger ~test_name ~(cli_inputs : Cli_inputs.t) ~(debug : bool)


We can work later to improve the code-sharing between this and the existing cloud implementation (there are some shared steps).

nholland94 · 2021-06-14T22:27:10Z

src/lib/integration_test_local_engine/mina_automation.ml

+    let open Node_constants in
+    let runtime_config = Service.Volume.create "runtime_config" in
+    let blocks_seed_map =
+      List.fold network_config.docker.block_producer_configs


Instead of folding over empty maps to fill them up, you can also build an associative list and construct the map from that. Eg, List.map ... ~f:(fun ... -> (k, v)) |> String.Map.of_alist.

nholland94 · 2021-06-14T22:30:55Z

src/lib/integration_test_local_engine/swarm_network.ml

+open Async
+open Integration_test_lib
+
+(* Parts of this implementation is still untouched from the Kubernetes implementation. This is done


We should work on extracting the graphql definitions from the engine layer. Those should be in the shared Integration_test_lib library, and the engine should just be in charge of the transport logic for sending the queries. We can revisit this later though, not in this PR.

deepthiskumar · 2021-06-15T23:15:58Z

src/lib/integration_test_local_engine/mina_automation.ml

+    ; archive_node_count: int
+    ; mina_archive_schema: string
+    ; runtime_config: Yojson.Safe.t
+          [@to_yojson fun j -> `String (Yojson.Safe.to_string j)] }


Do you know how the resource allocation works? Maybe that needs to be configurable as well. I know that the default memory allocated by docker on mac is not sufficient to run a node.

Good point! Will address in a follow up PR.

Import the engine interface that will be implemented for the local engine. Additionally add the local engine as a CLI flag to the test_executive

This adds the functionality to create a base docker-compose file from the specified test configs in the integration framework. Each node will be assigned to a service that will be running a container for each node in the network. This implementation currently supports creating a docker-compose file, creating a docker stack to run each container, and basic monitoring before continuing the rest of the test_executive execution. Lots of code in this PR has been stubbed and copied to make the compiler happy. Much of this code will be refactored to support the changes needed for the local engine later.

Adds support for snark coordinator and workers based off the test configurations passed into the test executive. Right now, we only implement 1 snark coordinator for all other snark workers. Additionally added a new file for all the default Mina node commands and environment variables.

Addressed the feedback given in the Network_Config PR. Made the docker_compose.ml file more simple to read with additional refactors to mina_docker.ml. Additionally removed a lot of the copied over code in swarm_network to clean up the file.

…e/local-engine-network-config

…Protocol/mina into feature/local-engine-network-config

QuiteStochastic · 2021-06-23T01:36:13Z

src/lib/integration_test_local_engine/mina_docker.ml

+    let stack_name = "it-" ^ user ^ "-" ^ git_commit ^ "-" ^ test_name in
+    (* GENERATE ACCOUNTS AND KEYPAIRS *)
+    let num_block_producers = List.length block_producers in
+    let block_producer_keypairs, runtime_accounts =


we should probably stick this logic in some shared function that can be called

QuiteStochastic · 2021-06-23T01:41:42Z

src/lib/integration_test_local_engine/mina_docker.ml

+    }
+  [@@deriving to_yojson]
+
+  let expand ~logger ~test_name ~cli_inputs:_ ~(debug : bool)


a large amount of this func doesn't look implementation specific, i feel like we should find a place in the /integration_test_lib dir for many of the pieces. keys, block producer configs, constants, names of things, etc. probably shoulda done this to begin with haha. if we wanna just merge this quicker then no worries but would be nice to rationalize it a bit more, if not now then later for sure

Totally, Nathan mentioned refactoring a lot of the test setup steps into something more reusable. I didn't go down that route just because I didn't want to be refactoring the cloud engine at the same time in this PR. This will be an easy enough change in a separate PR as long as we keep the new interface consistent.

QuiteStochastic · 2021-06-23T01:50:24Z

src/lib/integration_test_local_engine/swarm_network.ml

+    let%bind.Deferred cwd = Unix.getcwd () in
+    Malleable_error.return (Util.run_cmd_exn cwd "sh" [ "-c"; docker_cmd ])
+
+  let start ~fresh_state node : unit Malleable_error.t =


all the stuff with the /stop /start puppeteer scripts was kind of a hacky solution we had that mostly had specifically to do with problems with the GCP deployment. mb @nholland94 can give more background about the gory details. i think we should avoid this if we have time. ideally, it would be best for the local engine to just figure out some docker command and make docker shut the service down and start it back up again, or something like that.

I do agree that a better solution would be preferable but, I think the puppeteer scripts are actually extremely useful for getting this started. I'm not sure if we have plans to refactor the puppeteer scripts out of the cloud engine but being able to start all node containers via start and stop commands is really useful. I don't think there is a way to do this natively in docker-compose unless I missed something :/

Added support for archive nodes in the local engine. If archive nodes are specified in the test plan, the test executive will download the archive schema to the working directory to use the schema as a bind mount for the postgres container. The archive node will then connect to the postgres container as usual.

Renamed node_config to docker_node_config and refactored some of the modules to have more static references baked in. Additionally refactored mina_docker to use these static references instead of string values in many places.

Added functionality for GraphQL networking to the docker containers. This work was mostly just adding the '--insecure-rest-server' flag to the nodes and opening up the corresponding rest port within the docker-compose file. To allow communication to the docker network from localhost, we map ports on the host starting at 7000 to port 3085 to the node which is used for GraphQL communication.

MartinMinkov · 2024-01-11T02:59:02Z

Moved too #14648

MartinMinkov added 3 commits June 8, 2021 12:06

Add .opam and dune files for local engine

eddc593

Add engine interfaces for local engine

6fe2363

Import the engine interface that will be implemented for the local engine. Additionally add the local engine as a CLI flag to the test_executive

MartinMinkov requested a review from a team as a code owner June 8, 2021 19:27

MartinMinkov linked an issue Jun 8, 2021 that may be closed by this pull request

[Local Test Framework] Implement Network_config interface #8915

Open

MartinMinkov changed the base branch from develop to compatible June 8, 2021 22:18

psteckler reviewed Jun 9, 2021

View reviewed changes

MartinMinkov commented Jun 11, 2021

View reviewed changes

MartinMinkov linked an issue Jun 11, 2021 that may be closed by this pull request

QA snark coordinator node for local framework #9010

Open

nholland94 requested changes Jun 14, 2021

View reviewed changes

deepthiskumar reviewed Jun 15, 2021

View reviewed changes

MartinMinkov added 7 commits June 17, 2021 15:08

Add .opam and dune files for local engine

2f89ef8

Add engine interfaces for local engine

bb3787b

Import the engine interface that will be implemented for the local engine. Additionally add the local engine as a CLI flag to the test_executive

Local Integration work refactor and feedback

0707643

Addressed the feedback given in the Network_Config PR. Made the docker_compose.ml file more simple to read with additional refactors to mina_docker.ml. Additionally removed a lot of the copied over code in swarm_network to clean up the file.

Merge branch 'compatible' of github.com:MinaProtocol/mina into featur…

51bb3e9

…e/local-engine-network-config

Merge branch 'feature/local-engine-network-config' of github.com:Mina…

9539e59

…Protocol/mina into feature/local-engine-network-config

QuiteStochastic reviewed Jun 23, 2021

View reviewed changes

MartinMinkov linked an issue Jun 29, 2021 that may be closed by this pull request

Implement archive node for local framework #9009

Closed

Cleaned up modules in mina_docker and node_config

b86c34a

Renamed node_config to docker_node_config and refactored some of the modules to have more static references baked in. Additionally refactored mina_docker to use these static references instead of string values in many places.

MartinMinkov changed the title ~~[Local Test Framework] [WIP] - Docker Network Config~~ [Local Test Framework] - Docker Network Config Jul 7, 2021

MartinMinkov and others added 3 commits July 15, 2021 09:55

Merge branch 'compatible' into feature/local-engine-network-config

2570ec8

Fix compilation

71cfc6d

add local version for rebuild-deb script

8613893

MartinMinkov requested a review from a team as a code owner December 21, 2021 13:14

added local/buildkite modes to rebuild-deb.sh

b0ca83b

MartinMinkov closed this Jan 11, 2024

MartinMinkov deleted the feature/local-engine-network-config branch January 11, 2024 02:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Local Test Framework] - Docker Network Config #9000

[Local Test Framework] - Docker Network Config #9000

MartinMinkov commented Jun 8, 2021 •

edited

Loading

psteckler Jun 9, 2021 •

edited

Loading

psteckler Jun 9, 2021

MartinMinkov Jun 11, 2021

MartinMinkov Jun 11, 2021

nholland94 Jun 8, 2021

nholland94 Jun 8, 2021

nholland94 Jun 14, 2021

nholland94 Jun 14, 2021

nholland94 Jun 14, 2021

nholland94 Jun 14, 2021

nholland94 Jun 14, 2021

nholland94 Jun 14, 2021

nholland94 Jun 14, 2021

nholland94 Jun 14, 2021

deepthiskumar Jun 15, 2021

MartinMinkov Jun 29, 2021

QuiteStochastic Jun 23, 2021

QuiteStochastic Jun 23, 2021

MartinMinkov Jun 29, 2021

QuiteStochastic Jun 23, 2021

MartinMinkov Jun 29, 2021 •

edited

Loading

MartinMinkov commented Jan 11, 2024

[Local Test Framework] - Docker Network Config #9000

[Local Test Framework] - Docker Network Config #9000

Conversation

MartinMinkov commented Jun 8, 2021 • edited Loading

psteckler Jun 9, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MartinMinkov Jun 29, 2021 • edited Loading

Choose a reason for hiding this comment

MartinMinkov commented Jan 11, 2024

MartinMinkov commented Jun 8, 2021 •

edited

Loading

psteckler Jun 9, 2021 •

edited

Loading

MartinMinkov Jun 29, 2021 •

edited

Loading