terraform
is used to create the infrastructure needed to run ingest. Currently scripts are provided to run the ingest on Google Cloud, but it can be run locally or on another cloud provider with appropriate setup.
- GCloud SDK (tested with 455.0.0)
- Terraform (v1.6.3)
- Helm (v3.13.2)
- kubectl (v1.28.2)
Terraform (docs)
IMPORTANT: This setup assumes that a bigtable instance is already created. To reduce latency, it is recommended that all resources are co-located in the same region as bigtable instance.
Provided scripts create a VPC network, subnet, redis instance, cluster with separately managed pools to run master and workers. Customize variables in the file terraform/terraform.tfvars
to create infrastructure in your Google Cloud project.
You will need at least the following roles:
- Service Account Admin
- Cloud Memorystore Redis Admin
- Kubernetes Engine Cluster Admin
- Compute Network Admin
Run the following to create required resources.
cd terraform/
export GOOGLE_OAUTH_ACCESS_TOKEN=$(gcloud auth print-access-token) # temporary token
terraform init // only needed first time
terraform plan
terraform apply
This will output some variables useful for next steps:
kubernetes_cluster_context = "gcloud container clusters get-credentials chunkedgraph-ingest --zone us-east1-b --project neuromancer-seung-import"
kubernetes_cluster_name = "chunkedgraph-ingest"
project_id = "neuromancer-seung-import"
redis_host = "10.128.211.211"
region = "us-east1"
zone = "us-east1-b"
Use value of kubernetes_cluster_context
to connect to your cluster.
Use value of redis_host
in helm/pychunkedgraph/values.yaml
(more info in Helm section).
You can also look these up again with terraform show
from within the terraform/
directory.
Click Me
If your project does not have enough quota for In-Use IP addresses for the given region, you may need to use a private cluster. GCP assigns an external IP to each VM that is part of a public cluster which can easily hit the quota when you use a large number of workers.To use a private cluster, add the following changes to terraform/cluster.tf
within the google_container_cluster
resource block.
addons_config {
http_load_balancing {
disabled = true
}
network_policy_config { // Calico
disabled = false
}
}
network_policy {
enabled = true
}
master_authorized_networks_config {}
private_cluster_config {
enable_private_nodes = true
enable_private_endpoint = false
master_ipv4_cidr_block = "172.16.0.16/28"
}
You will also need to enable the Service Networking API.
After this, run terraform commands to create the required infrastructure. Once it completes you will need to disable authorized networks to allow kubectl
access to GKE control plane.
gcloud container clusters update <CLUSTER_NAME> --no-enable-master-authorized-networks
This method allows us to use a private cluster easily. If you need to give cluster access only to specific IPs you may follow the instructions in this guide, this involves a few additional components and steps to access the cluster with kubectl
.
Build the PyChunkedgraph docker image:
git clone https://github.com/seung-lab/PyChunkedGraph.git
cd PyChunkedGraph
gcloud builds submit . --tag=gcr.io/<your_project_id>/pychunkedgraph:<custom_tag> --project=<your_project_id>
and link the image and tag in values.yaml
(refer to example_values.yaml
).
image:
repository: gcr.io/<your_project_id>/pychunkedgraph
tag: <custom_tag>
Helm (docs)
helm
is used to run the ingest. The provided chart installs kubernetes resources such as configmaps, secrets, deployments needed to run the ingest. Refer to example helm/pychunkedgraph/example_values.yaml
file for more information.
IMPORTANT: If you have a large dataset to ingest, it is recommended to do this layer by layer. See scaling.
NOTE: Depending on your dataset, you will need to figure out the optimal limits for cpu and memory in your worker deployments. To do that adjust the
count
andmachine
variables in terraform.tfvars. It can vary with chunk size, size of supervoxels (atomic semgents in layer 1), number of edges per chunk and so on.
When all variables are ready, rename your values file to values.yaml
(ignored by git because it can contain sensitive information). If a different name is preferred (for different datasets/project), use the format values*.[yml|yaml]
which will also be ignored by git. Then the file name will need to be explicitly passed to helm install
with -f <values_file.yml>
.
Then run:
cd helm/pychunkedgraph
helm dependencies update
helm install <release_name> . --debug --dry-run --set workerDeployments[0].enabled=true
helm upgrade <release_name> . // upgrade existing after changing values.yaml
You can enable a deployment and disable another without changing the values file, for example - when L2 chunks are done L2 workers are no longer needed. Resources can be freed up by disabling them (note that the default in values file is enabled: false
, only enabling needs to be set explicitly).
helm upgrade <release_name> . --set workerDeployments[1].enabled=true
If successful run the same command without --dry-run
. This will create master and worker kubernetes deployments.
Pods will have dataset configuration mounted in /app/datasets
and /app
is the WORKDIR
.
NOTE: Refer to dataset_config.md for more information about the config structure.
The number of workerDeployments
in values.yaml
relates to the number of layers in the graph. For example, a graph with 6 layers must have 5 deployments in workerDeployments
(l2
- l6
). To disable a depoyment, set the enabled
key to false
, you may want this for higher layer deployments because there will be no jobs until lower layers finish.
Tracker workers listen to completion of jobs at each layer and queue the parent chunks. Set trackerDeployments.count
to root_layer
of the chunkedgraph. The root_layer
can be determined as follows -
# voxel_counts - size of the dataset along axes
# ex) voxel_counts = np.array([4096, 4096, 1024])
# chunk_size - chunkedgraph chunk size
# ex) chunk_size = np.array([512, 512, 128])
import numpy as np
chunks = np.ceil((voxel_counts / chunk_size)).astype(int)
max_chunks = max(chunks) # most chunks along any dimension
log_chunks = np.log(max_chunks) / np.log(fanout)
root_layer = int(np.ceil(log_chunks)) + 2
The process is named ingest
because you are ingesting data into a bigtable.
Pods should now be in Running
status, provided there were no issues. Run the following to create a bigtable and enqueue jobs.
kubectl exec -ti deploy/master -- bash
// now you're in the container
> ingest graph <unique_test_bigtable_name> datasets/test.yml --test
RQ is used to create jobs. This library uses redis
as a task queue.
The --test
flag will queue 8 children chunks that share the same parent. When the children chunk jobs finish, worker listening on the t2
(tracker for layer 2) queue should enqueue the parent chunk. (To avoid race conditions there should only be one worker listening on tracker queues for each layer. The provided helm chart makes sure of this but important not to forget).
NOTE: For large datasets, the
ingest graph
command can take a long time because queue is buffered so as not to exceed redis memory. You will need to figure out how big the redis instance must be accordingly. See scaling section to see the recommended way to run ingest for large datasets.
You can check the progress with ingest status
. In another shell, run:
kubectl exec -ti deploy/master -- bash
> ingest status
2 : 8 / 64 # progress at each layer
3 : 1 / 8
4 : 0 / 1 # this graph has 4 layers
Output should look like this if successful. Now you can rerun the ingest without the --test
flag (make sure to use a different bigtable name).
NOTE: make sure to flush redis (
ingest flush_redis
) after running ingest and before anotherhelm install
. Residuals from previous ingest runs can lead to inaccurate information.
When creating a large chunkedgraph with tens of thousands of level 2 chunks, it is recommended to do it layer by layer. This is because there is simply too much information to keep track of for automatically enqueing parent chunks. This easily leads to exhausting redis memory and disrupts the process. In addition, the enqueing of parent chunks is slow because there can only be one tracker worker per layer.
To do this, add this env var to the helm/pychunkedgraph/values.yaml
file:
env:
- name: &commonEnvVars "pychunkedgraph"
vars:
REDIS_HOST: "<redis_host>" # refer to output of terraform apply
REDIS_PORT: 6379
REDIS_PASSWORD: ""
.
.
.
DO_NOT_AUTOQUEUE_PARENT_CHUNKS: <any_non_empty_string>
This essentially disables all the tracking needed to automatically enqueue parent chunks. The trackerDeployments
can be disabled as well because they will not be necessary.
Follow the same process as above to enqueue level 2 chunks with the ingest graph
command. When all of them complete you can proceed to layer 3
.
> ingest layer 3
Then layer 4
and so on up to the max layer. Proceed to next layer only after all the jobs in current layer finish.
If the ingest graph
command is killed by mistake or if kubernetes restarts master
pod for any reason while ingest is in progress, you can restart the process with --retry
. This option must only be used when a table has already been created and ingest process was interrupted.
> ingest graph <same_name_as_above> datasets/test.yml --retry
Sometimes jobs fail for any number of reasons. Assuming the causes were external, you can requeue them using rqx requeue <queue_name> --all
. Refer to pychunkedgraph/ingest/cli.py
and pychunkedgraph/ingest/rq_cli.py
for more commands.
RQ Cheatsheet:
rq info # show info for all existing queues
rqx status <queue_name>
rqx failed <queue_name> # will print a list of failed jobs
rqx failed <queue_name> <job_id> # will print the traceback of error why the job failed
rqx requeue <queue_name> --all
rqx requeue <queue_name> <job_ids_space_separated>
Sometimes even after all jobs have been queued and none of the workers are busy anymore, you might see a mismatch between jobs completed and total jobs at a given layer. This does not happen often and is due to pre-emptible VMs failing to communicate when they get killed. Ideally rq
should notice and mark the job as failed but sometimes it doesn't happen.
To re-queue such jobs, you will need to login to redis and check which of them might be stuck.
If you are using private cluster
If you created a private cluster, `apt` is not working because of no public ip address. You might want to create a new Docker image and create a cluster again.FROM seunglab/pychunkedgraph:graph-tool_dracopy
COPY override/timeout.conf /etc/nginx/conf.d/timeout.conf
COPY override/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
COPY requirements.txt /app
RUN pip install pip==20.2 && pip install --no-cache-dir --upgrade -r requirements.txt \
&& apt update; exit 0
RUN apt install redis-server -y
COPY . /app
kubectl exec -ti deploy/master -- bash
> apt update
> apt install redis -y
> redis-cli -h $REDIS_HOST
redis> keys rq:job:2_*
The command keys rq:job:2*
will give you a list of jobs stuck in queue l2
. The output looks something like this:
1) "rq:job:2_319_290_14"
2) "rq:job:2_932_683_2"
Enqueue these chunks individually with ingest chunk <queue> <layer> <X> <Y> <Z>
, for instance:
> ingest chunk l2 2 319 290 14
Once all these stuck jobs finish, completed count and total count should match.