Skip to content

Latest commit

 

History

History
272 lines (202 loc) · 11.7 KB

2.meshing.md

File metadata and controls

272 lines (202 loc) · 11.7 KB

Introduction

This is documentation was originally created by @manuel-castro.

The document describes the procedure to mesh a ChunkedGraph dataset. This assumes that the associated ChunkedGraph is in the new format (based on main branch). Meshes created are in the sharded format.

Prerequisites

The segmentation data has been completely ingested into the ChunkedGraph. The segmentation data has been downsampled to the desired mip to mesh at. You have the appropriate credentials to create an instance of the ChunkedGraph.

Setup

Before starting, set the mesh key in the info file of the chunkedgraph's watershed/segmentation CloudVolume layer to be the directory where meshes will be stored, if it has not been already.

It is recommended to run meshing before any edits on the chunkedgraph. But can also be run it after, make sure to use a timestamp before the first edit.

from pychunkedgraph.graph import ChunkedGraph
cg = ChunkedGraph(graph_id="<>")
cg.meta.ws_cv.info["mesh"] =  "graphene_meshes"
cg.meta.ws_cv.commit_info()

This will create a new directory named graphene_meshes in the segmentation path.

The sharding specification also needs to be set. Here’s a typical-looking sharding specification:

sharding_spec = {
 '@type': 'neuroglancer_legacy_mesh',
 'spatial_index': None,
 'mip': 2,
 'chunk_size': [X, Y, Z],
 'sharding': {
  '2': {'@type': 'neuroglancer_uint64_sharded_v1',
   'preshift_bits': 0,
   'hash': 'murmurhash3_x86_128',
   'minishard_bits': 1,
   'shard_bits': 0,
   'minishard_index_encoding': 'gzip',
   'data_encoding': 'raw'},
  '3': {'@type': 'neuroglancer_uint64_sharded_v1',
   'preshift_bits': 0,
   'hash': 'murmurhash3_x86_128',
   'minishard_bits': 3,
   'shard_bits': 0,
   'minishard_index_encoding': 'gzip',
   'data_encoding': 'raw'},
  '4': {'@type': 'neuroglancer_uint64_sharded_v1',
   'preshift_bits': 0,
   'hash': 'murmurhash3_x86_128',
   'minishard_bits': 6,
   'shard_bits': 0,
   'minishard_index_encoding': 'gzip',
   'data_encoding': 'raw'}
   }
}

Make sure that the chunk_size is the same as the chunk_size for the ChunkedGraph taking into account the mip level, and that the mip is the mip you want. For more information about the other settings, please see the link at the top. To set this specification for the watershed layer:

cg.meta.ws_cv.mesh.meta.info = sharding_spec
cg.meta.ws_cv.mesh.meta.commit_info()

Main Procedure

Meshing then begins by meshing every chunk at layer 2. This means running marching cubes on each layer 2 chunk with 1 voxel of overlap in the positive direction. The script to use to run meshing is meshing_batch.py.

Local Meshing

python meshing_batch.py --layer 2 --chunk_start 0 0 0 --chunk_end 5 5 8 --mip 2 --cg_name my_awesome_cg

This will run local execution of 200 layer 2 chunks at mip 2 of the CG with graph_id my_awesome_cg (the chunks within the bounding box 0,0,0 to 5,5,8). To obtain the chunk range for the entire dataset, divide the size of the segmentation layer by the CG’s chunk size, and take the ceiling. You could get chunk_end from cg.meta.layer_chunk_bounds.

One typically also runs higher layers after creating meshes at layer 2. What this does is simply stitch together the appropriate initial meshes to create meshes for nodes at a higher layer. Example:

python meshing_batch.py --layer 3 --chunk_start 0 0 0 --chunk_end 3 3 4 --mip 2 --cg_name my_awesome_cg

The benefit of stitching meshes is to consolidate many small meshes stored that are stored in different locations into much fewer bigger meshes. This makes it much easier for the CG server to retrieve the appropriate meshes for a given neuron, which in turn means a user gets the meshes much more quickly.

NOTE: Meshes should be created layer by layer sequentially up to the highest layer that one wishes to have meshes at (i.e. first run the script for layer 2, then 3, soo on). Currently, we stop at layer 6 or 7, because of the memory requirements needed to create meshes of higher layers than that.

Two more options to the meshing_batch.py script: --queue_name and --skip_cache. queue_name specifies the AWS queue you wish to push meshing tasks to, if you wish to run meshing in a distributed manner (necessary for any reasonably sized dataset). skip_cache disables gcloud caching and should only be used for testing/development.

Parameter Setting

To properly view the created meshes and make edits, additional parameters have to be set in two places: cg.meta.custom_data and cg.meta.ws_cv.info.

cg.meta.custom_data needs to look somewhat like this:

{
  'mesh': {
    'max_layer': 6,
    'verify': False,
    'mip': 2,
    'max_error': 40,
    'dir': 'graphene_meshes',
    'initial_ts': <unix_timestamp>
  }
}

These parameters are necessary for proper remeshing. Here, max_layer is the highest layer we meshed. max_error is the simplification error for marching cubes, typically chosen to be the largest dimension of the resolution/mip level that was used to create the meshes.

initial_ts should be a timestamp that is after the meshes were created but before the first edit.

Here’s how to set this:

cg.meta.custom_data["mesh"] = {
    'max_layer': <layer>,
    'mip': 2,
    'max_error': 40,
    'dir': 'graphene_meshes',
    'initial_ts': int(datetime.datetime(<year>, <month>, <day>).timestamp())
}
cg.update_meta(cg.meta, overwrite=True)

The watershed layer’s info file also needs a mesh_metadata object. Here’s what that should look like:

cg.meta.ws_cv.info["mesh_metadata"]
> {'uniform_draco_grid_size': 21, 'unsharded_mesh_dir': 'dynamic'}

Here the unsharded_mesh_dir is the gcloud directory name to store dynamically created meshes and is typically just dynamic. uniform_draco_grid_size is used in CloudVolume to properly deduplicate vertices at chunk boundaries, and calculated like so:

from pychunkedgraph.meshing import meshgen

draco_encoding_settings = meshgen.get_draco_encoding_settings_for_chunk(
  cg, cg.get_chunk_id(layer=2,x=0,y=0,z=0), mip=<mip>)
draco_encoding_settings['quantization_range'] / (2 ** draco_encoding_settings['quantization_bits'] - 1)
> 21.0 # uniform_draco_grid_size

Downloading meshes through cloud-volume

If you need to download the meshes programmatically through cloud-volume, the mesh info file needs to be updated:

cg.meta.ws_cv.info["mesh_metadata"]["max_meshed_layer"] = cg.meta.custom_data["mesh"]["max_layer"]
cg.meta.ws_cv.commit_info()

Distributed Meshing

Distributed meshing is done with a Kubernetes cluster and an AWS SQS queue.

Once you have setup access to SQS service, run the same meshing command above but with --queue_name:

python meshing_batch.py --layer 2 \
  --chunk_start 0 0 0 \
  --chunk_end 5 5 8 \
  --mip 2 \
  --cg_name my_awesome_cg \
  --queue_name <sqs_queue_url>

<sqs_queue_url> typically looks something like https://sqs.us-east-2.amazonaws.com/622009480892/my_awesome_queue.

You can do this on your workstation (recommended) or from one of the meshing workers. If you're working on a really large dataset with millions of chunks, the above command will likely get killed if you run for the entire range. To avoid that run it in batches. Here is a bash script that can do it for you:

#!/bin/sh

max=<X>
step=1
j=0
for idx in `seq 0 $(expr $max - 1)`
do
    i=`expr $idx \* $step`
    j=`expr $j + $step`
    python meshing_batch.py --layer 2 --chunk_start $i 0 0 --chunk_end $j <Y> <Z> --mip 2 --cg_name <cg_name> --queue_name <sqs_url>
done

The above script was contributed by @jakobtroidl.

To build your docker image and upload to gcloud, run:

gcloud builds submit --tag gcr.io/the-bestest-gcloud-project/pychunkedgraph:MyCoolDockerImage .

Creating Secrets

Connect to the cluster and create kubernets secrets the following files:

# secret to access google cloud resources
google-secret.json: |-
  {
    ...
  }
# secret to access the chunkedgraph server
cave-secret.json: |-
  {
    "token": "<>"
  }
# secret to access aws sqs service
aws-secret.json: |-
  {
    "AWS_ACCESS_KEY_ID": "<>",
    "AWS_SECRET_ACCESS_KEY": "<>"
  }

For each of these files, run:

kubectl create secret generic google-secret --from-file=path/to/google-secret.json
kubectl create secret generic cave-secret --from-file=path/to/cave-secret.json
kubectl create secret generic aws-secret --from-file=path/to/aws-secret.json

Example Deployment

Then apply a Kubernetes deployment to your cluster. Example deployment here: example-meshing.yaml. Rename file to meshing.yaml and change relevant fields, then apply:

kubectl apply -f meshing.yaml

When the process is compelete, delete the deployment:

kubectl delete -f meshing.yaml

Make sure the GOOGLE_APPLICATION_CREDENTIALS, BIGTABLE_PROJECT and BIGTABLE_INSTANCE environment variables are correctly set.

NOTE: lease_seconds and requested pod memory in the deployment. These will change from layer to layer. Lower layer chunks require exponentially less time to process and less memory. Experiment with local execution to find the best settings before running the whole dataset.

Be careful when scaling the number of workers to not overload the BigTable cluster, which will slow down meshing and the overall use of the ChunkedGraph. Check the CPU utilization graph in the monitoring section of the BigTable cluster, and increase the number of nodes if the current number of workers is causing CPU utilization to be at or above the recommended max percentage.

Memory and Time Limitations

Because of the way the sharded format works, it is much slower to download, say, 500 meshes at once than using the unsharded format. Therefore, the mesh stitching code is written differently to be more efficient: instead of finding which child meshes to stitch together and requesting those, we simply download all the shards we need at the beginning. Then for each parent node we find the relevant shards of those downloaded that have the child meshes and do local unpacking of the shard. The upside of this is way faster download time and minimal latency, since we make way fewer requests. However, we are now holding all of these shards in memory, and the amount of memory we need at each layer increases by ~8x. Typically layer 7 is the highest layer we can mesh at, since memory requirements at that layer could surpass 30 or 50GB. Time also starts to become a factor since we are deduplicating millions of vertices for each mesh, and a higher layer chunk contains many mesh fragments. SQS has a maximum lease time of 12 hours and layer 7 has also come somewhat close to hitting that limit.

If in the future we switch to some kind of multi-resolution format, where higher layer meshes are more simplified than their lower layer counterparts, that could help with this issue.

For Private Cluster

Setup Cloud NAT To enqueue jobs to the AWS SQS, you need to setup outbound connections to your private cluster. Please find step 6 in this setup tutorial, create Cloud NAT gateway.