The distributed stitching workflows ingests expansion microscopy data as a TIFF slice series and runs the following processing:
- Conversion to N5
- Flatfield correction
- Deconvolution
- Stitching
- Export to N5 and TIFF slice series
All steps besides deconvolution use the stitching-spark code from the Saalfeld Lab at Janelia.
Deconvolution uses a MATLAB script (details TBD).
Usage (example):
./stitching.nf [arguments]
Argument | Description |
---|---|
--images_dir | Path to directory containing TIFF slices and the ImageList.csv file |
--output_dir | Path to output directory |
--psf_dir | Path to a point-spread function TIFF stack of each channel taken from the microscope |
Argument | Default | Description |
---|---|---|
--spark_container_repo | public.ecr.aws/janeliascicomp/exm-analysis | Docker registry and repository for the spark container |
--spark_container_name | stitching | Name for the container in the spark_container_repo |
‑‑spark_container_version | 1.8.1 | Version for the container in the spark_container_repo |
--spark_work_dir | Path to directory containing Spark working files and logs during stitching | |
--stitching_app | /app/app.jar | Path to the JAR file containing the stitching application. |
--skip | Specifies the steps to be skipped. The valid values are: prestitching, deconvolution, stitch, fuse, tiff-export . See Tweaking the stitching process how to use this |
|
--stitching_json_inputs | 488nm-decon,560nm-decon,642nm-decon | Default JSON inputs for the stich step. See Tweaking the stitching process how to use this. |
--fuse_to_n5_json_inputs | 488nm-decon-final,560nm-decon-final,642nm-decon-final | Default JSON inputs for the fuse and export step. See Tweaking the stitching process how to use this |
--workers | 4 | Number of Spark workers to use for stitching |
--worker_cores | 4 | Number of cores allocated to each Spark worker |
--gb_per_core | 15 | Size of memory (in GB) that is allocated for each core of a Spark worker. The total memory usage for stitching will be workers worker_cores gb_per_core. |
--driver_memory | 15g | Amount of memory to allocate for the Spark driver |
--driver_stack_size | 128m | Amount of stack space to allocate for the Spark driver |
--stitching_output | Output directory for stitching (relative to --output_dir) | |
--resolution | 0.104,0.104,0.18 | Resolution of the input imagery |
--axis | -y,-x,z | Axis mapping for objective to pixel coordinates conversion when parsing metadata. Minus sign flips the axis. |
--channels | 488nm,560nm,642nm | List of channels to stitch |
--block_size | 128,128,64 | Block size to use when converting to n5 before stitching |
--stitching_mode | incremental | |
--stitching_padding | 0,0,0 | |
--stitching_blur_sigma | 2 | |
--deconv_cpus | 4 | Number of CPUs to use for deconvolution |
--background | 100 | Background value subtracted during deconvolution |
--psf_z_step_um | 0.1 | Step size of PSF used during deconvolution |
--iterations_per_channel | 10,10,10 | The number of iterations/tile done during deconvolution |
--export_level | 0 | Scale level to export after stitching |
--allow_fusestage | false | Allow fusing tiles using their stage coordinates. Set to true to quickly stitch and export the image volume based on stage coordinates |
In some cases, especially when the signal to noise ratio is low, the deconvolution process may fail to actually deconvolve the tiles, which will make the stitch step to fail - this is typically one cause of failure that may be seen during the stitching step.
One way to deal with this case is to use the raw tiles to perform the stitching and then replace the deconv tiles in the final file generated by the stitch step.
Typically the stitch step uses the <ch>-decon.json
files - generated by the deconvolution step - as input and it generates <ch>-decon-final.json
files. If you want to use the raw tiles, use the --stitching_json_inputs "488nm,560nm,642nm"
input parameter. If this parameter is set, the input for the stitch step will be -i <data_dir>/488nm.json -i <data_dir>/560nm.json -i <data_dir>/642nm.json
and this will generate the files 480nm-final.json
, 560nm-final.json
, and 642nm-final.json
. One more thing that should be mentioned is that if you don't use all channels for stitching, for example you only use 488nm
, then the stitch step will only generate 488nm-final.json
, but the pipeline will "artificially" create the "-final.json" files for all the other channels even if they were not actually used for stitching. For "artificially create" the "-final" file we take one of the "-final" files actually generated by the stitch step and replace the tiles with the corresponding tiles from the raw channel files. For example 488nm-final.json
was generated by the stitch application, and 642nm channel was not even used for that. In that case we copy 488n-final.json
to 642nm-final.json
and in the new 642nm-final.json
we replace all file fields with the corresponding fields from 642nm.json
If this works and you are happy with the result of the stitching step - then you can proceed with the next step - 'fuse and n5 export' which will take the output generated by the stitch step, fuse the tiles and export them to N5. If you used raw tiles for stitching but you still want to export the deconvolved tiles then all you need is to use the parameter --fuse_to_n5_json_inputs "488nm-decon-final"
. In this case the pipeline will copy 488nm-final.json file to 488nm-decon-final.json and it will replace the tile files with the corrresponding deconvolved tiles. If you want to export the raw tiles then use --fuse_to_n5_json_inputs "488nm-final"
. This example will only export the 488nm channel so if you want to export all channels you must specify them all --fuse_to_n5_json_inputs "488nm-final,560nm-final,642nm-final"
. If you want to export '-decon-final.json" for all channels just ignore the '--fuse_to_n5_json_inputs' because that is the default behavior anyway. If a '-decon-final.json' file already exists for a channel, that will not be overriden, so if you want that make sure you remove or rename the old file.
Also to help you only try out the steps that you want you can use --skip
step to skip certain steps from the pipeline. Keep in mind however that if some steps never ran and the data that was supposed to be generated by those steps is required downstream, the downstream process will fail.