Skip to content

Compression Choices

William Silversmith edited this page Oct 25, 2024 · 11 revisions

CloudVolume has many different codecs to choose from for each compression type. Here is a short guide (to be improved upon) to give some guidance on which one to choose.

Some encodings can be layered with a second stage bitstream compression. We support gzip and brotli (br) mainly because that is what browsers (and hence Neuroglancer) automatically support. It is possible in the future to add support for e.g. zstd but Neuroglancer would have to have a codec for it. Note that brotli is not supported for sharded data currently (Neuroglancer only has a gzip decompression JS module).

EM Images

Generally grayscale 8 or 16 bit electron or light microscopy images.

Choices: raw, raw+gzip, raw+br, png, jpeg, jxl

  • If you can tolerate lossy compression, jpeg will be very fast and gives good compression.
  • jxl (JPEG XL) gives good performance with both lossy and lossless options. It can generally compress about 2x as much as jpeg for similar quality. Lossless compression can be better than png and done faster. Results and performance can be tuned with quality and effort parameters. Not yet integrated into mainline Neuroglancer.
  • png is slow, but will give better lossless compression by about 25% compared with raw+gzip.
  • raw+gzip and raw+br have slightly different performance profiles but will give similar compression at the default settings.
  • raw means uncompressed. Very fast on SSD, not so much on remote networks. Untenable for large datasets.
  • jpeg does not support 16-bit images (it technically does, but requires special recompilation of the library so no).

Segmentation

These are usually uint32 or uint64 densely labeled data.

Choices: raw, compressed_segmentation (cseg), compresso, crackle (all +gzip or +br)

  • compressed_segmentation will generally give the best neuroglancer rendering performance as it's the native representation. Other formats are decoded to raw then reencoded as compressed_segmentation in the browser.
  • For smooth segmentation, generally go with compresso+br for the best compression ratio and almost top performance. crackle+br gives superior compression and performance to compresso, but is experimental.
  • For noisy segmentation, go with cseg+br for the best compression and top performance.
  • If you use crackle, please communicate with Will Silversmith.

compresso, crackle, and cseg are codecs designed for connectomics data. Crackle and compresso are novel high compression codecs.

Voxel-Wise Affinities

Intermediate float32 xyz neighbor affinity predictions used for creating segmentation and region graphs. These are very heavy, 12x bigger than the base image. More information: https://github.com/seung-lab/cloud-volume/wiki/Advanced-Topic:-fpzip-and-kempressed-Encodings

Choices: raw, raw+gz, raw+br, fpzip, kempressed

  • Use kempressed for best compression.
  • Note that the official Neuroglancer client cannot display fpzip or kempressed, so you'll have to use raw+X if that's a requirement.

Alignment Vectors

These are usually float32 images with an X and Y component. Some older versions are int16 to which this advice does not apply.

Choices: raw, raw+gzip, raw+br, fpzip, zfpc

  • The current best choice is to use raw+br
  • zfpc is an experimental lossy compression choice that will likely be the go-to option in the future. Don't pick it for now unless you are in communication with Will Silversmith

Visualizing Experimental Codecs

Experimental Codecs: fpzip, kempressed, crackle, and zfpc

These codecs are not integrated into mainline Neuroglancer. However, you can visualize them using a Neuroglancer fork.