Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add subsection on spruce case study (DRAFT) #176

Merged
merged 5 commits into from
Nov 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions paper.bib
Original file line number Diff line number Diff line change
Expand Up @@ -2323,3 +2323,15 @@ @article{hemstrom2024next
year={2024},
publisher={Nature Publishing Group}
}

@article{nystedt_NorwaySpruceGenome_2013,
title = {The {{Norway}} Spruce Genome Sequence and Conifer Genome Evolution},
author = {Nystedt, Bj{\"o}rn and Street, Nathaniel R. and Wetterbom, Anna and Zuccolo, Andrea and Lin, Yao-Cheng and Scofield, Douglas G. and Vezzi, Francesco and Delhomme, Nicolas and Giacomello, Stefania and Alexeyenko, Andrey and Vicedomini, Riccardo and Sahlin, Kristoffer and Sherwood, Ellen and Elfstrand, Malin and Gramzow, Lydia and Holmberg, Kristina and H{\"a}llman, Jimmie and Keech, Olivier and Klasson, Lisa and Koriabine, Maxim and Kucukoglu, Melis and K{\"a}ller, Max and Luthman, Johannes and Lysholm, Fredrik and Niittyl{\"a}, Totte and Olson, {\AA}ke and Rilakovic, Nemanja and Ritland, Carol and Rossell{\'o}, Josep A. and Sena, Juliana and Svensson, Thomas and {Talavera-L{\'o}pez}, Carlos and Thei{\ss}en, G{\"u}nter and Tuominen, Hannele and Vanneste, Kevin and Wu, Zhi-Qiang and Zhang, Bo and Zerbe, Philipp and Arvestad, Lars and Bhalerao, Rishikesh and Bohlmann, Joerg and Bousquet, Jean and Garcia Gil, Rosario and Hvidsten, Torgeir R. and {de Jong}, Pieter and MacKay, John and Morgante, Michele and Ritland, Kermit and Sundberg, Bj{\"o}rn and Lee Thompson, Stacey and {Van de Peer}, Yves and Andersson, Bj{\"o}rn and Nilsson, Ove and Ingvarsson, P{\"a}r K. and Lundeberg, Joakim and Jansson, Stefan},
year = {2013},
month = may,
journal = {Nature},
volume = {497},
number = {7451},
pages = {579--584},
publisher = {Nature Publishing Group},
}
79 changes: 79 additions & 0 deletions paper.tex
Original file line number Diff line number Diff line change
Expand Up @@ -979,6 +979,85 @@ \subsection{Case study: Genomics England 100,000 genomes}
to the ``mask-oriented'' analysis of large immutable, single-source
datasets is a potentially transformational change enabled by Zarr.

\subsection{Case study: 1,063 spruce whole-genome samples}

To demonstrate the versatility of \texttt{vcf2zarr}, in this section
we include a case study of a dataset originating from a species whose
genome properties differ substantially from human. The conifer Norway
spruce, \emph{Picea abies}, is one of the largest and most
ecologically important species on Earth and is distributed over large
parts of the Northern hemisphere. One of its features is its large
genome, consisting of 12 autosomes with a genome size of 19.6
Gb~\cite{nystedt_NorwaySpruceGenome_2013}. Here, we assess the
performance of \texttt{vcf2zarr} as applied to 1,063 whole-genome
resequenced spruce individuals from a recent study (in preparation),
where resequencing data has been mapped to a chromosome-scale
reference genome. The dataset consists of approximately 3,745 million
single-nucleotide variants and small indels from the 12 autosomes
totalling 7.4TiB of VCF data after \texttt{bgzip} compression,
distributed over 165 VCF files. Since the dataset originally was
generated for downstream analyses that require genotype likelihoods,
the genotypes include the PL field.

Following the conversion process detailed in the previous section, we
first converted the 165 VCF files (7.33TiB) to ICF, followed by
conversion to Zarr. The tasks were run as single jobs on a compute
node with 2,048 GB RAM and 128 cores. Conversion to ICF required
13h45min runtime. The ICF representation used a total of 6.77 TiB over
1,342,569 data storage files. Conversion to Zarr required 18h51min
runtime, generating a dataset with 32 arrays, consuming a total of 6.6
TiB storage over 11,984,581 chunk files. The reduction in total
storage (1.1X) is negligible, mainly due to the inherent difficulty in
compressing the PL field. The top fields in terms of storage are
detailed in Table~\ref{tab-spruce-data}.

\begin{table}
\caption{Summary for a selection of the largest VCF Zarr columns produced for
1,063 VCFs on 12 spruce chromosomes using \texttt{vcf2zarr}
default settings. For column details see the caption of Table~\ref{tab-genomics-england-data}.
\label{tab-spruce-data}}
\begin{tabular}{llS[table-format=3.1]S[table-format=3.2]S[table-format=3.2]}
\toprule
{Field} & {type} & {storage} & {compress} & {\%total} \\
\midrule
/call\_PL & int16 & 6.12 TiB & 12.0 & 92.75\% \\
/call\_genotype & int8 & 282.45 GiB & 26.0 & 4.18\% \\
/call\_genotype\_mask & bool & 42.15 GiB & 180.0 & 0.62\% \\
/variant\_DP4 & int32 & 26.24 GiB & 2.1 & 0.39\% \\
/variant\_MQSBZ & float32 & 13.08 GiB & 1.1 & 0.19\% \\
/variant\_RPBZ & float32 & 13.04 GiB & 1.1 & 0.19\% \\
/variant\_MQBZ & float32 & 12.97 GiB & 1.1 & 0.19\% \\
/variant\_SCBZ & float32 & 12.94 GiB & 1.1 & 0.19\% \\
/variant\_VDB & float32 & 12.84 GiB & 1.1 & 0.19\% \\
/variant\_quality & float32 & 12.84 GiB & 1.1 & 0.19\% \\
/variant\_BQBZ & float32 & 12.64 GiB & 1.1 & 0.19\% \\
/variant\_SGB & float32 & 12.12 GiB & 1.2 & 0.18\% \\
/variant\_position & int32 & 9.65 GiB & 1.4 & 0.14\% \\
/variant\_AC & int16 & 7.42 GiB & 2.8 & 0.11\% \\
/variant\_DP & int32 & 6.71 GiB & 2.1 & 0.10\% \\
/variant\_allele & object & 5.37 GiB & 21.0 & 0.08\% \\
/variant\_AN & int16 & 3.05 GiB & 2.3 & 0.05\% \\
/call\_genotype\_phased & bool & 2.03 GiB & 1800.0 & 0.03\% \\
/variant\_filter & bool & 1.46 GiB & 2.4 & 0.02\% \\
/variant\_MQ & int8 & 489.83 MiB & 7.3 & 0.01\% \\
\bottomrule
\end{tabular}
\end{table}

Table~\ref{tab-spruce-data} shows that the dataset storage size is
dominated by a major component, call\_PL, which accounts for nearly
93\% of the total. call\_PL is yet another field with low compression potential due to inherent noisiness.

%% FIXME: reference regarding VCF POS?
The spruce data set poses an interesting challenge to the VCF storage
format due to the large chromosome sizes (>1Gbp). The VCF POS field is
encoded in 32-bit format (corresponding to a maximum of approximately
537Mbp), which mandates splitting up larger chromosomes into chunks,
adding additional complexity to downstream analyses. Zarr does not
impose this limitation, and we therefore demonstrate the versatility
of Zarr by updating contig information and the coordinate system for
the spruce dataset.

\section{Discussion}
% Zarr is great
VCF is a central element of modern genomics, facilitating
Expand Down
205 changes: 205 additions & 0 deletions spruce_example/spruce.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 15,
"id": "47ce48d8-1106-4f83-b36e-81e8de9a34c5",
"metadata": {},
"outputs": [],
"source": [
"import sgkit\n",
"import glob\n",
"import os\n",
"import struct\n",
"import numpy as np\n",
"import humanfriendly\n",
"import tabulate\n",
"from distributed import Client\n",
"import matplotlib\n",
"import matplotlib.pyplot as plt\n",
"import xarray\n",
"import sys\n",
"#import dask.array as da\n",
"import pandas\n",
"from pathlib import Path\n",
"from bio2zarr.vcf2zarr import vcz"
]
},
{
"cell_type": "markdown",
"id": "3a7e6c6c-3834-43b1-8e97-87eec9f0e4b3",
"metadata": {},
"source": [
"#### Data sources"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "915fc1bc-2cf1-4b01-b65c-af9a9b00c6a1",
"metadata": {},
"outputs": [],
"source": [
"WORKDIR=Path(os.environ[\"WORKDIR\"])\n",
"VCFDIR=Path(os.environ[\"VCFDIR\"])\n",
"VCF_FILE_PATTERN=VCFDIR / \"PA_chr*.vcf.gz\"\n",
"ICF_DIR=WORKDIR / \"results\" / \"icf\" / \"spruce.icf\"\n",
"ZARR_DIR=WORKDIR / \"results\" / \"vcz\" / \"spruce.vcz\""
]
},
{
"cell_type": "markdown",
"id": "778e2308-299e-4e0d-8a44-9adf8ccc0d97",
"metadata": {},
"source": [
"#### VCF size and partitions"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "ce62775c-f00c-48bb-a533-5de2aa6f7820",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Total compressed VCF: 7.33 TiB across 165 files\n"
]
}
],
"source": [
"files = glob.glob(str(VCF_FILE_PATTERN))\n",
"total_size = sum(os.path.getsize(file) for file in files)\n",
"print(f\"Total compressed VCF: {humanfriendly.format_size(total_size, binary=True)} across {len(files)} files\")"
]
},
{
"cell_type": "markdown",
"id": "5a05128e-874f-4769-a9fc-94f91da48327",
"metadata": {},
"source": [
"#### Inspect Zarr and ICF"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "7980a881-a55a-44d0-84a4-d1bab5ff9e59",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 5min 28s, sys: 1h 15min 43s, total: 1h 21min 12s\n",
"Wall time: 3h 19min 18s\n"
]
}
],
"source": [
"%%time\n",
"zarrvcf_inspec = vcz.inspect(ZARR_DIR)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "cd551096-d0f0-432b-aa5f-2af795956a23",
"metadata": {},
"outputs": [],
"source": [
"zarrdf = pandas.DataFrame(zarrvcf_inspec)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "69dfc006-2a86-4708-9029-84725d7ac8c2",
"metadata": {},
"outputs": [],
"source": [
"zarrdf.to_csv(\"spruce_zarr_inspect.csv\", index=False)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "ffd419ac-fcc9-410a-9542-6759fd44cb82",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 21.3 ms, sys: 4.73 ms, total: 26 ms\n",
"Wall time: 105 ms\n"
]
}
],
"source": [
"%%time\n",
"icf_inspec = vcz.inspect(ICF_DIR)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "f53a8f4c-c5be-400e-b3b2-5fadca016fe6",
"metadata": {},
"outputs": [],
"source": [
"icfdf = pandas.DataFrame(icf_inspec)\n",
"icfdf.to_csv(\"spruce_icf_inspect.csv\", index=False)"
]
},
{
"cell_type": "markdown",
"id": "f3aaf4a4-98eb-4edd-90e6-e66f29df4871",
"metadata": {},
"source": [
"#### Count Zarr files"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "1dc06082-f240-4d6f-b157-ce4a2800e628",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"17227933\n"
]
}
],
"source": [
"! find {ZARR_DIR} | wc -l"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
27 changes: 27 additions & 0 deletions spruce_example/spruce_icf_inspect.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name,type,chunks,size,compressed,max_n,min_val,max_val
CHROM,String,6753,390.65 GiB,164.43 MiB,1,n/a,n/a
POS,Integer,6753,390.65 GiB,5.36 GiB,1, 1, 1e+08
QUAL,Float,6753,390.65 GiB,13.51 GiB,1, 3, 2.7e+05
ID,String,1145,55.81 GiB,1.27 MiB,0,n/a,n/a
FILTERS,String,6753,390.65 GiB,108.89 MiB,0,n/a,n/a
REF,String,6753,390.65 GiB,2.77 GiB,1,n/a,n/a
ALT,String,6997,407.81 GiB,3.93 GiB,3,n/a,n/a
INFO/INDEL,Flag,1257,57.67 GiB,55.59 MiB,1, 1, 1
INFO/IDV,Integer,1257,57.67 GiB,83.68 MiB,1, 1, 4.3e+02
INFO/IMF,Float,1257,57.67 GiB,117.22 MiB,1, 0.007, 1
INFO/DP,Integer,6753,390.65 GiB,7.07 GiB,1, 1, 3.3e+05
INFO/VDB,Float,6751,390.47 GiB,14.5 GiB,1, 0, 1
INFO/RPBZ,Float,6753,390.59 GiB,16.19 GiB,1,-1.6e+04, 1.8e+04
INFO/MQBZ,Float,6753,390.59 GiB,15.61 GiB,1,-4.3e+03, 2.2e+04
INFO/BQBZ,Float,6753,390.56 GiB,15.57 GiB,1,-8.5e+03, 4.2e+03
INFO/MQSBZ,Float,6751,390.29 GiB,13.72 GiB,1,-4.4e+04, 3e+04
INFO/SCBZ,Float,6753,390.59 GiB,15.85 GiB,1,-1.4e+04, 1.4e+04
INFO/FS,Float,6753,390.65 GiB,62.99 MiB,1, 0, 0
INFO/SGB,Float,6753,390.65 GiB,14.55 GiB,1,-2.8e+05, 9.8e+04
INFO/MQ0F,Float,6753,390.65 GiB,62.99 MiB,1, 0, 0
FORMAT/PL,Integer,836866,51.06 TiB,6.02 TiB,10, 0, 4e+02
FORMAT/GT,Integer,363571,22.16 TiB,590.94 GiB,3,-1, 3
INFO/AC,Integer,6879,399.86 GiB,7.64 GiB,3, 1, 2.1e+03
INFO/AN,Integer,6753,390.65 GiB,4.32 GiB,1, 2, 2.1e+03
INFO/DP4,Integer,8543,502.27 GiB,27.36 GiB,4, 0, 3.2e+05
INFO/MQ,Integer,6753,390.65 GiB,900.17 MiB,1, 5, 60
33 changes: 33 additions & 0 deletions spruce_example/spruce_zarr_inspect.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name,dtype,stored,size,ratio,nchunks,chunk_size,avg_chunk_stored,shape,chunk_shape,compressor,filters
/call_PL,int16,6.12 TiB,72.42 TiB, 12,749036,101.38 MiB,8.57 MiB,"(3745170452, 1063, 10)","(10000, 1000, 10)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/call_genotype,int8,282.45 GiB,7.24 TiB, 26,749036,10.14 MiB,395.4 KiB,"(3745170452, 1063, 2)","(10000, 1000, 2)","Blosc(cname='zstd', clevel=7, shuffle=BITSHUFFLE, blocksize=0)",None
/call_genotype_mask,bool,42.15 GiB,7.24 TiB, 1.8e+02,749036,10.14 MiB,59.01 KiB,"(3745170452, 1063, 2)","(10000, 1000, 2)","Blosc(cname='zstd', clevel=7, shuffle=BITSHUFFLE, blocksize=0)",None
/variant_DP4,int32,26.24 GiB,55.81 GiB, 2.1,374518,156.25 KiB,73.47 KiB,"(3745170452, 4)","(10000, 4)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_MQSBZ,float32,13.08 GiB,13.95 GiB, 1.1,374518,39.06 KiB,36.63 KiB,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_RPBZ,float32,13.04 GiB,13.95 GiB, 1.1,374518,39.06 KiB,36.51 KiB,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_MQBZ,float32,12.97 GiB,13.95 GiB, 1.1,374518,39.06 KiB,36.32 KiB,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_SCBZ,float32,12.94 GiB,13.95 GiB, 1.1,374518,39.06 KiB,36.22 KiB,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_VDB,float32,12.84 GiB,13.95 GiB, 1.1,374518,39.06 KiB,35.95 KiB,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_quality,float32,12.84 GiB,13.95 GiB, 1.1,374518,39.06 KiB,35.95 KiB,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_BQBZ,float32,12.64 GiB,13.95 GiB, 1.1,374518,39.06 KiB,35.38 KiB,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_SGB,float32,12.12 GiB,13.95 GiB, 1.2,374518,39.06 KiB,33.93 KiB,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_position,int32,9.65 GiB,13.95 GiB, 1.4,374518,39.06 KiB,27.02 KiB,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_AC,int16,7.42 GiB,20.93 GiB, 2.8,374518,58.59 KiB,20.77 KiB,"(3745170452, 3)","(10000, 3)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_DP,int32,6.71 GiB,13.95 GiB, 2.1,374518,39.06 KiB,18.79 KiB,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_allele,object,5.37 GiB,111.61 GiB, 21,374518,312.5 KiB,15.04 KiB,"(3745170452, 4)","(10000, 4)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",[VLenUTF8()]
/variant_AN,int16,3.05 GiB,6.98 GiB, 2.3,374518,19.53 KiB,8.54 KiB,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/call_genotype_phased,bool,2.03 GiB,3.62 TiB, 1.8e+03,749036,5.07 MiB,2.84 KiB,"(3745170452, 1063)","(10000, 1000)","Blosc(cname='zstd', clevel=7, shuffle=BITSHUFFLE, blocksize=0)",None
/variant_filter,bool,1.46 GiB,3.49 GiB, 2.4,374518,9.77 KiB,4.09 KiB,"(3745170452, 1)","(10000, 1)","Blosc(cname='zstd', clevel=7, shuffle=BITSHUFFLE, blocksize=0)",None
/variant_MQ,int8,489.83 MiB,3.49 GiB, 7.3,374518,9.77 KiB,1.34 KiB,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_IMF,float32,161.37 MiB,13.95 GiB, 89,374518,39.06 KiB,451 bytes,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_IDV,int16,121.14 MiB,6.98 GiB, 59,374518,19.53 KiB,339 bytes,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_INDEL,bool,72.6 MiB,3.49 GiB, 49,374518,9.77 KiB,203 bytes,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=BITSHUFFLE, blocksize=0)",None
/variant_id,object,34.95 MiB,27.9 GiB, 8.2e+02,374518,78.12 KiB,97 bytes,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",[VLenUTF8()]
/variant_id_mask,bool,33.16 MiB,3.49 GiB, 1.1e+02,374518,9.77 KiB,92 bytes,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=BITSHUFFLE, blocksize=0)",None
/variant_contig,int16,32.44 MiB,6.98 GiB, 2.2e+02,374518,19.53 KiB,90 bytes,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_FS,float32,32.09 MiB,13.95 GiB, 4.5e+02,374518,39.06 KiB,89 bytes,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/variant_MQ0F,float32,32.09 MiB,13.95 GiB, 4.5e+02,374518,39.06 KiB,89 bytes,"(3745170452,)","(10000,)","Blosc(cname='zstd', clevel=7, shuffle=NOSHUFFLE, blocksize=0)",None
/contig_length,int64,6.77 KiB,8.18 KiB, 1.2,1,8.18 KiB,6.77 KiB,"(1047,)","(1047,)","Blosc(cname='zstd', clevel=7, shuffle=SHUFFLE, blocksize=0)",None
/contig_id,object,6.2 KiB,8.18 KiB, 1.3,1,8.18 KiB,6.2 KiB,"(1047,)","(1047,)","Blosc(cname='zstd', clevel=7, shuffle=SHUFFLE, blocksize=0)",[VLenUTF8()]
/sample_id,object,5.46 KiB,8.3 KiB, 1.5,2,4.15 KiB,2.73 KiB,"(1063,)","(1000,)","Blosc(cname='zstd', clevel=7, shuffle=SHUFFLE, blocksize=0)",[VLenUTF8()]
/filter_id,object,4.43 KiB,8 bytes, 0.0018,1,8.0 bytes,4.43 KiB,"(1,)","(1,)","Blosc(cname='zstd', clevel=7, shuffle=SHUFFLE, blocksize=0)",[VLenUTF8()]
Loading