From 40b092c9acea8a57ea6d7c4ad84b640190751599 Mon Sep 17 00:00:00 2001 From: dweemx Date: Thu, 20 Feb 2020 14:23:01 +0100 Subject: [PATCH 01/32] Add case study Kurmangaliyev Y Z et al., 2019 Add the 2 config files (bbknn and harmony) Add new section in docs Update the Hung R et al., 2019 case study with correct link --- docs/case-studies.rst | 97 ++++++- examples/hungr_2019/10x_bbknn_scenic.config | 12 +- .../10x_bbknn_scenic.config | 264 ++++++++++++++++++ .../10x_harmony_scenic.config | 198 +++++++++++++ 4 files changed, 554 insertions(+), 17 deletions(-) create mode 100644 examples/kurmangaliyevyz_2019/10x_bbknn_scenic.config create mode 100644 examples/kurmangaliyevyz_2019/10x_harmony_scenic.config diff --git a/docs/case-studies.rst b/docs/case-studies.rst index 56ed122f..62e55b6c 100644 --- a/docs/case-studies.rst +++ b/docs/case-studies.rst @@ -1,33 +1,108 @@ Case Studies ============= +Kurmangaliyev Y Z et al., 2019 - Modular transcriptional programs separately define axon and dendrite connectivity +------------------------------------------------------------------------------------------------------------------- + +Some links related to the case study: + +- Paper: https://elifesciences.org/articles/50822 +- GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE126139 + +Analysis of 10xGenomics Samples +******************************* + +BBKNN and SCENIC +++++++++++++++++ + +The following command was used to generate the config: + +.. code:: bash + + nextflow config \ + ~/vib-singlecell-nf/vsn-pipelines \ + -profile sra,cellranger,pcacv,bbknn,dm6,scenic,scenic_use_cistarget_motifs,scenic_use_cistarget_tracks,singularity,qsub \ + > nextflow.config + +The generated config is available at the ``vsn-pipelines`` GitHub repository: `examples/kurmangaliyevyz_2019/10x_bbknn_scenic.config`_. You should update the lines commented with " TO EDIT" with the correct information. + +.. _`examples/kurmangaliyevyz_2019/10x_bbknn_scenic.config`: https://github.com/vib-singlecell-nf/vsn-pipelines/blob/master/examples/kurmangaliyevyz_2019/10x_bbknn_scenic.config + +To start the pipeline, run the following command: + +.. code:: bash + + nextflow \ + -C nextflow.config \ + run ~/vib-singlecell-nf/vsn-pipelines \ + -entry sra_cellranger_bbknn_scenic -resume + +The resulting loom file is available at `kurmangaliyevyz_2019_10x_bbknn_scenic`_ and is ready to be explored in `SCope `_. + +.. _`kurmangaliyevyz_2019_10x_bbknn_scenic`: https://cloud.aertslab.org/index.php/s/dpmQyKAW5cWn9RF + +Harmony and SCENIC (append mode) +++++++++++++++++++++++++++++++++ + +.. code:: bash + + nextflow config \ + ~/vib-singlecell-nf/vsn-pipelines \ + -profile tenx,pcacv,harmony,scenic_append_only,singularity \ + > nextflow.config + +The generated config is available at the ``vsn-pipelines`` GitHub repository: `examples/kurmangaliyevyz_2019/10x_harmony_scenic.config`_. You should update The lines commented with " TO EDIT" with the correct information. + +.. _`examples/kurmangaliyevyz_2019/10x_harmony_scenic.config`: https://github.com/vib-singlecell-nf/vsn-pipelines/blob/master/examples/kurmangaliyevyz_2019/10x_harmony_scenic.config + +To start the pipeline, run the following command: + +.. code:: bash + + nextflow \ + -C nextflow.config \ + run ~/vib-singlecell-nf/vsn-pipelines \ + -entry harmony_scenic -resume + +The resulting loom file is available at `kurmangaliyevyz_2019_harmony_scenic`_ and is ready to be explored in `SCope `_. + +.. _`kurmangaliyevyz_2019_harmony_scenic`: https://cloud.aertslab.org/index.php/s/92bR4LfLDbtDM8F + Hung R et al., 2019 - A cell atlas of the adult Drosophila midgut ----------------------------------------------------------------- Some links related to the case study: -- GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE120537 - Paper: https://www.pnas.org/content/117/3/1514.abstract +- GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE120537 -Analysis of 10x Samples -************************ +Analysis of 10xGenomics Samples +******************************* -The following command was used to generate the config:: +The following command was used to generate the config: + +.. code:: bash nextflow config \ - ~/vib-singlecell-nf \ + ~/vib-singlecell-nf/vsn-pipelines \ -profile singularity,sra,cellranger,pcacv,bbknn,scenic \ > nextflow.config -The generated config is available in ``examples/hungr_2019/10x_bbknn_scenic.config``. +The generated config is available at the ``vsn-pipelines`` GitHub repository: `examples/hungr_2019/10x_bbknn_scenic.config`_. You should provide The lines commented with " TO EDIT" with the correct information. + +.. _`examples/hungr_2019/10x_bbknn_scenic.config`: https://github.com/vib-singlecell-nf/vsn-pipelines/blob/master/examples/hungr_2019/10x_bbknn_scenic.config -To start the pipeline, run the following command:: +To start the pipeline, run the following command: + +.. code:: bash nextflow \ - -C nextflow.config \ - run ~/vib-singlecell-nf \ - -entry sra_cellranger_bbknn_scenic + -C nextflow.config \ + run ~/vib-singlecell-nf/vsn-pipelines \ + -entry sra_cellranger_bbknn_scenic + +The resulting loom file is available at `hungr_2019_bbknn_scenic.loom`_, and is ready to be explored in `SCope `_. -The resulting loom file is available here: ``examples/hungr_2019/10x_bbknn_scenic.loom`` and is ready to be explored in `SCope `_. +.. _`hungr_2019_bbknn_scenic.loom`: https://cloud.aertslab.org/index.php/s/PeBcfa8ggzbjZRr \ No newline at end of file diff --git a/examples/hungr_2019/10x_bbknn_scenic.config b/examples/hungr_2019/10x_bbknn_scenic.config index fb8afca8..6d69ae2a 100644 --- a/examples/hungr_2019/10x_bbknn_scenic.config +++ b/examples/hungr_2019/10x_bbknn_scenic.config @@ -115,13 +115,13 @@ params { cistarget { adj = 'adj.tsv' // motif feather format databases - mtfDB = "/staging/leuven/res_00001/databases/cistarget/databases/drosophila_melanogaster/dm6/flybase_r6.02/mc8nr/gene_based/dm6-5kb-upstream-full-tx-11species.mc8nr.feather" + mtfDB = "dm6-5kb-upstream-full-tx-11species.mc8nr.feather" // TO EDIT // motif annotations - mtfANN = "/staging/leuven/res_00001/databases/cistarget/motif2tf/motifs-v8-nr.flybase-m0.001-o0.0.tbl" + mtfANN = "motifs-v8-nr.flybase-m0.001-o0.0.tbl" // TO EDIT // track feather format databases - trkDB = "/ddn1/vol1/staging/leuven/stg_00002/lcb/saibar/Projects/epiSCENIC/2019-06_ChipDB/Chip_dbs_v2_20190627/encode_modERN_20190621__ChIP_seq.max_GENEBASED.feather" + trkDB = "encode_modERN_20190621__ChIP_seq.max_GENEBASED.feather" // TO EDIT // track annotations - trkANN = "/ddn1/vol1/staging/leuven/stg_00002/lcb/dwmax/documents/resources/scenic/db/dm6/encode_modERN_20190621_dm6_annotation.track_to_tf_in_motif_to_tf_format.tsv" + trkANN = "encode_modERN_20190621_dm6_annotation.track_to_tf_in_motif_to_tf_format.tsv" // TO EDIT type = '' output = 'reg.csv' rank_threshold = 5000 @@ -154,7 +154,7 @@ params { processExecutor = 'qsub' } count { - transcriptome = '/ddn1/vol1/staging/leuven/stg_00002/lcb/dwmax/documents/resources/refs/flybase/r6.31_premrna_v3/cellranger/3.1.0/flybase_r6.31_premrna_v3.1' + transcriptome = cellranger/3.1.0/flybase_r6.31_premrna_v3.1' // TO EDIT ppn = 34 pmem = '5400mb' maxForks = 2 @@ -264,5 +264,5 @@ dag { singularity { enabled = true autoMounts = true - runOptions = '-B /ddn1/vol1/staging/leuven/stg_00002/,/staging/leuven/stg_00002/' + runOptions = '-B ' // TO EDIT } diff --git a/examples/kurmangaliyevyz_2019/10x_bbknn_scenic.config b/examples/kurmangaliyevyz_2019/10x_bbknn_scenic.config new file mode 100644 index 00000000..ebfa1ea4 --- /dev/null +++ b/examples/kurmangaliyevyz_2019/10x_bbknn_scenic.config @@ -0,0 +1,264 @@ +manifest { + name = 'vib-singlecell-nf/vsn-pipelines' + description = 'A repository of pipelines for single-cell data in Nextflow DSL2' + homePage = 'https://github.com/vib-singlecell-nf/vsn-pipelines' + version = '0.11.0' + mainScript = 'main.nf' + defaultBranch = 'master' + nextflowVersion = '!19.12.0-edge' +} + +params { + global { + project_name = 'Kurmangaliyev2019_10x_Brain_Pupa_T4-T5' + outdir = 'out' + qsubaccount = '' + species = 'fly' + genome { + assembly = 'dm6' + } + } + sc { + scope { + genome = 'Flybase r6.31 pre-mRNA v3.1 (https://github.com/FlyCellAtlas/genome_references/commit/a0c77d922ce88d9df1f72b79ab59996898bac78c)' + tree { + level_1 = 'Brain' + level_2 = 'Pupa' + level_3 = 'T4/T5' + } + } + file_converter { + iff = '10x_cellranger_mex' + off = 'h5ad' + tagCellWithSampleId = true + useFilteredMatrix = true + } + file_concatenator { + join = 'outer' + iff = '10x_cellranger_mex' + off = 'h5ad' + } + scanpy { + container = 'vibsinglecellnf/scanpy:0.5.0' + filter { + report_ipynb = '/src/scanpy/bin/reports/sc_filter_qc_report.ipynb' + cellFilterMinNGenes = 1000 + cellFilterMaxNGenes = 2000 + cellFilterMaxPercentMito = 0.05 + geneFilterMinNCells = 3 + iff = '10x_cellranger_mex' + off = 'h5ad' + outdir = 'out' + } + data_transformation { + dataTransformationMethod = 'log1p' + iff = '10x_cellranger_mex' + off = 'h5ad' + } + normalization { + normalizationMethod = 'cpx' + countsPerCellAfter = 10000 + iff = '10x_cellranger_mex' + off = 'h5ad' + } + feature_selection { + report_ipynb = '/src/scanpy/bin/reports/sc_select_variable_genes_report.ipynb' + featureSelectionMethod = 'mean_disp_plot' + featureSelectionMinMean = 0.01 + featureSelectionMaxMean = 5 + featureSelectionMinDisp = 0.5 + iff = '10x_cellranger_mex' + off = 'h5ad' + } + feature_scaling { + featureScalingMthod = 'zscore_scale' + featureScalingMaxSD = 10 + iff = '10x_cellranger_mex' + off = 'h5ad' + } + neighborhood_graph { + iff = '10x_cellranger_mex' + off = 'h5ad' + } + dim_reduction { + report_ipynb = '/src/scanpy/bin/reports/sc_dim_reduction_report.ipynb' + pca { + dimReductionMethod = 'PCA' + iff = '10x_cellranger_mex' + off = 'h5ad' + } + umap { + dimReductionMethod = 'UMAP' + iff = '10x_cellranger_mex' + off = 'h5ad' + } + tsne { + dimReductionMethod = 't-SNE' + nJobs = 10 + iff = '10x_cellranger_mex' + off = 'h5ad' + } + } + clustering { + report_ipynb = '/src/scanpy/bin/reports/sc_clustering_report.ipynb' + clusteringMethods = ['louvain','leiden'] + resolutions = [0.4, 0.8, 1.0, 1.2, 1.6, 2.0, 4.0] + iff = '10x_cellranger_mex' + off = 'h5ad' + } + marker_genes { + method = 'wilcoxon' + ngenes = 0 + groupby = 'louvain' + off = 'h5ad' + } + batch_effect_correct { + batchEffectCorrectionMethod = 'bbknn' + report_ipynb = '/src/scanpy/bin/reports/sc_bbknn_report.ipynb' + neighborsWithinBatch = 5 + trim = 0 + iff = '10x_cellranger_mex' + off = 'h5ad' + } + } + scenic { + container = 'vibsinglecellnf/scenic:0.9.19' + scenicoutdir = 'out/scenic/' + report_ipynb = '/src/scenic/bin/reports/scenic_report.ipynb' + filteredLoom = '' + scenicOutputLoom = 'SCENIC_output.loom' + scenicScopeOutputLoom = 'SCENIC_SCope_output.loom' + mode = 'dask_multiprocessing' + client_or_address = '' + numWorkers = 8 + cell_id_attribute = 'CellID' + gene_attribute = 'Gene' + grn { + seed = '' + pmem = '2gb' + maxForks = 1 + numWorkers = 16 + tfs = 'allTFs_dmel.txt' // TO EDIT + } + cistarget { + adj = 'adj.tsv' + type = '' + output = 'reg.csv' + rank_threshold = 5000 + auc_threshold = 0.05 + nes_threshold = 3.0 + min_orthologous_identity = 0.0 + max_similarity_fdr = 0.001 + annotations_fname = '' + thresholds = '0.75,0.90' + top_n_targets = 50 + top_n_regulators = '5,10,50' + min_genes = 20 + pmem = '2gb' + maxForks = 1 + numWorkers = 8 + motifsDb = 'dm6-5kb-upstream-full-tx-11species.mc8nr.feather' // TO EDIT + motifsAnnotation = 'motifs-v8-nr.flybase-m0.001-o0.0.tbl' // TO EDIT + tracksDb = 'encode_modERN_20190621__ChIP_seq.max_GENEBASED.feather' // TO EDIT + tracksAnnotation = 'encode_modERN_20190621_dm6_annotation.track_to_tf_in_motif_to_tf_format.tsv' // TO EDIT + } + aucell { + output = 'aucell_output.loom' + rank_threshold = 5000 + auc_threshold = 0.05 + nes_threshold = 3.0 + pmem = '2gb' + maxForks = 1 + numWorkers = 8 + } + } + cellranger { + container = 'vibsinglecellnf/cellranger:3.1.0' + labels { + processExecutor = 'local' + } + count { + transcriptome = 'cellranger/3.1.0/flybase_r6.31_premrna_v3.1' // TO EDIT + ppn = 16 + pmem = '6gb' + walltime = '24:00:00' + maxForks = 1 + } + } + } + utils { + container = 'vibsinglecellnf/utils:0.2.1' + workflow_configuration { + report_ipynb = '/src/utils/bin/reports/workflow_configuration_template.ipynb' + } + sra_metadata { + mode = 'web' + } + } + parseConfig = { sample, paramsGlobal, paramsLocal -> + def pL = paramsLocal.collectEntries { k,v -> + if (v instanceof Map) { + if (v.containsKey(sample)) + return [k, v[sample]] + if (v.containsKey('default')) + return [k, v['default']] + throw new Exception("Not a valid entry in " + k + ". The sample " + sample + " is not found in " + v +" ; Make sure your samples are correctly specified when using the multi-sample feature.") + } else { + return [k,v] + } + } + return [global: paramsGlobal, local: pL] + } + data { + sra = [[id:'SRP184201', samples:['*']]] + } + sratoolkit { + container = 'vibsinglecellnf/sratoolkit:2.9.4-1.1.0' + downloadFastqs { + threads = 8 + maxForks = 1 + } + } + pcacv { + container = 'vibsinglecellnf/pcacv:0.1.0' + find_optimal_npcs { + accessor = '@assays$RNA@scale.data' + nCores = 8 + } + } +} + +process { + withLabel:qsub { + executor = 'pbs' + } + withLabel:local { + executor = 'local' + } +} + +timeline { + enabled = true + file = 'out/nextflow_reports/execution_timeline.html' +} + +report { + enabled = true + file = 'out/nextflow_reports/execution_report.html' +} + +trace { + enabled = true + file = 'out/nextflow_reports/execution_trace.txt' +} + +dag { + enabled = true + file = 'out/nextflow_reports/pipeline_dag.svg' +} + +singularity { + enabled = true + autoMounts = true + runOptions = '-B ' // TO EDIT +} diff --git a/examples/kurmangaliyevyz_2019/10x_harmony_scenic.config b/examples/kurmangaliyevyz_2019/10x_harmony_scenic.config new file mode 100644 index 00000000..6eea0d36 --- /dev/null +++ b/examples/kurmangaliyevyz_2019/10x_harmony_scenic.config @@ -0,0 +1,198 @@ +manifest { + name = 'vib-singlecell-nf/vsn-pipelines' + description = 'A repository of pipelines for single-cell data in Nextflow DSL2' + homePage = 'https://github.com/vib-singlecell-nf/vsn-pipelines' + version = '0.12.0' + mainScript = 'main.nf' + defaultBranch = 'master' + nextflowVersion = '!19.12.0-edge' +} + +params { + global { + project_name = 'Kurmangaliyev2019_10x_Brain_Pupa_T4-T5' + outdir = 'out' + qsubaccount = '' + species = 'fly' + genome { + assembly = 'dm6' + } + } + sc { + scope { + genome = 'Flybase r6.31 pre-mRNA v3.1 (https://github.com/FlyCellAtlas/genome_references/commit/a0c77d922ce88d9df1f72b79ab59996898bac78c)' + tree { + level_1 = 'Brain' + level_2 = 'Pupa' + level_3 = 'T4/T5' + } + } + file_converter { + iff = '10x_cellranger_mex' + off = 'h5ad' + tagCellWithSampleId = true + useFilteredMatrix = true + } + file_concatenator { + join = 'outer' + iff = '10x_cellranger_mex' + off = 'h5ad' + } + scanpy { + container = 'vibsinglecellnf/scanpy:0.5.0' + filter { + report_ipynb = '/src/scanpy/bin/reports/sc_filter_qc_report.ipynb' + cellFilterMinNGenes = 1000 + cellFilterMaxNGenes = 2000 + cellFilterMaxPercentMito = 0.05 + geneFilterMinNCells = 3 + iff = '10x_cellranger_mex' + off = 'h5ad' + outdir = 'out' + } + data_transformation { + dataTransformationMethod = 'log1p' + iff = '10x_cellranger_mex' + off = 'h5ad' + } + normalization { + normalizationMethod = 'cpx' + countsPerCellAfter = 10000 + iff = '10x_cellranger_mex' + off = 'h5ad' + } + feature_selection { + report_ipynb = '/src/scanpy/bin/reports/sc_select_variable_genes_report.ipynb' + featureSelectionMethod = 'mean_disp_plot' + featureSelectionMinMean = 0.01 + featureSelectionMaxMean = 5 + featureSelectionMinDisp = 0.5 + iff = '10x_cellranger_mex' + off = 'h5ad' + } + feature_scaling { + featureScalingMthod = 'zscore_scale' + featureScalingMaxSD = 10 + iff = '10x_cellranger_mex' + off = 'h5ad' + } + neighborhood_graph { + iff = '10x_cellranger_mex' + off = 'h5ad' + } + dim_reduction { + report_ipynb = '/src/scanpy/bin/reports/sc_dim_reduction_report.ipynb' + pca { + dimReductionMethod = 'PCA' + iff = '10x_cellranger_mex' + off = 'h5ad' + } + umap { + dimReductionMethod = 'UMAP' + iff = '10x_cellranger_mex' + off = 'h5ad' + } + tsne { + dimReductionMethod = 't-SNE' + nJobs = 10 + iff = '10x_cellranger_mex' + off = 'h5ad' + } + } + clustering { + report_ipynb = '/src/scanpy/bin/reports/sc_clustering_report.ipynb' + clusteringMethods = ['louvain','leiden'] + resolutions = [0.4, 0.8, 1.0, 1.2, 1.6, 2.0, 4.0] + iff = '10x_cellranger_mex' + off = 'h5ad' + } + marker_genes { + method = 'wilcoxon' + ngenes = 0 + groupby = 'louvain' + off = 'h5ad' + } + } + harmony { + container = 'vibsinglecellnf/harmony:1.0' + report_ipynb = '/src/harmony/bin/reports/sc_harmony_report.ipynb' + varsUse = ['batch'] + } + scenic { + container = 'vibsinglecellnf/scenic:0.9.19' + report_ipynb = '/src/scenic/bin/reports/scenic_report.ipynb' + existingScenicLoom = 'out/scenic/Kurmangaliyev2019_10x_Brain_Pupa_T4-T5.SCENIC_output.loom' // TO EDIT + sampleSuffixWithExtension = '.SCENIC_output.loom' + scenicoutdir = 'out/scenic/' + scenicScopeOutputLoom = 'SCENIC_SCope_output.loom' + } + } + utils { + container = 'vibsinglecellnf/utils:0.2.1' + workflow_configuration { + report_ipynb = '/src/utils/bin/reports/workflow_configuration_template.ipynb' + } + } + parseConfig = { sample, paramsGlobal, paramsLocal -> + def pL = paramsLocal.collectEntries { k,v -> + if (v instanceof Map) { + if (v.containsKey(sample)) + return [k, v[sample]] + if (v.containsKey('default')) + return [k, v['default']] + throw new Exception("Not a valid entry in " + k + ". The sample " + sample + " is not found in " + v +" ; Make sure your samples are correctly specified when using the multi-sample feature.") + } else { + return [k,v] + } + } + return [global: paramsGlobal, local: pL] + } + data { + tenx { + cellranger_outs_dir_path = 'out/counts/T4_T5_*/outs/' // TO EDIT + } + } + pcacv { + container = 'vibsinglecellnf/pcacv:0.1.0' + find_optimal_npcs { + accessor = '@assays$RNA@scale.data' + nCores = 8 + } + } +} + +process { + executor = 'local' + withLabel:qsub { + executor = 'pbs' + } + withLabel:local { + executor = 'local' + } +} + +timeline { + enabled = true + file = 'out/nextflow_reports/execution_timeline.html' +} + +report { + enabled = true + file = 'out/nextflow_reports/execution_report.html' +} + +trace { + enabled = true + file = 'out/nextflow_reports/execution_trace.txt' +} + +dag { + enabled = true + file = 'out/nextflow_reports/pipeline_dag.svg' +} + +singularity { + enabled = true + autoMounts = true + runOptions = '-B ' // TO EDIT +} From d3b277eb41a7c98e75a3359414aa1dd2eedbcfb5 Mon Sep 17 00:00:00 2001 From: dweemx Date: Thu, 20 Feb 2020 21:48:10 +0100 Subject: [PATCH 02/32] Fix for incorrect input cardinality --- workflows/single_sample.nf | 4 +++- workflows/single_sample_star.nf | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/workflows/single_sample.nf b/workflows/single_sample.nf index 734926ed..6ea600cd 100644 --- a/workflows/single_sample.nf +++ b/workflows/single_sample.nf @@ -60,7 +60,9 @@ workflow single_sample_base { // Publishing SC__PUBLISH_H5AD( - CLUSTER_IDENTIFICATION.out.marker_genes, + CLUSTER_IDENTIFICATION.out.marker_genes.map { + it -> tuple(it[0], it[1], null) + }, params.global.project_name+".single_sample.output" ) diff --git a/workflows/single_sample_star.nf b/workflows/single_sample_star.nf index 1fec17c5..e94e68d2 100644 --- a/workflows/single_sample_star.nf +++ b/workflows/single_sample_star.nf @@ -55,7 +55,9 @@ workflow single_sample_star { // Publishing SC__PUBLISH_H5AD( - CLUSTER_IDENTIFICATION.out.marker_genes, + CLUSTER_IDENTIFICATION.out.marker_genes.map { + it -> tuple(it[0], it[1], null) + }, "single_sample.output" ) From b6fe7bfcf283d74d45d756bda710d26f54084aa5 Mon Sep 17 00:00:00 2001 From: dweemx Date: Thu, 20 Feb 2020 21:49:08 +0100 Subject: [PATCH 03/32] Remove redundant prefixes in params --- conf/test__bbknn.config | 2 +- conf/test__harmony.config | 2 +- conf/test__mnncorrect.config | 2 +- conf/test__single_sample.config | 2 +- conf/test__single_sample_scenic.config | 2 +- conf/test__single_sample_scenic_multiruns.config | 2 +- docs/development.rst | 4 ++-- docs/features.rst | 2 +- 8 files changed, 9 insertions(+), 9 deletions(-) diff --git a/conf/test__bbknn.config b/conf/test__bbknn.config index 5dcc55d8..ddd3f694 100644 --- a/conf/test__bbknn.config +++ b/conf/test__bbknn.config @@ -21,7 +21,7 @@ params { } dim_reduction { pca { - dimReductionMethod = 'PCA' + method = 'PCA' nComps = 2 } } diff --git a/conf/test__harmony.config b/conf/test__harmony.config index 3ecd4dd6..0957eee2 100644 --- a/conf/test__harmony.config +++ b/conf/test__harmony.config @@ -21,7 +21,7 @@ params { } dim_reduction { pca { - dimReductionMethod = 'PCA' + method = 'PCA' nComps = 2 } } diff --git a/conf/test__mnncorrect.config b/conf/test__mnncorrect.config index 5dcc55d8..ddd3f694 100644 --- a/conf/test__mnncorrect.config +++ b/conf/test__mnncorrect.config @@ -21,7 +21,7 @@ params { } dim_reduction { pca { - dimReductionMethod = 'PCA' + method = 'PCA' nComps = 2 } } diff --git a/conf/test__single_sample.config b/conf/test__single_sample.config index 0d11a2e4..f3539112 100644 --- a/conf/test__single_sample.config +++ b/conf/test__single_sample.config @@ -21,7 +21,7 @@ params { } dim_reduction { pca { - dimReductionMethod = 'PCA' + method = 'PCA' nComps = 2 } } diff --git a/conf/test__single_sample_scenic.config b/conf/test__single_sample_scenic.config index 93e0d57c..42fb2fad 100644 --- a/conf/test__single_sample_scenic.config +++ b/conf/test__single_sample_scenic.config @@ -21,7 +21,7 @@ params { } dim_reduction { pca { - dimReductionMethod = 'PCA' + method = 'PCA' nComps = 2 } } diff --git a/conf/test__single_sample_scenic_multiruns.config b/conf/test__single_sample_scenic_multiruns.config index 497ca14e..a10e27f4 100644 --- a/conf/test__single_sample_scenic_multiruns.config +++ b/conf/test__single_sample_scenic_multiruns.config @@ -21,7 +21,7 @@ params { } dim_reduction { pca { - dimReductionMethod = 'PCA' + method = 'PCA' nComps = 2 } } diff --git a/docs/development.rst b/docs/development.rst index f29f267e..50e40ec3 100644 --- a/docs/development.rst +++ b/docs/development.rst @@ -602,11 +602,11 @@ The parameter structure internally (post-merge) is: } dim_reduction { pca { - dimReductionMethod = 'PCA' + method = 'PCA' ... } umap { - dimReductionMethod = 'UMAP' + method = 'UMAP' ... } } diff --git a/docs/features.rst b/docs/features.rst index 98fad58d..e9cd7932 100644 --- a/docs/features.rst +++ b/docs/features.rst @@ -166,7 +166,7 @@ Since ``v0.9.0``, it is possible to explore several combinations of parameters. - ``method`` :: - clusteringMethods = ['louvain','leiden'] + methods = ['louvain','leiden'] - ``resolution`` :: From d60194771c8a5388096f76cfe12a6ed1af9b68b2 Mon Sep 17 00:00:00 2001 From: dweemx Date: Fri, 21 Feb 2020 09:59:06 +0100 Subject: [PATCH 04/32] Lowercase and remove special chars form the dim reduction method names --- conf/test__bbknn.config | 2 +- conf/test__harmony.config | 2 +- conf/test__mnncorrect.config | 2 +- conf/test__single_sample.config | 2 +- conf/test__single_sample_scenic.config | 2 +- conf/test__single_sample_scenic_multiruns.config | 2 +- docs/development.rst | 4 ++-- 7 files changed, 8 insertions(+), 8 deletions(-) diff --git a/conf/test__bbknn.config b/conf/test__bbknn.config index ddd3f694..c18bae14 100644 --- a/conf/test__bbknn.config +++ b/conf/test__bbknn.config @@ -21,7 +21,7 @@ params { } dim_reduction { pca { - method = 'PCA' + method = 'pca' nComps = 2 } } diff --git a/conf/test__harmony.config b/conf/test__harmony.config index 0957eee2..fa1fc18c 100644 --- a/conf/test__harmony.config +++ b/conf/test__harmony.config @@ -21,7 +21,7 @@ params { } dim_reduction { pca { - method = 'PCA' + method = 'pca' nComps = 2 } } diff --git a/conf/test__mnncorrect.config b/conf/test__mnncorrect.config index ddd3f694..c18bae14 100644 --- a/conf/test__mnncorrect.config +++ b/conf/test__mnncorrect.config @@ -21,7 +21,7 @@ params { } dim_reduction { pca { - method = 'PCA' + method = 'pca' nComps = 2 } } diff --git a/conf/test__single_sample.config b/conf/test__single_sample.config index f3539112..9f9122ab 100644 --- a/conf/test__single_sample.config +++ b/conf/test__single_sample.config @@ -21,7 +21,7 @@ params { } dim_reduction { pca { - method = 'PCA' + method = 'pca' nComps = 2 } } diff --git a/conf/test__single_sample_scenic.config b/conf/test__single_sample_scenic.config index 42fb2fad..0baa738f 100644 --- a/conf/test__single_sample_scenic.config +++ b/conf/test__single_sample_scenic.config @@ -21,7 +21,7 @@ params { } dim_reduction { pca { - method = 'PCA' + method = 'pca' nComps = 2 } } diff --git a/conf/test__single_sample_scenic_multiruns.config b/conf/test__single_sample_scenic_multiruns.config index a10e27f4..d2a9410a 100644 --- a/conf/test__single_sample_scenic_multiruns.config +++ b/conf/test__single_sample_scenic_multiruns.config @@ -21,7 +21,7 @@ params { } dim_reduction { pca { - method = 'PCA' + method = 'pca' nComps = 2 } } diff --git a/docs/development.rst b/docs/development.rst index 50e40ec3..31ad648a 100644 --- a/docs/development.rst +++ b/docs/development.rst @@ -602,11 +602,11 @@ The parameter structure internally (post-merge) is: } dim_reduction { pca { - method = 'PCA' + method = 'pca' ... } umap { - method = 'UMAP' + method = 'tsne' ... } } From 923a960500e0517fe8cf78542a4e9117351e394d Mon Sep 17 00:00:00 2001 From: dweemx Date: Fri, 21 Feb 2020 23:50:35 +0100 Subject: [PATCH 05/32] Add regressing out variable step to the pipelines This is not done by default --- src/utils/bin/sc_file_concatenator.py | 1 + src/utils/bin/sc_file_converter.py | 4 ++++ workflows/bbknn.nf | 8 +++++++- workflows/harmony.nf | 8 +++++++- workflows/mnncorrect.nf | 11 ++++++++--- workflows/single_sample.nf | 8 +++++++- workflows/single_sample_star.nf | 8 +++++++- 7 files changed, 41 insertions(+), 7 deletions(-) diff --git a/src/utils/bin/sc_file_concatenator.py b/src/utils/bin/sc_file_concatenator.py index d4eb6032..7b6e428a 100755 --- a/src/utils/bin/sc_file_concatenator.py +++ b/src/utils/bin/sc_file_concatenator.py @@ -76,6 +76,7 @@ join=args.join, index_unique=index_unique ) + adata = adata[:, np.sort(adata.var.index)] else: raise Exception("Concatenation of .{} files is not implemented.".format(args.format)) diff --git a/src/utils/bin/sc_file_converter.py b/src/utils/bin/sc_file_converter.py index 9772cced..2eb6d8ac 100755 --- a/src/utils/bin/sc_file_converter.py +++ b/src/utils/bin/sc_file_converter.py @@ -4,6 +4,7 @@ import os import re import scanpy as sc +import numpy as np in_formats = [ '10x_cellranger_mex', @@ -114,6 +115,7 @@ def add_sample_id(adata, args): # If is tag_cell_with_sample_id is given, add the sample ID as suffix if args.tag_cell_with_sample_id: adata.obs.index = map(lambda x: re.sub('-[0-9]+', f"-{args.sample_id}", x), adata.obs.index) + adata = adata[:, np.sort(adata.var.index)] print("Writing 10x data to h5ad...") adata.write_h5ad(filename="{}.h5ad".format(FILE_PATH_OUT_BASENAME)) @@ -132,6 +134,7 @@ def add_sample_id(adata, args): # If is tag_cell_with_sample_id is given, add the sample ID as suffix if args.tag_cell_with_sample_id: adata.obs.index = map(lambda x: re.sub('-[0-9]+', f"-{args.sample_id}", x), adata.obs.index) + adata = adata[:, np.sort(adata.var.index)] print("Writing 10x data to h5ad...") adata.write_h5ad(filename="{}.h5ad".format(FILE_PATH_OUT_BASENAME)) @@ -146,6 +149,7 @@ def add_sample_id(adata, args): delimiter=delim, first_column_names=True ).T + adata = adata[:, np.sort(adata.var.index)] adata.write_h5ad(filename="{}.h5ad".format(FILE_PATH_OUT_BASENAME)) else: diff --git a/workflows/bbknn.nf b/workflows/bbknn.nf index c57b9823..b703f6a7 100644 --- a/workflows/bbknn.nf +++ b/workflows/bbknn.nf @@ -9,6 +9,7 @@ include '../src/utils/processes/utils.nf' params(params.sc.file_concatenator + p include QC_FILTER from '../src/scanpy/workflows/qc_filter.nf' params(params) include NORMALIZE_TRANSFORM from '../src/scanpy/workflows/normalize_transform.nf' params(params + params.global) include HVG_SELECTION from '../src/scanpy/workflows/hvg_selection.nf' params(params + params.global) +include SC__SCANPY__REGRESS_OUT from '../src/scanpy/processes/regress_out.nf' params(params) include NEIGHBORHOOD_GRAPH from '../src/scanpy/workflows/neighborhood_graph.nf' params(params) include DIM_REDUCTION_PCA from '../src/scanpy/workflows/dim_reduction_pca.nf' params(params + params.global) include DIM_REDUCTION_TSNE_UMAP from '../src/scanpy/workflows/dim_reduction.nf' params(params + params.global) @@ -45,7 +46,12 @@ workflow bbknn_base { ) NORMALIZE_TRANSFORM( SC__FILE_CONCATENATOR.out ) HVG_SELECTION( NORMALIZE_TRANSFORM.out ) - DIM_REDUCTION_PCA( HVG_SELECTION.out.scaled ) + if(params.sc.scanpy.containsKey("regress_out")) { + preprocessed_data = SC__SCANPY__REGRESS_OUT( HVG_SELECTION.out.scaled ) + } else { + preprocessed_data = HVG_SELECTION.out.scaled + } + DIM_REDUCTION_PCA( preprocessed_data ) NEIGHBORHOOD_GRAPH( DIM_REDUCTION_PCA.out ) DIM_REDUCTION_TSNE_UMAP( NEIGHBORHOOD_GRAPH.out ) diff --git a/workflows/harmony.nf b/workflows/harmony.nf index 3d386854..f30a037a 100644 --- a/workflows/harmony.nf +++ b/workflows/harmony.nf @@ -9,6 +9,7 @@ include '../src/utils/processes/utils.nf' params(params.sc.file_concatenator + p include QC_FILTER from '../src/scanpy/workflows/qc_filter.nf' params(params) include NORMALIZE_TRANSFORM from '../src/scanpy/workflows/normalize_transform.nf' params(params + params.global) include HVG_SELECTION from '../src/scanpy/workflows/hvg_selection.nf' params(params + params.global) +include SC__SCANPY__REGRESS_OUT from '../src/scanpy/processes/regress_out.nf' params(params) include NEIGHBORHOOD_GRAPH from '../src/scanpy/workflows/neighborhood_graph.nf' params(params) include DIM_REDUCTION_PCA from '../src/scanpy/workflows/dim_reduction_pca.nf' params(params + params.global) include DIM_REDUCTION_TSNE_UMAP from '../src/scanpy/workflows/dim_reduction.nf' params(params + params.global) @@ -46,7 +47,12 @@ workflow harmony_base { ) NORMALIZE_TRANSFORM( SC__FILE_CONCATENATOR.out ) HVG_SELECTION( NORMALIZE_TRANSFORM.out ) - DIM_REDUCTION_PCA( HVG_SELECTION.out.scaled ) + if(params.sc.scanpy.containsKey("regress_out")) { + preprocessed_data = SC__SCANPY__REGRESS_OUT( HVG_SELECTION.out.scaled ) + } else { + preprocessed_data = HVG_SELECTION.out.scaled + } + DIM_REDUCTION_PCA( preprocessed_data ) NEIGHBORHOOD_GRAPH( DIM_REDUCTION_PCA.out ) DIM_REDUCTION_TSNE_UMAP( NEIGHBORHOOD_GRAPH.out ) diff --git a/workflows/mnncorrect.nf b/workflows/mnncorrect.nf index 4bf48c3e..680e64d5 100644 --- a/workflows/mnncorrect.nf +++ b/workflows/mnncorrect.nf @@ -8,7 +8,7 @@ include '../src/utils/processes/utils.nf' params(params.sc.file_concatenator + p include QC_FILTER from '../src/scanpy/workflows/qc_filter.nf' params(params) include NORMALIZE_TRANSFORM from '../src/scanpy/workflows/normalize_transform.nf' params(params) -// include SC__SCANPY__ADJUSTMENT from '../src/scanpy/processes/adjust.nf' params(params) +include SC__SCANPY__REGRESS_OUT from '../src/scanpy/processes/regress_out.nf' params(params) include HVG_SELECTION from '../src/scanpy/workflows/hvg_selection.nf' params(params) include NEIGHBORHOOD_GRAPH from '../src/scanpy/workflows/neighborhood_graph.nf' params(params) include DIM_REDUCTION_PCA from '../src/scanpy/workflows/dim_reduction_pca.nf' params(params + params.global) @@ -42,7 +42,12 @@ workflow mnncorrect { ) NORMALIZE_TRANSFORM( SC__FILE_CONCATENATOR.out ) HVG_SELECTION( NORMALIZE_TRANSFORM.out ) - DIM_REDUCTION_PCA( HVG_SELECTION.out.scaled ) + if(params.sc.scanpy.containsKey("regress_out")) { + preprocessed_data = SC__SCANPY__REGRESS_OUT( HVG_SELECTION.out.scaled ) + } else { + preprocessed_data = HVG_SELECTION.out.scaled + } + DIM_REDUCTION_PCA( preprocessed_data ) NEIGHBORHOOD_GRAPH( DIM_REDUCTION_PCA.out ) DIM_REDUCTION_TSNE_UMAP( NEIGHBORHOOD_GRAPH.out ) @@ -55,7 +60,7 @@ workflow mnncorrect { BEC_MNNCORRECT( NORMALIZE_TRANSFORM.out, - HVG_SELECTION.out.scaled, + preprocessed_data, clusterIdentificationPreBatchEffectCorrection.marker_genes ) diff --git a/workflows/single_sample.nf b/workflows/single_sample.nf index 6ea600cd..7052d5b3 100644 --- a/workflows/single_sample.nf +++ b/workflows/single_sample.nf @@ -8,6 +8,7 @@ include '../src/utils/processes/utils.nf' params(params) include QC_FILTER from '../src/scanpy/workflows/qc_filter.nf' params(params) include NORMALIZE_TRANSFORM from '../src/scanpy/workflows/normalize_transform.nf' params(params) include HVG_SELECTION from '../src/scanpy/workflows/hvg_selection.nf' params(params) +include SC__SCANPY__REGRESS_OUT from '../src/scanpy/processes/regress_out.nf' params(params) include NEIGHBORHOOD_GRAPH from '../src/scanpy/workflows/neighborhood_graph.nf' params(params) include DIM_REDUCTION_PCA from '../src/scanpy/workflows/dim_reduction_pca.nf' params(params) include '../src/scanpy/workflows/dim_reduction.nf' params(params) @@ -38,7 +39,12 @@ workflow single_sample_base { QC_FILTER( data ) NORMALIZE_TRANSFORM( QC_FILTER.out.filtered ) HVG_SELECTION( NORMALIZE_TRANSFORM.out ) - DIM_REDUCTION_PCA( HVG_SELECTION.out.scaled ) + if(params.sc.scanpy.containsKey("regress_out")) { + preprocessed_data = SC__SCANPY__REGRESS_OUT( HVG_SELECTION.out.scaled ) + } else { + preprocessed_data = HVG_SELECTION.out.scaled + } + DIM_REDUCTION_PCA( preprocessed_data ) NEIGHBORHOOD_GRAPH( DIM_REDUCTION_PCA.out ) DIM_REDUCTION_TSNE_UMAP( NEIGHBORHOOD_GRAPH.out ) CLUSTER_IDENTIFICATION( diff --git a/workflows/single_sample_star.nf b/workflows/single_sample_star.nf index e94e68d2..9b25074d 100644 --- a/workflows/single_sample_star.nf +++ b/workflows/single_sample_star.nf @@ -9,6 +9,7 @@ include star as STAR from '../workflows/star.nf' params(params) include QC_FILTER from '../src/scanpy/workflows/qc_filter.nf' params(params) include NORMALIZE_TRANSFORM from '../src/scanpy/workflows/normalize_transform.nf' params(params) include HVG_SELECTION from '../src/scanpy/workflows/hvg_selection.nf' params(params) +include SC__SCANPY__REGRESS_OUT from '../src/scanpy/processes/regress_out.nf' params(params) include NEIGHBORHOOD_GRAPH from '../src/scanpy/workflows/neighborhood_graph.nf' params(params) include DIM_REDUCTION_PCA from '../src/scanpy/workflows/dim_reduction_pca.nf' params(params) include '../src/scanpy/workflows/dim_reduction.nf' params(params) @@ -35,7 +36,12 @@ workflow single_sample_star { QC_FILTER( data ) NORMALIZE_TRANSFORM( QC_FILTER.out.filtered ) HVG_SELECTION( NORMALIZE_TRANSFORM.out ) - DIM_REDUCTION_PCA( HVG_SELECTION.out.scaled ) + if(params.sc.scanpy.containsKey("regress_out")) { + preprocessed_data = SC__SCANPY__REGRESS_OUT( HVG_SELECTION.out.scaled ) + } else { + preprocessed_data = HVG_SELECTION.out.scaled + } + DIM_REDUCTION_PCA( preprocessed_data ) NEIGHBORHOOD_GRAPH( DIM_REDUCTION_PCA.out ) DIM_REDUCTION_TSNE_UMAP( NEIGHBORHOOD_GRAPH.out ) CLUSTER_IDENTIFICATION( From b3ac939b19616d4c5b7aca7171f43c892b6feee2 Mon Sep 17 00:00:00 2001 From: dweemx Date: Fri, 21 Feb 2020 23:51:06 +0100 Subject: [PATCH 06/32] Add profile to regress out vars with scanpy in nextflow.config --- nextflow.config | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/nextflow.config b/nextflow.config index 54ca98c7..26f58505 100644 --- a/nextflow.config +++ b/nextflow.config @@ -3,7 +3,7 @@ manifest { name = 'vib-singlecell-nf/vsn-pipelines' description = 'A repository of pipelines for single-cell data in Nextflow DSL2' homePage = 'https://github.com/vib-singlecell-nf/vsn-pipelines' - version = '0.12.1' + version = '0.13.0' mainScript = 'main.nf' defaultBranch = 'master' nextflowVersion = '!19.12.0-edge' // with ! prefix, stop execution if current version does not match required version. @@ -119,6 +119,14 @@ profiles { includeConfig 'src/dropletutils/dropletutils.config' } + // scanpy profiles + + scanpy_regress_out { + includeConfig 'src/scanpy/conf/regress_out.config' + } + + // cellranger profiles + cellranger { includeConfig 'src/cellranger/cellranger.config' } From bc289fb068a4996c5c7b2e398572aed956dc0149 Mon Sep 17 00:00:00 2001 From: dweemx Date: Sat, 22 Feb 2020 01:30:09 +0100 Subject: [PATCH 07/32] Rename params.data.tenx.cellranger_outs_dir_paths to params.data.tenx.cellranger_outs_dir_paths Update docs --- conf/test__bbknn.config | 2 +- conf/test__harmony.config | 2 +- conf/test__mnncorrect.config | 2 +- conf/test__single_sample.config | 2 +- conf/test__single_sample_scenic.config | 2 +- ...est__single_sample_scenic_multiruns.config | 2 +- docs/development.rst | 35 +++---------------- docs/pipelines.rst | 10 +++--- src/channels/conf/tenx.config | 7 ---- src/channels/conf/tenx_cellranger_mex.config | 7 ++++ src/channels/conf/tsv.config | 2 +- src/utils/main.test.nf | 8 ++--- 12 files changed, 28 insertions(+), 53 deletions(-) delete mode 100644 src/channels/conf/tenx.config create mode 100644 src/channels/conf/tenx_cellranger_mex.config diff --git a/conf/test__bbknn.config b/conf/test__bbknn.config index c18bae14..bdacc6d1 100644 --- a/conf/test__bbknn.config +++ b/conf/test__bbknn.config @@ -5,7 +5,7 @@ params { } data { tenx { - cellranger_outs_dir_path = "testdata/*/outs/" + cellranger_outs_dir_paths = "testdata/*/outs/" } } sc { diff --git a/conf/test__harmony.config b/conf/test__harmony.config index fa1fc18c..d6920dd7 100644 --- a/conf/test__harmony.config +++ b/conf/test__harmony.config @@ -5,7 +5,7 @@ params { } data { tenx { - cellranger_outs_dir_path = "testdata/*/outs/" + cellranger_outs_dir_paths = "testdata/*/outs/" } } sc { diff --git a/conf/test__mnncorrect.config b/conf/test__mnncorrect.config index c18bae14..bdacc6d1 100644 --- a/conf/test__mnncorrect.config +++ b/conf/test__mnncorrect.config @@ -5,7 +5,7 @@ params { } data { tenx { - cellranger_outs_dir_path = "testdata/*/outs/" + cellranger_outs_dir_paths = "testdata/*/outs/" } } sc { diff --git a/conf/test__single_sample.config b/conf/test__single_sample.config index 9f9122ab..52524004 100644 --- a/conf/test__single_sample.config +++ b/conf/test__single_sample.config @@ -5,7 +5,7 @@ params { } data { tenx { - cellranger_outs_dir_path = 'sample_data/outs' + cellranger_outs_dir_paths = 'sample_data/outs' } } sc { diff --git a/conf/test__single_sample_scenic.config b/conf/test__single_sample_scenic.config index 0baa738f..f8600b05 100644 --- a/conf/test__single_sample_scenic.config +++ b/conf/test__single_sample_scenic.config @@ -5,7 +5,7 @@ params { } data { tenx { - cellranger_outs_dir_path = 'sample_data/outs' + cellranger_outs_dir_paths = 'sample_data/outs' } } sc { diff --git a/conf/test__single_sample_scenic_multiruns.config b/conf/test__single_sample_scenic_multiruns.config index d2a9410a..a3821e12 100644 --- a/conf/test__single_sample_scenic_multiruns.config +++ b/conf/test__single_sample_scenic_multiruns.config @@ -5,7 +5,7 @@ params { } data { tenx { - cellranger_outs_dir_path = 'sample_data/outs' + cellranger_outs_dir_paths = 'sample_data/outs' } } sc { diff --git a/docs/development.rst b/docs/development.rst index 31ad648a..e4215865 100644 --- a/docs/development.rst +++ b/docs/development.rst @@ -329,7 +329,7 @@ This step is not required. However it this step is skipped, the code would still include SC__SCANPY__REPORT_TO_HTML from '../src/scanpy/processes/reports.nf' params(params + params.global) - workflow harmony_base { + workflow harmony { take: data @@ -380,8 +380,8 @@ This step is not required. However it this step is skipped, the code would still ).combine( BEC_HARMONY.out.harmony_report, by: 0 - ).map { - tuple( it[0], it.drop(1) ) + ).map { + tuple( it[0], it.drop(1) ) } // reporting: def clusteringParams = SC__SCANPY__CLUSTERING_PARAMS( clean(params.sc.scanpy.clustering) ) @@ -398,31 +398,6 @@ This step is not required. However it this step is skipped, the code would still } - workflow harmony_standalone { - - main: - data = getTenXChannel( params.data.tenx.cellranger_outs_dir_path ).view() - harmony_base( data ) - - emit: - filteredloom = harmony_base.out.filteredloom - scopeloom = harmony_base.out.scopeloom - - } - - workflow harmony { - - take: - data - - main: - harmony_base( data ) - - emit: - filteredloom = harmony_base.out.filteredloom - scopeloom = harmony_base.out.scopeloom - - } 11. Add a new Nextflow profile in ``nextflow.config`` of the ``vsn-pipelines`` repository @@ -430,8 +405,8 @@ This step is not required. However it this step is skipped, the code would still workflow harmony { - include harmony_standalone as HARMONY from './workflows/harmony' params(params) - HARMONY() + include harmony as HARMONY from './workflows/harmony' params(params) + getDataChannel | HARMONY } diff --git a/docs/pipelines.rst b/docs/pipelines.rst index fbd958a0..82da7591 100644 --- a/docs/pipelines.rst +++ b/docs/pipelines.rst @@ -29,7 +29,7 @@ The tool-specific parameters, as well as Docker/Singularity profiles, are includ In particular, the following parameters are frequently modified in practice: * ``params.global.project_name``: a project name which will be included in some of the output file names. - * ``params.data.tenx.cellranger_outs_dir_path``, which should point to the ``outs/`` folder generated by CellRanger (if using 10x data). See ``Information on using 10x Genomics datasets`` for additional info. + * ``params.data.tenx.cellranger_outs_dir_paths``, which should point to the ``outs/`` folder generated by CellRanger (if using 10x data). See ``Information on using 10x Genomics datasets`` for additional info. * Filtering parameters (``params.sc.scanpy.filter``): filtering parameters, which will be applied to all samples, can be set here: min/max genes, mitochondrial read fraction, and min cells. See ``Multi-sample parameters`` for additional info on how to specify sample-specific parameters. * Louvain cluster resolution: ``params.sc.scanpy.clustering.resolution``. * For cell- and sample-level annotations, see ``here`` for additional info. @@ -207,15 +207,15 @@ Cell Ranger (10xGenomics) -profiles tenx -In the generated .config file, make sur the ``cellranger_outs_dir_path`` parameter is set with the paths to the Cell Ranger ``outs`` folders:: +In the generated .config file, make sur the ``cellranger_outs_dir_paths`` parameter is set with the paths to the Cell Ranger ``outs`` folders:: [...] tenx { - cellranger_outs_dir_path = "data/10x/1k_pbmc/1k_pbmc_*/outs/" + cellranger_outs_dir_paths = "data/10x/1k_pbmc/1k_pbmc_*/outs/" } [...] -- The ``cellranger_outs_dir_path`` parameter accepts glob patterns and also comma separated paths. +- The ``cellranger_outs_dir_paths`` parameter accepts glob patterns and also comma separated paths. Information on using 10x Genomics datasets @@ -238,7 +238,7 @@ Let's say the file structure of your data looks like this, Setting the input directory appropriately will collect all the samples listed in the ``filtered_[feature|gene]_bc_matrix`` directories listed above. For example, in ``params.data.tenx``, setting:: - cellranger_outs_dir_path = "/home/data/cellranger/Sample*/outs/" + cellranger_outs_dir_paths = "/home/data/cellranger/Sample*/outs/" will recursively find all 10x samples in that directory. diff --git a/src/channels/conf/tenx.config b/src/channels/conf/tenx.config deleted file mode 100644 index c78def01..00000000 --- a/src/channels/conf/tenx.config +++ /dev/null @@ -1,7 +0,0 @@ -params { - data { - tenx { - cellranger_outs_dir_path = 'data/10x/1k_pbmc/1k_pbmc_*/outs/' - } - } -} diff --git a/src/channels/conf/tenx_cellranger_mex.config b/src/channels/conf/tenx_cellranger_mex.config new file mode 100644 index 00000000..8a37d05f --- /dev/null +++ b/src/channels/conf/tenx_cellranger_mex.config @@ -0,0 +1,7 @@ +params { + data { + tenx { + cellranger_outs_dir_paths = 'data/10x/1k_pbmc/1k_pbmc_*/outs/' + } + } +} diff --git a/src/channels/conf/tsv.config b/src/channels/conf/tsv.config index 406ae44c..e9ce9cb9 100644 --- a/src/channels/conf/tsv.config +++ b/src/channels/conf/tsv.config @@ -1,6 +1,6 @@ params { data { - h5ad { + tsv { file_paths = '' suffix = '.tsv' } diff --git a/src/utils/main.test.nf b/src/utils/main.test.nf index a4ab4271..52252cc3 100644 --- a/src/utils/main.test.nf +++ b/src/utils/main.test.nf @@ -41,17 +41,17 @@ workflow { switch(params.test) { case "SC__FILE_CONVERTER": include SC__FILE_CONVERTER from './processes/utils' params(params) - test_SC__FILE_CONVERTER( getTenXChannel( params.data.tenx.cellranger_outs_dir_path ) ) + test_SC__FILE_CONVERTER( getTenXChannel( params.data.tenx.cellranger_outs_dir_paths ) ) break; case "SC__FILE_CONCATENATOR": - test_SC__FILE_CONCATENATOR( getTenXChannel( params.data.tenx.cellranger_outs_dir_path ) ) + test_SC__FILE_CONCATENATOR( getTenXChannel( params.data.tenx.cellranger_outs_dir_paths ) ) break; case "FILTER_BY_CELL_METADATA": // Imports include FILTER_BY_CELL_METADATA from './workflows/filterByCellMetadata' params(params) // Run if(params.sc.cell_filter) { - data = getTenXChannel( params.data.tenx.cellranger_outs_dir_path ) + data = getTenXChannel( params.data.tenx.cellranger_outs_dir_paths ) SC__FILE_CONVERTER( data ) FILTER_BY_CELL_METADATA( SC__FILE_CONVERTER.out ) } @@ -62,7 +62,7 @@ workflow { include SC__ANNOTATE_BY_CELL_METADATA from './processes/h5adAnnotate' params(params) // Run if(params.sc.cell_filter && params.sc.cell_annotate) { - data = getTenXChannel( params.data.tenx.cellranger_outs_dir_path ) + data = getTenXChannel( params.data.tenx.cellranger_outs_dir_paths ) SC__FILE_CONVERTER( data ) FILTER_BY_CELL_METADATA( SC__FILE_CONVERTER.out ) SC__ANNOTATE_BY_CELL_METADATA( FILTER_BY_CELL_METADATA.out ) From 741d0783a4e4a0026160e07bb69513db20f9a2dc Mon Sep 17 00:00:00 2001 From: dweemx Date: Sat, 22 Feb 2020 01:47:06 +0100 Subject: [PATCH 08/32] Add data channel for cellranger h5 files Create new config for cellranger h5 files --- src/channels/conf/tenx_cellranger_h5.config | 7 +++++ src/channels/tenx.nf | 33 +++++++++++++++++++-- 2 files changed, 37 insertions(+), 3 deletions(-) create mode 100644 src/channels/conf/tenx_cellranger_h5.config diff --git a/src/channels/conf/tenx_cellranger_h5.config b/src/channels/conf/tenx_cellranger_h5.config new file mode 100644 index 00000000..0e2ce869 --- /dev/null +++ b/src/channels/conf/tenx_cellranger_h5.config @@ -0,0 +1,7 @@ +params { + data { + tenx { + cellranger_h5_file_paths = '' + } + } +} diff --git a/src/channels/tenx.nf b/src/channels/tenx.nf index 4a5d5ae0..bbfc6c6c 100644 --- a/src/channels/tenx.nf +++ b/src/channels/tenx.nf @@ -1,12 +1,39 @@ nextflow.preview.dsl=2 -def extractSample(path) { +def extractSampleFromH5(path) { + // Allow to detect data generated by CellRanger prior and post to version 3. + (full, parentDir, id, filename) = (path =~ /(.+)\/(.+)\/outs\/(.+)\.h5/)[0] + return id +} + +workflow getH5Channel { + + take: + glob + + main: + // Check whether multiple globs are provided + if(glob.contains(',')) { + glob = Arrays.asList(glob.split(',')); + } + channel = Channel + .fromPath(glob, type: 'file', checkIfExists: true) + .map { + filePath -> tuple(extractSampleFromH5( "${filePath}" ), file("${filePath}")) + } + + emit: + channel + +} + +def extractSampleFromMEX(path) { // Allow to detect data generated by CellRanger prior and post to version 3. (full, parentDir, id) = (path =~ /(.+)\/(.+)\/outs/)[0] return id } -workflow getChannel { +workflow getMEXChannel { take: glob @@ -19,7 +46,7 @@ workflow getChannel { channel = Channel .fromPath(glob, type: 'dir', checkIfExists: true) .map { - filePath -> tuple(extractSample( "${filePath}" ), file("${filePath}")) + filePath -> tuple(extractSampleFromMEX( "${filePath}" ), file("${filePath}")) } emit: From 3df011a9705f23993f3d07e071e7af9821b89df1 Mon Sep 17 00:00:00 2001 From: dweemx Date: Sat, 22 Feb 2020 01:56:14 +0100 Subject: [PATCH 09/32] Implement #110 Create main channels handling all different data input types Update SC__FILE_CONVERTER with the input data type and output data type Hence update and clean main.nf Update RtD --- docs/pipelines.rst | 11 ----- main.nf | 94 ++++++++++++++---------------------- src/channels/channels.nf | 64 ++++++++++++++++++++++++ src/utils/processes/utils.nf | 22 ++++++--- 4 files changed, 115 insertions(+), 76 deletions(-) create mode 100644 src/channels/channels.nf diff --git a/docs/pipelines.rst b/docs/pipelines.rst index 82da7591..c418d12f 100644 --- a/docs/pipelines.rst +++ b/docs/pipelines.rst @@ -262,10 +262,6 @@ In the generated .config file, make sure the ``file_paths`` parameter is set wit - The ``suffix`` parameter is used to infer the sample name from the file paths (it is removed from the input file path to derive a sample name). - The ``file_paths`` accepts glob patterns and also comma separated paths. -Make sure that ``sc.file_converter.iff`` is set to ``h5ad``. - -Currently H5AD input is only implemented in the ``h5ad_single_sample`` entry point. - TSV --- :: @@ -285,9 +281,6 @@ In the generated .config file, make sure the ``file_paths`` parameter is set wit - The ``suffix`` parameter is used to infer the sample name from the file paths (it is removed from the input file path to derive a sample name). - The ``file_paths`` accepts glob patterns and also comma separated paths. -Make sure that ``sc.file_converter.iff`` is set to ``tsv``. - -Currently H5AD input is only implemented in the ``tsv_single_sample`` entry point. CSV --- @@ -307,7 +300,3 @@ In the generated .config file, make sure the ``file_paths`` parameter is set wit - The ``suffix`` parameter is used to infer the sample name from the file paths (it is removed from the input file path to derive a sample name). - The ``file_paths`` accepts glob patterns and also comma separated paths. - -Make sure that ``sc.file_converter.iff`` is set to ``csv``. - -Currently H5AD input is only implemented in the ``csv_single_sample`` entry point. \ No newline at end of file diff --git a/main.nf b/main.nf index 6dbc2168..8e1a40ea 100644 --- a/main.nf +++ b/main.nf @@ -15,11 +15,13 @@ if(!params.global.containsKey('seed')) { } } +include './src/channels/channels' params(params) + // run multi-sample with bbknn, output a scope loom file workflow bbknn { - include bbknn_standalone as BBKNN from './workflows/bbknn' params(params) - BBKNN() + include bbknn as BBKNN from './workflows/bbknn' params(params) + getDataChannel | BBKNN } @@ -27,35 +29,41 @@ workflow bbknn { workflow mnncorrect { include mnncorrect as MNNCORRECT from './workflows/mnncorrect' params(params) - MNNCORRECT() + getDataChannel | MNNCORRECT } // run multi-sample with bbknn, output a scope loom file workflow harmony { - include harmony_standalone as HARMONY from './workflows/harmony' params(params) - HARMONY() + include harmony as HARMONY from './workflows/harmony' params(params) + getDataChannel | HARMONY } // run multi-sample with bbknn, then scenic from the filtered output: workflow bbknn_scenic { - include bbknn_standalone as BBKNN from './workflows/bbknn' params(params) - include SCENIC_append from './src/scenic/main.nf' params(params) - BBKNN() - SCENIC_append( BBKNN.out.filteredloom, BBKNN.out.scopeloom ) + include bbknn as BBKNN from './workflows/bbknn' params(params) + include scenic_append as SCENIC_APPEND from './src/scenic/main.nf' params(params) + getDataChannel | BBKNN + SCENIC_APPEND( + BBKNN.out.filteredloom, + BBKNN.out.scopeloom + ) } // run multi-sample with harmony, then scenic from the filtered output: workflow harmony_scenic { - include harmony_standalone as HARMONY from './workflows/harmony' params(params) - include SCENIC_append from './src/scenic/main.nf' params(params) - HARMONY() - SCENIC_append( HARMONY.out.filteredloom, HARMONY.out.scopeloom ) + include harmony as HARMONY from './workflows/harmony' params(params) + include scenic_append as SCENIC_APPEND from './src/scenic/main.nf' params(params) + getDataChannel | HARMONY + SCENIC_APPEND( + HARMONY.out.filteredloom, + HARMONY.out.scopeloom + ) } @@ -63,8 +71,8 @@ workflow harmony_scenic { // run single_sample, output a scope loom file workflow single_sample { - include single_sample_standalone as SINGLE_SAMPLE from './workflows/single_sample' params(params) - SINGLE_SAMPLE() + include single_sample as SINGLE_SAMPLE from './workflows/single_sample' params(params) + getDataChannel | SINGLE_SAMPLE } @@ -72,10 +80,13 @@ workflow single_sample { // run single_sample, then scenic from the filtered output: workflow single_sample_scenic { - include SCENIC_append from './src/scenic/main.nf' params(params) + include scenic_append as SCENIC_APPEND from './src/scenic/main.nf' params(params) include single_sample_standalone as SINGLE_SAMPLE from './workflows/single_sample' params(params) - SINGLE_SAMPLE() - SCENIC_append( SINGLE_SAMPLE.out.filteredloom, SINGLE_SAMPLE.out.scopeloom ) + getDataChannel | SINGLE_SAMPLE + SCENIC_APPEND( + SINGLE_SAMPLE.out.filteredloom, + SINGLE_SAMPLE.out.scopeloom + ) } @@ -83,8 +94,8 @@ workflow single_sample_scenic { // run scenic directly from an existing loom file: workflow scenic { - include SCENIC as SCENIC_WF from './src/scenic/main.nf' params(params) - SCENIC_WF( Channel.of( tuple("foobar", file(params.sc.scenic.filteredLoom))) ) + include scenic as SCENIC from './src/scenic/main.nf' params(params) + SCENIC( Channel.of( tuple("foobar", file(params.sc.scenic.filteredLoom))) ) } @@ -103,8 +114,10 @@ workflow cellranger { workflow cellranger_metadata { - include CELLRANGER_COUNT_WITH_METADATA from './src/cellranger/workflows/cellRangerCountWithMetadata' params(params) - CELLRANGER_COUNT_WITH_METADATA(file(params.sc.cellranger.count.metadata)) + include CELLRANGER_COUNT_WITH_METADATA from './src/cellranger/workflows/cellRangerCountWithMetadata' params(params) + CELLRANGER_COUNT_WITH_METADATA( + file(params.sc.cellranger.count.metadata) + ) } @@ -117,39 +130,6 @@ workflow single_sample_cellranger { } -workflow h5ad_single_sample { - - include getChannel as getH5ADChannel from './src/channels/file' params(params) - include single_sample as SINGLE_SAMPLE from './workflows/single_sample' params(params) - data = getH5ADChannel( - params.data.h5ad.file_paths, - params.data.h5ad.suffix - ).view() | SINGLE_SAMPLE - -} - -workflow tsv_single_sample { - - include getChannel as getTSVChannel from './src/channels/file' params(params) - include single_sample as SINGLE_SAMPLE from './workflows/single_sample' params(params) - data = getTSVChannel( - params.data.tsv.file_paths, - params.data.tsv.suffix - ).view() | SINGLE_SAMPLE - -} - -workflow csv_single_sample { - - include getChannel as getCSVChannel from './src/channels/file' params(params) - include single_sample as SINGLE_SAMPLE from './workflows/single_sample' params(params) - data = getCSVChannel( - params.data.csv.file_paths, - params.data.csv.suffix - ).view() | SINGLE_SAMPLE - -} - workflow star { @@ -199,9 +179,9 @@ workflow sra_cellranger_bbknn { workflow sra_cellranger_bbknn_scenic { - include SCENIC_append from './src/scenic/main.nf' params(params) + include scenic_append as SCENIC_APPEND from './src/scenic/main.nf' params(params) sra_cellranger_bbknn() - SCENIC_append( + SCENIC_APPEND( sra_cellranger_bbknn.out.filteredLoom, sra_cellranger_bbknn.out.scopeLoom ) diff --git a/src/channels/channels.nf b/src/channels/channels.nf new file mode 100644 index 00000000..d85cbcc2 --- /dev/null +++ b/src/channels/channels.nf @@ -0,0 +1,64 @@ +nextflow.preview.dsl=2 + +include getMEXChannel as getTenXCellRangerMEXChannel from './tenx' params(params) +include getH5Channel as getTenXCellRangerH5Channel from './tenx' params(params) +include getChannel as getFileChannel from './file' params(params) + +workflow getDataChannel { + + main: + data = Channel.empty() + if(params.data.containsKey("tenx") && params.data.tenx.containsKey("cellranger_outs_dir_paths")) { + data = data.concat( + getTenXCellRangerMEXChannel( + params.data.tenx.cellranger_outs_dir_paths + ).map { + it -> tuple(it[0], it[1], "10x_cellranger_mex", "h5ad") + } + ).view() + } + if(params.data.containsKey("tenx") && params.data.tenx.containsKey("cellranger_h5_file_paths")) { + data = data.concat( + getTenXCellRangerH5Channel( + params.data.tenx.cellranger_h5_file_paths + ).map { + it -> tuple(it[0], it[1], "10x_cellranger_h5", "h5ad") + } + ).view() + } + if(params.data.containsKey("h5ad")) { + data = data.concat( + getFileChannel( + params.data.h5ad.file_paths, + params.data.h5ad.suffix + ).map { + it -> tuple(it[0], it[1], "tsv", "h5ad") + } + ).view() + } + if(params.data.containsKey("tsv")) { + data = data.concat( + getFileChannel( + params.data.tsv.file_paths, + params.data.h5ad.suffix + ).map { + it -> tuple(it[0], it[1], "tsv", "h5ad") + } + ).view() + } + if(params.data.containsKey("csv")) { + data = data.concat( + getFileChannel( + params.data.csv.file_paths, + params.data.h5ad.suffix + ).map { + it -> tuple(it[0], it[1], "csv", "h5ad") + } + ).view() + } + data.ifEmpty { exit 1, "Pipeline cannot run: no data provided." } + + emit: + data + +} \ No newline at end of file diff --git a/src/utils/processes/utils.nf b/src/utils/processes/utils.nf index 19bef314..87aad3cd 100644 --- a/src/utils/processes/utils.nf +++ b/src/utils/processes/utils.nf @@ -40,16 +40,22 @@ process SC__FILE_CONVERTER { publishDir "${params.global.outdir}/data/intermediate", mode: 'symlink', overwrite: true input: - tuple val(sampleId), path(f) + tuple \ + val(sampleId), \ + path(f), \ + val(inputDataType), \ + val(outputDataType) output: - tuple val(sampleId), path("${sampleId}.SC__FILE_CONVERTER.${processParams.off}") + tuple \ + val(sampleId), \ + path("${sampleId}.SC__FILE_CONVERTER.${outputDataType}") script: def sampleParams = params.parseConfig(sampleId, params.global, params.sc.file_converter) processParams = sampleParams.local - switch(processParams.iff) { + switch(inputDataType) { case "10x_cellranger_mex": // Reference: https://kb.10xgenomics.com/hc/en-us/articles/115000794686-How-is-the-MEX-format-used-for-the-gene-barcode-matrices- // Check if output was generated with CellRanger v2 or v3 @@ -78,11 +84,11 @@ process SC__FILE_CONVERTER { break; default: - throw new Exception("The given input format ${processParams.iff} is not recognized.") + throw new Exception("The given input format ${inputDataType} is not recognized.") break; } - if(processParams.iff == "h5ad") + if(inputDataType == "h5ad") """ cp ${f} "${sampleId}.SC__FILE_CONVERTER.h5ad" """ @@ -91,10 +97,10 @@ process SC__FILE_CONVERTER { ${binDir}sc_file_converter.py \ --sample-id "${sampleId}" \ ${(processParams.containsKey('tagCellWithSampleId')) ? '--tag-cell-with-sample-id' : ''} \ - --input-format $processParams.iff \ - --output-format $processParams.off \ + --input-format $inputDataType \ + --output-format $outputDataType \ ${f} \ - "${sampleId}.SC__FILE_CONVERTER.${processParams.off}" + "${sampleId}.SC__FILE_CONVERTER.${outputDataType}" """ } From 1f9dcf06c4a7fc0dbc7b4227f232dd057d31ad15 Mon Sep 17 00:00:00 2001 From: dweemx Date: Sat, 22 Feb 2020 01:57:44 +0100 Subject: [PATCH 10/32] Clean configs and main workflows --- nextflow.config | 26 +++++--- src/utils/conf/base.config | 2 - workflows/bbknn.nf | 31 +-------- workflows/harmony.nf | 31 +-------- workflows/mnncorrect.nf | 133 +++++++++++++++++++------------------ workflows/single_sample.nf | 31 +-------- 6 files changed, 88 insertions(+), 166 deletions(-) diff --git a/nextflow.config b/nextflow.config index 26f58505..6dc4a16a 100644 --- a/nextflow.config +++ b/nextflow.config @@ -60,6 +60,7 @@ profiles { } // workflow-specific profiles: + star { includeConfig 'src/star/star.config' } @@ -119,13 +120,13 @@ profiles { includeConfig 'src/dropletutils/dropletutils.config' } - // scanpy profiles + // scanpy profiles: scanpy_regress_out { includeConfig 'src/scanpy/conf/regress_out.config' } - // cellranger profiles + // cellranger profiles: cellranger { includeConfig 'src/cellranger/cellranger.config' @@ -140,9 +141,13 @@ profiles { includeConfig 'src/cellranger/conf/count_metadata.config' } - // data profiles + // data profiles: + tenx { - includeConfig 'src/channels/conf/tenx.config' + includeConfig 'src/channels/conf/tenx_cellranger_mex.config' + } + tenx_h5 { + includeConfig 'src/channels/conf/tenx_cellranger_h5.config' } h5ad { includeConfig 'src/channels/conf/h5ad.config' @@ -159,7 +164,8 @@ profiles { includeConfig 'src/sratoolkit/sratoolkit.config' } - // metadata profiles + // metadata profiles: + dm6 { includeConfig 'src/scenic/conf/min/tfs/fly-v0.0.1.config' includeConfig 'conf/genomes/dm6.config' @@ -170,12 +176,14 @@ profiles { includeConfig 'conf/genomes/hg38.config' } - // feature profiles + // feature profiles: + pcacv { includeConfig 'src/pcacv/pcacv.config' } - // scenic profiles + // scenic profiles: + scenic_use_cistarget_motifs { includeConfig "src/scenic/conf/min/dbs/cistarget-motifs-${params.global.species}-${params.global.genome.assembly}-v0.0.1.config" } @@ -188,7 +196,8 @@ profiles { includeConfig 'src/scenic/conf/test.config' } - // utility profiles + // utility profiles: + utils_sample_annotate { includeConfig 'src/utils/conf/sample_annotate.config' } @@ -200,6 +209,7 @@ profiles { } // test profiles: + test__single_sample { includeConfig 'conf/test__single_sample.config' } diff --git a/src/utils/conf/base.config b/src/utils/conf/base.config index 47cc3e0f..2b158bcc 100644 --- a/src/utils/conf/base.config +++ b/src/utils/conf/base.config @@ -7,8 +7,6 @@ params { } sc { file_converter { - iff = '10x_cellranger_mex' - off = 'h5ad' tagCellWithSampleId = true useFilteredMatrix = true } diff --git a/workflows/bbknn.nf b/workflows/bbknn.nf index b703f6a7..43e03bdc 100644 --- a/workflows/bbknn.nf +++ b/workflows/bbknn.nf @@ -20,16 +20,13 @@ include SC__H5AD_TO_FILTERED_LOOM from '../src/utils/processes/h5adToLoom.nf' pa include FILE_CONVERTER from '../src/utils/workflows/fileConverter.nf' params(params) include BEC_BBKNN from '../src/scanpy/workflows/bec_bbknn.nf' params(params) -// data channel to start from 10x data: -include getChannel as getTenXChannel from '../src/channels/tenx.nf' params(params) - // reporting: include UTILS__GENERATE_WORKFLOW_CONFIG_REPORT from '../src/utils/processes/reports.nf' params(params) include SC__SCANPY__MERGE_REPORTS from '../src/scanpy/processes/reports.nf' params(params + params.global) include SC__SCANPY__REPORT_TO_HTML from '../src/scanpy/processes/reports.nf' params(params + params.global) -workflow bbknn_base { +workflow bbknn { take: data @@ -111,29 +108,3 @@ workflow bbknn_base { scopeloom } - -workflow bbknn_standalone { - - main: - data = getTenXChannel( params.data.tenx.cellranger_outs_dir_path ).view() - bbknn_base( data ) - - emit: - filteredloom = bbknn_base.out.filteredloom - scopeloom = bbknn_base.out.scopeloom - -} - -workflow bbknn { - - take: - data - - main: - bbknn_base( data ) - - emit: - filteredloom = bbknn_base.out.filteredloom - scopeloom = bbknn_base.out.scopeloom - -} diff --git a/workflows/harmony.nf b/workflows/harmony.nf index f30a037a..61895aa3 100644 --- a/workflows/harmony.nf +++ b/workflows/harmony.nf @@ -21,16 +21,13 @@ include BEC_HARMONY from '../src/harmony/workflows/bec_harmony.nf' params(params include SC__H5AD_TO_FILTERED_LOOM from '../src/utils/processes/h5adToLoom.nf' params(params + params.global) include FILE_CONVERTER from '../src/utils/workflows/fileConverter.nf' params(params) -// data channel to start from 10x data: -include getChannel as getTenXChannel from '../src/channels/tenx.nf' params(params) - // reporting: include UTILS__GENERATE_WORKFLOW_CONFIG_REPORT from '../src/utils/processes/reports.nf' params(params) include SC__SCANPY__MERGE_REPORTS from '../src/scanpy/processes/reports.nf' params(params + params.global) include SC__SCANPY__REPORT_TO_HTML from '../src/scanpy/processes/reports.nf' params(params + params.global) -workflow harmony_base { +workflow harmony { take: data @@ -111,29 +108,3 @@ workflow harmony_base { scopeloom } - -workflow harmony_standalone { - - main: - data = getTenXChannel( params.data.tenx.cellranger_outs_dir_path ).view() - harmony_base( data ) - - emit: - filteredloom = harmony_base.out.filteredloom - scopeloom = harmony_base.out.scopeloom - -} - -workflow harmony { - - take: - data - - main: - harmony_base( data ) - - emit: - filteredloom = harmony_base.out.filteredloom - scopeloom = harmony_base.out.scopeloom - -} diff --git a/workflows/mnncorrect.nf b/workflows/mnncorrect.nf index 680e64d5..f79bbc40 100644 --- a/workflows/mnncorrect.nf +++ b/workflows/mnncorrect.nf @@ -20,9 +20,6 @@ include BEC_MNNCORRECT from '../src/scanpy/workflows/bec_mnncorrect.nf' params(p include SC__H5AD_TO_FILTERED_LOOM from '../src/utils/processes/h5adToLoom.nf' params(params) include FILE_CONVERTER from '../src/utils/workflows/fileConverter.nf' params(params) -// data channel to start from 10x data: -include getChannel as getTenXChannel from '../src/channels/tenx.nf' params(params) - // reporting: include UTILS__GENERATE_WORKFLOW_CONFIG_REPORT from '../src/utils/processes/reports.nf' params(params) include SC__SCANPY__MERGE_REPORTS from '../src/scanpy/processes/reports.nf' params(params + params.global) @@ -30,74 +27,78 @@ include SC__SCANPY__REPORT_TO_HTML from '../src/scanpy/processes/reports.nf' par workflow mnncorrect { - // Run the pipeline - data = getTenXChannel( params.data.tenx.cellranger_outs_dir_path ).view() - QC_FILTER( data ) // Remove concat - SC__FILE_CONCATENATOR( - QC_FILTER.out.filtered.map { - it -> it[1] - }.toSortedList( - { a, b -> getBaseName(a) <=> getBaseName(b) } + take: + data + + main: + + // Run the pipeline + QC_FILTER( data ) // Remove concat + SC__FILE_CONCATENATOR( + QC_FILTER.out.filtered.map { + it -> it[1] + }.toSortedList( + { a, b -> getBaseName(a) <=> getBaseName(b) } + ) ) - ) - NORMALIZE_TRANSFORM( SC__FILE_CONCATENATOR.out ) - HVG_SELECTION( NORMALIZE_TRANSFORM.out ) - if(params.sc.scanpy.containsKey("regress_out")) { - preprocessed_data = SC__SCANPY__REGRESS_OUT( HVG_SELECTION.out.scaled ) - } else { - preprocessed_data = HVG_SELECTION.out.scaled - } - DIM_REDUCTION_PCA( preprocessed_data ) - NEIGHBORHOOD_GRAPH( DIM_REDUCTION_PCA.out ) - DIM_REDUCTION_TSNE_UMAP( NEIGHBORHOOD_GRAPH.out ) + NORMALIZE_TRANSFORM( SC__FILE_CONCATENATOR.out ) + HVG_SELECTION( NORMALIZE_TRANSFORM.out ) + if(params.sc.scanpy.containsKey("regress_out")) { + preprocessed_data = SC__SCANPY__REGRESS_OUT( HVG_SELECTION.out.scaled ) + } else { + preprocessed_data = HVG_SELECTION.out.scaled + } + DIM_REDUCTION_PCA( preprocessed_data ) + NEIGHBORHOOD_GRAPH( DIM_REDUCTION_PCA.out ) + DIM_REDUCTION_TSNE_UMAP( NEIGHBORHOOD_GRAPH.out ) - // Perform the clustering step w/o batch effect correction (for comparison matter) - clusterIdentificationPreBatchEffectCorrection = CLUSTER_IDENTIFICATION( - NORMALIZE_TRANSFORM.out, - DIM_REDUCTION_TSNE_UMAP.out.dimred_tsne_umap, - "Pre Batch Effect Correction" - ) + // Perform the clustering step w/o batch effect correction (for comparison matter) + clusterIdentificationPreBatchEffectCorrection = CLUSTER_IDENTIFICATION( + NORMALIZE_TRANSFORM.out, + DIM_REDUCTION_TSNE_UMAP.out.dimred_tsne_umap, + "Pre Batch Effect Correction" + ) - BEC_MNNCORRECT( - NORMALIZE_TRANSFORM.out, - preprocessed_data, - clusterIdentificationPreBatchEffectCorrection.marker_genes - ) + BEC_MNNCORRECT( + NORMALIZE_TRANSFORM.out, + preprocessed_data, + clusterIdentificationPreBatchEffectCorrection.marker_genes + ) - // Conversion - // Convert h5ad to X (here we choose: loom format) - filteredloom = SC__H5AD_TO_FILTERED_LOOM( SC__FILE_CONCATENATOR.out ) - scopeloom = FILE_CONVERTER( - BEC_MNNCORRECT.out.data.groupTuple(), - 'loom', - SC__FILE_CONCATENATOR.out, - ) + // Conversion + // Convert h5ad to X (here we choose: loom format) + filteredloom = SC__H5AD_TO_FILTERED_LOOM( SC__FILE_CONCATENATOR.out ) + scopeloom = FILE_CONVERTER( + BEC_MNNCORRECT.out.data.groupTuple(), + 'loom', + SC__FILE_CONCATENATOR.out, + ) - project = CLUSTER_IDENTIFICATION.out.marker_genes.map { it -> it[0] } - UTILS__GENERATE_WORKFLOW_CONFIG_REPORT( - file(workflow.projectDir + params.utils.workflow_configuration.report_ipynb) - ) - // Collect the reports: - ipynbs = project.combine( - UTILS__GENERATE_WORKFLOW_CONFIG_REPORT.out - ).join( - HVG_SELECTION.out.report - ).join( - BEC_MNNCORRECT.out.cluster_report - ).combine( - BEC_MNNCORRECT.out.mnncorrect_report, - by: 0 - ).map { - tuple( it[0], it.drop(1) ) - } - // Reporting: - def clusteringParams = SC__SCANPY__CLUSTERING_PARAMS( clean(params.sc.scanpy.clustering) ) - SC__SCANPY__MERGE_REPORTS( - ipynbs, - "merged_report", - clusteringParams.isParameterExplorationModeOn() - ) - SC__SCANPY__REPORT_TO_HTML(SC__SCANPY__MERGE_REPORTS.out) + project = CLUSTER_IDENTIFICATION.out.marker_genes.map { it -> it[0] } + UTILS__GENERATE_WORKFLOW_CONFIG_REPORT( + file(workflow.projectDir + params.utils.workflow_configuration.report_ipynb) + ) + // Collect the reports: + ipynbs = project.combine( + UTILS__GENERATE_WORKFLOW_CONFIG_REPORT.out + ).join( + HVG_SELECTION.out.report + ).join( + BEC_MNNCORRECT.out.cluster_report + ).combine( + BEC_MNNCORRECT.out.mnncorrect_report, + by: 0 + ).map { + tuple( it[0], it.drop(1) ) + } + // Reporting: + def clusteringParams = SC__SCANPY__CLUSTERING_PARAMS( clean(params.sc.scanpy.clustering) ) + SC__SCANPY__MERGE_REPORTS( + ipynbs, + "merged_report", + clusteringParams.isParameterExplorationModeOn() + ) + SC__SCANPY__REPORT_TO_HTML(SC__SCANPY__MERGE_REPORTS.out) emit: filteredloom diff --git a/workflows/single_sample.nf b/workflows/single_sample.nf index 7052d5b3..795adc2a 100644 --- a/workflows/single_sample.nf +++ b/workflows/single_sample.nf @@ -17,15 +17,12 @@ include CLUSTER_IDENTIFICATION from '../src/scanpy/workflows/cluster_identificat include SC__H5AD_TO_FILTERED_LOOM from '../src/utils/processes/h5adToLoom.nf' params(params) include FILE_CONVERTER from '../src/utils/workflows/fileConverter.nf' params(params) -// data channel to start from 10x data: -include getChannel as getTenXChannel from '../src/channels/tenx.nf' params(params) - // reporting: include UTILS__GENERATE_WORKFLOW_CONFIG_REPORT from '../src/utils/processes/reports.nf' params(params) include SC__SCANPY__MERGE_REPORTS from '../src/scanpy/processes/reports.nf' params(params) include SC__SCANPY__REPORT_TO_HTML from '../src/scanpy/processes/reports.nf' params(params) -workflow single_sample_base { +workflow single_sample { take: data @@ -91,29 +88,3 @@ workflow single_sample_base { scopeloom } - -workflow single_sample_standalone { - - main: - data = getTenXChannel( params.data.tenx.cellranger_outs_dir_path ).view() - single_sample_base( data ) - - emit: - filteredloom = single_sample_base.out.filteredloom - scopeloom = single_sample_base.out.scopeloom - -} - -workflow single_sample { - - take: - data - - main: - single_sample_base( data ) - - emit: - single_sample_base.out.filteredloom - single_sample_base.out.scopeloom - -} From 4185381b85a02be0d98ee54f657cd70a7bf1c156 Mon Sep 17 00:00:00 2001 From: dweemx Date: Sun, 23 Feb 2020 10:59:59 +0100 Subject: [PATCH 11/32] Rename tenx input data params Rename cellranger_h5_file_paths to cellranger_h5 Rename cellranger_outs_dir_paths to cellranger_mex --- conf/test__bbknn.config | 2 +- conf/test__harmony.config | 2 +- conf/test__mnncorrect.config | 2 +- conf/test__single_sample.config | 2 +- conf/test__single_sample_scenic.config | 2 +- conf/test__single_sample_scenic_multiruns.config | 2 +- src/channels/channels.nf | 8 ++++---- src/channels/conf/tenx_cellranger_h5.config | 2 +- src/channels/conf/tenx_cellranger_mex.config | 2 +- src/utils/main.test.nf | 8 ++++---- 10 files changed, 16 insertions(+), 16 deletions(-) diff --git a/conf/test__bbknn.config b/conf/test__bbknn.config index bdacc6d1..617129cb 100644 --- a/conf/test__bbknn.config +++ b/conf/test__bbknn.config @@ -5,7 +5,7 @@ params { } data { tenx { - cellranger_outs_dir_paths = "testdata/*/outs/" + cellranger_mex = "testdata/*/outs/" } } sc { diff --git a/conf/test__harmony.config b/conf/test__harmony.config index d6920dd7..a01cfefd 100644 --- a/conf/test__harmony.config +++ b/conf/test__harmony.config @@ -5,7 +5,7 @@ params { } data { tenx { - cellranger_outs_dir_paths = "testdata/*/outs/" + cellranger_mex = "testdata/*/outs/" } } sc { diff --git a/conf/test__mnncorrect.config b/conf/test__mnncorrect.config index bdacc6d1..617129cb 100644 --- a/conf/test__mnncorrect.config +++ b/conf/test__mnncorrect.config @@ -5,7 +5,7 @@ params { } data { tenx { - cellranger_outs_dir_paths = "testdata/*/outs/" + cellranger_mex = "testdata/*/outs/" } } sc { diff --git a/conf/test__single_sample.config b/conf/test__single_sample.config index 52524004..75df7745 100644 --- a/conf/test__single_sample.config +++ b/conf/test__single_sample.config @@ -5,7 +5,7 @@ params { } data { tenx { - cellranger_outs_dir_paths = 'sample_data/outs' + cellranger_mex = 'sample_data/outs' } } sc { diff --git a/conf/test__single_sample_scenic.config b/conf/test__single_sample_scenic.config index f8600b05..9be4b3c5 100644 --- a/conf/test__single_sample_scenic.config +++ b/conf/test__single_sample_scenic.config @@ -5,7 +5,7 @@ params { } data { tenx { - cellranger_outs_dir_paths = 'sample_data/outs' + cellranger_mex = 'sample_data/outs' } } sc { diff --git a/conf/test__single_sample_scenic_multiruns.config b/conf/test__single_sample_scenic_multiruns.config index a3821e12..c9484e61 100644 --- a/conf/test__single_sample_scenic_multiruns.config +++ b/conf/test__single_sample_scenic_multiruns.config @@ -5,7 +5,7 @@ params { } data { tenx { - cellranger_outs_dir_paths = 'sample_data/outs' + cellranger_mex = 'sample_data/outs' } } sc { diff --git a/src/channels/channels.nf b/src/channels/channels.nf index d85cbcc2..60d04376 100644 --- a/src/channels/channels.nf +++ b/src/channels/channels.nf @@ -8,19 +8,19 @@ workflow getDataChannel { main: data = Channel.empty() - if(params.data.containsKey("tenx") && params.data.tenx.containsKey("cellranger_outs_dir_paths")) { + if(params.data.containsKey("tenx") && params.data.tenx.containsKey("cellranger_mex")) { data = data.concat( getTenXCellRangerMEXChannel( - params.data.tenx.cellranger_outs_dir_paths + params.data.tenx.cellranger_mex ).map { it -> tuple(it[0], it[1], "10x_cellranger_mex", "h5ad") } ).view() } - if(params.data.containsKey("tenx") && params.data.tenx.containsKey("cellranger_h5_file_paths")) { + if(params.data.containsKey("tenx") && params.data.tenx.containsKey("cellranger_h5")) { data = data.concat( getTenXCellRangerH5Channel( - params.data.tenx.cellranger_h5_file_paths + params.data.tenx.cellranger_h5 ).map { it -> tuple(it[0], it[1], "10x_cellranger_h5", "h5ad") } diff --git a/src/channels/conf/tenx_cellranger_h5.config b/src/channels/conf/tenx_cellranger_h5.config index 0e2ce869..7524f392 100644 --- a/src/channels/conf/tenx_cellranger_h5.config +++ b/src/channels/conf/tenx_cellranger_h5.config @@ -1,7 +1,7 @@ params { data { tenx { - cellranger_h5_file_paths = '' + cellranger_h5 = '' } } } diff --git a/src/channels/conf/tenx_cellranger_mex.config b/src/channels/conf/tenx_cellranger_mex.config index 8a37d05f..b6e93236 100644 --- a/src/channels/conf/tenx_cellranger_mex.config +++ b/src/channels/conf/tenx_cellranger_mex.config @@ -1,7 +1,7 @@ params { data { tenx { - cellranger_outs_dir_paths = 'data/10x/1k_pbmc/1k_pbmc_*/outs/' + cellranger_mex = 'data/10x/1k_pbmc/1k_pbmc_*/outs/' } } } diff --git a/src/utils/main.test.nf b/src/utils/main.test.nf index 52252cc3..63ccabd7 100644 --- a/src/utils/main.test.nf +++ b/src/utils/main.test.nf @@ -41,17 +41,17 @@ workflow { switch(params.test) { case "SC__FILE_CONVERTER": include SC__FILE_CONVERTER from './processes/utils' params(params) - test_SC__FILE_CONVERTER( getTenXChannel( params.data.tenx.cellranger_outs_dir_paths ) ) + test_SC__FILE_CONVERTER( getTenXChannel( params.data.tenx.cellranger_mex ) ) break; case "SC__FILE_CONCATENATOR": - test_SC__FILE_CONCATENATOR( getTenXChannel( params.data.tenx.cellranger_outs_dir_paths ) ) + test_SC__FILE_CONCATENATOR( getTenXChannel( params.data.tenx.cellranger_mex ) ) break; case "FILTER_BY_CELL_METADATA": // Imports include FILTER_BY_CELL_METADATA from './workflows/filterByCellMetadata' params(params) // Run if(params.sc.cell_filter) { - data = getTenXChannel( params.data.tenx.cellranger_outs_dir_paths ) + data = getTenXChannel( params.data.tenx.cellranger_mex ) SC__FILE_CONVERTER( data ) FILTER_BY_CELL_METADATA( SC__FILE_CONVERTER.out ) } @@ -62,7 +62,7 @@ workflow { include SC__ANNOTATE_BY_CELL_METADATA from './processes/h5adAnnotate' params(params) // Run if(params.sc.cell_filter && params.sc.cell_annotate) { - data = getTenXChannel( params.data.tenx.cellranger_outs_dir_paths ) + data = getTenXChannel( params.data.tenx.cellranger_mex ) SC__FILE_CONVERTER( data ) FILTER_BY_CELL_METADATA( SC__FILE_CONVERTER.out ) SC__ANNOTATE_BY_CELL_METADATA( FILTER_BY_CELL_METADATA.out ) From 1dc8db6d40e3f02d7707ebb36a079fe9487d6b22 Mon Sep 17 00:00:00 2001 From: dweemx Date: Sun, 23 Feb 2020 11:00:59 +0100 Subject: [PATCH 12/32] Update input data format in pipelines docs --- docs/pipelines.rst | 63 +++++++++++++++++++++++++++++++++++----------- 1 file changed, 49 insertions(+), 14 deletions(-) diff --git a/docs/pipelines.rst b/docs/pipelines.rst index c418d12f..de349dcf 100644 --- a/docs/pipelines.rst +++ b/docs/pipelines.rst @@ -29,7 +29,7 @@ The tool-specific parameters, as well as Docker/Singularity profiles, are includ In particular, the following parameters are frequently modified in practice: * ``params.global.project_name``: a project name which will be included in some of the output file names. - * ``params.data.tenx.cellranger_outs_dir_paths``, which should point to the ``outs/`` folder generated by CellRanger (if using 10x data). See ``Information on using 10x Genomics datasets`` for additional info. + * ``params.data.tenx.cellranger_mex``, which should point to the ``outs/`` folder generated by CellRanger (if using 10x data). See ``Information on using 10x Genomics datasets`` for additional info. * Filtering parameters (``params.sc.scanpy.filter``): filtering parameters, which will be applied to all samples, can be set here: min/max genes, mitochondrial read fraction, and min cells. See ``Multi-sample parameters`` for additional info on how to specify sample-specific parameters. * Louvain cluster resolution: ``params.sc.scanpy.clustering.resolution``. * For cell- and sample-level annotations, see ``here`` for additional info. @@ -202,20 +202,46 @@ Depending on the type of data you run the pipeline with, one or more appropriate Cell Ranger (10xGenomics) ------------------------- -:: - -profiles tenx +Use the following profile when generating the config file: +- either using the Cell Ranger ``MEX`` output folder, -In the generated .config file, make sur the ``cellranger_outs_dir_paths`` parameter is set with the paths to the Cell Ranger ``outs`` folders:: +.. code:: + + -profile tenx + +In the generated .config file, make sur the ``cellranger_mex`` parameter is set with the paths to the Cell Ranger ``outs`` folders: + +.. code:: [...] tenx { - cellranger_outs_dir_paths = "data/10x/1k_pbmc/1k_pbmc_*/outs/" + cellranger_mex = "data/10x/1k_pbmc/1k_pbmc_*/outs/" } [...] -- The ``cellranger_outs_dir_paths`` parameter accepts glob patterns and also comma separated paths. +The ``cellranger_mex`` parameter accepts glob patterns and also comma separated paths. + + +- or the Cell Ranger ``h5`` file, + +.. code:: + + -profile tenx_h5 + + +In the generated .config file, make sur the ``cellranger_h5`` parameter is set with the paths to the Cell Ranger ``outs`` files: + +.. code:: + + [...] + tenx { + cellranger_h5 = "data/10x/1k_pbmc/1k_pbmc_*/outs/" + } + [...] + +- The ``cellranger_mex`` parameter accepts glob patterns and also comma separated paths. Information on using 10x Genomics datasets @@ -236,18 +262,23 @@ Let's say the file structure of your data looks like this, └── ... Setting the input directory appropriately will collect all the samples listed in the ``filtered_[feature|gene]_bc_matrix`` directories listed above. -For example, in ``params.data.tenx``, setting:: +For example, in ``params.data.tenx``, setting: + +.. code:: + + cellranger_mex = "/home/data/cellranger/Sample*/outs/" - cellranger_outs_dir_paths = "/home/data/cellranger/Sample*/outs/" will recursively find all 10x samples in that directory. H5AD (Scanpy) ------------- -:: +Use the following profile when generating the config file: + +.. code:: - -profiles h5ad + -profile h5ad In the generated .config file, make sure the ``file_paths`` parameter is set with the paths to the ``.h5ad`` files:: @@ -264,9 +295,11 @@ In the generated .config file, make sure the ``file_paths`` parameter is set wit TSV --- -:: +Use the following profile when generating the config file: - -profiles tsv +.. code:: + + -profile tsv In the generated .config file, make sure the ``file_paths`` parameter is set with the paths to the ``.tsv`` files:: @@ -284,9 +317,11 @@ In the generated .config file, make sure the ``file_paths`` parameter is set wit CSV --- -:: +Use the following profile when generating the config file: + +.. code:: - -profiles csv + -profile csv In the generated .config file, make sure the ``file_paths`` parameter is set with the paths to the ``.csv`` files:: From 100efef4bf7d129ed8cad8c91decb01a3555656e Mon Sep 17 00:00:00 2001 From: dweemx Date: Mon, 24 Feb 2020 11:13:53 +0100 Subject: [PATCH 13/32] Improve docs to define the data input paths --- docs/pipelines.rst | 35 +++++++++++++++++++++++++--------- tests/publish_with_renaming.nf | 31 ------------------------------ 2 files changed, 26 insertions(+), 40 deletions(-) delete mode 100644 tests/publish_with_renaming.nf diff --git a/docs/pipelines.rst b/docs/pipelines.rst index de349dcf..13ec1594 100644 --- a/docs/pipelines.rst +++ b/docs/pipelines.rst @@ -200,12 +200,36 @@ Input Data Formats Depending on the type of data you run the pipeline with, one or more appropriate profiles should be set when running ``nextflow config``. +All the input data parameters are compatible with the following features: + +- Glob patterns + +.. code:: + + "data/10x/1k_pbmc/1k_pbmc_*/outs/" + +- Comma separated paths (paths can contain glob patterns) + +.. code:: + + "data/10x/1k_pbmc/1k_pbmc_v2_chemistry/outs/, data/10x/1k_pbmc/1k_pbmc_v3_chemistry/outs/" + +- Array of paths (paths can contain glob patterns) + +.. code:: + + [ + "data/10x/1k_pbmc/1k_pbmc_v2_chemistry/outs/", + "data/10x/1k_pbmc/1k_pbmc_v3_chemistry/outs/" + ] + + Cell Ranger (10xGenomics) ------------------------- Use the following profile when generating the config file: -- either using the Cell Ranger ``MEX`` output folder, +- Either using the Cell Ranger ``MEX`` output folder, .. code:: @@ -221,10 +245,8 @@ In the generated .config file, make sur the ``cellranger_mex`` parameter is set } [...] -The ``cellranger_mex`` parameter accepts glob patterns and also comma separated paths. - -- or the Cell Ranger ``h5`` file, +- Or the Cell Ranger ``h5`` file, .. code:: @@ -241,8 +263,6 @@ In the generated .config file, make sur the ``cellranger_h5`` parameter is set w } [...] -- The ``cellranger_mex`` parameter accepts glob patterns and also comma separated paths. - Information on using 10x Genomics datasets ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -291,7 +311,6 @@ In the generated .config file, make sure the ``file_paths`` parameter is set wit [...] - The ``suffix`` parameter is used to infer the sample name from the file paths (it is removed from the input file path to derive a sample name). -- The ``file_paths`` accepts glob patterns and also comma separated paths. TSV --- @@ -312,7 +331,6 @@ In the generated .config file, make sure the ``file_paths`` parameter is set wit [...] - The ``suffix`` parameter is used to infer the sample name from the file paths (it is removed from the input file path to derive a sample name). -- The ``file_paths`` accepts glob patterns and also comma separated paths. CSV @@ -334,4 +352,3 @@ In the generated .config file, make sure the ``file_paths`` parameter is set wit [...] - The ``suffix`` parameter is used to infer the sample name from the file paths (it is removed from the input file path to derive a sample name). -- The ``file_paths`` accepts glob patterns and also comma separated paths. diff --git a/tests/publish_with_renaming.nf b/tests/publish_with_renaming.nf deleted file mode 100644 index 30c9459e..00000000 --- a/tests/publish_with_renaming.nf +++ /dev/null @@ -1,31 +0,0 @@ -nextflow.preview.dsl=2 - -process PUBLISH { - - publishDir "out", mode: 'link', overwrite: true, saveAs: { filename -> params.out_filename } - - input: - file f - - output: - file f - - """ - """ -} - -process TOUCH { - - output: - file("foo.txt") - - """ - touch foo.txt - """ - -} - -workflow { - main: - PUBLISH( TOUCH() ) -} \ No newline at end of file From 879fa686e47da976fe5ddfcc20659904720f5e89 Mon Sep 17 00:00:00 2001 From: dweemx Date: Mon, 24 Feb 2020 11:57:24 +0100 Subject: [PATCH 14/32] Fix bug for h5ad data input Plus remove unecessary parameter of h5ad_concatenate.config --- src/channels/channels.nf | 2 +- src/utils/conf/h5ad_concatenate.config | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/src/channels/channels.nf b/src/channels/channels.nf index 60d04376..381127d2 100644 --- a/src/channels/channels.nf +++ b/src/channels/channels.nf @@ -32,7 +32,7 @@ workflow getDataChannel { params.data.h5ad.file_paths, params.data.h5ad.suffix ).map { - it -> tuple(it[0], it[1], "tsv", "h5ad") + it -> tuple(it[0], it[1], "h5ad", "h5ad") } ).view() } diff --git a/src/utils/conf/h5ad_concatenate.config b/src/utils/conf/h5ad_concatenate.config index 140f38cb..aeb602ba 100644 --- a/src/utils/conf/h5ad_concatenate.config +++ b/src/utils/conf/h5ad_concatenate.config @@ -2,7 +2,6 @@ params { sc { file_concatenator { join = 'outer' - iff = '10x_cellranger_mex' off = 'h5ad' } } From bc04cc8de53b336e13a4ae53ca301b626839ad9d Mon Sep 17 00:00:00 2001 From: dweemx Date: Mon, 24 Feb 2020 14:36:26 +0100 Subject: [PATCH 15/32] Add regress out variable docs section --- docs/features.rst | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/docs/features.rst b/docs/features.rst index e9cd7932..f99f0eaf 100644 --- a/docs/features.rst +++ b/docs/features.rst @@ -171,3 +171,23 @@ Since ``v0.9.0``, it is possible to explore several combinations of parameters. - ``resolution`` :: resolutions = [0.4, 0.8] + +Regress out variables +--------------------- + +By default, don't regress any variable out. To enable this features, the ``scanpy_regress_out`` profile should be added when generating the main config using ``nextflow config``. This will add the following entry in the config: + +.. code:: groovy + + params { + sc { + scanpy { + regress_out { + variablesToRegressOut = [] + off = 'h5ad' + } + } + } + } + +Add any variable in ``variablesToRegressOut`` to regress out: e.g.: 'n_counts', 'percent_mito'. From 41300db9ce0b254b399da5ec44bb0c8fe81c19b6 Mon Sep 17 00:00:00 2001 From: dweemx Date: Mon, 24 Feb 2020 14:36:46 +0100 Subject: [PATCH 16/32] Populate cell and sample-based docs sections --- docs/features.rst | 59 +++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 57 insertions(+), 2 deletions(-) diff --git a/docs/features.rst b/docs/features.rst index f99f0eaf..da3a76fa 100644 --- a/docs/features.rst +++ b/docs/features.rst @@ -110,12 +110,67 @@ Currently, only the Scanpy related pipelines have this feature implemented. Cell-based metadata annotation ------------------------------ -If you have (pre-computed) cell-based metadata and you'd like to add them as annotations, please read `cell-based metadata annotation `_. +The profile ``utils_cell_annotate`` should be added when generating the main config using ``nextflow config``. This will add the following entry in the config: + +.. code:: groovy + + params { + sc { + cell_annotate { + iff = '10x_cellranger_mex' + off = 'h5ad' + cellMetaDataFilePath = '' + indexColumnName = '' + sampleColumnName = '' + annotationColumnNames = [''] + } + } + } + +Then, the following parameters should be updated to use the module feature: + +- ``cellMetaDataFilePath`` is a TSV file (with header) with at least 2 columns: a column containing all the cell IDs and an annotation column. +- ``indexColumnName`` is the column name from ``cellMetaDataFilePath`` containing the cell IDs information. +- ``sampleColumnName`` is the column name from ``cellMetaDataFilePath`` containing the sample ID/name information. +- ``annotationColumnNames`` is an array of columns names from ``cellMetaDataFilePath`` containing different annotation metadata to add. Sample-based metadata annotation -------------------------------- -If you have sample-based metadata and you'd like to annotate the cells with these annotations, please read `sample-based metadata annotation `_. +The profile ``utils_sample_annotate`` should be added when generating the main config using ``nextflow config``. This will add the following entry in the config: + +.. code:: groovy + + params { + sc { + sample_annotate { + iff = '10x_cellranger_mex' + off = 'h5ad' + type = 'sample' + metaDataFilePath = 'data/10x/1k_pbmc/metadata.tsv' + } + } + } + +Then, the following parameters should be updated to use the module feature: + +- ``metaDataFilePath`` is a TSV file (with header) with at least 2 columns where the first column need to match the sample IDs. Any other columns will be added as annotation in the final loom i.e.: all the cells related to their sample will get annotated with their given annotations. + +.. list-table:: Sample-based Metadata Table + :widths: 40 40 20 + :header-rows: 1 + + * - id + - chemistry + - ... + * - 1k_pbmc_v2_chemistry + - v2 + - ... + * - 1k_pbmc_v3_chemistry + - v3 + - ... + +Sample-annotating the samples using this system will allow any user to query all the annotation using the SCope portal. This is especially relevant when samples needs to be compared across specific annotations (check compare tab with SCope). Multi-sample parameters ------------------------ From 71867714634b5a1d1508d2c62e72fb92367f2e5f Mon Sep 17 00:00:00 2001 From: dweemx Date: Mon, 24 Feb 2020 15:34:07 +0100 Subject: [PATCH 17/32] Update tools pcacv, scanpy, scenic --- src/pcacv | 2 +- src/scanpy | 2 +- src/scenic | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/pcacv b/src/pcacv index a22d4ef0..b4574930 160000 --- a/src/pcacv +++ b/src/pcacv @@ -1 +1 @@ -Subproject commit a22d4ef02db47e33c41fc56f351bd28f06effda3 +Subproject commit b4574930b43148136bf5351f22cc154f99557ead diff --git a/src/scanpy b/src/scanpy index 8ab5d125..342bc17b 160000 --- a/src/scanpy +++ b/src/scanpy @@ -1 +1 @@ -Subproject commit 8ab5d125442dd0d52934e4d265b06ab2e3129a63 +Subproject commit 342bc17b4346950f85eb203e132b8bd53faf272c diff --git a/src/scenic b/src/scenic index 2339677d..e3b6dbf1 160000 --- a/src/scenic +++ b/src/scenic @@ -1 +1 @@ -Subproject commit 2339677dda1881cb05d85548124a32ffa4397d72 +Subproject commit e3b6dbf1acd3ad63c7e9493acd4253135c001d8d From d4919cf4b2900b2199c1fc5a50ee7a0e3eeb0a5c Mon Sep 17 00:00:00 2001 From: dweemx Date: Mon, 24 Feb 2020 16:05:13 +0100 Subject: [PATCH 18/32] Fix bug cannot find single_sample_standalone --- docs/features.rst | 5 ----- main.nf | 2 +- 2 files changed, 1 insertion(+), 6 deletions(-) diff --git a/docs/features.rst b/docs/features.rst index da3a76fa..b13c2cbb 100644 --- a/docs/features.rst +++ b/docs/features.rst @@ -210,11 +210,6 @@ You'll just have to repeat the following structure for the parameters which you Parameter exploration ---------------------- -The latest version only implements this feature for the following pipelines: - -- ``single_sample`` -- ``bbknn`` - Since ``v0.9.0``, it is possible to explore several combinations of parameters. The latest version of the VSN-Pipelines allows to explore the following parameters: - ``params.sc.scanpy.clustering`` diff --git a/main.nf b/main.nf index 8e1a40ea..d59b6e02 100644 --- a/main.nf +++ b/main.nf @@ -81,7 +81,7 @@ workflow single_sample { workflow single_sample_scenic { include scenic_append as SCENIC_APPEND from './src/scenic/main.nf' params(params) - include single_sample_standalone as SINGLE_SAMPLE from './workflows/single_sample' params(params) + include single_sample as SINGLE_SAMPLE from './workflows/single_sample' params(params) getDataChannel | SINGLE_SAMPLE SCENIC_APPEND( SINGLE_SAMPLE.out.filteredloom, From c85120741afdbe571771a004dcc0151c3ac96619 Mon Sep 17 00:00:00 2001 From: dweemx Date: Mon, 24 Feb 2020 18:42:59 +0100 Subject: [PATCH 19/32] Fix bug unpaired reports (clustering and bec) --- src/utils/workflows/utils.nf | 4 ++-- workflows/bbknn.nf | 34 ++++++++++++++++++++++++++----- workflows/harmony.nf | 37 ++++++++++++++++++++++++++++------ workflows/mnncorrect.nf | 39 +++++++++++++++++++++++++++++------- 4 files changed, 94 insertions(+), 20 deletions(-) diff --git a/src/utils/workflows/utils.nf b/src/utils/workflows/utils.nf index 24925eb6..aa4f87e8 100644 --- a/src/utils/workflows/utils.nf +++ b/src/utils/workflows/utils.nf @@ -11,7 +11,7 @@ nextflow.preview.dsl=2 workflow COMBINE_BY_PARAMS { take: - // Expects (sampleId, data, *params) + // Expects (sampleId, data, unstashedParams) A // Expects (sampleId, data, [stashedParams]) B @@ -26,7 +26,7 @@ workflow COMBINE_BY_PARAMS { }.groupTuple( by: [0, params.numParams()-1] ).map { - it -> tuple(it[1], *it[2]) + it -> tuple(it[1], *it[2], it[0]) } } else { out = A.join(B) diff --git a/workflows/bbknn.nf b/workflows/bbknn.nf index 43e03bdc..53cfe62c 100644 --- a/workflows/bbknn.nf +++ b/workflows/bbknn.nf @@ -5,6 +5,7 @@ nextflow.preview.dsl=2 include '../src/utils/processes/files.nf' params(params.sc.file_concatenator + params.global + params) include '../src/utils/processes/utils.nf' params(params.sc.file_concatenator + params.global + params) +include '../src/utils/workflows/utils.nf' params(params) include QC_FILTER from '../src/scanpy/workflows/qc_filter.nf' params(params) include NORMALIZE_TRANSFORM from '../src/scanpy/workflows/normalize_transform.nf' params(params + params.global) @@ -82,20 +83,43 @@ workflow bbknn { ) // Collect the reports: + // Define the parameters for clustering + def clusteringParams = SC__SCANPY__CLUSTERING_PARAMS( clean(params.sc.scanpy.clustering) ) + // Pairing clustering reports with bec reports + if(!clusteringParams.isParameterExplorationModeOn()) { + clusteringBECReports = BEC_BBKNN.out.cluster_report.map { + it -> tuple(it[0], it[1]) + }.combine( + BEC_BBKNN.out.bbknn_report.map { + it -> tuple(it[0], it[1]) + }, + by: 0 + ) + } else { + clusteringBECReports = COMBINE_BY_PARAMS( + BEC_BBKNN.out.cluster_report.map { + it -> tuple(it[0], it[1], *it[2]) + }, + BEC_BBKNN.out.bbknn_report, + clusteringParams + ).map { + it -> tuple(it[0], it[1], it[2]) + } + } ipynbs = project.combine( UTILS__GENERATE_WORKFLOW_CONFIG_REPORT.out ).join( - HVG_SELECTION.out.report - ).join( - BEC_BBKNN.out.cluster_report + HVG_SELECTION.out.report.map { + it -> tuple(it[0], it[1]) + } ).combine( - BEC_BBKNN.out.bbknn_report, + clusteringBECReports, by: 0 ).map { tuple( it[0], it.drop(1) ) } + // reporting: - def clusteringParams = SC__SCANPY__CLUSTERING_PARAMS( clean(params.sc.scanpy.clustering) ) SC__SCANPY__MERGE_REPORTS( ipynbs, "merged_report", diff --git a/workflows/harmony.nf b/workflows/harmony.nf index 61895aa3..3245cbd0 100644 --- a/workflows/harmony.nf +++ b/workflows/harmony.nf @@ -5,6 +5,7 @@ nextflow.preview.dsl=2 include '../src/utils/processes/files.nf' params(params.sc.file_concatenator + params.global + params) include '../src/utils/processes/utils.nf' params(params.sc.file_concatenator + params.global + params) +include '../src/utils/workflows/utils.nf' params(params) include QC_FILTER from '../src/scanpy/workflows/qc_filter.nf' params(params) include NORMALIZE_TRANSFORM from '../src/scanpy/workflows/normalize_transform.nf' params(params + params.global) @@ -81,21 +82,45 @@ workflow harmony { UTILS__GENERATE_WORKFLOW_CONFIG_REPORT( file(workflow.projectDir + params.utils.workflow_configuration.report_ipynb) ) - // collect the reports: + + // Collect the reports: + // Define the parameters for clustering + def clusteringParams = SC__SCANPY__CLUSTERING_PARAMS( clean(params.sc.scanpy.clustering) ) + // Pairing clustering reports with bec reports + if(!clusteringParams.isParameterExplorationModeOn()) { + clusteringBECReports = BEC_HARMONY.out.cluster_report.map { + it -> tuple(it[0], it[1]) + }.combine( + BEC_HARMONY.out.harmony_report.map { + it -> tuple(it[0], it[1]) + }, + by: 0 + ) + } else { + clusteringBECReports = COMBINE_BY_PARAMS( + BEC_HARMONY.out.cluster_report.map { + it -> tuple(it[0], it[1], *it[2]) + }, + BEC_HARMONY.out.harmony_report, + clusteringParams + ).map { + it -> tuple(it[0], it[1], it[2]) + } + } ipynbs = project.combine( UTILS__GENERATE_WORKFLOW_CONFIG_REPORT.out ).join( - HVG_SELECTION.out.report - ).join( - BEC_HARMONY.out.cluster_report + HVG_SELECTION.out.report.map { + it -> tuple(it[0], it[1]) + } ).combine( - BEC_HARMONY.out.harmony_report, + clusteringBECReports, by: 0 ).map { tuple( it[0], it.drop(1) ) } + // reporting: - def clusteringParams = SC__SCANPY__CLUSTERING_PARAMS( clean(params.sc.scanpy.clustering) ) SC__SCANPY__MERGE_REPORTS( ipynbs, "merged_report", diff --git a/workflows/mnncorrect.nf b/workflows/mnncorrect.nf index f79bbc40..ad9cebd4 100644 --- a/workflows/mnncorrect.nf +++ b/workflows/mnncorrect.nf @@ -5,6 +5,7 @@ nextflow.preview.dsl=2 include '../src/utils/processes/files.nf' params(params.sc.file_concatenator + params.global + params) include '../src/utils/processes/utils.nf' params(params.sc.file_concatenator + params.global + params) +include '../src/utils/workflows/utils.nf' params(params) include QC_FILTER from '../src/scanpy/workflows/qc_filter.nf' params(params) include NORMALIZE_TRANSFORM from '../src/scanpy/workflows/normalize_transform.nf' params(params) @@ -46,7 +47,7 @@ workflow mnncorrect { if(params.sc.scanpy.containsKey("regress_out")) { preprocessed_data = SC__SCANPY__REGRESS_OUT( HVG_SELECTION.out.scaled ) } else { - preprocessed_data = HVG_SELECTION.out.scaled + preprocessed_data = HVG_SELECTION.out.hvg } DIM_REDUCTION_PCA( preprocessed_data ) NEIGHBORHOOD_GRAPH( DIM_REDUCTION_PCA.out ) @@ -78,21 +79,45 @@ workflow mnncorrect { UTILS__GENERATE_WORKFLOW_CONFIG_REPORT( file(workflow.projectDir + params.utils.workflow_configuration.report_ipynb) ) + // Collect the reports: + // Define the parameters for clustering + def clusteringParams = SC__SCANPY__CLUSTERING_PARAMS( clean(params.sc.scanpy.clustering) ) + // Pairing clustering reports with bec reports + if(!clusteringParams.isParameterExplorationModeOn()) { + clusteringBECReports = BEC_MNNCORRECT.out.cluster_report.map { + it -> tuple(it[0], it[1]) + }.combine( + BEC_MNNCORRECT.out.mnncorrect_report.map { + it -> tuple(it[0], it[1]) + }, + by: 0 + ) + } else { + clusteringBECReports = COMBINE_BY_PARAMS( + BEC_MNNCORRECT.out.cluster_report.map { + it -> tuple(it[0], it[1], *it[2]) + }, + BEC_MNNCORRECT.out.mnncorrect_report, + clusteringParams + ).map { + it -> tuple(it[0], it[1], it[2]) + } + } ipynbs = project.combine( UTILS__GENERATE_WORKFLOW_CONFIG_REPORT.out ).join( - HVG_SELECTION.out.report - ).join( - BEC_MNNCORRECT.out.cluster_report + HVG_SELECTION.out.report.map { + it -> tuple(it[0], it[1]) + } ).combine( - BEC_MNNCORRECT.out.mnncorrect_report, + clusteringBECReports, by: 0 ).map { tuple( it[0], it.drop(1) ) } - // Reporting: - def clusteringParams = SC__SCANPY__CLUSTERING_PARAMS( clean(params.sc.scanpy.clustering) ) + + // reporting: SC__SCANPY__MERGE_REPORTS( ipynbs, "merged_report", From 07ce04c92eff64957faf20b1b3052d0021f50082 Mon Sep 17 00:00:00 2001 From: dweemx Date: Mon, 24 Feb 2020 21:42:46 +0100 Subject: [PATCH 20/32] Fix for vibsinglecellnf/scanpy:0.5.0#22 Pass log-transformed matrices/AnnData objects to mnn_correct, and use HVGs instead of all the genes. (see mnnpy github README) --- workflows/mnncorrect.nf | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/workflows/mnncorrect.nf b/workflows/mnncorrect.nf index ad9cebd4..96d88487 100644 --- a/workflows/mnncorrect.nf +++ b/workflows/mnncorrect.nf @@ -47,7 +47,7 @@ workflow mnncorrect { if(params.sc.scanpy.containsKey("regress_out")) { preprocessed_data = SC__SCANPY__REGRESS_OUT( HVG_SELECTION.out.scaled ) } else { - preprocessed_data = HVG_SELECTION.out.hvg + preprocessed_data = HVG_SELECTION.out.scaled } DIM_REDUCTION_PCA( preprocessed_data ) NEIGHBORHOOD_GRAPH( DIM_REDUCTION_PCA.out ) @@ -62,7 +62,7 @@ workflow mnncorrect { BEC_MNNCORRECT( NORMALIZE_TRANSFORM.out, - preprocessed_data, + HVG_SELECTION.out.hvg, clusterIdentificationPreBatchEffectCorrection.marker_genes ) From 1d90181b2ec99ac8133476525e01855c5b737124 Mon Sep 17 00:00:00 2001 From: dweemx Date: Mon, 24 Feb 2020 21:53:26 +0100 Subject: [PATCH 21/32] Update tools harmony, scanpy --- src/harmony | 2 +- src/scanpy | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/harmony b/src/harmony index 5373bca7..ddffb428 160000 --- a/src/harmony +++ b/src/harmony @@ -1 +1 @@ -Subproject commit 5373bca7a6b907b8f1aeaa0bdd40e0d02d2bdeb0 +Subproject commit ddffb428676293f1851e9fd3706a714f31f39663 diff --git a/src/scanpy b/src/scanpy index 342bc17b..57e29c7b 160000 --- a/src/scanpy +++ b/src/scanpy @@ -1 +1 @@ -Subproject commit 342bc17b4346950f85eb203e132b8bd53faf272c +Subproject commit 57e29c7bac5e1e1707dc58257d657611e1c57ca3 From 6b886d162305fcf43b669de6fc7fde7aeb75f1d4 Mon Sep 17 00:00:00 2001 From: dweemx Date: Mon, 24 Feb 2020 22:38:48 +0100 Subject: [PATCH 22/32] Fix for single_sample pipelines --- workflows/single_sample.nf | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/workflows/single_sample.nf b/workflows/single_sample.nf index 795adc2a..b6bb0fcb 100644 --- a/workflows/single_sample.nf +++ b/workflows/single_sample.nf @@ -72,11 +72,19 @@ workflow single_sample { // Reporting: def clusteringParams = SC__SCANPY__CLUSTERING_PARAMS( clean(params.sc.scanpy.clustering) ) SC__SCANPY__MERGE_REPORTS( - QC_FILTER.out.report.mix( + QC_FILTER.out.report.map { + it -> tuple(it[0], it[1]) + }.mix( samples.combine(UTILS__GENERATE_WORKFLOW_CONFIG_REPORT.out), - HVG_SELECTION.out.report, - DIM_REDUCTION_TSNE_UMAP.out.report, - CLUSTER_IDENTIFICATION.out.report + HVG_SELECTION.out.report.map { + it -> tuple(it[0], it[1]) + }, + DIM_REDUCTION_TSNE_UMAP.out.report.map { + it -> tuple(it[0], it[1]) + }, + CLUSTER_IDENTIFICATION.out.report.map { + it -> tuple(it[0], it[1]) + } ).groupTuple(), "merged_report", clusteringParams.isParameterExplorationModeOn() From d24fb4af6b914ca3cdb6b81bf1ba4ef0357a5b5c Mon Sep 17 00:00:00 2001 From: dweemx Date: Tue, 25 Feb 2020 01:16:51 +0100 Subject: [PATCH 23/32] Fix bug mnncorrect fails with sample_data_tiny dataset It seems that mnncorrect does not work with tiny sample_data dataset. The following error is raised: IndexError: index 100 is out of bounds for axis 0 with size 100. Haven't dive into it to understand to real problem --- .github/workflows/mnncorrect.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/mnncorrect.yml b/.github/workflows/mnncorrect.yml index 8a4a8ee2..8f338110 100644 --- a/.github/workflows/mnncorrect.yml +++ b/.github/workflows/mnncorrect.yml @@ -25,8 +25,8 @@ jobs: - name: Get sample data run: | mkdir testdata - wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/sample_data_tiny.tar.gz - tar xvf sample_data_tiny.tar.gz + wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/sample_data.tar.gz + tar xvf sample_data.tar.gz cp -r sample_data testdata/sample1 mv sample_data testdata/sample2 - name: Run single_sample test From 2af9ab52658d93775588176585224438da4f98fc Mon Sep 17 00:00:00 2001 From: dweemx Date: Tue, 25 Feb 2020 10:57:12 +0100 Subject: [PATCH 24/32] Update DAGs --- assets/images/bbknn.svg | 2214 ++++++++++++++------------ assets/images/bbknn_scenic.svg | 2669 +++++++++++++++++--------------- assets/images/harmony.svg | 2350 ++++++++++++++++------------ assets/images/mnncorrect.svg | 2327 +++++++++++++++------------- 4 files changed, 5349 insertions(+), 4211 deletions(-) diff --git a/assets/images/bbknn.svg b/assets/images/bbknn.svg index 92d6e074..66ead53e 100644 --- a/assets/images/bbknn.svg +++ b/assets/images/bbknn.svg @@ -1,1350 +1,1640 @@ - + --> + pipeline_dag - + p0 - -Channel.fromPath + +Channel.from p1 - -map + +view p0->p1 - - + + p2 - -view + p1->p2 - - -channel + + p3 - -bbknn:BBKNN:bbknn_base:QC_FILTER:SC__FILE_CONVERTER + +Channel.empty - + + +p7 + +concat + + -p2->p3 - - -data +p3->p7 + + +data + + + +p8 + +view + + + +p7->p8 + + - + p4 - -bbknn:BBKNN:bbknn_base:QC_FILTER:SC__SCANPY__COMPUTE_QC_STATS - - - -p3->p4 - - + +Channel.fromPath - + p5 - -bbknn:BBKNN:bbknn_base:QC_FILTER:SC__SCANPY__GENE_FILTER + +map - + p4->p5 - - - - - -p8 - -join - - - -p4->p8 - - + + - + p6 - -bbknn:BBKNN:bbknn_base:QC_FILTER:SC__SCANPY__CELL_FILTER + +map - + p5->p6 - - + + +channel - - -p15 - -map + + +p6->p7 + + - - -p6->p15 - - + + +p9 + +ifEmpty - - -p16 - -collect + + +p8->p9 + + +data - - -p15->p16 - - + + +p11 + +bbknn:BBKNN:QC_FILTER:SC__FILE_CONVERTER - - -p7 - + + +p8->p11 + + +data - - -p7->p8 - - + + +p10 + + + + +p9->p10 + + - + p12 - -bbknn:BBKNN:bbknn_base:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT + +bbknn:BBKNN:QC_FILTER:SC__SCANPY__COMPUTE_QC_STATS - - -p8->p12 - - -data + + +p11->p12 + + - + p13 - -bbknn:BBKNN:bbknn_base:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML + +bbknn:BBKNN:QC_FILTER:SC__SCANPY__GENE_FILTER - -p12->p13 - - - - - -p9 - - - - -p9->p12 - - -ipynb - - - -p10 - - - -p10->p12 - - -reportTitle +p12->p13 + + - - -p11 - + + +p16 + +join - - -p11->p12 - - -isBenchmarkMode + + +p12->p16 + + - + p14 - + +bbknn:BBKNN:QC_FILTER:SC__SCANPY__CELL_FILTER - + p13->p14 - - + + + + + +p25 + +map + + + +p14->p25 + + + + + +p26 + +toSortedList + + + +p25->p26 + + - + p17 - -bbknn:BBKNN:bbknn_base:SC__FILE_CONCATENATOR + +map p16->p17 - - + + - - -p18 - -bbknn:BBKNN:bbknn_base:NORMALIZE_TRANSFORM:SC__SCANPY__NORMALIZATION + + +p15 + - + + +p15->p16 + + + + + +p21 + +bbknn:BBKNN:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT + + -p17->p18 - - -rawFilteredData +p17->p21 + + +data - - -p78 - -bbknn:BBKNN:bbknn_base:SC__H5AD_TO_FILTERED_LOOM + + +p22 + +map - - -p17->p78 - - -rawFilteredData + + +p21->p22 + + - - -p87 - -combine + + +p18 + - - -p17->p87 - - -rawFilteredData + + +p18->p21 + + +ipynb - + p19 - -bbknn:BBKNN:bbknn_base:NORMALIZE_TRANSFORM:SC__SCANPY__DATA_TRANSFORMATION + - - -p18->p19 - - + + +p19->p21 + + +reportTitle - + p20 - -bbknn:BBKNN:bbknn_base:HVG_SELECTION:SC__SCANPY__FEATURE_SELECTION - - - -p19->p20 - - -normalizedTransformedData - - - -p46 - -join - - - -p19->p46 - - -normalizedTransformedData - - - -p57 - -join - - - -p19->p57 - - -normalizedTransformedData - - - -p21 - -bbknn:BBKNN:bbknn_base:HVG_SELECTION:SC__SCANPY__FEATURE_SCALING + p20->p21 - - - - - -p24 - -bbknn:BBKNN:bbknn_base:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - - -p21->p24 - - -data - - - -p27 - -map - - - -p21->p27 - - -data - - - -p25 - -bbknn:BBKNN:bbknn_base:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - - - -p24->p25 - - - - - -p99 - -join - - - -p24->p99 - - - - - -p22 - - - - -p22->p24 - - -ipynb + + +isParameterExplorationModeOn p23 - + +bbknn:BBKNN:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML + + + +p22->p23 + + + + + +p24 + p23->p24 - - -reportTitle + + - - -p26 - + + +p27 + +bbknn:BBKNN:SC__FILE_CONCATENATOR - + -p25->p26 - - +p26->p27 + + p28 - -bbknn:BBKNN:bbknn_base:DIM_REDUCTION:DIM_REDUCTION_PCA:SC__SCANPY__DIM_REDUCTION__PCA + +bbknn:BBKNN:NORMALIZE_TRANSFORM:SC__SCANPY__NORMALIZATION - + p27->p28 - - -data + + +rawFilteredData + + + +p97 + +bbknn:BBKNN:SC__H5AD_TO_FILTERED_LOOM + + + +p27->p97 + + +rawFilteredData + + + +p106 + +combine + + + +p27->p106 + + +rawFilteredData p29 - -map + +bbknn:BBKNN:NORMALIZE_TRANSFORM:SC__SCANPY__DATA_TRANSFORMATION - + p28->p29 - - -data + + p30 - -bbknn:BBKNN:bbknn_base:DIM_REDUCTION:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__TSNE + +bbknn:BBKNN:HVG_SELECTION:SC__SCANPY__FIND_HIGHLY_VARIABLE_GENES - + p29->p30 - - + + +normalizedTransformedData + + + +p62 + +join + + + +p29->p62 + + +normalizedTransformedData + + + +p75 + +join + + + +p29->p75 + + +normalizedTransformedData p31 - -map + +bbknn:BBKNN:HVG_SELECTION:SC__SCANPY__SUBSET_HIGHLY_VARIABLE_GENES - + p30->p31 - - -dimReductionData + + +data - - -p48 - -map + + +p35 + +bbknn:BBKNN:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - -p30->p48 - - -dimReductionData + + +p30->p35 + + +data p32 - -bbknn:BBKNN:bbknn_base:DIM_REDUCTION:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__UMAP + +bbknn:BBKNN:HVG_SELECTION:SC__SCANPY__FEATURE_SCALING - + p31->p32 - - + + - + -p33 - -map +p40 + +map - - -p32->p33 - - -data + + +p32->p40 + + +data - - -p39 - -map + + +p41 + +bbknn:BBKNN:DIM_REDUCTION_PCA:SC__SCANPY__DIM_REDUCTION__PCA - - -p32->p39 - - -data + + +p40->p41 + + +data - + -p36 - -bbknn:BBKNN:bbknn_base:DIM_REDUCTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT +p33 + - - -p33->p36 - - -data + + +p33->p35 + + +ipynb - + -p37 - -bbknn:BBKNN:bbknn_base:DIM_REDUCTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML +p36 + +map - - -p36->p37 - - + + +p35->p36 + + - + p34 - + - + -p34->p36 - - -ipynb +p34->p35 + + +reportTitle - - -p35 - + + +p37 + +map - - -p35->p36 - - -reportTitle + + +p36->p37 + + +report_notebook + + + +p124 + +map + + + +p36->p124 + + +report_notebook - + p38 - + +bbknn:BBKNN:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML p37->p38 - - + + - + -p40 - -bbknn:BBKNN:bbknn_base:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING +p39 + - - -p39->p40 - - + + +p38->p39 + + - - -p43 - -bbknn:BBKNN:bbknn_base:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT + + +p42 + +bbknn:BBKNN:NEIGHBORHOOD_GRAPH:SC__SCANPY__NEIGHBORHOOD_GRAPH - + -p40->p43 - - -data +p41->p42 + + +data - - -p40->p46 - - -data + + +p43 + +bbknn:BBKNN:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__TSNE + + + +p42->p43 + + +data p44 - -bbknn:BBKNN:bbknn_base:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML + +bbknn:BBKNN:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__UMAP - + p43->p44 - - - - - -p41 - - - - -p41->p43 - - -ipynb + + +dimReductionData - - -p42 - + + +p64 + +map - - -p42->p43 - - -reportTitle + + +p43->p64 + + +dimReductionData p45 - + +map - + p44->p45 - - - - - -p47 - -bbknn:BBKNN:bbknn_base:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES + + +data - - -p46->p47 - - - - - -p71 - -join + + +p53 + +map - - -p47->p71 - - -A + + +p44->p53 + + +data - - -p75 - -bbknn:BBKNN:bbknn_base:BEC_BBKNN:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT + + +p48 + +bbknn:BBKNN:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - -p71->p75 - - -data + + +p45->p48 + + +data - + p49 - -bbknn:BBKNN:bbknn_base:BEC_BBKNN:SC__SCANPY__BATCH_EFFECT_CORRECTION + +map - + p48->p49 - - + + + + + +p46 + + + + +p46->p48 + + +ipynb + + + +p47 + + + + +p47->p48 + + +reportTitle - + p50 - -map + +map - + p49->p50 - - -data + + +report_notebook - + p51 - -bbknn:BBKNN:bbknn_base:BEC_BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING + +bbknn:BBKNN:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - + p50->p51 - - + + + + + +p52 + + + + +p51->p52 + + - + p54 - -bbknn:BBKNN:bbknn_base:BEC_BBKNN:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT + +bbknn:BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING - - -p51->p54 - - -data + + +p53->p54 + + - - -p51->p57 - - -data + + +p57 + +bbknn:BBKNN:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - -p55 - -bbknn:BBKNN:bbknn_base:BEC_BBKNN:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML + + +p54->p57 + + +data - - -p54->p55 - - -cluster_report + + +p54->p62 + + +data - - -p100 - -join + + +p58 + +map - - -p54->p100 - - -cluster_report + + +p57->p58 + + - - -p52 - + + +p55 + - + -p52->p54 - - -ipynb - - - -p53 - - - - -p53->p54 - - -reportTitle +p55->p57 + + +ipynb p56 - - - - -p55->p56 - - - - - -p58 - -bbknn:BBKNN:bbknn_base:BEC_BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES + - - -p57->p58 - - + + +p56->p57 + + +reportTitle - + p59 - -map + +map - + p58->p59 - - - - - -p64 - -combine - - - -p59->p64 - - - - - -p65 - -bbknn:BBKNN:bbknn_base:BEC_BBKNN:SC__SCANPY__DIM_REDUCTION__UMAP - - - -p64->p65 - - + + +report_notebook - + p60 - -Channel.from + +bbknn:BBKNN:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML + + + +p59->p60 + + - + p61 - + - + p60->p61 - - - - - -p62 - - - - -p62->p64 - - + + - + p63 - - - - -p63->p64 - - + +bbknn:BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES - - -p66 - -map - - - -p65->p66 - - + + +p62->p63 + + - - -p80 - -groupTuple + + +p89 + +join - - -p65->p80 - - + + +p63->p89 + + +A - - -p94 - -map + + +p93 + +bbknn:BBKNN:BEC_BBKNN:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT - - -p65->p94 - - + + +p89->p93 + + +data - - -p68 - -bbknn:BBKNN:bbknn_base:BEC_BBKNN:SC__PUBLISH_H5AD + + +p65 + +bbknn:BBKNN:BEC_BBKNN:SC__SCANPY__BATCH_EFFECT_CORRECTION - - -p66->p68 - - + + +p64->p65 + + - - -p69 - + + +p66 + +map - - -p68->p69 - - -B + + +p65->p66 + + +data - + p67 - + +bbknn:BBKNN:BEC_BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING - - -p67->p68 - - -fOutSuffix + + +p66->p67 + + - + p70 - + +bbknn:BBKNN:BEC_BBKNN:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT + + + +p67->p70 + + +data + + + +p67->p75 + + +data + + + +p71 + +map p70->p71 - - + + - - -p76 - -bbknn:BBKNN:bbknn_base:BEC_BBKNN:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML + + +p68 + - - -p75->p76 - - + + +p68->p70 + + +ipynb + + + +p69 + + + + +p69->p70 + + +reportTitle p72 - + +map - - -p72->p75 - - -ipynb + + +p71->p72 + + +cluster_report + + + +p116 + +map + + + +p71->p116 + + +cluster_report p73 - + +bbknn:BBKNN:BEC_BBKNN:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - - -p73->p75 - - -reportTitle + + +p72->p73 + + p74 - + - - -p74->p75 - - -isBenchmarkMode + + +p73->p74 + + - + +p76 + +bbknn:BBKNN:BEC_BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES + + + +p75->p76 + + + + + p77 - + +map - + p76->p77 - - + + - + +p82 + +combine + + + +p77->p82 + + + + + +p83 + +bbknn:BBKNN:BEC_BBKNN:SC__SCANPY__DIM_REDUCTION__UMAP + + + +p82->p83 + + + + + +p78 + +Channel.from + + + p79 - + - + p78->p79 - - -filteredloom + + - - -p81 - -branch + + +p80 + - + -p80->p81 - - -data +p80->p82 + + - - -p82 - -view + + +p81 + p81->p82 - - - - - -p92 - -view - - - -p81->p92 - - + + p84 - -map - - - -p81->p84 - - - - - -p83 - + +map - + -p82->p83 - - +p83->p84 + + - - -p93 - + + +p99 + +groupTuple - - -p92->p93 - - + + +p83->p99 + + + + + +p113 + +map + + + +p83->p113 + + - + -p85 - +p86 + +bbknn:BBKNN:BEC_BBKNN:SC__PUBLISH_H5AD - - -p84->p85 - - + + +p84->p86 + + - - -p86 - + + +p87 + p86->p87 - - + + +B + + + +p85 + + + + +p85->p86 + + +fOutSuffix p88 - -ifEmpty - - - -p87->p88 - - - - - -p89 - -bbknn:BBKNN:bbknn_base:FILE_CONVERTER:SC__H5AD_TO_LOOM + - + p88->p89 - - + + + + + +p94 + +map + + + +p93->p94 + + + + + +p117 + +map + + + +p93->p117 + + p90 - -bbknn:BBKNN:bbknn_base:FILE_CONVERTER:COMPRESS_HDF5 + - - -p89->p90 - - + + +p90->p93 + + +ipynb p91 - - - - -p90->p91 - - -scopeloom + - - -p98 - -combine + + +p91->p93 + + +reportTitle - - -p94->p98 - - -project + + +p92 + - - -p98->p99 - - + + +p92->p93 + + +isParameterExplorationModeOn - + p95 - + +bbknn:BBKNN:BEC_BBKNN:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML + + + +p94->p95 + + - + p96 - -bbknn:BBKNN:bbknn_base:UTILS__GENERATE_WORKFLOW_CONFIG_REPORT + - + p95->p96 - - -ipynb - - - -p96->p98 - - + + - + -p97 - +p98 + - + p97->p98 - - + + + + + +p100 + +branch - + p99->p100 - - + + +data - + +p111 + +view + + + +p100->p111 + + + + + +p101 + +view + + + +p100->p101 + + + + + p103 - -combine + +map p100->p103 - - - - - -p104 - -map - - - -p103->p104 - - + + - - -p101 - + + +p112 + - - -p101->p103 - - + + +p111->p112 + + p102 - + - - -p102->p103 - - + + +p101->p102 + + - + -p107 - -bbknn:BBKNN:bbknn_base:SC__SCANPY__MERGE_REPORTS +p104 + - - -p104->p107 - - -ipynbs + + +p103->p104 + + - + -p108 - -bbknn:BBKNN:bbknn_base:SC__SCANPY__REPORT_TO_HTML +p107 + +ifEmpty - - -p107->p108 - - + + +p106->p107 + + - + p105 - + - - -p105->p107 - - -reportTitle + + +p105->p106 + + - - -p106 - + + +p108 + +bbknn:BBKNN:FILE_CONVERTER:SC__H5AD_TO_LOOM - - -p106->p107 - - -isBenchmarkMode + + +p107->p108 + + - + p109 - + +bbknn:BBKNN:FILE_CONVERTER:COMPRESS_HDF5 - + p108->p109 - - + + + + + +p110 + + + + +p109->p110 + + + + + +p123 + +combine + + + +p113->p123 + + +project + + + +p125 + +join + + + +p123->p125 + + + + + +p114 + + + + +p115 + +bbknn:BBKNN:UTILS__GENERATE_WORKFLOW_CONFIG_REPORT + + + +p114->p115 + + +ipynb + + + +p115->p123 + + + + + +p120 + +combine + + + +p116->p120 + + + + + +p121 + + + + +p120->p121 + + +clusteringBECReports + + + +p118 + + + + +p117->p118 + + + + + +p119 + + + + +p119->p120 + + + + + +p122 + + + + +p122->p123 + + + + + +p128 + +combine + + + +p125->p128 + + + + + +p124->p125 + + + + + +p129 + +map + + + +p128->p129 + + + + + +p126 + + + + +p126->p128 + + + + + +p127 + + + + +p127->p128 + + + + + +p132 + +bbknn:BBKNN:SC__SCANPY__MERGE_REPORTS + + + +p129->p132 + + +ipynbs + + + +p133 + +bbknn:BBKNN:SC__SCANPY__REPORT_TO_HTML + + + +p132->p133 + + + + + +p130 + + + + +p130->p132 + + +reportTitle + + + +p131 + + + + +p131->p132 + + +isParameterExplorationModeOn + + + +p134 + + + + +p133->p134 + + \ No newline at end of file diff --git a/assets/images/bbknn_scenic.svg b/assets/images/bbknn_scenic.svg index 7d5a643a..d9fd35d6 100644 --- a/assets/images/bbknn_scenic.svg +++ b/assets/images/bbknn_scenic.svg @@ -1,1715 +1,1912 @@ - + --> + pipeline_dag - + p0 - -Channel.from + +Channel.from p1 - + +view p0->p1 - - -runs + + p2 - -Channel.fromPath + + + + +p1->p2 + + p3 - -map - - - -p2->p3 - - + +Channel.from p4 - -view + p3->p4 - - -channel + + +runs p5 - -bbknn_scenic:BBKNN:bbknn_base:QC_FILTER:SC__FILE_CONVERTER + +Channel.empty + + + +p9 + +concat - + -p4->p5 - - -data +p5->p9 + + +data + + + +p10 + +view + + + +p9->p10 + + - + p6 - -bbknn_scenic:BBKNN:bbknn_base:QC_FILTER:SC__SCANPY__COMPUTE_QC_STATS - - - -p5->p6 - - + +Channel.fromPath - + p7 - -bbknn_scenic:BBKNN:bbknn_base:QC_FILTER:SC__SCANPY__GENE_FILTER + +map - + p6->p7 - - - - - -p10 - -join - - - -p6->p10 - - + + - + p8 - -bbknn_scenic:BBKNN:bbknn_base:QC_FILTER:SC__SCANPY__CELL_FILTER + +map - + p7->p8 - - + + +channel - - -p17 - -map + + +p8->p9 + + - - -p8->p17 - - + + +p11 + +ifEmpty - - -p18 - -collect + + +p10->p11 + + +data - - -p17->p18 - - + + +p13 + +bbknn_scenic:BBKNN:QC_FILTER:SC__FILE_CONVERTER - - -p9 - + + +p10->p13 + + +data - - -p9->p10 - - + + +p12 + + + + +p11->p12 + + - + p14 - -bbknn_scenic:BBKNN:bbknn_base:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT + +bbknn_scenic:BBKNN:QC_FILTER:SC__SCANPY__COMPUTE_QC_STATS - - -p10->p14 - - -data + + +p13->p14 + + - + p15 - -bbknn_scenic:BBKNN:bbknn_base:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML + +bbknn_scenic:BBKNN:QC_FILTER:SC__SCANPY__GENE_FILTER - + p14->p15 - - - - - -p11 - - - - -p11->p14 - - -ipynb + + - - -p12 - + + +p18 + +join - - -p12->p14 - - -reportTitle + + +p14->p18 + + - - -p13 - + + +p16 + +bbknn_scenic:BBKNN:QC_FILTER:SC__SCANPY__CELL_FILTER - + -p13->p14 - - -isBenchmarkMode +p15->p16 + + - + -p16 - +p27 + +map - + + +p16->p27 + + + + + +p28 + +toSortedList + + + +p27->p28 + + + + + +p17 + + + -p15->p16 - - +p17->p18 + + - + p19 - -bbknn_scenic:BBKNN:bbknn_base:SC__FILE_CONCATENATOR + +map p18->p19 - - + + - - -p20 - -bbknn_scenic:BBKNN:bbknn_base:NORMALIZE_TRANSFORM:SC__SCANPY__NORMALIZATION + + +p23 + +bbknn_scenic:BBKNN:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT - + -p19->p20 - - -rawFilteredData +p19->p23 + + +data - - -p80 - -bbknn_scenic:BBKNN:bbknn_base:SC__H5AD_TO_FILTERED_LOOM + + +p24 + +map - - -p19->p80 - - -rawFilteredData + + +p23->p24 + + - - -p88 - -combine + + +p20 + - - -p19->p88 - - -rawFilteredData + + +p20->p23 + + +ipynb - + p21 - -bbknn_scenic:BBKNN:bbknn_base:NORMALIZE_TRANSFORM:SC__SCANPY__DATA_TRANSFORMATION + - - -p20->p21 - - + + +p21->p23 + + +reportTitle - + p22 - -bbknn_scenic:BBKNN:bbknn_base:HVG_SELECTION:SC__SCANPY__FEATURE_SELECTION - - - -p21->p22 - - -normalizedTransformedData - - - -p48 - -join - - - -p21->p48 - - -normalizedTransformedData - - - -p59 - -join - - - -p21->p59 - - -normalizedTransformedData - - - -p23 - -bbknn_scenic:BBKNN:bbknn_base:HVG_SELECTION:SC__SCANPY__FEATURE_SCALING + p22->p23 - - - - - -p26 - -bbknn_scenic:BBKNN:bbknn_base:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - - -p23->p26 - - -data - - - -p29 - -map - - - -p23->p29 - - -data - - - -p27 - -bbknn_scenic:BBKNN:bbknn_base:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - - - -p26->p27 - - - - - -p99 - -join - - - -p26->p99 - - - - - -p24 - - - - -p24->p26 - - -ipynb + + +isParameterExplorationModeOn p25 - + +bbknn_scenic:BBKNN:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML + + + +p24->p25 + + + + + +p26 + p25->p26 - - -reportTitle + + - - -p28 - + + +p29 + +bbknn_scenic:BBKNN:SC__FILE_CONCATENATOR - + -p27->p28 - - +p28->p29 + + p30 - -bbknn_scenic:BBKNN:bbknn_base:DIM_REDUCTION:DIM_REDUCTION_PCA:SC__SCANPY__DIM_REDUCTION__PCA + +bbknn_scenic:BBKNN:NORMALIZE_TRANSFORM:SC__SCANPY__NORMALIZATION - + p29->p30 - - -data + + +rawFilteredData + + + +p99 + +bbknn_scenic:BBKNN:SC__H5AD_TO_FILTERED_LOOM + + + +p29->p99 + + +rawFilteredData + + + +p107 + +combine + + + +p29->p107 + + +rawFilteredData p31 - -map + +bbknn_scenic:BBKNN:NORMALIZE_TRANSFORM:SC__SCANPY__DATA_TRANSFORMATION - + p30->p31 - - -data + + p32 - -bbknn_scenic:BBKNN:bbknn_base:DIM_REDUCTION:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__TSNE + +bbknn_scenic:BBKNN:HVG_SELECTION:SC__SCANPY__FIND_HIGHLY_VARIABLE_GENES - + p31->p32 - - + + +normalizedTransformedData + + + +p64 + +join + + + +p31->p64 + + +normalizedTransformedData + + + +p77 + +join + + + +p31->p77 + + +normalizedTransformedData p33 - -map + +bbknn_scenic:BBKNN:HVG_SELECTION:SC__SCANPY__SUBSET_HIGHLY_VARIABLE_GENES - + p32->p33 - - -dimReductionData + + +data - - -p50 - -map + + +p37 + +bbknn_scenic:BBKNN:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - -p32->p50 - - -dimReductionData + + +p32->p37 + + +data p34 - -bbknn_scenic:BBKNN:bbknn_base:DIM_REDUCTION:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__UMAP + +bbknn_scenic:BBKNN:HVG_SELECTION:SC__SCANPY__FEATURE_SCALING - + p33->p34 - - + + - + -p35 - -map +p42 + +map - - -p34->p35 - - -data + + +p34->p42 + + +data - - -p41 - -map + + +p43 + +bbknn_scenic:BBKNN:DIM_REDUCTION_PCA:SC__SCANPY__DIM_REDUCTION__PCA - - -p34->p41 - - -data + + +p42->p43 + + +data - + -p38 - -bbknn_scenic:BBKNN:bbknn_base:DIM_REDUCTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT +p35 + - - -p35->p38 - - -data + + +p35->p37 + + +ipynb - + -p39 - -bbknn_scenic:BBKNN:bbknn_base:DIM_REDUCTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML +p38 + +map - - -p38->p39 - - + + +p37->p38 + + - + p36 - + - + -p36->p38 - - -ipynb +p36->p37 + + +reportTitle - - -p37 - + + +p39 + +map - - -p37->p38 - - -reportTitle + + +p38->p39 + + +report_notebook + + + +p124 + +map + + + +p38->p124 + + +report_notebook - + p40 - + +bbknn_scenic:BBKNN:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML p39->p40 - - + + - + -p42 - -bbknn_scenic:BBKNN:bbknn_base:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING +p41 + - - -p41->p42 - - + + +p40->p41 + + - - -p45 - -bbknn_scenic:BBKNN:bbknn_base:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT + + +p44 + +bbknn_scenic:BBKNN:NEIGHBORHOOD_GRAPH:SC__SCANPY__NEIGHBORHOOD_GRAPH - + -p42->p45 - - -data +p43->p44 + + +data - - -p42->p48 - - -data + + +p45 + +bbknn_scenic:BBKNN:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__TSNE + + + +p44->p45 + + +data p46 - -bbknn_scenic:BBKNN:bbknn_base:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML + +bbknn_scenic:BBKNN:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__UMAP - + p45->p46 - - + + +dimReductionData - - -p43 - - - - -p43->p45 - - -ipynb - - - -p44 - + + +p66 + +map - - -p44->p45 - - -reportTitle + + +p45->p66 + + +dimReductionData p47 - + +map - + p46->p47 - - - - - -p49 - -bbknn_scenic:BBKNN:bbknn_base:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES - - - -p48->p49 - - + + +data - - -p73 - -join + + +p55 + +map - - -p49->p73 - - -A + + +p46->p55 + + +data - - -p77 - -bbknn_scenic:BBKNN:bbknn_base:BEC_BBKNN:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT + + +p50 + +bbknn_scenic:BBKNN:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - -p73->p77 - - -data + + +p47->p50 + + +data - + p51 - -bbknn_scenic:BBKNN:bbknn_base:BEC_BBKNN:SC__SCANPY__BATCH_EFFECT_CORRECTION + +map - + p50->p51 - - + + + + + +p48 + + + + +p48->p50 + + +ipynb + + + +p49 + + + + +p49->p50 + + +reportTitle - + p52 - -map + +map - + p51->p52 - - -data + + +report_notebook - + p53 - -bbknn_scenic:BBKNN:bbknn_base:BEC_BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING + +bbknn_scenic:BBKNN:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - + p52->p53 - - + + + + + +p54 + + + + +p53->p54 + + - + p56 - -bbknn_scenic:BBKNN:bbknn_base:BEC_BBKNN:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT + +bbknn_scenic:BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING - - -p53->p56 - - -data + + +p55->p56 + + - - -p53->p59 - - -data + + +p59 + +bbknn_scenic:BBKNN:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - -p57 - -bbknn_scenic:BBKNN:bbknn_base:BEC_BBKNN:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML + + +p56->p59 + + +data - - -p56->p57 - - -cluster_report + + +p56->p64 + + +data - - -p100 - -join + + +p60 + +map - - -p56->p100 - - -cluster_report + + +p59->p60 + + - - -p54 - + + +p57 + - + -p54->p56 - - -ipynb - - - -p55 - - - - -p55->p56 - - -reportTitle +p57->p59 + + +ipynb p58 - - - - -p57->p58 - - + - - -p60 - -bbknn_scenic:BBKNN:bbknn_base:BEC_BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES - - - -p59->p60 - - + + +p58->p59 + + +reportTitle - + p61 - -map + +map - + p60->p61 - - - - - -p66 - -combine - - - -p61->p66 - - - - - -p67 - -bbknn_scenic:BBKNN:bbknn_base:BEC_BBKNN:SC__SCANPY__DIM_REDUCTION__UMAP - - - -p66->p67 - - + + +report_notebook - + p62 - -Channel.from + +bbknn_scenic:BBKNN:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML + + + +p61->p62 + + - + p63 - + - + p62->p63 - - + + - + + +p65 + +bbknn_scenic:BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES + + + +p64->p65 + + + + -p64 - +p91 + +join - - -p64->p66 - - + + +p65->p91 + + +A - - -p65 - + + +p95 + +bbknn_scenic:BBKNN:BEC_BBKNN:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT + + + +p91->p95 + + +data + + + +p67 + +bbknn_scenic:BBKNN:BEC_BBKNN:SC__SCANPY__BATCH_EFFECT_CORRECTION - + -p65->p66 - - +p66->p67 + + p68 - -map + +map - + p67->p68 - - - - - -p81 - -groupTuple - - - -p67->p81 - - - - - -p94 - -map - - - -p67->p94 - - - - - -p70 - -bbknn_scenic:BBKNN:bbknn_base:BEC_BBKNN:SC__PUBLISH_H5AD - - - -p68->p70 - - - - - -p71 - - - - -p70->p71 - - -B + + +data - + p69 - + +bbknn_scenic:BBKNN:BEC_BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING - - -p69->p70 - - -fOutSuffix + + +p68->p69 + + - + p72 - + +bbknn_scenic:BBKNN:BEC_BBKNN:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT + + + +p69->p72 + + +data + + + +p69->p77 + + +data + + + +p73 + +map p72->p73 - - + + - - -p78 - -bbknn_scenic:BBKNN:bbknn_base:BEC_BBKNN:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML + + +p70 + - - -p77->p78 - - + + +p70->p72 + + +ipynb + + + +p71 + + + + +p71->p72 + + +reportTitle p74 - + +map - - -p74->p77 - - -ipynb + + +p73->p74 + + +cluster_report + + + +p116 + +map + + + +p73->p116 + + +cluster_report p75 - + +bbknn_scenic:BBKNN:BEC_BBKNN:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - - -p75->p77 - - -reportTitle + + +p74->p75 + + p76 - + - - -p76->p77 - - -isBenchmarkMode + + +p75->p76 + + - + +p78 + +bbknn_scenic:BBKNN:BEC_BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES + + + +p77->p78 + + + + + p79 - + +map - + p78->p79 - - + + - + -p110 - -view +p84 + +combine - + -p80->p110 - - -filteredLoom +p79->p84 + + - - -p113 - -combine + + +p85 + +bbknn_scenic:BBKNN:BEC_BBKNN:SC__SCANPY__DIM_REDUCTION__UMAP - - -p80->p113 - - -filteredLoom + + +p84->p85 + + - - -p111 - + + +p80 + +Channel.from - - -p110->p111 - - + + +p81 + + + + +p80->p81 + + - + p82 - -branch + - + -p81->p82 - - -data +p82->p84 + + - + p83 - -view + - + -p82->p83 - - - - - -p92 - -view - - - -p82->p92 - - +p83->p84 + + - + -p85 - -map +p86 + +map - - -p82->p85 - - + + +p85->p86 + + - - -p84 - + + +p100 + +groupTuple - - -p83->p84 - - + + +p85->p100 + + - - -p93 - + + +p113 + +map - - -p92->p93 - - + + +p85->p113 + + - + -p86 - +p88 + +bbknn_scenic:BBKNN:BEC_BBKNN:SC__PUBLISH_H5AD - - -p85->p86 - - + + +p86->p88 + + + + + +p89 + + + + +p88->p89 + + +B p87 - + - + p87->p88 - - + + +fOutSuffix - + -p89 - -ifEmpty +p90 + - - -p88->p89 - - + + +p90->p91 + + - - -p90 - -bbknn_scenic:BBKNN:bbknn_base:FILE_CONVERTER:SC__H5AD_TO_LOOM + + +p96 + +map - - -p89->p90 - - + + +p95->p96 + + - + + +p117 + +map + + + +p95->p117 + + + + -p91 - -bbknn_scenic:BBKNN:bbknn_base:FILE_CONVERTER:COMPRESS_HDF5 +p92 + - - -p90->p91 - - + + +p92->p95 + + +ipynb - + -p134 - -join +p93 + - - -p91->p134 - - -scopeLoom + + +p93->p95 + + +reportTitle - - -p135 - -bbknn_scenic:SCENIC_append:APPEND_SCENIC_LOOM + + +p94 + - - -p134->p135 - - + + +p94->p95 + + +isParameterExplorationModeOn - + -p98 - -combine - - - -p94->p98 - - -project +p97 + +bbknn_scenic:BBKNN:BEC_BBKNN:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML - - -p98->p99 - - + + +p96->p97 + + - + -p95 - - - - -p96 - -bbknn_scenic:BBKNN:bbknn_base:UTILS__GENERATE_WORKFLOW_CONFIG_REPORT +p98 + - - -p95->p96 - - -ipynb + + +p97->p98 + + + + + +p136 + +combine - + -p96->p98 - - +p99->p136 + + +filteredLoom - - -p97 - + + +p138 + +bbknn_scenic:SCENIC_APPEND:scenic:GRNBOOST2_WITHOUT_DASK - - -p97->p98 - - + + +p136->p138 + + - - -p99->p100 - - + + +p101 + +branch - + + +p100->p101 + + +data + + -p103 - -combine +p111 + +view - - -p100->p103 - - + + +p101->p111 + + + + + +p102 + +view + + + +p101->p102 + + p104 - -map + +map - - -p103->p104 - - + + +p101->p104 + + - - -p101 - + + +p112 + - - -p101->p103 - - + + +p111->p112 + + - + -p102 - +p103 + - + p102->p103 - - + + - + -p107 - -bbknn_scenic:BBKNN:bbknn_base:SC__SCANPY__MERGE_REPORTS +p105 + - - -p104->p107 - - -ipynbs + + +p104->p105 + + p108 - -bbknn_scenic:BBKNN:bbknn_base:SC__SCANPY__REPORT_TO_HTML + +ifEmpty - + p107->p108 - - - - - -p105 - - - - -p105->p107 - - -reportTitle + + p106 - + - + p106->p107 - - -isBenchmarkMode + + p109 - + +bbknn_scenic:BBKNN:FILE_CONVERTER:SC__H5AD_TO_LOOM - + p108->p109 - - + + - - -p112 - + + +p110 + +bbknn_scenic:BBKNN:FILE_CONVERTER:COMPRESS_HDF5 - - -p112->p113 - - + + +p109->p110 + + - - -p115 - -bbknn_scenic:SCENIC_append:SCENIC:GRNBOOST2_WITHOUT_DASK + + +p151 + +join - - -p113->p115 - - + + +p110->p151 + + +scopeLoom + + + +p152 + +bbknn_scenic:SCENIC_APPEND:APPEND_SCENIC_LOOM + + + +p151->p152 + + - - -p120 - -bbknn_scenic:SCENIC_append:SCENIC:CISTARGET__MOTIF + + +p123 + +combine - - -p115->p120 - - + + +p113->p123 + + +project - + p125 - -bbknn_scenic:SCENIC_append:SCENIC:CISTARGET__TRACK + +join - - -p115->p125 - - + + +p123->p125 + + - + p114 - + + + + +p115 + +bbknn_scenic:BBKNN:UTILS__GENERATE_WORKFLOW_CONFIG_REPORT - + p114->p115 - - -tfs - - - -p128 - -bbknn_scenic:SCENIC_append:SCENIC:AUCELL__MOTIF + + +ipynb - - -p120->p128 - - + + +p115->p123 + + - - -p116 - -Channel.fromPath + + +p120 + +combine - - -p117 - -collect + + +p116->p120 + + - - -p116->p117 - - + + +p121 + - - -p117->p120 - - -motifsDb + + +p120->p121 + + +clusteringBECReports - + p118 - + - - -p118->p120 - - -annotation + + +p117->p118 + + - + p119 - + - + p119->p120 - - -type - - - -p129 - -map - - - -p128->p129 - - - - - -p121 - -Channel.fromPath + + - + p122 - -collect - - - -p121->p122 - - + - - -p122->p125 - - -tracksDb + + +p122->p123 + + - + -p126 - +p128 + +combine - - -p125->p126 - - + + +p125->p128 + + - - -p123 - + + +p124->p125 + + - - -p123->p125 - - -annotation + + +p129 + +map - - -p124 - + + +p128->p129 + + - + + +p126 + + + -p124->p125 - - -type +p126->p128 + + - + p127 - + - + p127->p128 - - -type + + + + + +p132 + +bbknn_scenic:BBKNN:SC__SCANPY__MERGE_REPORTS + + + +p129->p132 + + +ipynbs + + + +p133 + +bbknn_scenic:BBKNN:SC__SCANPY__REPORT_TO_HTML + + + +p132->p133 + + - + p130 - -bbknn_scenic:SCENIC_append:SCENIC:VISUALIZE + - + -p129->p130 - - +p130->p132 + + +reportTitle - + p131 - -bbknn_scenic:SCENIC_append:SCENIC:PUBLISH_LOOM - - - -p130->p131 - - - - - -p132 - + - + p131->p132 - - + + +isParameterExplorationModeOn - - -p133 - + + +p134 + p133->p134 - - - - - -p138 - -bbknn_scenic:SCENIC_append:GENERATE_REPORT - - - -p135->p138 - - + + - - -p139 - -bbknn_scenic:SCENIC_append:REPORT_TO_HTML + + +p135 + - - -p138->p139 - - + + +p135->p136 + + - - -p136 - + + +p143 + +bbknn_scenic:SCENIC_APPEND:scenic:CISTARGET__MOTIF - - -p136->p138 - - -ipynb + + +p138->p143 + + - + p137 - + - + p137->p138 - - -reportTitle + + +tfs + + + +p145 + +bbknn_scenic:SCENIC_APPEND:scenic:AUCELL__MOTIF + + + +p143->p145 + + + + + +p139 + +Channel.fromPath - + p140 - + +collect - + p139->p140 - - + + + + + +p140->p143 + + +motifsDb + + + +p141 + + + + +p141->p143 + + +annotation + + + +p142 + + + + +p142->p143 + + +type + + + +p146 + +map + + + +p145->p146 + + + + + +p144 + + + + +p144->p145 + + +type + + + +p147 + +bbknn_scenic:SCENIC_APPEND:scenic:VISUALIZE + + + +p146->p147 + + + + + +p148 + +bbknn_scenic:SCENIC_APPEND:scenic:PUBLISH_LOOM + + + +p147->p148 + + +scenicLoom + + + +p149 + + + + +p148->p149 + + + + + +p150 + + + + +p150->p151 + + + + + +p155 + +bbknn_scenic:SCENIC_APPEND:GENERATE_REPORT + + + +p152->p155 + + + + + +p156 + +bbknn_scenic:SCENIC_APPEND:REPORT_TO_HTML + + + +p155->p156 + + + + + +p153 + + + + +p153->p155 + + +ipynb + + + +p154 + + + + +p154->p155 + + +reportTitle + + + +p157 + + + + +p156->p157 + + \ No newline at end of file diff --git a/assets/images/harmony.svg b/assets/images/harmony.svg index cde27ede..41905e7f 100644 --- a/assets/images/harmony.svg +++ b/assets/images/harmony.svg @@ -1,1367 +1,1805 @@ - + --> + pipeline_dag - + p0 - -Channel.fromPath + +Channel.from p1 - -map + +view p0->p1 - - + + p2 - -view + p1->p2 - - -channel + + p3 - -harmony:HARMONY:harmony_base:QC_FILTER:SC__FILE_CONVERTER + +Channel.empty - + + +p7 + +concat + + -p2->p3 - - -data +p3->p7 + + +data + + + +p8 + +view + + + +p7->p8 + + - + p4 - -harmony:HARMONY:harmony_base:QC_FILTER:SC__SCANPY__COMPUTE_QC_STATS - - - -p3->p4 - - + +Channel.fromPath - + p5 - -harmony:HARMONY:harmony_base:QC_FILTER:SC__SCANPY__GENE_FILTER + +map - + p4->p5 - - - - - -p8 - -join - - - -p4->p8 - - + + - + p6 - -harmony:HARMONY:harmony_base:QC_FILTER:SC__SCANPY__CELL_FILTER + +map - + p5->p6 - - + + +channel - - -p15 - -map + + +p6->p7 + + - - -p6->p15 - - + + +p9 + +ifEmpty - - -p16 - -collect + + +p8->p9 + + +data - - -p15->p16 - - + + +p11 + +harmony:HARMONY:QC_FILTER:SC__FILE_CONVERTER - - -p7 - + + +p8->p11 + + +data - - -p7->p8 - - + + +p10 + + + + +p9->p10 + + - + p12 - -harmony:HARMONY:harmony_base:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT + +harmony:HARMONY:QC_FILTER:SC__SCANPY__COMPUTE_QC_STATS - - -p8->p12 - - -data + + +p11->p12 + + - + p13 - -harmony:HARMONY:harmony_base:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML + +harmony:HARMONY:QC_FILTER:SC__SCANPY__GENE_FILTER - + p12->p13 - - - - - -p9 - - - - -p9->p12 - - -ipynb + + - - -p10 - + + +p16 + +join - - -p10->p12 - - -reportTitle + + +p12->p16 + + - - -p11 - + + +p14 + +harmony:HARMONY:QC_FILTER:SC__SCANPY__CELL_FILTER - + -p11->p12 - - -isBenchmarkMode +p13->p14 + + - + -p14 - +p25 + +map - + + +p14->p25 + + + + + +p26 + +toSortedList + + + +p25->p26 + + + + + +p15 + + + -p13->p14 - - +p15->p16 + + - + p17 - -harmony:HARMONY:harmony_base:SC__FILE_CONCATENATOR + +map p16->p17 - - + + - - -p18 - -harmony:HARMONY:harmony_base:NORMALIZE_TRANSFORM:SC__SCANPY__NORMALIZATION + + +p21 + +harmony:HARMONY:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT - + -p17->p18 - - -rawFilteredData +p17->p21 + + +data - - -p78 - -harmony:HARMONY:harmony_base:SC__H5AD_TO_FILTERED_LOOM + + +p22 + +map - - -p17->p78 - - -rawFilteredData + + +p21->p22 + + - - -p87 - -combine + + +p18 + - - -p17->p87 - - -rawFilteredData + + +p18->p21 + + +ipynb - + p19 - -harmony:HARMONY:harmony_base:NORMALIZE_TRANSFORM:SC__SCANPY__DATA_TRANSFORMATION + - - -p18->p19 - - + + +p19->p21 + + +reportTitle - + p20 - -harmony:HARMONY:harmony_base:HVG_SELECTION:SC__SCANPY__FEATURE_SELECTION - - - -p19->p20 - - -normalizedTransformedData - - - -p46 - -join - - - -p19->p46 - - -normalizedTransformedData - - - -p64 - -join - - - -p19->p64 - - -normalizedTransformedData - - - -p21 - -harmony:HARMONY:harmony_base:HVG_SELECTION:SC__SCANPY__FEATURE_SCALING + p20->p21 - - - - - -p24 - -harmony:HARMONY:harmony_base:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - - -p21->p24 - - -data - - - -p27 - -map - - - -p21->p27 - - -data - - - -p25 - -harmony:HARMONY:harmony_base:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - - - -p24->p25 - - - - - -p99 - -join - - - -p24->p99 - - - - - -p22 - - - - -p22->p24 - - -ipynb + + +isParameterExplorationModeOn p23 - + +harmony:HARMONY:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML + + + +p22->p23 + + + + + +p24 + p23->p24 - - -reportTitle + + - - -p26 - + + +p27 + +harmony:HARMONY:SC__FILE_CONCATENATOR - + -p25->p26 - - +p26->p27 + + p28 - -harmony:HARMONY:harmony_base:DIM_REDUCTION:DIM_REDUCTION_PCA:SC__SCANPY__DIM_REDUCTION__PCA + +harmony:HARMONY:NORMALIZE_TRANSFORM:SC__SCANPY__NORMALIZATION - + p27->p28 - - -data + + +rawFilteredData + + + +p108 + +harmony:HARMONY:SC__H5AD_TO_FILTERED_LOOM + + + +p27->p108 + + +rawFilteredData + + + +p117 + +combine + + + +p27->p117 + + +rawFilteredData p29 - -map + +harmony:HARMONY:NORMALIZE_TRANSFORM:SC__SCANPY__DATA_TRANSFORMATION - + p28->p29 - - -data - - - -p48 - -map - - - -p28->p48 - - -data + + p30 - -harmony:HARMONY:harmony_base:DIM_REDUCTION:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__TSNE + +harmony:HARMONY:HVG_SELECTION:SC__SCANPY__FIND_HIGHLY_VARIABLE_GENES - + p29->p30 - - + + +normalizedTransformedData + + + +p62 + +join + + + +p29->p62 + + +normalizedTransformedData + + + +p92 + +join + + + +p29->p92 + + +normalizedTransformedData p31 - -map + +harmony:HARMONY:HVG_SELECTION:SC__SCANPY__SUBSET_HIGHLY_VARIABLE_GENES - + p30->p31 - - -dimred_pca_tsne + + +data + + + +p35 + +harmony:HARMONY:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT + + + +p30->p35 + + +data p32 - -harmony:HARMONY:harmony_base:DIM_REDUCTION:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__UMAP + +harmony:HARMONY:HVG_SELECTION:SC__SCANPY__FEATURE_SCALING - + p31->p32 - - + + - + -p33 - -map +p40 + +map - - -p32->p33 - - -data + + +p32->p40 + + +data - - -p39 - -map + + +p41 + +harmony:HARMONY:DIM_REDUCTION_PCA:SC__SCANPY__DIM_REDUCTION__PCA - - -p32->p39 - - -data + + +p40->p41 + + +data - + -p36 - -harmony:HARMONY:harmony_base:DIM_REDUCTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT +p33 + - - -p33->p36 - - -data + + +p33->p35 + + +ipynb - + -p37 - -harmony:HARMONY:harmony_base:DIM_REDUCTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML +p36 + +map - - -p36->p37 - - + + +p35->p36 + + - + p34 - + - + -p34->p36 - - -ipynb +p34->p35 + + +reportTitle - - -p35 - + + +p37 + +map - - -p35->p36 - - -reportTitle + + +p36->p37 + + +report_notebook + + + +p136 + +map + + + +p36->p136 + + +report_notebook - + p38 - + +harmony:HARMONY:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML p37->p38 - - + + - + -p40 - -harmony:HARMONY:harmony_base:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING +p39 + - - -p39->p40 - - + + +p38->p39 + + - - -p43 - -harmony:HARMONY:harmony_base:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT + + +p42 + +harmony:HARMONY:NEIGHBORHOOD_GRAPH:SC__SCANPY__NEIGHBORHOOD_GRAPH - + -p40->p43 - - -data +p41->p42 + + +dimReductionData - - -p40->p46 - - -data + + +p64 + +map - - -p44 - -harmony:HARMONY:harmony_base:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML + + +p41->p64 + + +dimReductionData - - -p43->p44 - - + + +p66 + +map - - -p41 - + + +p41->p66 + + +dimReductionData - - -p41->p43 - - -ipynb + + +p69 + +map - + + +p41->p69 + + +dimReductionData + + -p42 - +p43 + +harmony:HARMONY:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__TSNE - + p42->p43 - - -reportTitle + + +data + + + +p44 + +harmony:HARMONY:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__UMAP + + + +p43->p44 + + p45 - + +map - + p44->p45 - - - - - -p47 - -harmony:HARMONY:harmony_base:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES - - - -p46->p47 - - - - - -p71 - -join - - - -p47->p71 - - -A + + +data - - -p94 - -map + + +p53 + +map - - -p47->p94 - - -A + + +p44->p53 + + +data - - -p75 - -harmony:HARMONY:harmony_base:BEC_HARMONY:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT + + +p48 + +harmony:HARMONY:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - -p71->p75 - - -data + + +p45->p48 + + +data - + p49 - -harmony:HARMONY:harmony_base:BEC_HARMONY:SC__HARMONY__HARMONY_MATRIX + +map - + p48->p49 - - -dimReductionData + + + + + +p46 + + + + +p46->p48 + + +ipynb + + + +p47 + + + + +p47->p48 + + +reportTitle - + p50 - -join - - - -p48->p50 - - -dimReductionData + +map - + p49->p50 - - + + +report_notebook - + p51 - -harmony:HARMONY:harmony_base:BEC_HARMONY:SC__H5AD_UPDATE_X_PCA + +harmony:HARMONY:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - + p50->p51 - - + + - + p52 - -map + - + p51->p52 - - -data + + - + -p53 - -harmony:HARMONY:harmony_base:BEC_HARMONY:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__TSNE +p54 + +harmony:HARMONY:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING - - -p52->p53 - - + + +p53->p54 + + - + -p54 - -map +p57 + +harmony:HARMONY:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - + + +p54->p57 + + +data + + + +p54->p62 + + +data + + + +p58 + +map + + -p53->p54 - - +p57->p58 + + p55 - -harmony:HARMONY:harmony_base:BEC_HARMONY:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__UMAP + - - -p54->p55 - - + + +p55->p57 + + +ipynb p56 - -map - - - -p55->p56 - - -data - - - -p57 - -harmony:HARMONY:harmony_base:BEC_HARMONY:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING + - + p56->p57 - - + + +reportTitle - + +p59 + +map + + + +p58->p59 + + +report_notebook + + + p60 - -harmony:HARMONY:harmony_base:BEC_HARMONY:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT + +harmony:HARMONY:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - - -p57->p60 - - -data + + +p59->p60 + + - + p61 - -harmony:HARMONY:harmony_base:BEC_HARMONY:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML + - + p60->p61 - - -cluster_report + + - - -p58 - - - - -p58->p60 - - -ipynb - - - -p59 - + + +p63 + +harmony:HARMONY:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES - + -p59->p60 - - -reportTitle +p62->p63 + + - - -p62 - + + +p100 + +join + + + +p63->p100 + + +A + + + +p124 + +map + + + +p63->p124 + + +A - - -p61->p62 - - + + +p104 + +harmony:HARMONY:BEC_HARMONY:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT + + + +p100->p104 + + +data p65 - -harmony:HARMONY:harmony_base:BEC_HARMONY:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES + +harmony:HARMONY:BEC_HARMONY:SC__HARMONY__HARMONY_MATRIX - + p64->p65 - - - - - -p63 - - - - -p63->p64 - - + + - + -p66 - -map - - - -p65->p66 - - -data - - - -p80 - -groupTuple +p67 + +join - - -p65->p80 - - -data + + +p65->p67 + + - + p68 - -harmony:HARMONY:harmony_base:BEC_HARMONY:SC__PUBLISH_H5AD + +harmony:HARMONY:BEC_HARMONY:SC__H5AD_UPDATE_X_PCA - + -p66->p68 - - - - - -p69 - +p67->p68 + + - - -p68->p69 - - -B + + +p66->p67 + + - - -p67 - + + +p70 + +join - + -p67->p68 - - -fOutSuffix +p68->p70 + + - - -p70 - + + +p71 + +harmony:HARMONY:BEC_HARMONY:NEIGHBORHOOD_GRAPH:SC__SCANPY__NEIGHBORHOOD_GRAPH - + p70->p71 - - - - - -p76 - -harmony:HARMONY:harmony_base:BEC_HARMONY:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML + + +data - - -p75->p76 - - + + +p69->p70 + + p72 - + +harmony:HARMONY:BEC_HARMONY:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__TSNE - + -p72->p75 - - -ipynb +p71->p72 + + +data p73 - + +harmony:HARMONY:BEC_HARMONY:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__UMAP - + -p73->p75 - - -reportTitle +p72->p73 + + p74 - + +map - + -p74->p75 - - -isBenchmarkMode +p73->p74 + + +data + + + +p82 + +map + + + +p73->p82 + + +data - + p77 - + +harmony:HARMONY:BEC_HARMONY:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - + + +p74->p77 + + +data + + + +p78 + +map + + + +p77->p78 + + + + + +p75 + + + +p75->p77 + + +ipynb + + + +p76 + + + + p76->p77 - - + + +reportTitle - + p79 - + +map - + p78->p79 - - -filteredloom + + +report_notebook - + -p81 - -branch +p80 + +harmony:HARMONY:BEC_HARMONY:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - + -p80->p81 - - -data +p79->p80 + + - + -p82 - -view +p81 + - + -p81->p82 - - - - - -p92 - -view - - - -p81->p92 - - - - - -p84 - -map - - - -p81->p84 - - +p80->p81 + + p83 - + +harmony:HARMONY:BEC_HARMONY:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING p82->p83 - - + + - - -p93 - + + +p86 + +harmony:HARMONY:BEC_HARMONY:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - -p92->p93 - - + + +p83->p86 + + +data - + + +p87 + +map + + + +p86->p87 + + + + -p85 - +p84 + - + -p84->p85 - - +p84->p86 + + +ipynb + + + +p85 + + + + +p85->p86 + + +reportTitle p88 - -ifEmpty + +map p87->p88 - - - - - -p86 - - - - -p86->p87 - - + + +cluster_report + + + +p127 + +map + + + +p87->p127 + + +cluster_report p89 - -harmony:HARMONY:harmony_base:FILE_CONVERTER:SC__H5AD_TO_LOOM + +harmony:HARMONY:BEC_HARMONY:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML p88->p89 - - + + p90 - -harmony:HARMONY:harmony_base:FILE_CONVERTER:COMPRESS_HDF5 + p89->p90 - - + + + + + +p93 + +harmony:HARMONY:BEC_HARMONY:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES + + + +p92->p93 + + - + p91 - + - - -p90->p91 - - -scopeloom + + +p91->p92 + + - + -p98 - -combine +p94 + +map - + -p94->p98 - - -project +p93->p94 + + +data - - -p98->p99 - - + + +p110 + +groupTuple + + + +p93->p110 + + +data p95 - - - - -p96 - -harmony:HARMONY:harmony_base:UTILS__GENERATE_WORKFLOW_CONFIG_REPORT + +map - + -p95->p96 - - -ipynb - - - -p96->p98 - - +p94->p95 + + +marker_genes - + p97 - + +harmony:HARMONY:BEC_HARMONY:SC__PUBLISH_H5AD - - -p97->p98 - - + + +p95->p97 + + - - -p101 - -join + + +p98 + - - -p99->p101 - - + + +p97->p98 + + +B - - -p104 - -combine + + +p96 + - - -p101->p104 - - + + +p96->p97 + + +fOutSuffix - - -p100 - + + +p99 + - - -p100->p101 - - + + +p99->p100 + + p105 - -map + +map - + p104->p105 - - + + + + + +p128 + +map + + + +p104->p128 + + + + + +p101 + + + + +p101->p104 + + +ipynb p102 - + - + p102->p104 - - + + +reportTitle p103 - + - + p103->p104 - - + + +isParameterExplorationModeOn - + -p108 - -harmony:HARMONY:harmony_base:SC__SCANPY__MERGE_REPORTS +p106 + +harmony:HARMONY:BEC_HARMONY:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML + + + +p105->p106 + + + + + +p107 + - + -p105->p108 - - -ipynbs +p106->p107 + + p109 - -harmony:HARMONY:harmony_base:SC__SCANPY__REPORT_TO_HTML + - -p108->p109 - - - - - -p106 - - - - -p106->p108 - - -reportTitle - - - -p107 - - - -p107->p108 - - -isBenchmarkMode +p108->p109 + + - - -p110 - + + +p111 + +branch - + -p109->p110 - - +p110->p111 + + +data + + + +p112 + +view + + + +p111->p112 + + + + + +p122 + +view + + + +p111->p122 + + + + + +p114 + +map + + + +p111->p114 + + + + + +p113 + + + + +p112->p113 + + + + + +p123 + + + + +p122->p123 + + + + + +p115 + + + + +p114->p115 + + + + + +p116 + + + + +p116->p117 + + + + + +p118 + +ifEmpty + + + +p117->p118 + + + + + +p119 + +harmony:HARMONY:FILE_CONVERTER:SC__H5AD_TO_LOOM + + + +p118->p119 + + + + + +p120 + +harmony:HARMONY:FILE_CONVERTER:COMPRESS_HDF5 + + + +p119->p120 + + + + + +p121 + + + + +p120->p121 + + + + + +p135 + +combine + + + +p124->p135 + + +project + + + +p137 + +join + + + +p135->p137 + + + + + +p125 + + + + +p126 + +harmony:HARMONY:UTILS__GENERATE_WORKFLOW_CONFIG_REPORT + + + +p125->p126 + + +ipynb + + + +p126->p135 + + + + + +p132 + +combine + + + +p127->p132 + + + + + +p133 + + + + +p132->p133 + + +clusteringBECReports + + + +p129 + + + + +p128->p129 + + + + + +p130 + + + + +p130->p132 + + + + + +p131 + + + + +p131->p132 + + + + + +p134 + + + + +p134->p135 + + + + + +p140 + +combine + + + +p137->p140 + + + + + +p136->p137 + + + + + +p141 + +map + + + +p140->p141 + + + + + +p138 + + + + +p138->p140 + + + + + +p139 + + + + +p139->p140 + + + + + +p144 + +harmony:HARMONY:SC__SCANPY__MERGE_REPORTS + + + +p141->p144 + + +ipynbs + + + +p145 + +harmony:HARMONY:SC__SCANPY__REPORT_TO_HTML + + + +p144->p145 + + + + + +p142 + + + + +p142->p144 + + +reportTitle + + + +p143 + + + + +p143->p144 + + +isParameterExplorationModeOn + + + +p146 + + + + +p145->p146 + + \ No newline at end of file diff --git a/assets/images/mnncorrect.svg b/assets/images/mnncorrect.svg index b1182b72..c4923575 100644 --- a/assets/images/mnncorrect.svg +++ b/assets/images/mnncorrect.svg @@ -1,1541 +1,1754 @@ - + --> + pipeline_dag - + p0 - -Channel.from + +Channel.from p1 - -view + +view p0->p1 - - + + p2 - + p1->p2 - - + + p3 - -Channel.fromPath + +Channel.empty - + -p4 - -map +p7 + +concat - + -p3->p4 - - +p3->p7 + + +data - + + +p8 + +view + + + +p7->p8 + + + + +p4 + +Channel.fromPath + + + p5 - -view + +map p4->p5 - - -channel + + - + p6 - -mnncorrect:MNNCORRECT:QC_FILTER:SC__FILE_CONVERTER + +map p5->p6 - - -data - - - -p7 - -mnncorrect:MNNCORRECT:QC_FILTER:SC__SCANPY__COMPUTE_QC_STATS + + +channel p6->p7 - - - - - -p8 - -mnncorrect:MNNCORRECT:QC_FILTER:SC__SCANPY__GENE_FILTER - - - -p7->p8 - - - - - -p11 - -join - - - -p7->p11 - - + + p9 - -mnncorrect:MNNCORRECT:QC_FILTER:SC__SCANPY__CELL_FILTER + +ifEmpty p8->p9 - - + + +data - - -p18 - -map - - - -p9->p18 - - - - - -p19 - -collect + + +p11 + +mnncorrect:MNNCORRECT:QC_FILTER:SC__FILE_CONVERTER - - -p18->p19 - - + + +p8->p11 + + +data - + p10 - + - - -p10->p11 - - + + +p9->p10 + + - + + +p12 + +mnncorrect:MNNCORRECT:QC_FILTER:SC__SCANPY__COMPUTE_QC_STATS + + + +p11->p12 + + + + -p15 - -mnncorrect:MNNCORRECT:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT +p13 + +mnncorrect:MNNCORRECT:QC_FILTER:SC__SCANPY__GENE_FILTER - + -p11->p15 - - -data +p12->p13 + + - + p16 - -mnncorrect:MNNCORRECT:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML + +join - - -p15->p16 - - + + +p12->p16 + + - + -p12 - +p14 + +mnncorrect:MNNCORRECT:QC_FILTER:SC__SCANPY__CELL_FILTER - + -p12->p15 - - -ipynb +p13->p14 + + - + -p13 - +p25 + +map - + -p13->p15 - - -reportTitle +p14->p25 + + - - -p14 - + + +p26 + +toSortedList - - -p14->p15 - - -isBenchmarkMode + + +p25->p26 + + p17 - + +map p16->p17 - - + + - - -p20 - -mnncorrect:MNNCORRECT:SC__FILE_CONCATENATOR + + +p15 + - - -p19->p20 - - + + +p15->p16 + + - + p21 - -mnncorrect:MNNCORRECT:NORMALIZE_TRANSFORM:SC__SCANPY__NORMALIZATION + +mnncorrect:MNNCORRECT:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT - - -p20->p21 - - -rawFilteredData + + +p17->p21 + + +data - - -p94 - -mnncorrect:MNNCORRECT:SC__H5AD_TO_FILTERED_LOOM + + +p22 + +map - - -p20->p94 - - -rawFilteredData + + +p21->p22 + + - - -p103 - -combine + + +p18 + - - -p20->p103 - - -rawFilteredData + + +p18->p21 + + +ipynb - + + +p19 + + + + +p19->p21 + + +reportTitle + + -p22 - -mnncorrect:MNNCORRECT:NORMALIZE_TRANSFORM:SC__SCANPY__DATA_TRANSFORMATION +p20 + - + -p21->p22 - - +p20->p21 + + +isParameterExplorationModeOn - + p23 - -mnncorrect:MNNCORRECT:HVG_SELECTION:SC__SCANPY__FEATURE_SELECTION + +mnncorrect:MNNCORRECT:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML - + p22->p23 - - -normalizedTransformedData - - - -p50 - -join - - - -p22->p50 - - -normalizedTransformedData - - - -p64 - -join - - - -p22->p64 - - -normalizedTransformedData + + - + p24 - -mnncorrect:MNNCORRECT:HVG_SELECTION:SC__SCANPY__FEATURE_SCALING + - + p23->p24 - - + + - + p27 - -mnncorrect:MNNCORRECT:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - - -p24->p27 - - -data - - - -p30 - -map + +mnncorrect:MNNCORRECT:SC__FILE_CONCATENATOR - - -p24->p30 - - -data - - - -p52 - -map - - - -p24->p52 - - -data + + +p26->p27 + + p28 - -mnncorrect:MNNCORRECT:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML + +mnncorrect:MNNCORRECT:NORMALIZE_TRANSFORM:SC__SCANPY__NORMALIZATION p27->p28 - - - - - -p115 - -join - - - -p27->p115 - - + + +rawFilteredData - - -p25 - + + +p105 + +mnncorrect:MNNCORRECT:SC__H5AD_TO_FILTERED_LOOM - - -p25->p27 - - -ipynb + + +p27->p105 + + +rawFilteredData - - -p26 - + + +p114 + +combine - - -p26->p27 - - -reportTitle + + +p27->p114 + + +rawFilteredData p29 - + +mnncorrect:MNNCORRECT:NORMALIZE_TRANSFORM:SC__SCANPY__DATA_TRANSFORMATION p28->p29 - - + + + + + +p30 + +mnncorrect:MNNCORRECT:HVG_SELECTION:SC__SCANPY__FIND_HIGHLY_VARIABLE_GENES + + + +p29->p30 + + +normalizedTransformedData + + + +p62 + +join + + + +p29->p62 + + +normalizedTransformedData + + + +p89 + +join + + + +p29->p89 + + +normalizedTransformedData p31 - -mnncorrect:MNNCORRECT:DIM_REDUCTION_PCA:SC__SCANPY__DIM_REDUCTION__PCA + +mnncorrect:MNNCORRECT:HVG_SELECTION:SC__SCANPY__SUBSET_HIGHLY_VARIABLE_GENES p30->p31 - - -data + + +data + + + +p35 + +mnncorrect:MNNCORRECT:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT + + + +p30->p35 + + +data p32 - -mnncorrect:MNNCORRECT:NEIGHBORHOOD_GRAPH:SC__SCANPY__NEIGHBORHOOD_GRAPH + +mnncorrect:MNNCORRECT:HVG_SELECTION:SC__SCANPY__FEATURE_SCALING p31->p32 - - -data + + +data - + + +p64 + +map + + + +p31->p64 + + +data + + -p33 - -map +p40 + +map - + -p32->p33 - - -data +p32->p40 + + +data - - -p34 - -mnncorrect:MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__TSNE + + +p41 + +mnncorrect:MNNCORRECT:DIM_REDUCTION_PCA:SC__SCANPY__DIM_REDUCTION__PCA - - -p33->p34 - - + + +p40->p41 + + +data - - -p35 - -map + + +p33 + - - -p34->p35 - - + + +p33->p35 + + +ipynb - + p36 - -mnncorrect:MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__UMAP + +map - -p35->p36 - - - - - -p37 - -map - - -p36->p37 - - -data +p35->p36 + + - - -p43 - -map + + +p34 + - - -p36->p43 - - -data + + +p34->p35 + + +reportTitle - + -p40 - -mnncorrect:MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - - -p37->p40 - - -data - - - -p41 - -mnncorrect:MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - - - -p40->p41 - - +p37 + +map + + + +p36->p37 + + +report_notebook + + + +p132 + +map + + + +p36->p132 + + +report_notebook p38 - + +mnncorrect:MNNCORRECT:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - + -p38->p40 - - -ipynb +p37->p38 + + p39 - + - + -p39->p40 - - -reportTitle +p38->p39 + + p42 - + +mnncorrect:MNNCORRECT:NEIGHBORHOOD_GRAPH:SC__SCANPY__NEIGHBORHOOD_GRAPH p41->p42 - - + + +data + + + +p43 + +mnncorrect:MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__TSNE + + + +p42->p43 + + +data p44 - -mnncorrect:MNNCORRECT:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING + +mnncorrect:MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__UMAP p43->p44 - - + + - + -p47 - -mnncorrect:MNNCORRECT:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT +p45 + +map - + -p44->p47 - - -data +p44->p45 + + +data - - -p44->p50 - - -data + + +p53 + +map + + + +p44->p53 + + +data - + p48 - -mnncorrect:MNNCORRECT:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML + +mnncorrect:MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - -p47->p48 - - + + +p45->p48 + + +data - - -p45 - + + +p49 + +map - - -p45->p47 - - -ipynb + + +p48->p49 + + p46 - + - + -p46->p47 - - -reportTitle +p46->p48 + + +ipynb - - -p49 - + + +p47 + - - -p48->p49 - - + + +p47->p48 + + +reportTitle + + + +p50 + +map + + + +p49->p50 + + +report_notebook p51 - -mnncorrect:MNNCORRECT:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES + +mnncorrect:MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - + p50->p51 - - + + - + -p87 - -join - - - -p51->p87 - - -A - - - -p110 - -map +p52 + - - -p51->p110 - - -A + + +p51->p52 + + - - -p91 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT + + +p54 + +mnncorrect:MNNCORRECT:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING - - -p87->p91 - - -data + + +p53->p54 + + - - -p53 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:SC__SCANPY__BATCH_EFFECT_CORRECTION + + +p57 + +mnncorrect:MNNCORRECT:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - + -p52->p53 - - +p54->p57 + + +data - - -p54 - -map + + +p54->p62 + + +data - - -p53->p54 - - -data + + +p58 + +map + + + +p57->p58 + + p55 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:DIM_REDUCTION_PCA:SC__SCANPY__DIM_REDUCTION__PCA + - - -p54->p55 - - -data + + +p55->p57 + + +ipynb p56 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:NEIGHBORHOOD_GRAPH:SC__SCANPY__NEIGHBORHOOD_GRAPH - - - -p55->p56 - - -data - - - -p57 - -map + - + p56->p57 - - -data + + +reportTitle - + -p58 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING +p59 + +map - + + +p58->p59 + + +report_notebook + + + +p60 + +mnncorrect:MNNCORRECT:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML + + -p57->p58 - - +p59->p60 + + - + p61 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT + - + -p58->p61 - - -data - - - -p58->p64 - - -data +p60->p61 + + - + -p62 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML +p63 + +mnncorrect:MNNCORRECT:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES - + -p61->p62 - - -cluster_report - - - -p116 - -join - - - -p61->p116 - - -cluster_report +p62->p63 + + - - -p59 - + + +p97 + +join - - -p59->p61 - - -ipynb + + +p63->p97 + + +A - - -p60 - + + +p121 + +map - - -p60->p61 - - -reportTitle + + +p63->p121 + + +A - - -p63 - + + +p101 + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT - - -p62->p63 - - + + +p97->p101 + + +data p65 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:SC__SCANPY__BATCH_EFFECT_CORRECTION - + p64->p65 - - + + p66 - -map + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:SC__SCANPY__FEATURE_SCALING - + p65->p66 - - - - - -p71 - -combine - - - -p66->p71 - - - - - -p72 - -map - - - -p71->p72 - - -data + + - + p67 - -Channel.from + +map + + + +p66->p67 + + +data - + p68 - + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:DIM_REDUCTION_PCA:SC__SCANPY__DIM_REDUCTION__PCA - + p67->p68 - - + + +data - + p69 - + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:NEIGHBORHOOD_GRAPH:SC__SCANPY__NEIGHBORHOOD_GRAPH - - -p69->p71 - - + + +p68->p69 + + +data - + p70 - + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__TSNE + + + +p69->p70 + + +data + + + +p71 + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__UMAP p70->p71 - - + + - - -p73 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__TSNE + + +p72 + +map - - -p72->p73 - - + + +p71->p72 + + +data - - -p74 - -map + + +p80 + +map - - -p73->p74 - - + + +p71->p80 + + +data - + p75 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__UMAP + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - -p74->p75 - - + + +p72->p75 + + +data p76 - -map + +map p75->p76 - - -data + + - - -p82 - -map + + +p73 + - - -p75->p82 - - -data + + +p73->p75 + + +ipynb - - -p96 - -groupTuple + + +p74 + - - -p75->p96 - - -data + + +p74->p75 + + +reportTitle - + -p79 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - - -p76->p79 - - -data - - - -p80 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML +p77 + +map - - -p79->p80 - - + + +p76->p77 + + +report_notebook - + -p77 - +p78 + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML - + -p77->p79 - - -ipynb +p77->p78 + + - + -p78 - +p79 + p78->p79 - - -reportTitle + + p81 - + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING p80->p81 - - + + - + p84 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:SC__PUBLISH_H5AD + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT - - -p82->p84 - - + + +p81->p84 + + +data + + + +p81->p89 + + +data p85 - + +map p84->p85 - - -B + + + + + +p82 + + + + +p82->p84 + + +ipynb p83 - + p83->p84 - - -fOutSuffix + + +reportTitle p86 - + +map - + -p86->p87 - - +p85->p86 + + +cluster_report - - -p92 - -mnncorrect:MNNCORRECT:BEC_MNNCORRECT:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML + + +p124 + +map - - -p91->p92 - - + + +p85->p124 + + +cluster_report + + + +p87 + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML + + + +p86->p87 + + p88 - + - + -p88->p91 - - -ipynb - - - -p89 - - - - -p89->p91 - - -reportTitle +p87->p88 + + p90 - + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES + + + +p89->p90 + + + + + +p91 + +map - + p90->p91 - - -isBenchmarkMode + + - + -p93 - +p92 + +map - - -p92->p93 - - + + +p91->p92 + + +data + + + +p107 + +groupTuple + + + +p91->p107 + + +data + + + +p94 + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:SC__PUBLISH_H5AD + + + +p92->p94 + + - + p95 - + - + p94->p95 - - + + +B - + + +p93 + + + + +p93->p94 + + +fOutSuffix + + -p97 - -branch +p96 + - + p96->p97 - - -data + + - - -p98 - -view + + +p102 + +map - - -p97->p98 - - + + +p101->p102 + + - - -p108 - -view + + +p125 + +map - - -p97->p108 - - + + +p101->p125 + + - - -p100 - -map + + +p98 + - - -p97->p100 - - + + +p98->p101 + + +ipynb p99 - - - - -p98->p99 - - - - - -p109 - + - - -p108->p109 - - + + +p99->p101 + + +reportTitle - - -p101 - + + +p100 + p100->p101 - - + + +isParameterExplorationModeOn - + -p102 - +p103 + +mnncorrect:MNNCORRECT:BEC_MNNCORRECT:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML - + p102->p103 - - + + - + p104 - -ifEmpty + p103->p104 - - - - - -p105 - -mnncorrect:MNNCORRECT:FILE_CONVERTER:SC__H5AD_TO_LOOM - - - -p104->p105 - - + + - + p106 - -mnncorrect:MNNCORRECT:FILE_CONVERTER:COMPRESS_HDF5 + p105->p106 - - + + - + -p107 - +p108 + +branch - - -p106->p107 - - + + +p107->p108 + + +data - - -p114 - -combine + + +p119 + +view - - -p110->p114 - - -project + + +p108->p119 + + - - -p114->p115 - - + + +p109 + +view + + + +p108->p109 + + p111 - + +map + + + +p108->p111 + + + + + +p120 + + + + +p119->p120 + + + + + +p110 + + + + +p109->p110 + + p112 - -mnncorrect:MNNCORRECT:UTILS__GENERATE_WORKFLOW_CONFIG_REPORT + - + p111->p112 - - -ipynb + + - - -p112->p114 - - + + +p115 + +ifEmpty + + + +p114->p115 + + - + p113 - + - + p113->p114 - - - - - -p115->p116 - - + + - + -p119 - -combine - - - -p116->p119 - - - - - -p120 - -map +p116 + +mnncorrect:MNNCORRECT:FILE_CONVERTER:SC__H5AD_TO_LOOM - - -p119->p120 - - + + +p115->p116 + + p117 - + +mnncorrect:MNNCORRECT:FILE_CONVERTER:COMPRESS_HDF5 - - -p117->p119 - - + + +p116->p117 + + p118 - - - - -p118->p119 - - - - - -p123 - -mnncorrect:MNNCORRECT:SC__SCANPY__MERGE_REPORTS - - - -p120->p123 - - -ipynbs - - - -p124 - -mnncorrect:MNNCORRECT:SC__SCANPY__REPORT_TO_HTML + - - -p123->p124 - - + + +p117->p118 + + - + -p121 - +p131 + +combine - - -p121->p123 - - -reportTitle + + +p121->p131 + + +project + + + +p133 + +join + + + +p131->p133 + + p122 - + + + + +p123 + +mnncorrect:MNNCORRECT:UTILS__GENERATE_WORKFLOW_CONFIG_REPORT - + p122->p123 - - -isBenchmarkMode + + +ipynb - - -p125 - + + +p123->p131 + + + + + +p128 + +combine - + -p124->p125 - - +p124->p128 + + + + + +p129 + + + + +p128->p129 + + +clusteringBECReports + + + +p126 + + + + +p125->p126 + + + + + +p127 + + + + +p127->p128 + + + + + +p130 + + + + +p130->p131 + + + + + +p136 + +combine + + + +p133->p136 + + + + + +p132->p133 + + + + + +p137 + +map + + + +p136->p137 + + + + + +p134 + + + + +p134->p136 + + + + + +p135 + + + + +p135->p136 + + + + + +p140 + +mnncorrect:MNNCORRECT:SC__SCANPY__MERGE_REPORTS + + + +p137->p140 + + +ipynbs + + + +p141 + +mnncorrect:MNNCORRECT:SC__SCANPY__REPORT_TO_HTML + + + +p140->p141 + + + + + +p138 + + + + +p138->p140 + + +reportTitle + + + +p139 + + + + +p139->p140 + + +isParameterExplorationModeOn + + + +p142 + + + + +p141->p142 + + \ No newline at end of file From ad6d89fa17acea013de552080a7e3e72cd21e339 Mon Sep 17 00:00:00 2001 From: dweemx Date: Tue, 25 Feb 2020 10:57:48 +0100 Subject: [PATCH 25/32] Add bbknn_scenic CI --- .github/workflows/bbknn_scenic.yml | 36 ++++++++++++++++++++++++ conf/test__bbknn_scenic.config | 44 ++++++++++++++++++++++++++++++ docs/pipelines.rst | 7 +++-- 3 files changed, 85 insertions(+), 2 deletions(-) create mode 100644 .github/workflows/bbknn_scenic.yml create mode 100644 conf/test__bbknn_scenic.config diff --git a/.github/workflows/bbknn_scenic.yml b/.github/workflows/bbknn_scenic.yml new file mode 100644 index 00000000..57b33bcb --- /dev/null +++ b/.github/workflows/bbknn_scenic.yml @@ -0,0 +1,36 @@ +name: bbknn_scenic + +on: + push: + branches: + - master + pull_request: + branches: + - master + +jobs: + build: + + runs-on: ubuntu-latest + + steps: + - uses: actions/checkout@v1 + with: + submodules: true + - name: Install Nextflow + run: | + export NXF_VER='19.12.0-edge' + wget -qO- get.nextflow.io | bash + sudo mv nextflow /usr/local/bin/ + - name: Get sample data + run: | + mkdir testdata + wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/sample_data_tiny.tar.gz + tar xvf sample_data_tiny.tar.gz + cp -r sample_data testdata/sample1 + mv sample_data testdata/sample2 + - name: Run bbknn_scenic test + run: | + nextflow run ${GITHUB_WORKSPACE} -profile bbknn_scenic,test__bbknn_scenic,docker -entry bbknn_scenic -ansi-log false + cat .nextflow.log + diff --git a/conf/test__bbknn_scenic.config b/conf/test__bbknn_scenic.config new file mode 100644 index 00000000..566803d9 --- /dev/null +++ b/conf/test__bbknn_scenic.config @@ -0,0 +1,44 @@ + +params { + global { + project_name = 'bbknn_scenic_CI' + } + data { + tenx { + cellranger_mex = "testdata/*/outs/" + } + } + sc { + file_annotator { + metaDataFilePath = '' + } + scanpy { + filter { + cellFilterMinNGenes = 1 + } + neighborhood_graph { + nPcs = 2 + } + dim_reduction { + pca { + method = 'pca' + nComps = 2 + } + } + } + scenic { + filteredLoom = 'https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/expr_mat_tiny.loom' + numWorkers = 2 + grn { + tfs = 'https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/test_TFs_tiny.txt' + } + cistarget { + motifsDb = 'https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/genome-ranking.feather' + motifsAnnotation = 'https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/motifs.tbl' + tracksDb = '' + tracksAnnotation = '' + } + } + } +} + diff --git a/docs/pipelines.rst b/docs/pipelines.rst index 13ec1594..7f88c816 100644 --- a/docs/pipelines.rst +++ b/docs/pipelines.rst @@ -160,8 +160,11 @@ Source: https://github.com/Teichlab/bbknn/blob/master/examples/pancreas.ipynb .. |BBKNN Workflow| image:: https://raw.githubusercontent.com/vib-singlecell-nf/vsn-pipelines/master/assets/images/bbknn.svg?sanitize=true -**bbknn_scenic** ----------------- +**bbknn_scenic** |bbknn_scenic| +------------------------------- + +.. |bbknn_scenic| image:: https://github.com/vib-singlecell-nf/vsn-pipelines/workflows/bbknn_scenic/badge.svg + Runs the ``bbknn`` workflow above, then runs the ``scenic`` workflow on the output, generating a comprehensive loom file with the combined results. This could be very resource intensive, depending on the dataset. From 81d00c9a21022b8d0bfc022d54054ac2599cc5dc Mon Sep 17 00:00:00 2001 From: dweemx Date: Tue, 25 Feb 2020 10:58:24 +0100 Subject: [PATCH 26/32] Update README Add link to quick start Add pipelines tags --- README.rst | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/README.rst b/README.rst index f9018f89..418adf9d 100644 --- a/README.rst +++ b/README.rst @@ -15,8 +15,48 @@ VSN-Pipelines :target: https://gitter.im/vib-singlecell-nf/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge :alt: Gitter +|single_sample| |single_sample_scenic| |scenic| |scenic_multiruns| |single_sample_scenic_multiruns| |bbknn| |bbknn_scenic| |harmony| |mnncorrect| + +.. |single_sample| image:: https://github.com/vib-singlecell-nf/vsn-pipelines/workflows/single_sample/badge.svg + :target: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#single-sample-single-sample + :alt: Single-sample Pipeline + +.. |single_sample_scenic| image:: https://github.com/vib-singlecell-nf/vsn-pipelines/workflows/single_sample_scenic/badge.svg + :target: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#single-sample-scenic-single-sample-scenic + :alt: Single-sample SCENIC Pipeline + +.. |scenic| image:: https://github.com/vib-singlecell-nf/vsn-pipelines/workflows/scenic/badge.svg + :target: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#scenic-scenic + :alt: SCENIC Pipeline + +.. |scenic_multiruns| image:: https://github.com/vib-singlecell-nf/vsn-pipelines/workflows/scenic_multiruns/badge.svg + :target: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#scenic-multiruns-scenic-multiruns-single-sample-scenic-multiruns + :alt: SCENIC Multi-runs Pipeline + +.. |single_sample_scenic_multiruns| image:: https://github.com/vib-singlecell-nf/vsn-pipelines/workflows/single_sample_scenic_multiruns/badge.svg + :target: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#scenic-multiruns-scenic-multiruns-single-sample-scenic-multiruns + :alt: Single-sample SCENIC Multi-runs Pipeline + +.. |bbknn| image:: https://github.com/vib-singlecell-nf/vsn-pipelines/workflows/bbknn/badge.svg + :target: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#bbknn-bbknn + :alt: BBKNN Pipeline + +.. |bbknn_scenic| image:: https://github.com/vib-singlecell-nf/vsn-pipelines/workflows/bbknn_scenic/badge.svg + :target: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#bbknn-scenic + :alt: BBKNN SCENIC Pipeline + +.. |harmony| image:: https://github.com/vib-singlecell-nf/vsn-pipelines/workflows/harmony/badge.svg + :target: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#harmony-harmony + :alt: Harmony Pipeline + +.. |mnncorrect| image:: https://github.com/vib-singlecell-nf/vsn-pipelines/workflows/mnncorrect/badge.svg + :target: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#mnncorrect-mnncorrect + :alt: MNN-correct Pipeline + A repository of pipelines for single-cell data in Nextflow DSL2. +Do you want a quick tour of the VSN pipelines ? Please read `Quick Start `_. + Full documentation available on `Read the Docs `_ This main repo contains multiple workflows for analyzing single cell transcriptomics data, and depends on a number of tools, which are organized into submodules within the VIB-Singlecell-NF_ organization. From 96a43136f05ae562cfe4b326290568fa06324fc8 Mon Sep 17 00:00:00 2001 From: dweemx Date: Tue, 25 Feb 2020 10:58:31 +0100 Subject: [PATCH 27/32] Update scanpy tool --- src/scanpy | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/scanpy b/src/scanpy index 57e29c7b..b7424bc8 160000 --- a/src/scanpy +++ b/src/scanpy @@ -1 +1 @@ -Subproject commit 57e29c7bac5e1e1707dc58257d657611e1c57ca3 +Subproject commit b7424bc884a14d85b800571c535f701c8688dc6b From e365d4561638db653ce97feedd3569969f66c538 Mon Sep 17 00:00:00 2001 From: dweemx Date: Tue, 25 Feb 2020 11:01:19 +0100 Subject: [PATCH 28/32] Add missing test__bbknn_scenic profile --- nextflow.config | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/nextflow.config b/nextflow.config index 6dc4a16a..d4949bbc 100644 --- a/nextflow.config +++ b/nextflow.config @@ -233,6 +233,10 @@ profiles { includeConfig 'src/utils/conf/h5ad_concatenate.config' includeConfig 'conf/test__bbknn.config' } + test__bbknn_scenic { + includeConfig 'src/utils/conf/h5ad_concatenate.config' + includeConfig 'conf/test__bbknn_scenic.config' + } test__harmony { includeConfig 'src/utils/conf/h5ad_concatenate.config' includeConfig 'conf/test__harmony.config' From 468ad10f33a8c820b57805b732514f64f181f576 Mon Sep 17 00:00:00 2001 From: dweemx Date: Tue, 25 Feb 2020 11:03:20 +0100 Subject: [PATCH 29/32] Genome config is required for *scenic pipelines --- nextflow.config | 1 + 1 file changed, 1 insertion(+) diff --git a/nextflow.config b/nextflow.config index d4949bbc..60f6f391 100644 --- a/nextflow.config +++ b/nextflow.config @@ -234,6 +234,7 @@ profiles { includeConfig 'conf/test__bbknn.config' } test__bbknn_scenic { + includeConfig 'conf/genomes/hg38.config' includeConfig 'src/utils/conf/h5ad_concatenate.config' includeConfig 'conf/test__bbknn_scenic.config' } From 7b3533346d4a5d5f3a9ca24e73f138befe2355ac Mon Sep 17 00:00:00 2001 From: dweemx Date: Tue, 25 Feb 2020 12:24:42 +0100 Subject: [PATCH 30/32] Use small sample data for bbknn_scenic CI --- .github/workflows/bbknn_scenic.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/bbknn_scenic.yml b/.github/workflows/bbknn_scenic.yml index 57b33bcb..05f0f824 100644 --- a/.github/workflows/bbknn_scenic.yml +++ b/.github/workflows/bbknn_scenic.yml @@ -25,8 +25,8 @@ jobs: - name: Get sample data run: | mkdir testdata - wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/sample_data_tiny.tar.gz - tar xvf sample_data_tiny.tar.gz + wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/sample_data_small.tar.gz + tar xvf sample_data_small.tar.gz cp -r sample_data testdata/sample1 mv sample_data testdata/sample2 - name: Run bbknn_scenic test From 8ec1851e813ad21be503165d593f62dc5dc175c1 Mon Sep 17 00:00:00 2001 From: dweemx Date: Tue, 25 Feb 2020 12:43:51 +0100 Subject: [PATCH 31/32] Update log trace in getting-started docs --- docs/getting-started.rst | 84 +++++++++++++++++++++++----------------- 1 file changed, 49 insertions(+), 35 deletions(-) diff --git a/docs/getting-started.rst b/docs/getting-started.rst index c3ca4ade..80795281 100644 --- a/docs/getting-started.rst +++ b/docs/getting-started.rst @@ -48,45 +48,59 @@ Example Output $ nextflow -C single_sample.config run vib-singlecell-nf/vsn-pipelines -entry single_sample N E X T F L O W ~ version 19.12.0-edge - Launching `vib-singlecell-nf/vsn-pipelines` [condescending_liskov] - revision: 92368248f3 [master] + Launching `/ddn1/vol1/staging/leuven/stg_00002/lcb/dwmax/documents/aertslab/GitHub/vib-singlecell-nf/main.nf` [nice_engelbart] - revision: 0096df9054 WARN: DSL 2 IS AN EXPERIMENTAL FEATURE UNDER DEVELOPMENT -- SYNTAX MAY CHANGE IN FUTURE RELEASE - - [33/68d885] process > single_sample:SINGLE_SAMPLE:UTILS__GENERATE_WORKFLOW_CONFIG_REPORT [100%] 1 of 1 ✔ - [a2/dcf990] process > single_sample:SINGLE_SAMPLE:QC_FILTER:SC__FILE_CONVERTER (1) [100%] 1 of 1 ✔ - [9c/dff236] process > single_sample:SINGLE_SAMPLE:QC_FILTER:SC__SCANPY__COMPUTE_QC_STATS (1) [100%] 1 of 1 ✔ - [65/e1bf9f] process > single_sample:SINGLE_SAMPLE:QC_FILTER:SC__SCANPY__GENE_FILTER (1) [100%] 1 of 1 ✔ - [92/faae99] process > single_sample:SINGLE_SAMPLE:QC_FILTER:SC__SCANPY__CELL_FILTER (1) [100%] 1 of 1 ✔ - [52/c39d90] process > single_sample:SINGLE_SAMPLE:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT (1) [100%] 1 of 1 ✔ - [d2/b38e10] process > single_sample:SINGLE_SAMPLE:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML (1) [100%] 1 of 1 ✔ - [87/96ef4d] process > single_sample:SINGLE_SAMPLE:NORMALIZE_TRANSFORM:SC__SCANPY__NORMALIZATION (1) [100%] 1 of 1 ✔ - [b2/493705] process > single_sample:SINGLE_SAMPLE:NORMALIZE_TRANSFORM:SC__SCANPY__DATA_TRANSFORMATION (1) [100%] 1 of 1 ✔ - [69/a2a237] process > single_sample:SINGLE_SAMPLE:HVG_SELECTION:SC__SCANPY__FEATURE_SELECTION (1) [100%] 1 of 1 ✔ - [1d/0ec983] process > single_sample:SINGLE_SAMPLE:HVG_SELECTION:SC__SCANPY__FEATURE_SCALING (1) [100%] 1 of 1 ✔ - [91/11965d] process > single_sample:SINGLE_SAMPLE:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT (1) [100%] 1 of 1 ✔ - [4e/620e9e] process > single_sample:SINGLE_SAMPLE:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML (1) [100%] 1 of 1 ✔ - [fd/c6e8c5] process > single_sample:SINGLE_SAMPLE:DIM_REDUCTION:DIM_REDUCTION_PCA:SC__SCANPY__DIM_REDUCTION__PCA (1) [100%] 1 of 1 ✔ - [32/548f80] process > single_sample:SINGLE_SAMPLE:DIM_REDUCTION:SC__SCANPY__DIM_REDUCTION__TSNE (1) [100%] 1 of 1 ✔ - [e0/9b68f3] process > single_sample:SINGLE_SAMPLE:DIM_REDUCTION:SC__SCANPY__DIM_REDUCTION__UMAP (1) [100%] 1 of 1 ✔ - [20/337908] process > single_sample:SINGLE_SAMPLE:DIM_REDUCTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT (1) [100%] 1 of 1 ✔ - [b9/dc2795] process > single_sample:SINGLE_SAMPLE:DIM_REDUCTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML (1) [100%] 1 of 1 ✔ - [0b/42a0a3] process > single_sample:SINGLE_SAMPLE:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING (1) [100%] 1 of 1 ✔ - [3a/084e6f] process > single_sample:SINGLE_SAMPLE:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT (1) [100%] 1 of 1 ✔ - [06/6ea130] process > single_sample:SINGLE_SAMPLE:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML (1) [100%] 1 of 1 ✔ - [84/ca1672] process > single_sample:SINGLE_SAMPLE:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES (1) [100%] 1 of 1 ✔ - [db/d66797] process > single_sample:SINGLE_SAMPLE:SC__H5AD_TO_FILTERED_LOOM (1) [100%] 1 of 1 ✔ - [46/be45d7] process > single_sample:SINGLE_SAMPLE:FILE_CONVERTER:SC__H5AD_TO_LOOM (1) [100%] 1 of 1 ✔ - [78/3988ff] process > single_sample:SINGLE_SAMPLE:FILE_CONVERTER:COMPRESS_HDF5 (1) [100%] 1 of 1 ✔ - [4d/bfb133] process > single_sample:SINGLE_SAMPLE:SC__PUBLISH_H5AD (1) [100%] 1 of 1 ✔ - [9c/b5f299] process > single_sample:SINGLE_SAMPLE:SC__SCANPY__MERGE_REPORTS (1) [100%] 1 of 1 ✔ - [00/b15be5] process > single_sample:SINGLE_SAMPLE:SC__SCANPY__REPORT_TO_HTML (1) [100%] 1 of 1 ✔ - Converting 1k_pbmc_v2_chemistry.SC__SCANPY__MARKER_GENES.h5ad to 1k_pbmc_v2_chemistry.SC__SCANPY__MARKER_GENES.loom (w/ additional compression)... - Completed at: 22-Jan-2020 13:45:59 - Duration : 2m 38s + executor > local (59) + [0c/d33a4e] process > single_sample:SINGLE_SAMPLE:UTILS__GENERATE_WORKFLOW_CONFIG_REPORT [100%] 1 of 1 ✔ + [17/ab2b39] process > single_sample:SINGLE_SAMPLE:QC_FILTER:SC__FILE_CONVERTER (1) [100%] 2 of 2 ✔ + [e4/84f688] process > single_sample:SINGLE_SAMPLE:QC_FILTER:SC__SCANPY__COMPUTE_QC_STATS (2) [100%] 2 of 2 ✔ + [1b/daa1c3] process > single_sample:SINGLE_SAMPLE:QC_FILTER:SC__SCANPY__GENE_FILTER (2) [100%] 2 of 2 ✔ + [fc/8653d0] process > single_sample:SINGLE_SAMPLE:QC_FILTER:SC__SCANPY__CELL_FILTER (2) [100%] 2 of 2 ✔ + [9d/ebeff9] process > single_sample:SINGLE_SAMPLE:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__GENERATE_DUAL_INPUT_REPORT (2) [100%] 2 of 2 ✔ + [87/e13dd0] process > single_sample:SINGLE_SAMPLE:QC_FILTER:GENERATE_DUAL_INPUT_REPORT:SC__SCANPY__REPORT_TO_HTML (2) [100%] 2 of 2 ✔ + [a6/867a4a] process > single_sample:SINGLE_SAMPLE:NORMALIZE_TRANSFORM:SC__SCANPY__NORMALIZATION (2) [100%] 2 of 2 ✔ + [07/8e63b1] process > single_sample:SINGLE_SAMPLE:NORMALIZE_TRANSFORM:SC__SCANPY__DATA_TRANSFORMATION (2) [100%] 2 of 2 ✔ + [c1/07c18c] process > single_sample:SINGLE_SAMPLE:HVG_SELECTION:SC__SCANPY__FIND_HIGHLY_VARIABLE_GENES (2) [100%] 2 of 2 ✔ + [e9/53e204] process > single_sample:SINGLE_SAMPLE:HVG_SELECTION:SC__SCANPY__SUBSET_HIGHLY_VARIABLE_GENES (2) [100%] 2 of 2 ✔ + [0b/e7ae8c] process > single_sample:SINGLE_SAMPLE:HVG_SELECTION:SC__SCANPY__FEATURE_SCALING (2) [100%] 2 of 2 ✔ + [5d/52236c] process > single_sample:SINGLE_SAMPLE:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT (2) [100%] 2 of 2 ✔ + [71/5d6559] process > single_sample:SINGLE_SAMPLE:HVG_SELECTION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML (2) [100%] 2 of 2 ✔ + [8c/1b4cc9] process > single_sample:SINGLE_SAMPLE:DIM_REDUCTION_PCA:SC__SCANPY__DIM_REDUCTION__PCA (2) [100%] 2 of 2 ✔ + [7b/d423f7] process > single_sample:SINGLE_SAMPLE:NEIGHBORHOOD_GRAPH:SC__SCANPY__NEIGHBORHOOD_GRAPH (2) [100%] 2 of 2 ✔ + [9b/3a10d2] process > single_sample:SINGLE_SAMPLE:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__TSNE (2) [100%] 2 of 2 ✔ + [5f/2c6325] process > single_sample:SINGLE_SAMPLE:DIM_REDUCTION_TSNE_UMAP:SC__SCANPY__DIM_REDUCTION__UMAP (2) [100%] 2 of 2 ✔ + [ff/b5c6ef] process > single_sample:SINGLE_SAMPLE:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT (2) [100%] 2 of 2 ✔ + [b6/86bc36] process > single_sample:SINGLE_SAMPLE:DIM_REDUCTION_TSNE_UMAP:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML (2) [100%] 2 of 2 ✔ + [1a/2fec91] process > single_sample:SINGLE_SAMPLE:CLUSTER_IDENTIFICATION:SC__SCANPY__CLUSTERING (2) [100%] 2 of 2 ✔ + [38/8a814b] process > single_sample:SINGLE_SAMPLE:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__GENERATE_REPORT (2) [100%] 2 of 2 ✔ + [35/530dcf] process > single_sample:SINGLE_SAMPLE:CLUSTER_IDENTIFICATION:GENERATE_REPORT:SC__SCANPY__REPORT_TO_HTML (2) [100%] 2 of 2 ✔ + [05/3e201e] process > single_sample:SINGLE_SAMPLE:CLUSTER_IDENTIFICATION:SC__SCANPY__MARKER_GENES (2) [100%] 2 of 2 ✔ + [04/ad44c6] process > single_sample:SINGLE_SAMPLE:SC__H5AD_TO_FILTERED_LOOM (2) [100%] 2 of 2 ✔ + [46/47cac6] process > single_sample:SINGLE_SAMPLE:FILE_CONVERTER:SC__H5AD_TO_LOOM (2) [100%] 2 of 2 ✔ + [33/640ffa] process > single_sample:SINGLE_SAMPLE:FILE_CONVERTER:COMPRESS_HDF5 (2) [100%] 2 of 2 ✔ + [77/87b596] process > single_sample:SINGLE_SAMPLE:SC__PUBLISH_H5AD (2) [100%] 2 of 2 ✔ + [61/82bf98] process > single_sample:SINGLE_SAMPLE:SC__SCANPY__MERGE_REPORTS (1) [100%] 2 of 2 ✔ + [5a/26ce75] process > single_sample:SINGLE_SAMPLE:SC__SCANPY__REPORT_TO_HTML (2) [100%] 2 of 2 ✔ + + ------------------------------------------------------------------ + Converting 1k_pbmc_v2_chemistry.SC__SCANPY__MARKER_GENES.h5ad to 1k_pbmc_v2_chemistry.SC__SCANPY__MARKER_GENES.loom + (w/ additional compression)... + ------------------------------------------------------------------ + + + ------------------------------------------------------------------ + Converting 1k_pbmc_v3_chemistry.SC__SCANPY__MARKER_GENES.h5ad to 1k_pbmc_v3_chemistry.SC__SCANPY__MARKER_GENES.loom + (w/ additional compression)... + ------------------------------------------------------------------ + + WARN: To render the execution DAG in the required format it is required to install Graphviz -- See http://www.graphviz.org for more info. + Completed at: 25-Feb-2020 12:31:44 + Duration : 2m 15s CPU hours : 0.1 - Succeeded : 28 + Succeeded : 59 -The pipelines will generate 3 types of results in the output directory (`params.global.outdir`), by default `out/` +The pipelines will generate 3 types of results in the output directory (`params.global.outdir`), by default ``out/`` - ``data``: contains the workflow output file (in h5ad format), plus symlinks to all the intermediate files. - ``loom``: contains final loom files which can be imported inside SCope visualization tool for further visualization of the results. From 27cb08217794ccb90d2fb4ea19b575e2a33ec3af Mon Sep 17 00:00:00 2001 From: dweemx Date: Tue, 25 Feb 2020 12:46:08 +0100 Subject: [PATCH 32/32] Update test__bbknn_scenic config Use test_TFs_small instead of test_TFs_tiny --- conf/test__bbknn_scenic.config | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/conf/test__bbknn_scenic.config b/conf/test__bbknn_scenic.config index 566803d9..bae74fd4 100644 --- a/conf/test__bbknn_scenic.config +++ b/conf/test__bbknn_scenic.config @@ -27,10 +27,9 @@ params { } } scenic { - filteredLoom = 'https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/expr_mat_tiny.loom' numWorkers = 2 grn { - tfs = 'https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/test_TFs_tiny.txt' + tfs = 'https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/test_TFs_small.txt' } cistarget { motifsDb = 'https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/genome-ranking.feather'