Update nf-core modules with new nftests #420

nschcolnicov · 2025-01-15T20:33:39Z

PR checklist

nf-core-bot · 2025-01-15T20:34:15Z

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.0.2.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

github-actions · 2025-01-15T20:35:18Z

`nf-core pipelines lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit ae94a44

+| ✅ 300 tests passed       |+
#| ❔   6 tests were ignored |#
!| ❗   4 tests had warnings |!

❗ Test warnings:

pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
pipeline_todos - TODO string in base.config: Check the defaults for all processes

❔ Tests ignored:

files_exist - File is ignored: assets/multiqc_config.yml
nextflow_config - Config default ignored: params.report_file
nextflow_config - Config default ignored: params.logo_file
nextflow_config - Config default ignored: params.css_file
nextflow_config - Config default ignored: params.citations_file
multiqc_config - multiqc_config

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-differentialabundance_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-differentialabundance_logo_light.png
files_exist - File found: docs/images/nf-core-differentialabundance_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: main.nf
files_exist - File found: conf/base.config
files_exist - File found: conf/igenomes.config
files_exist - File found: conf/igenomes_ignored.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: modules.json
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: docs/images/nf-core-differentialabundance_logo.png
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/NfcoreTemplate.groovy
files_exist - File not found check: lib/Utils.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: lib/WorkflowMain.groovy
files_exist - File not found check: lib/WorkflowDifferentialabundance.groovy
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: Singularity
files_exist - File not found check: lib/nfcore_external_java_deps.jar
files_exist - File not found check: .travis.yml
nextflow_config - Found nf-schema plugin
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: validation.help.enabled
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable found: validation.help.beforeText
nextflow_config - Config variable found: validation.help.afterText
nextflow_config - Config variable found: validation.help.command
nextflow_config - Config variable found: validation.summary.beforeText
nextflow_config - Config variable found: validation.summary.afterText
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config variable (correctly) not found: params.max_cpus
nextflow_config - Config variable (correctly) not found: params.max_memory
nextflow_config - Config variable (correctly) not found: params.max_time
nextflow_config - Config variable (correctly) not found: params.validationFailUnrecognisedParams
nextflow_config - Config variable (correctly) not found: params.validationLenientMode
nextflow_config - Config variable (correctly) not found: params.validationSchemaIgnoreParams
nextflow_config - Config variable (correctly) not found: params.validationShowHiddenParams
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: 1.6.0dev
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
nextflow_config - nextflow.config contains configuration profile test
nextflow_config - Config default value correct: params.study_name= study
nextflow_config - Config default value correct: params.study_type= rnaseq
nextflow_config - Config default value correct: params.study_abundance_type= counts
nextflow_config - Config default value correct: params.affy_cel_files_archive= null
nextflow_config - Config default value correct: params.querygse= null
nextflow_config - Config default value correct: params.observations_id_col= sample
nextflow_config - Config default value correct: params.observations_type= sample
nextflow_config - Config default value correct: params.features_id_col= gene_id
nextflow_config - Config default value correct: params.features_name_col= gene_name
nextflow_config - Config default value correct: params.features_type= gene
nextflow_config - Config default value correct: params.features_metadata_cols= gene_id,gene_name,gene_biotype
nextflow_config - Config default value correct: params.features_gtf_feature_type= transcript
nextflow_config - Config default value correct: params.features_gtf_table_first_field= gene_id
nextflow_config - Config default value correct: params.affy_file_name_col= file
nextflow_config - Config default value correct: params.affy_background= true
nextflow_config - Config default value correct: params.affy_bgversion= 2
nextflow_config - Config default value correct: params.affy_cdfname= null
nextflow_config - Config default value correct: params.affy_build_annotation= true
nextflow_config - Config default value correct: params.proteus_measurecol_prefix= LFQ intensity
nextflow_config - Config default value correct: params.proteus_norm_function= normalizeMedian
nextflow_config - Config default value correct: params.proteus_plotsd_method= violin
nextflow_config - Config default value correct: params.proteus_plotmv_loess= true
nextflow_config - Config default value correct: params.proteus_palette_name= Set1
nextflow_config - Config default value correct: params.filtering_min_abundance= 1.0
nextflow_config - Config default value correct: params.filtering_min_samples= 1.0
nextflow_config - Config default value correct: params.filtering_min_proportion_not_na= 0.5
nextflow_config - Config default value correct: params.exploratory_clustering_method= ward.D2
nextflow_config - Config default value correct: params.exploratory_cor_method= spearman
nextflow_config - Config default value correct: params.exploratory_n_features= 500
nextflow_config - Config default value correct: params.exploratory_whisker_distance= 1.5
nextflow_config - Config default value correct: params.exploratory_mad_threshold= -5
nextflow_config - Config default value correct: params.exploratory_main_variable= auto_pca
nextflow_config - Config default value correct: params.exploratory_assay_names= raw,normalised,variance_stabilised
nextflow_config - Config default value correct: params.exploratory_final_assay= variance_stabilised
nextflow_config - Config default value correct: params.exploratory_palette_name= Set1
nextflow_config - Config default value correct: params.differential_feature_id_column= gene_id
nextflow_config - Config default value correct: params.differential_fc_column= log2FoldChange
nextflow_config - Config default value correct: params.differential_pval_column= pvalue
nextflow_config - Config default value correct: params.differential_qval_column= padj
nextflow_config - Config default value correct: params.differential_min_fold_change= 2.0
nextflow_config - Config default value correct: params.differential_max_pval= 1.0
nextflow_config - Config default value correct: params.differential_max_qval= 0.05
nextflow_config - Config default value correct: params.differential_feature_name_column= gene_name
nextflow_config - Config default value correct: params.differential_foldchanges_logged= true
nextflow_config - Config default value correct: params.differential_palette_name= Set1
nextflow_config - Config default value correct: params.deseq2_test= Wald
nextflow_config - Config default value correct: params.deseq2_fit_type= parametric
nextflow_config - Config default value correct: params.deseq2_sf_type= ratio
nextflow_config - Config default value correct: params.deseq2_min_replicates_for_replace= 7
nextflow_config - Config default value correct: params.deseq2_independent_filtering= true
nextflow_config - Config default value correct: params.deseq2_lfc_threshold= 0
nextflow_config - Config default value correct: params.deseq2_alt_hypothesis= greaterAbs
nextflow_config - Config default value correct: params.deseq2_p_adjust_method= BH
nextflow_config - Config default value correct: params.deseq2_alpha= 0.1
nextflow_config - Config default value correct: params.deseq2_minmu= 0.5
nextflow_config - Config default value correct: params.deseq2_vs_method= vst
nextflow_config - Config default value correct: params.deseq2_shrink_lfc= true
nextflow_config - Config default value correct: params.deseq2_cores= 1
nextflow_config - Config default value correct: params.deseq2_vs_blind= true
nextflow_config - Config default value correct: params.deseq2_vst_nsub= 1000
nextflow_config - Config default value correct: params.limma_spacing= null
nextflow_config - Config default value correct: params.limma_block= null
nextflow_config - Config default value correct: params.limma_correlation= null
nextflow_config - Config default value correct: params.limma_method= ls
nextflow_config - Config default value correct: params.limma_proportion= 0.01
nextflow_config - Config default value correct: params.limma_stdev_coef_lim= 0.1,4
nextflow_config - Config default value correct: params.limma_winsor_tail_p= 0.05,0.1
nextflow_config - Config default value correct: params.limma_lfc= 0
nextflow_config - Config default value correct: params.limma_adjust_method= BH
nextflow_config - Config default value correct: params.limma_p_value= 1.0
nextflow_config - Config default value correct: params.limma_use_voom= false
nextflow_config - Config default value correct: params.gsea_permute= phenotype
nextflow_config - Config default value correct: params.gsea_nperm= 1000
nextflow_config - Config default value correct: params.gsea_scoring_scheme= weighted
nextflow_config - Config default value correct: params.gsea_metric= Signal2Noise
nextflow_config - Config default value correct: params.gsea_sort= real
nextflow_config - Config default value correct: params.gsea_order= descending
nextflow_config - Config default value correct: params.gsea_set_max= 500
nextflow_config - Config default value correct: params.gsea_set_min= 15
nextflow_config - Config default value correct: params.gsea_norm= meandiv
nextflow_config - Config default value correct: params.gsea_rnd_type= no_balance
nextflow_config - Config default value correct: params.gsea_make_sets= true
nextflow_config - Config default value correct: params.gsea_num= 100
nextflow_config - Config default value correct: params.gsea_plot_top_x= 20
nextflow_config - Config default value correct: params.gsea_rnd_seed= timestamp
nextflow_config - Config default value correct: params.gprofiler2_significant= true
nextflow_config - Config default value correct: params.gprofiler2_measure_underrepresentation= false
nextflow_config - Config default value correct: params.gprofiler2_evcodes= false
nextflow_config - Config default value correct: params.gprofiler2_max_qval= 0.05
nextflow_config - Config default value correct: params.gprofiler2_domain_scope= annotated
nextflow_config - Config default value correct: params.gprofiler2_min_diff= 1
nextflow_config - Config default value correct: params.gprofiler2_palette_name= Blues
nextflow_config - Config default value correct: params.shinyngs_build_app= true
nextflow_config - Config default value correct: params.shinyngs_shinyapps_account= null
nextflow_config - Config default value correct: params.shinyngs_shinyapps_app_name= null
nextflow_config - Config default value correct: params.gene_sets_files= null
nextflow_config - Config default value correct: params.report_title= null
nextflow_config - Config default value correct: params.report_author= null
nextflow_config - Config default value correct: params.report_description= null
nextflow_config - Config default value correct: params.report_scree= true
nextflow_config - Config default value correct: params.report_round_digits= 4
nextflow_config - Config default value correct: params.igenomes_base= s3://ngi-igenomes/igenomes/
nextflow_config - Config default value correct: params.custom_config_version= master
nextflow_config - Config default value correct: params.custom_config_base= https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Config default value correct: params.publish_dir_mode= copy
nextflow_config - Config default value correct: params.validate_params= true
nextflow_config - Config default value correct: params.pipelines_testdata_base_path= https://raw.githubusercontent.com/nf-core/test-datasets/
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-differentialabundance_logo_light.png matches the template
files_unchanged - docs/images/nf-core-differentialabundance_logo_light.png matches the template
files_unchanged - docs/images/nf-core-differentialabundance_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version badge matched config. Badge: 24.04.2, Config: 24.04.2
readme - README Zenodo placeholder was replaced with DOI.
plugin_includes - No wrong validation plugin imports have been found
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (0 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: linting_comment.yml
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: download_pipeline.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: release-announcements.yml
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: template_version_comment.yml
actions_schema_validation - Workflow validation passed: awstest.yml
actions_schema_validation - Workflow validation passed: ci.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'
base_config - conf/base.config found and not ignored.
modules_config - conf/modules.config found and not ignored.
modules_config - GUNZIP_GTF found in conf/modules.config and Nextflow scripts.
modules_config - GTF_TO_TABLE found in conf/modules.config and Nextflow scripts.
modules_config - VALIDATOR found in conf/modules.config and Nextflow scripts.
modules_config - AFFY_JUSTRMA_RAW found in conf/modules.config and Nextflow scripts.
modules_config - AFFY_JUSTRMA_NORM found in conf/modules.config and Nextflow scripts.
modules_config - PROTEUS found in conf/modules.config and Nextflow scripts.
modules_config - GEOQUERY_GETGEO found in conf/modules.config and Nextflow scripts.
modules_config - DESEQ2_NORM found in conf/modules.config and Nextflow scripts.
modules_config - DESEQ2_DIFFERENTIAL found in conf/modules.config and Nextflow scripts.
modules_config - LIMMA_DIFFERENTIAL found in conf/modules.config and Nextflow scripts.
modules_config - CUSTOM_FILTERDIFFERENTIALTABLE found in conf/modules.config and Nextflow scripts.
modules_config - GSEA_GSEA found in conf/modules.config and Nextflow scripts.
modules_config - GPROFILER2_GOST found in conf/modules.config and Nextflow scripts.
modules_config - PLOT_EXPLORATORY found in conf/modules.config and Nextflow scripts.
modules_config - PLOT_DIFFERENTIAL found in conf/modules.config and Nextflow scripts.
modules_config - SHINYNGS_APP found in conf/modules.config and Nextflow scripts.
modules_config - RMARKDOWNNOTEBOOK found in conf/modules.config and Nextflow scripts.
modules_config - MAKE_REPORT_BUNDLE found in conf/modules.config and Nextflow scripts.
modules_config - CUSTOM_MATRIXFILTER found in conf/modules.config and Nextflow scripts.
modules_config - CUSTOM_TABULARTOGSEACLS found in conf/modules.config and Nextflow scripts.
modules_config - CUSTOM_TABULARTOGSEAGCT found in conf/modules.config and Nextflow scripts.
modules_config - TABULAR_TO_GSEA_CHIP found in conf/modules.config and Nextflow scripts.
nfcore_yml - Repository type in .nf-core.yml is valid: pipeline
nfcore_yml - nf-core version in .nf-core.yml is set to the latest version: 3.0.2

Run details

nf-core/tools version 3.0.2
Run at 2025-01-17 18:45:00

nschcolnicov · 2025-01-15T21:17:04Z

Need to update all of the modules from this closed PR, once they get merged in the modules repo PR

nschcolnicov · 2025-01-17T21:12:02Z

I'm looking into possible reasons why results differ between running it in CI tests and my local environment, specifically in the GSEA module. I saw that the default value for the random seed parameter "params.gsea_rnd_seed" is "timestamp", @pinin4fjords why are we using this value instead of a fixed integer? Wouldn't it make results non-reproducible?

pinin4fjords · 2025-01-20T10:33:54Z

modules/nf-core/gsea/gsea/main.nf

+    # Rename files so that they can be properly referenced by the output channels
+    # Function to rename files based on the given pattern
+    rename_files() {
+        local pattern=\$1
+        local exclude_patterns=\$2
+        local extension=\$3
+
+        # Find files matching the pattern but not matching the exclusion patterns
+        find . -type f -name "\$pattern" | while read -r file; do
+            # Exclude files based on the provided exclusion patterns
+            if ! echo "\$file" | grep -qE "\$exclude_patterns"; then
+                # Rename the file by adding the prefix "gene_sets_"
+                mv "\$file" "\$(dirname "\$file")/gene_sets_\$(basename "\$file")"
+            fi
+        done
+    }
+
+    # Pattern and exclusion for .tsv files
+    tsv_pattern="*.tsv"
+    tsv_exclude="gene_set_size|gsea_report|ranked_gene_list"
+
+    # Pattern and exclusion for .html files
+    html_pattern="*.html"
+    html_exclude="gsea_report|heat_map_corr_plot|index|pos_snapshot|neg_snapshot"
+
+    # Pattern and exclusion for .png files
+    png_pattern="*.png"
+    png_exclude="butterfly|enplot|global_es_histogram|gset_rnd_es_dist|heat_map|pvalues_vs_nes_plot|ranked_list_corr"
+
+    # Rename .tsv files
+    rename_files "\$tsv_pattern" "\$tsv_exclude" ".tsv"
+
+    # Rename .html files
+    rename_files "\$html_pattern" "\$html_exclude" ".html"
+
+    # Rename .png files
+    rename_files "\$png_pattern" "\$png_exclude" ".png"


Suggested change

# Rename files so that they can be properly referenced by the output channels

# Function to rename files based on the given pattern

rename_files() {

local pattern=\$1

local exclude_patterns=\$2

local extension=\$3

# Find files matching the pattern but not matching the exclusion patterns

find . -type f -name "\$pattern" | while read -r file; do

# Exclude files based on the provided exclusion patterns

if ! echo "\$file" | grep -qE "\$exclude_patterns"; then

# Rename the file by adding the prefix "gene_sets_"

mv "\$file" "\$(dirname "\$file")/gene_sets_\$(basename "\$file")"

fi

done

}

# Pattern and exclusion for .tsv files

tsv_pattern="*.tsv"

tsv_exclude="gene_set_size|gsea_report|ranked_gene_list"

# Pattern and exclusion for .html files

html_pattern="*.html"

html_exclude="gsea_report|heat_map_corr_plot|index|pos_snapshot|neg_snapshot"

# Pattern and exclusion for .png files

png_pattern="*.png"

png_exclude="butterfly|enplot|global_es_histogram|gset_rnd_es_dist|heat_map|pvalues_vs_nes_plot|ranked_list_corr"

# Rename .tsv files

rename_files "\$tsv_pattern" "\$tsv_exclude" ".tsv"

# Rename .html files

rename_files "\$html_pattern" "\$html_exclude" ".html"

# Rename .png files

rename_files "\$png_pattern" "\$png_exclude" ".png"

# Prefix gene_set files

for e in \

tsv:gene_set_size\|gsea_report\|ranked_gene_list \

html:gsea_report\|heat_map_corr_plot\|index\|pos_snapshot\|neg_snapshot \

png:butterfly\|enplot\|global_es_histogram\|gset_rnd_es_dist\|heat_map\|pvalues_vs_nes_plot\|ranked_list_corr

do

IFS=: read -r ext exclude <<<"\$e"

find . -type f -name "*.\$ext" | grep -Ev "\$exclude" | while IFS= read -r f

do

mv "$\f" "\$(dirname "\$f")/gene_sets_\$(basename "$\f")"

done

done

Could have been done a bit more concisely, along these lines? Not tested

Even though longer, I would argue that the previous version is a lot more readable. But not the hill I'm going to die on.

pinin4fjords

Seems to be a lot of work here, nice. But I have a couple of concerns that may need feeding back through nf-core/modules:

We shouldn't unconditionally set the random seed on tools that use it. That's changing the author's designed default behaviour. We can set seeds for reproducibility in testing, but that should be the limit.
We should only pin the main dependency. Anything else gives us headaches down the line. There is upcoming functionality in nf-core to use Conda lockfiles for reproducibility, but that is not what environment.ymls are for (unless you have a very good reason for those I'm not aware of).

pinin4fjords · 2025-01-20T10:39:20Z

modules/nf-core/custom/tabulartogseacls/environment.yml

@@ -2,4 +2,9 @@ channels:
  - conda-forge
  - bioconda
 dependencies:
-  - conda-forge::coreutils=8.30
+  - conda-forge::coreutils=9.5


We should not be pinning thise many dependencies in the environment.yml, this is not a lockfile.

pinin4fjords · 2025-01-20T10:40:09Z

modules/nf-core/custom/tabulartogseagct/environment.yml

@@ -2,4 +2,9 @@ channels:
  - conda-forge
  - bioconda
 dependencies:
-  - conda-forge::coreutils=8.30
+  - conda-forge::coreutils=9.5


Again, too many deps pinned

pinin4fjords · 2025-01-20T10:42:12Z

modules/nf-core/gsea/gsea/main.nf

    def chip_command = chip ? "-chip $chip -collapse true" : ''
+    def VERSION = '4.3.2' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions.
+    if (!(args ==~ /.*-rnd_seed.*/)) {args += " -rnd_seed 10"}


We should not change the default behaviour of underlying software unless we have to.

Set this for tests, but not here.

pinin4fjords · 2025-01-20T10:45:54Z

modules/nf-core/gprofiler2/gost/main.nf

+
+    cat <<-END_VERSIONS > versions.yml
+    "${task.process}":
+        r-base: \$(echo \$(R --version 2>&1) | sed 's/^.*R version //; s/ .*\$//')


We should actually change this, and the template, to not report the r version. This breaks every time Bioconda updates its R (which I only came to appreciate later on).

In general we should only pin the main dependency, and only report the version of that dependency.

grst · 2025-01-20T10:56:49Z

We shouldn't unconditionally set the random seed on tools that use it. That's changing the author's designed default behaviour. We can set seeds for reproducibility in testing, but that should be the limit.

I would argue that in most of the cases it's not "designed default behavior" but rather neglect. If we care about reproducibility of the pipeline we should set the seed if a tool doesn't provide stable output by itself.

Agree with the rest.

pinin4fjords · 2025-01-20T11:10:37Z

I would argue that in most of the cases it's not "designed default behavior" but rather neglect. If we care about reproducibility of the pipeline we should set the seed if a tool doesn't provide stable output by itself.

Sorry, I really strongly disagree with that. The lack of reproducibility is 'real' in tools like this. Setting the seed makes things reproducible, but it's only an illusion. The fact is that there's a simulation going on which means you should get a slightly different answer each time. For example, if a user gets a 'significant' result one time that goes away next time they run, that's a genuine reflection on the reliability of the result, not a consequence of neglect.

What if someone actually wanted to examine the impact of the simulations on reliability? It's not for us to fix the seed and make that harder.

In short, this is a hill I will die on ;-)

nschcolnicov · 2025-01-20T19:25:23Z

I would argue that in most of the cases it's not "designed default behavior" but rather neglect. If we care about reproducibility of the pipeline we should set the seed if a tool doesn't provide stable output by itself.

Sorry, I really strongly disagree with that. The lack of reproducibility is 'real' in tools like this. Setting the seed makes things reproducible, but it's only an illusion. The fact is that there's a simulation going on which means you should get a slightly different answer each time. For example, if a user gets a 'significant' result one time that goes away next time they run, that's a genuine reflection on the reliability of the result, not a consequence of neglect.

What if someone actually wanted to examine the impact of the simulations on reliability? It's not for us to fix the seed and make that harder.

In short, this is a hill I will die on ;-)

Thank you for your comments @grst @pinin4fjords. Regarding the seed debate, I think it is worth noting that the behavior of the module within the pipeline hasn't changed. While it is true that in the context of the module, this line will set a default seed whenever the doesn't set a seed value through the ext.args; In the context of the pipeline, nothing has changed, since the random seed value is already set by default in the modules.config via the gsea_rnd_seed to the value timestamp.
This issue should be created in the modules repository and we can update the pipeline once that has been updated.

Following up on this debate, I think that having a random seed being set by default whenever the user disregards this argument makes sense, and will ensure reproducibility, which is the behavior that I think most people are looking for. If they are looking for more flexibility, the seed value can easily be modified by the user. In any case, I think it would be best to leave the module as is, and update the documentation

Let me know your thoughts!

pinin4fjords · 2025-01-21T09:05:01Z

Let me know your thoughts!

See above ;-). As I say, I strongly think we shouldn't be setting random seeds as default.

Setting the seeds does not make this tool reproducible in a meaningful sense. To be a little bit reductive, it's like you're asking someone to throw a dice but telling them to throw a 6. As such it's appropriate to set in testing but not as a matter of course. It's actively misleading if people keep getting the same answer, even with small numbers of simulations, purely because they're setting a random seed. It's just wrong.

It's not about flexibility- if people are expecting consistent results from a method based on random simulations, then I'm afraid they don't understand how that method works. A user doesn't expect to have to do something to an nf-core module to restore what should be its default behaviour.

Could you just quickly take that out at the module level please? Then we can change it here before merge.

grst · 2025-01-21T09:17:53Z

This discussion probably doesn't lead anywhere, but I would still like to throw in that for me, this is about provenance and documentation of how one arrives at a certain result.

If I publish a table with DE results and write that I used nextflow run nf-core/differentialabundance -r X.X.X with a bunch of parameters. If there's a fixed seed, someone can run this themselves and get exactly the same table. Without the seed, they will get a different result, so I could as well have made the table up (or tweaked it a little bit to make my favorite gene significant).

EDIT: A solution that would satistfy both would be to choose a random seed and report it. But this would add additional complexity to the pipeline code.

pinin4fjords · 2025-01-21T09:22:32Z

This discussion probably doesn't lead anywhere, but I would still like to throw in that for me, this is about provenance and documentation of how one arrives at a certain result.

If I publish a table with DE results and write that I used nextflow run nf-core/differentialabundance -r X.X.X with a bunch of parameters. If there's a fixed seed, someone can run this themselves and get exactly the same table. Without the seed, they will get a different result, so I could as well have made the table up (or tweaked it a little bit to make my favorite gene significant).

Yep, you absolutely should set the random seed in that situation (and state the seed set). But you should do it actively, and it shouldn't be the default.

grst · 2025-01-21T09:25:46Z

Yep, you absolutely should set the random seed in that situation (and state the seed set). But you should do it actively, and it shouldn't be the default.

So I propose that the pipeline gets a global seed parameter that, if enabled, sets a seed to all applicable methods. I can open a separate ticket for that.

Updated gsea/gsea

051ee08

nschcolnicov force-pushed the update_modules_nftests branch from 38189f9 to 051ee08 Compare January 15, 2025 20:56

This was referenced Jan 15, 2025

added nf-tests to gsea_gsea POC #417

Closed

Added nftests shinyngs POC #413

Closed

Added nftests to gprofiler2/gost #415

Closed

POC Add nftests custom modules #401

Closed

Updated gprofiler2/gost

0ea0b82

This was referenced Jan 17, 2025

Generate gprofiler2/gost nf-test #393

Open

Update pipeline with new modules #399

Open

Generate shinyngs/ modules nf-test #397

Open

Generate gsea/gsea nf-test #394

Open

updated custom modules

ae94a44

nschcolnicov force-pushed the update_modules_nftests branch from 6c463c1 to ae94a44 Compare January 17, 2025 18:43

Updated nf-tests for fixed seed

248d3c1

pinin4fjords reviewed Jan 20, 2025

View reviewed changes

pinin4fjords requested changes Jan 20, 2025

View reviewed changes

grst mentioned this pull request Jan 22, 2025

Centralized option to set seed for all non-deterministic methods #425

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update nf-core modules with new nftests #420

Update nf-core modules with new nftests #420

nschcolnicov commented Jan 15, 2025

nf-core-bot commented Jan 15, 2025

github-actions bot commented Jan 15, 2025 •

edited

Loading

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

nschcolnicov commented Jan 15, 2025

nschcolnicov commented Jan 17, 2025

pinin4fjords Jan 20, 2025 •

edited

Loading

grst Jan 23, 2025

pinin4fjords left a comment

pinin4fjords Jan 20, 2025

pinin4fjords Jan 20, 2025

pinin4fjords Jan 20, 2025

pinin4fjords Jan 20, 2025

grst commented Jan 20, 2025

pinin4fjords commented Jan 20, 2025

nschcolnicov commented Jan 20, 2025

pinin4fjords commented Jan 21, 2025

grst commented Jan 21, 2025 •

edited

Loading

pinin4fjords commented Jan 21, 2025 •

edited

Loading

grst commented Jan 21, 2025

Update nf-core modules with new nftests #420

Are you sure you want to change the base?

Update nf-core modules with new nftests #420

Conversation

nschcolnicov commented Jan 15, 2025

PR checklist

nf-core-bot commented Jan 15, 2025

github-actions bot commented Jan 15, 2025 • edited Loading

nf-core pipelines lint overall result: Passed ✅ ⚠️

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

nschcolnicov commented Jan 15, 2025

nschcolnicov commented Jan 17, 2025

pinin4fjords Jan 20, 2025 • edited Loading

Choose a reason for hiding this comment

grst Jan 23, 2025

Choose a reason for hiding this comment

pinin4fjords left a comment

Choose a reason for hiding this comment

pinin4fjords Jan 20, 2025

Choose a reason for hiding this comment

pinin4fjords Jan 20, 2025

Choose a reason for hiding this comment

pinin4fjords Jan 20, 2025

Choose a reason for hiding this comment

pinin4fjords Jan 20, 2025

Choose a reason for hiding this comment

grst commented Jan 20, 2025

pinin4fjords commented Jan 20, 2025

nschcolnicov commented Jan 20, 2025

pinin4fjords commented Jan 21, 2025

grst commented Jan 21, 2025 • edited Loading

pinin4fjords commented Jan 21, 2025 • edited Loading

grst commented Jan 21, 2025

github-actions bot commented Jan 15, 2025 •

edited

Loading

`nf-core pipelines lint` overall result: Passed ✅ ⚠️

pinin4fjords Jan 20, 2025 •

edited

Loading

grst commented Jan 21, 2025 •

edited

Loading

pinin4fjords commented Jan 21, 2025 •

edited

Loading