Add proteus module for maxquant data analysis #147

WackerO · 2023-07-13T10:39:51Z

This PR adds the proteus module to differentialabundance; it allows importing proteomics measurements from MaxQuant which can then be analyzed with the limma module

PR checklist

This comment contains a description of changes (with reason).
If you've fixed a bug or added code that should be tested, add tests!
If you've added a new tool - have you followed the pipeline conventions in the contribution docs- [ ] If necessary, also make a PR on the nf-core/differentialabundance branch on the nf-core/test-datasets repository.
Make sure your code lints (nf-core lint).
Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
Usage Documentation in docs/usage.md is updated.
Output Documentation in docs/output.md is updated.
CHANGELOG.md is updated.
README.md is updated (including new tool citations and authors/contributors).

… modules/nf-core/proteus/ pxnotebook_env.yml Dockerfile!

…into add_proteus

github-actions · 2023-07-18T07:28:06Z

`nf-core lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit f26a224

+| ✅ 159 tests passed       |+
!| ❗   1 tests had warnings |!

❗ Test warnings:

pipeline_todos - TODO string in WorkflowDifferentialabundance.groovy: Optionally add in-text citation tools to this list.

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-differentialabundance_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-differentialabundance_logo_light.png
files_exist - File found: docs/images/nf-core-differentialabundance_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: lib/nfcore_external_java_deps.jar
files_exist - File found: lib/NfcoreTemplate.groovy
files_exist - File found: lib/Utils.groovy
files_exist - File found: lib/WorkflowMain.groovy
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: conf/igenomes.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: lib/WorkflowDifferentialabundance.groovy
files_exist - File found: modules.json
files_exist - File found: pyproject.toml
files_exist - File not found check: Singularity
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: docs/images/nf-core-differentialabundance_logo.png
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.validationShowHiddenParams
nextflow_config - Config variable found: params.validationSchemaIgnoreParams
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: 1.3.0dev
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-differentialabundance_logo_light.png matches the template
files_unchanged - docs/images/nf-core-differentialabundance_logo_light.png matches the template
files_unchanged - docs/images/nf-core-differentialabundance_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - lib/nfcore_external_java_deps.jar matches the template
files_unchanged - lib/NfcoreTemplate.groovy matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
files_unchanged - pyproject.toml matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version badge matched config. Badge: 23.04.0, Config: 23.04.0
readme - README Zenodo placeholder was replaced with DOI.
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (122 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: release-announcments.yml
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: awstest.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - 'assets/multiqc_config.yml' follows the ordering scheme of the minimally required plugins.
multiqc_config - 'assets/multiqc_config.yml' contains a matching 'report_comment'.
multiqc_config - 'assets/multiqc_config.yml' contains 'export_plots: true'.
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'

Run details

nf-core/tools version 2.10
Run at 2023-10-09 13:46:58

…arly the workflow parts necessary for proteus from those not necessary

into add_proteus; checking pipeline functionality

…cessary

…into add_proteus

pinin4fjords

A couple of initial comments while I look at the rest.

pinin4fjords · 2023-08-25T13:51:43Z

assets/differentialabundance_report.Rmd

+  features_log2_assays <- ""
+}
+
+if (is.null(features_log2_assays)) {


This is repeated code from shinyngs, we should factor that out. I'll make a PR for it.

@WackerO the function for this is now available in the latest shinyngs, you just have to update the container for the notebook.

I still think we should use the factored-out conditional logging function here, so I've unresolved this.

Ah of course, just had to update shinyngs for the rmarkdown report for that function to be found. I have for now added the [] removal in the report; maybe in a future release that part can be integrated into shiny as well

workflows/differentialabundance.nf

pinin4fjords

I'm having a bit of trouble in this, in that you're duplicating a lot of code from the wider pipeline. We should be able to process conditionally in certain places, e.g. in how the matrices are generated which differential method is called, but basically follow the same structure.

e.g. CUSTOM_MATRIXFILTER should only really need to be called in one place. It should be possible to validate everything the same way, since we need to check the same things- i.e. that samples, matrices and metadata are compatible.

We should be able to handle this similarly to arrays. I'll take a look and see if I can make some more detailed suggestions, but maybe that is enough to start with.

…into add_proteus

pinin4fjords · 2023-09-14T20:31:46Z

assets/differentialabundance_report.Rmd

+  features_log2_assays <- ""
+}
+
+if (is.null(features_log2_assays)) {


@WackerO the function for this is now available in the latest shinyngs, you just have to update the container for the notebook.

workflows/differentialabundance.nf

pinin4fjords · 2023-09-14T21:27:49Z

workflows/differentialabundance.nf

+        ch_contrasts_split = ch_contrasts_file
+            .splitCsv ( header:true, sep:(params.contrasts.endsWith('tsv') ? '\t' : ','))
+            .map{ it.tail().first() }
+
+        // For proteus, extract only meta and contrast variable
+        ch_contrasts_proteus = ch_contrasts_split
+            .map{
+                tuple(
+                    exp_meta,       // meta map
+                    it.variable     // contrast variable
+                )
+            }


Suggested change

ch_contrasts_split = ch_contrasts_file

.splitCsv ( header:true, sep:(params.contrasts.endsWith('tsv') ? '\t' : ','))

.map{ it.tail().first() }

// For proteus, extract only meta and contrast variable

ch_contrasts_proteus = ch_contrasts_split

.map{

tuple(

exp_meta, // meta map

it.variable // contrast variable

)

}

ch_contrasts_split = ch_contrasts_file

.splitCsv ( header:true, sep:(params.contrasts.endsWith('tsv') ? '\t' : ','))

.map{ it.tail().first() }

// For proteus, extract only meta and contrast variable

ch_contrasts_proteus = ch_contrasts_split

.map{

tuple(

exp_meta, // meta map

it.variable // contrast variable

)

}

So, proteus will run once for every contrast variable? How does the contrast variable impact on the quantifications?

If it does, why does the proteus module return results keyed by the meta, rather than the meta2 corresponding to the contrast variable?

I think we can simplify this chunk a lot, but I need to understand what's going on better first.

Ooop, good point, I indeed had to change that code part. Proteus runs once for each contrast var to generate the plots. To the best of my knowledge, the quant tables are not different between contrasts (but just to make sure, I opened an issue and asked the developers). For that reason, I added a reduce to run the rest of the pipeline only for one of the tables.

I also changed the input channels to proteus (they were unnecessarily complicated as I realized) and the meta accordingly after adding a second contrast to the test dataset. I had a look at the output and looks good. The report is attached just in case you are curious, but it's nothing spectacular.

Btw, I found that apparently, RMD parameters that contain capital letters will cause a weird bug during knitting; I had to rename some params. Just as a hint for future coding.

PXD043349.html.zip

…into add_proteus

pinin4fjords

Getting there- just a few more simplifications to facilitate future maintenance of these changes.

pinin4fjords · 2023-10-09T08:29:58Z

assets/differentialabundance_report.Rmd

+  features_log2_assays <- ""
+}
+
+if (is.null(features_log2_assays)) {


I still think we should use the factored-out conditional logging function here, so I've unresolved this.

conf/test_maxquant.config

docs/usage.md

workflows/differentialabundance.nf

pinin4fjords · 2023-10-09T09:08:02Z

workflows/differentialabundance.nf

+    } else if (params.study_type == 'maxquant'){
+
+        // For maxquant, we will use the processed matrices from PROTEUS
+        ch_features = ch_in_norm


This is very similar to what happens in the final part of this conditional, so let's combine the two rather than duplicating what happens there. Right now the last bit is

// Otherwise we can just use the matrix input matrix_as_anno_filename = "matrix_as_anno.${matrix_file.getExtension()}" matrix_file.copyTo(matrix_as_anno_filename) ch_features = Channel.of([ exp_meta, file(matrix_as_anno_filename)])

But you could do something like:

// Otherwise we can just use the matrix input matrix_as_anno_filename = "matrix_as_anno.${matrix_file.getExtension()}" if (params.study_type == 'maxquant'){ ch_features_matrix = ch_in_norm } else{ ch_features_matrix = ch_in_raw } ch_features = ch_features_matrix .map{ exp_meta, matrix_file -> matrix_file.copyTo(matrix_as_anno_filename) return [exp_meta, file(matrix_as_anno_filename)] }

TIL that the inside of the map is not separate from the rest of the code. I had to rename exp_meta, matrix_file and return [exp_meta to not receive this error:
- cause: The current scope already contains a variable of the name exp_meta

yep, my bad

Ah and I also added the workdir to matrix_as_anno_filename = "${workflow.workDir}/matrix_as_anno.${matrix_file.getExtension()}", otherwise the matrix_as_anno is just created wherever the user happens to be while running the pipelines

workflows/differentialabundance.nf

pinin4fjords · 2023-10-09T09:26:46Z

workflows/differentialabundance.nf

+        ch_in_raw = PROTEUS.out.raw_tab
+            .reduce{a, b -> a}
+            .map{tuple('id': exp_meta.id, it[1])}
+        ch_in_norm = PROTEUS.out.norm_tab
+            .reduce{a, b -> a}
+            .map{tuple('id': exp_meta.id, it[1])}


Suggested change

ch_in_raw = PROTEUS.out.raw_tab

.reduce{a, b -> a}

.map{tuple('id': exp_meta.id, it[1])}

ch_in_norm = PROTEUS.out.norm_tab

.reduce{a, b -> a}

.map{tuple('id': exp_meta.id, it[1])}

ch_in_raw = PROTEUS.out.raw_tab

.first()

.map{ meta, matrix -> tuple(exp_meta, matrix) }

ch_in_norm = PROTEUS.out.norm_tab

.first()

.map{ meta, matrix -> tuple(exp_meta, matrix) }

I think this is much simpler to understand.

Oop, I did not realize first() has the same effect as my reduce() snippet. Thanks!

Co-authored-by: Jonathan Manning <[email protected]>

…abundance into add_proteus

pinin4fjords

OK, think we're good, much better integrated now, thanks for the work.

WackerO · 2023-10-10T05:40:25Z

Thanks for the review!

WackerO added 7 commits June 15, 2023 07:45

Saving some progress, REMOVE pxnotebook_env.yml AND Dockerfilegit add…

9c32190

… modules/nf-core/proteus/ pxnotebook_env.yml Dockerfile!

progress save for Px

5356b0c

progress save

ca87472

Cleaning up some changes

7f9feb8

changed proteus configs

7757ba3

Installed and integrated proteus

2a7a5ee

Merge branch 'dev' of https://github.com/nf-core/differentialabundance …

0cb2685

…into add_proteus

WackerO added 4 commits July 18, 2023 09:39

prettier, removed an unnecessary file

80f5ad0

Added missing config

e4395bd

linting

27a3123

Undid some accidental changes

7013352

WackerO marked this pull request as ready for review July 20, 2023 11:56

WackerO marked this pull request as draft July 20, 2023 13:43

WackerO added 6 commits August 10, 2023 11:28

Updated docs, fixed bugs with proteus integration, separated more cle…

abce801

…arly the workflow parts necessary for proteus from those not necessary

Merge branch 'dev' of https://github.com/nf-core/differentialabundance

9b981e2

into add_proteus; checking pipeline functionality

Changed list format of --features_log2_assays

4b8d5bc

Some fixes of log2_assays

c844121

Removed process def from test_maxquant.config as it is not anymore ne…

41e8cda

…cessary

More cleanup, comment/docu changes

b71d3ea

WackerO marked this pull request as ready for review August 24, 2023 12:57

WackerO added 4 commits August 24, 2023 15:22

Updated output doc

a6f723f

Fixed NULL being printed in the report.html

3078c8a

prettier

efe8561

Merge branch 'dev' of https://github.com/nf-core/differentialabundance …

043e8e3

…into add_proteus

pinin4fjords requested changes Aug 25, 2023

View reviewed changes

WackerO added 2 commits August 28, 2023 12:43

made workflow more similar to previous version

cee6aba

Merge branch 'dev' of https://github.com/nf-core/differentialabundance …

4d876af

…into add_proteus

pinin4fjords reviewed Sep 14, 2023

View reviewed changes

WackerO and others added 6 commits October 2, 2023 15:22

Pipeline finally runs again after changing proteus

6b7a019

Added proteus params table to report, renamed some params

70ad1c9

updated proteus module

b6ed728

Merge branch 'dev' of https://github.com/nf-core/differentialabundance …

902b8ce

…into add_proteus

Merge branch 'dev' of https://github.com/nf-core/differentialabundance …

c05917a

…into add_proteus

Merge branch 'dev' into add_proteus

f62cc62

pinin4fjords requested changes Oct 9, 2023

View reviewed changes

WackerO and others added 9 commits October 9, 2023 12:49

Update docs/usage.md

a73d256

Co-authored-by: Jonathan Manning <[email protected]>

Update workflows/differentialabundance.nf

3e6c8fe

Co-authored-by: Jonathan Manning <[email protected]>

Update workflows/differentialabundance.nf

d4751ce

Co-authored-by: Jonathan Manning <[email protected]>

Working on review

42a4acc

Merge branch 'add_proteus' of https://github.com/WackerO/differential…

8266f2b

…abundance into add_proteus

Finished the review changes, pipeline runs locally

fa15274

Restored shinyngs module

b103a55

module updates

7007965

Cleanup, updated changelog, fixed output docs

f26a224

pinin4fjords approved these changes Oct 9, 2023

View reviewed changes

WackerO merged commit c9d5328 into nf-core:dev Oct 10, 2023
14 checks passed

WackerO deleted the add_proteus branch October 10, 2023 05:43

pinin4fjords added this to the 1.3.0 milestone Oct 10, 2023

WackerO mentioned this pull request Oct 10, 2023

Add module(s) for proteomics analysis #17

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add proteus module for maxquant data analysis #147

Add proteus module for maxquant data analysis #147

WackerO commented Jul 13, 2023 •

edited

Loading

github-actions bot commented Jul 18, 2023 •

edited

Loading

❗ Test warnings:

✅ Tests passed:

Run details

pinin4fjords left a comment

pinin4fjords Aug 25, 2023

pinin4fjords Sep 14, 2023

pinin4fjords Oct 9, 2023

WackerO Oct 9, 2023

pinin4fjords left a comment

pinin4fjords Sep 14, 2023

pinin4fjords Sep 14, 2023

WackerO Oct 4, 2023

pinin4fjords left a comment

pinin4fjords Oct 9, 2023

pinin4fjords Oct 9, 2023

WackerO Oct 9, 2023

pinin4fjords Oct 9, 2023

WackerO Oct 9, 2023

pinin4fjords Oct 9, 2023

WackerO Oct 9, 2023

pinin4fjords left a comment

WackerO commented Oct 10, 2023

Add proteus module for maxquant data analysis #147

Add proteus module for maxquant data analysis #147

Conversation

WackerO commented Jul 13, 2023 • edited Loading

PR checklist

github-actions bot commented Jul 18, 2023 • edited Loading

nf-core lint overall result: Passed ✅ ⚠️

❗ Test warnings:

✅ Tests passed:

Run details

pinin4fjords left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pinin4fjords left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pinin4fjords left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pinin4fjords left a comment

Choose a reason for hiding this comment

WackerO commented Oct 10, 2023

WackerO commented Jul 13, 2023 •

edited

Loading

github-actions bot commented Jul 18, 2023 •

edited

Loading

`nf-core lint` overall result: Passed ✅ ⚠️