diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 00000000..7fe55006 --- /dev/null +++ b/.gitattributes @@ -0,0 +1 @@ +*.config linguist-language=nextflow diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md new file mode 100644 index 00000000..482e87bf --- /dev/null +++ b/.github/CONTRIBUTING.md @@ -0,0 +1,47 @@ +# nf-core/bamtofastq: Contributing Guidelines + +Hi there! Many thanks for taking an interest in improving nf-core/bamtofastq. + +We try to manage the required tasks for nf-core/bamtofastq using GitHub issues, you probably came to this page when creating one. Please use the pre-filled template to save time. + +However, don't be put off by this template - other more general issues and suggestions are welcome! Contributions to the code are even more welcome ;) + +> If you need help using or modifying nf-core/bamtofastq then the best place to ask is on the pipeline channel on [Slack](https://nf-co.re/join/slack/). + + + +## Contribution workflow +If you'd like to write some code for nf-core/bamtofastq, the standard workflow +is as follows: + +1. Check that there isn't already an issue about your idea in the + [nf-core/bamtofastq issues](https://github.com/nf-core/bamtofastq/issues) to avoid + duplicating work. + * If there isn't one already, please create one so that others know you're working on this +2. Fork the [nf-core/bamtofastq repository](https://github.com/nf-core/bamtofastq) to your GitHub account +3. Make the necessary changes / additions within your forked repository +4. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged. + +If you're not used to this workflow with git, you can start with some [basic docs from GitHub](https://help.github.com/articles/fork-a-repo/) or even their [excellent interactive tutorial](https://try.github.io/). + + +## Tests +When you create a pull request with changes, [Travis CI](https://travis-ci.org/) will run automatic tests. +Typically, pull-requests are only fully reviewed when these tests are passing, though of course we can help out before then. + +There are typically two types of tests that run: + +### Lint Tests +The nf-core has a [set of guidelines](https://nf-co.re/developers/guidelines) which all pipelines must adhere to. +To enforce these and ensure that all pipelines stay in sync, we have developed a helper tool which runs checks on the pipeline code. This is in the [nf-core/tools repository](https://github.com/nf-core/tools) and once installed can be run locally with the `nf-core lint ` command. + +If any failures or warnings are encountered, please follow the listed URL for more documentation. + +### Pipeline Tests +Each nf-core pipeline should be set up with a minimal set of test-data. +Travis CI then runs the pipeline on this data to ensure that it exists successfully. +If there are any failures then the automated tests fail. +These tests are run both with the latest available version of Nextflow and also the minimum required version that is stated in the pipeline code. + +## Getting help +For further information/help, please consult the [nf-core/bamtofastq documentation](https://github.com/nf-core/bamtofastq#documentation) and don't hesitate to get in touch on the [nf-core/bamtofastq pipeline channel](https://nfcore.slack.com/channels/nf-core/bamtofastq) on [Slack](https://nf-co.re/join/slack/). diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md new file mode 100644 index 00000000..6cab6224 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.md @@ -0,0 +1,31 @@ +Hi there! + +Thanks for telling us about a problem with the pipeline. Please delete this text and anything that's not relevant from the template below: + +#### Describe the bug +A clear and concise description of what the bug is. + +#### Steps to reproduce +Steps to reproduce the behaviour: +1. Command line: `nextflow run ...` +2. See error: _Please provide your error message_ + +#### Expected behaviour +A clear and concise description of what you expected to happen. + +#### System: + - Hardware: [e.g. HPC, Desktop, Cloud...] + - Executor: [e.g. slurm, local, awsbatch...] + - OS: [e.g. CentOS Linux, macOS, Linux Mint...] + - Version [e.g. 7, 10.13.6, 18.3...] + +#### Nextflow Installation: + - Version: [e.g. 0.31.0] + +#### Container engine: + - Engine: [e.g. Conda, Docker or Singularity] + - version: [e.g. 1.0.0] + - Image tag: [e.g. nfcore/bamtofastq:1.0.0] + +#### Additional context +Add any other context about the problem here. diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md new file mode 100644 index 00000000..1f025b77 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.md @@ -0,0 +1,16 @@ +Hi there! + +Thanks for suggesting a new feature for the pipeline! Please delete this text and anything that's not relevant from the template below: + +#### Is your feature request related to a problem? Please describe. +A clear and concise description of what the problem is. +Ex. I'm always frustrated when [...] + +#### Describe the solution you'd like +A clear and concise description of what you want to happen. + +#### Describe alternatives you've considered +A clear and concise description of any alternative solutions or features you've considered. + +#### Additional context +Add any other context about the feature request here. diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md new file mode 100644 index 00000000..4c439ee4 --- /dev/null +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -0,0 +1,15 @@ +Many thanks to contributing to nf-core/bamtofastq! + +Please fill in the appropriate checklist below (delete whatever is not relevant). These are the most common things requested on pull requests (PRs). + +## PR checklist + - [ ] This comment contains a description of changes (with reason) + - [ ] If you've fixed a bug or added code that should be tested, add tests! + - [ ] If necessary, also make a PR on the [nf-core/bamtofastq branch on the nf-core/test-datasets repo]( https://github.com/nf-core/test-datasets/pull/new/nf-core/bamtofastq) + - [ ] Ensure the test suite passes (`nextflow run . -profile test,docker`). + - [ ] Make sure your code lints (`nf-core lint .`). + - [ ] Documentation in `docs` is updated + - [ ] `CHANGELOG.md` is updated + - [ ] `README.md` is updated + +**Learn more about contributing:** https://github.com/nf-core/bamtofastq/tree/master/.github/CONTRIBUTING.md diff --git a/.github/markdownlint.yml b/.github/markdownlint.yml new file mode 100644 index 00000000..e052a635 --- /dev/null +++ b/.github/markdownlint.yml @@ -0,0 +1,9 @@ +# Markdownlint configuration file +default: true, +line-length: false +no-multiple-blanks: 0 +blanks-around-headers: false +blanks-around-lists: false +header-increment: false +no-duplicate-header: + siblings_only: true diff --git a/.gitignore b/.gitignore new file mode 100644 index 00000000..5b54e3e6 --- /dev/null +++ b/.gitignore @@ -0,0 +1,7 @@ +.nextflow* +work/ +data/ +results/ +.DS_Store +tests/test_data +*.pyc diff --git a/.travis.yml b/.travis.yml new file mode 100644 index 00000000..ca06d704 --- /dev/null +++ b/.travis.yml @@ -0,0 +1,42 @@ +sudo: required +language: python +jdk: openjdk8 +services: docker +python: '3.6' +cache: pip +matrix: + fast_finish: true + +before_install: + # PRs to master are only ok if coming from dev branch + - '[ $TRAVIS_PULL_REQUEST = "false" ] || [ $TRAVIS_BRANCH != "master" ] || ([ $TRAVIS_PULL_REQUEST_SLUG = $TRAVIS_REPO_SLUG ] && ([ $TRAVIS_PULL_REQUEST_BRANCH = "dev" ] || [ $TRAVIS_PULL_REQUEST_BRANCH = "patch" ]))' + # Pull the docker image first so the test doesn't wait for this + - docker pull nfcore/bamtofastq:dev + # Fake the tag locally so that the pipeline runs properly + # Looks weird when this is :dev to :dev, but makes sense when testing code for a release (:dev to :1.0.1) + - docker tag nfcore/bamtofastq:dev nfcore/bamtofastq:dev + +install: + # Install Nextflow + - mkdir /tmp/nextflow && cd /tmp/nextflow + - wget -qO- get.nextflow.io | bash + - sudo ln -s /tmp/nextflow/nextflow /usr/local/bin/nextflow + # Install nf-core/tools + - pip install --upgrade pip + - pip install nf-core + # Reset + - mkdir ${TRAVIS_BUILD_DIR}/tests && cd ${TRAVIS_BUILD_DIR}/tests + # Install markdownlint-cli + - sudo apt-get install npm && npm install -g markdownlint-cli + +env: + - NXF_VER='0.32.0' # Specify a minimum NF version that should be tested and work + - NXF_VER='' # Plus: get the latest NF version and check that it works + +script: + # Lint the pipeline code + - nf-core lint ${TRAVIS_BUILD_DIR} + # Lint the documentation + - markdownlint ${TRAVIS_BUILD_DIR} -c ${TRAVIS_BUILD_DIR}/.github/markdownlint.yml + # Run the pipeline with the test profile + - nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 00000000..f425a3dc --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,4 @@ +# nf-core/bamtofastq: Changelog + +## v1.0dev - [date] +Initial release of nf-core/bamtofastq, created with the [nf-core](http://nf-co.re/) template. diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md new file mode 100644 index 00000000..1cda7600 --- /dev/null +++ b/CODE_OF_CONDUCT.md @@ -0,0 +1,46 @@ +# Contributor Covenant Code of Conduct + +## Our Pledge + +In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation. + +## Our Standards + +Examples of behavior that contributes to creating a positive environment include: + +* Using welcoming and inclusive language +* Being respectful of differing viewpoints and experiences +* Gracefully accepting constructive criticism +* Focusing on what is best for the community +* Showing empathy towards other community members + +Examples of unacceptable behavior by participants include: + +* The use of sexualized language or imagery and unwelcome sexual attention or advances +* Trolling, insulting/derogatory comments, and personal or political attacks +* Public or private harassment +* Publishing others' private information, such as a physical or electronic address, without explicit permission +* Other conduct which could reasonably be considered inappropriate in a professional setting + +## Our Responsibilities + +Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior. + +Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful. + +## Scope + +This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers. + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team on [Slack](https://nf-co.re/join/slack/). The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately. + +Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership. + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version] + +[homepage]: http://contributor-covenant.org +[version]: http://contributor-covenant.org/version/1/4/ diff --git a/Dockerfile b/Dockerfile new file mode 100644 index 00000000..2bae3218 --- /dev/null +++ b/Dockerfile @@ -0,0 +1,7 @@ +FROM nfcore/base:1.7 +LABEL authors="Friederike Hanssen" \ + description="Docker image containing all requirements for nf-core/bamtofastq pipeline" + +COPY environment.yml / +RUN conda env create -f /environment.yml && conda clean -a +ENV PATH /opt/conda/envs/nf-core-bamtofastq-1.0dev/bin:$PATH diff --git a/LICENSE b/LICENSE new file mode 100644 index 00000000..34fccfc9 --- /dev/null +++ b/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) Friederike Hanssen + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/README.md b/README.md new file mode 100644 index 00000000..fc830c7d --- /dev/null +++ b/README.md @@ -0,0 +1,67 @@ +# ![nf-core/bamtofastq](docs/images/nf-core-bamtofastq_logo.png) + +**Workflow converts one or multiple bam files back to the fastq format**. + +[![Build Status](https://travis-ci.com/nf-core/bamtofastq.svg?branch=master)](https://travis-ci.com/nf-core/bamtofastq) +[![Nextflow](https://img.shields.io/badge/nextflow-%E2%89%A50.32.0-brightgreen.svg)](https://www.nextflow.io/) + +[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg)](http://bioconda.github.io/) +[![Docker](https://img.shields.io/docker/automated/nfcore/bamtofastq.svg)](https://hub.docker.com/r/nfcore/bamtofastq) + +## Introduction + +The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible. + +## Quick Start + +i. Install [`nextflow`](https://nf-co.re/usage/installation) + +ii. Install one of [`docker`](https://docs.docker.com/engine/installation/), [`singularity`](https://www.sylabs.io/guides/3.0/user-guide/) or [`conda`](https://conda.io/miniconda.html) + +iii. Download the pipeline and test it on a minimal dataset with a single command + +```bash +nextflow run nf-core/bamtofastq -profile test, +``` + +iv. Start running your own analysis! + + +```bash +nextflow run nf-core/bamtofastq -profile --reads '*_R{1,2}.fastq.gz' --genome GRCh37 +``` + +See [usage docs](docs/usage.md) for all of the available options when running the pipeline. + +## Documentation + +The nf-core/bamtofastq pipeline comes with documentation about the pipeline, found in the `docs/` directory: + +1. [Installation](https://nf-co.re/usage/installation) +2. Pipeline configuration + * [Local installation](https://nf-co.re/usage/local_installation) + * [Adding your own system config](https://nf-co.re/usage/adding_own_config) + * [Reference genomes](https://nf-co.re/usage/reference_genomes) +3. [Running the pipeline](docs/usage.md) +4. [Output and how to interpret the results](docs/output.md) +5. [Troubleshooting](https://nf-co.re/usage/troubleshooting) + + + +## Credits + +nf-core/bamtofastq was originally written by Friederike Hanssen. + +## Contributions and Support + +If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md). + +For further information or help, don't hesitate to get in touch on [Slack](https://nfcore.slack.com/channels/nf-core/bamtofastq) (you can join with [this invite](https://nf-co.re/join/slack)). + +## Citation + + + + +You can cite the `nf-core` pre-print as follows: +Ewels PA, Peltzer A, Fillinger S, Alneberg JA, Patel H, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. **nf-core: Community curated bioinformatics pipelines**. *bioRxiv*. 2019. p. 610741. [doi: 10.1101/610741](https://www.biorxiv.org/content/10.1101/610741v1). diff --git a/assets/email_template.html b/assets/email_template.html new file mode 100644 index 00000000..1ae6e9a1 --- /dev/null +++ b/assets/email_template.html @@ -0,0 +1,54 @@ + + + + + + + + + nf-core/bamtofastq Pipeline Report + + +
+ + + +

nf-core/bamtofastq v${version}

+

Run Name: $runName

+ +<% if (!success){ + out << """ +
+

nf-core/bamtofastq execution completed unsuccessfully!

+

The exit status of the task that caused the workflow execution to fail was: $exitStatus.

+

The full error message was:

+
${errorReport}
+
+ """ +} else { + out << """ +
+ nf-core/bamtofastq execution completed successfully! +
+ """ +} +%> + +

The workflow was completed at $dateComplete (duration: $duration)

+

The command used to launch the workflow was as follows:

+
$commandLine
+ +

Pipeline Configuration:

+ + + <% out << summary.collect{ k,v -> "" }.join("\n") %> + +
$k
$v
+ +

nf-core/bamtofastq

+

https://github.com/nf-core/bamtofastq

+ +
+ + + diff --git a/assets/email_template.txt b/assets/email_template.txt new file mode 100644 index 00000000..f24e9d67 --- /dev/null +++ b/assets/email_template.txt @@ -0,0 +1,40 @@ +---------------------------------------------------- + ,--./,-. + ___ __ __ __ ___ /,-._.--~\\ + |\\ | |__ __ / ` / \\ |__) |__ } { + | \\| | \\__, \\__/ | \\ |___ \\`-._,-`-, + `._,._,' + nf-core/bamtofastq v${version} +---------------------------------------------------- + +Run Name: $runName + +<% if (success){ + out << "## nf-core/bamtofastq execution completed successfully! ##" +} else { + out << """#################################################### +## nf-core/bamtofastq execution completed unsuccessfully! ## +#################################################### +The exit status of the task that caused the workflow execution to fail was: $exitStatus. +The full error message was: + +${errorReport} +""" +} %> + + +The workflow was completed at $dateComplete (duration: $duration) + +The command used to launch the workflow was as follows: + + $commandLine + + + +Pipeline Configuration: +----------------------- +<% out << summary.collect{ k,v -> " - $k: $v" }.join("\n") %> + +-- +nf-core/bamtofastq +https://github.com/nf-core/bamtofastq diff --git a/assets/multiqc_config.yaml b/assets/multiqc_config.yaml new file mode 100644 index 00000000..abad4868 --- /dev/null +++ b/assets/multiqc_config.yaml @@ -0,0 +1,9 @@ +report_comment: > + This report has been generated by the nf-core/bamtofastq + analysis pipeline. For information about how to interpret these results, please see the + documentation. +report_section_order: + nf-core/bamtofastq-software-versions: + order: -1000 + +export_plots: true diff --git a/assets/nf-core-bamtofastq_logo.png b/assets/nf-core-bamtofastq_logo.png new file mode 100644 index 00000000..cb6b39f8 Binary files /dev/null and b/assets/nf-core-bamtofastq_logo.png differ diff --git a/assets/sendmail_template.txt b/assets/sendmail_template.txt new file mode 100644 index 00000000..d2bda0a4 --- /dev/null +++ b/assets/sendmail_template.txt @@ -0,0 +1,53 @@ +To: $email +Subject: $subject +Mime-Version: 1.0 +Content-Type: multipart/related;boundary="nfcoremimeboundary" + +--nfcoremimeboundary +Content-Type: text/html; charset=utf-8 + +$email_html + +--nfcoremimeboundary +Content-Type: image/png;name="nf-core-bamtofastq_logo.png" +Content-Transfer-Encoding: base64 +Content-ID: +Content-Disposition: inline; filename="nf-core-bamtofastq_logo.png" + +<% out << new File("$baseDir/assets/nf-core-bamtofastq_logo.png"). + bytes. + encodeBase64(). + toString(). + tokenize( '\n' )*. + toList()*. + collate( 76 )*. + collect { it.join() }. + flatten(). + join( '\n' ) %> + +<% +if (mqcFile){ +def mqcFileObj = new File("$mqcFile") +if (mqcFileObj.length() < mqcMaxSize){ +out << """ +--nfcoremimeboundary +Content-Type: text/html; name=\"multiqc_report\" +Content-Transfer-Encoding: base64 +Content-ID: +Content-Disposition: attachment; filename=\"${mqcFileObj.getName()}\" + +${mqcFileObj. + bytes. + encodeBase64(). + toString(). + tokenize( '\n' )*. + toList()*. + collate( 76 )*. + collect { it.join() }. + flatten(). + join( '\n' )} +""" +}} +%> + +--nfcoremimeboundary-- diff --git a/bin/markdown_to_html.r b/bin/markdown_to_html.r new file mode 100755 index 00000000..abe13350 --- /dev/null +++ b/bin/markdown_to_html.r @@ -0,0 +1,51 @@ +#!/usr/bin/env Rscript + +# Command line argument processing +args = commandArgs(trailingOnly=TRUE) +if (length(args) < 2) { + stop("Usage: markdown_to_html.r ", call.=FALSE) +} +markdown_fn <- args[1] +output_fn <- args[2] + +# Load / install packages +if (!require("markdown")) { + install.packages("markdown", dependencies=TRUE, repos='http://cloud.r-project.org/') + library("markdown") +} + +base_css_fn <- getOption("markdown.HTML.stylesheet") +base_css <- readChar(base_css_fn, file.info(base_css_fn)$size) +custom_css <- paste(base_css, " +body { + padding: 3em; + margin-right: 350px; + max-width: 100%; +} +#toc { + position: fixed; + right: 20px; + width: 300px; + padding-top: 20px; + overflow: scroll; + height: calc(100% - 3em - 20px); +} +#toc_header { + font-size: 1.8em; + font-weight: bold; +} +#toc > ul { + padding-left: 0; + list-style-type: none; +} +#toc > ul ul { padding-left: 20px; } +#toc > ul > li > a { display: none; } +img { max-width: 800px; } +") + +markdownToHTML( + file = markdown_fn, + output = output_fn, + stylesheet = custom_css, + options = c('toc', 'base64_images', 'highlight_code') +) diff --git a/bin/scrape_software_versions.py b/bin/scrape_software_versions.py new file mode 100755 index 00000000..af40520b --- /dev/null +++ b/bin/scrape_software_versions.py @@ -0,0 +1,52 @@ +#!/usr/bin/env python +from __future__ import print_function +from collections import OrderedDict +import re + +# TODO nf-core: Add additional regexes for new tools in process get_software_versions +regexes = { + 'nf-core/bamtofastq': ['v_pipeline.txt', r"(\S+)"], + 'Nextflow': ['v_nextflow.txt', r"(\S+)"], + 'FastQC': ['v_fastqc.txt', r"FastQC v(\S+)"], + 'MultiQC': ['v_multiqc.txt', r"multiqc, version (\S+)"], +} +results = OrderedDict() +results['nf-core/bamtofastq'] = 'N/A' +results['Nextflow'] = 'N/A' +results['FastQC'] = 'N/A' +results['MultiQC'] = 'N/A' + +# Search each file using its regex +for k, v in regexes.items(): + try: + with open(v[0]) as x: + versions = x.read() + match = re.search(v[1], versions) + if match: + results[k] = "v{}".format(match.group(1)) + except IOError: + results[k] = False + +# Remove software set to false in results +for k in results: + if not results[k]: + del(results[k]) + +# Dump to YAML +print (''' +id: 'software_versions' +section_name: 'nf-core/bamtofastq Software Versions' +section_href: 'https://github.com/nf-core/bamtofastq' +plot_type: 'html' +description: 'are collected at run time from the software output.' +data: | +
+''') +for k,v in results.items(): + print("
{}
{}
".format(k,v)) +print ("
") + +# Write out regexes as csv file: +with open('software_versions.csv', 'w') as f: + for k,v in results.items(): + f.write("{}\t{}\n".format(k,v)) diff --git a/conf/awsbatch.config b/conf/awsbatch.config new file mode 100644 index 00000000..14af5866 --- /dev/null +++ b/conf/awsbatch.config @@ -0,0 +1,18 @@ +/* + * ------------------------------------------------- + * Nextflow config file for running on AWS batch + * ------------------------------------------------- + * Base config needed for running with -profile awsbatch + */ +params { + config_profile_name = 'AWSBATCH' + config_profile_description = 'AWSBATCH Cloud Profile' + config_profile_contact = 'Alexander Peltzer (@apeltzer)' + config_profile_url = 'https://aws.amazon.com/de/batch/' +} + +aws.region = params.awsregion +process.executor = 'awsbatch' +process.queue = params.awsqueue +executor.awscli = '/home/ec2-user/miniconda/bin/aws' +params.tracedir = './' diff --git a/conf/base.config b/conf/base.config new file mode 100644 index 00000000..df6d4bc5 --- /dev/null +++ b/conf/base.config @@ -0,0 +1,58 @@ +/* + * ------------------------------------------------- + * nf-core/bamtofastq Nextflow base config file + * ------------------------------------------------- + * A 'blank slate' config file, appropriate for general + * use on most high performace compute environments. + * Assumes that all software is installed and available + * on the PATH. Runs in `local` mode - all jobs will be + * run on the logged in environment. + */ + +process { + + // TODO nf-core: Check the defaults for all processes + cpus = { check_max( 1 * task.attempt, 'cpus' ) } + memory = { check_max( 7.GB * task.attempt, 'memory' ) } + time = { check_max( 4.h * task.attempt, 'time' ) } + + errorStrategy = { task.exitStatus in [143,137,104,134,139] ? 'retry' : 'finish' } + maxRetries = 1 + maxErrors = '-1' + + // Process-specific resource requirements + // NOTE - Only one of the labels below are used in the fastqc process in the main script. + // If possible, it would be nice to keep the same label naming convention when + // adding in your processes. + // TODO nf-core: Customise requirements for specific processes. + // See https://www.nextflow.io/docs/latest/config.html#config-process-selectors + withLabel:process_low { + cpus = { check_max( 2 * task.attempt, 'cpus' ) } + memory = { check_max( 14.GB * task.attempt, 'memory' ) } + time = { check_max( 6.h * task.attempt, 'time' ) } + } + withLabel:process_medium { + cpus = { check_max( 6 * task.attempt, 'cpus' ) } + memory = { check_max( 42.GB * task.attempt, 'memory' ) } + time = { check_max( 8.h * task.attempt, 'time' ) } + } + withLabel:process_high { + cpus = { check_max( 12 * task.attempt, 'cpus' ) } + memory = { check_max( 84.GB * task.attempt, 'memory' ) } + time = { check_max( 10.h * task.attempt, 'time' ) } + } + withLabel:process_long { + time = { check_max( 20.h * task.attempt, 'time' ) } + } + withName:get_software_versions { + cache = false + } +} + +params { + // Defaults only, expecting to be overwritten + max_memory = 128.GB + max_cpus = 16 + max_time = 240.h + igenomes_base = 's3://ngi-igenomes/igenomes/' +} diff --git a/conf/igenomes.config b/conf/igenomes.config new file mode 100644 index 00000000..392f2507 --- /dev/null +++ b/conf/igenomes.config @@ -0,0 +1,192 @@ +/* + * ------------------------------------------------- + * Nextflow config file for iGenomes paths + * ------------------------------------------------- + * Defines reference genomes, using iGenome paths + * Can be used by any config that customises the base + * path using $params.igenomes_base / --igenomes_base + */ + +params { + // illumina iGenomes reference file paths + // TODO nf-core: Add new reference types and strip out those that are not needed + genomes { + 'GRCh37' { + bed12 = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/BWAIndex/" + } + 'GRCm38' { + bed12 = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCh37/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCh37/Sequence/BWAIndex/" + } + 'TAIR10' { + bed12 = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Arabidopsis_thaliana/Ensembl/TAIR10/Sequence/BWAIndex/" + } + 'EB2' { + bed12 = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Bacillus_subtilis_168/Ensembl/EB2/Sequence/BWAIndex/" + } + 'UMD3.1' { + bed12 = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Sequence/BWAIndex/" + + } + 'WBcel235' { + bed12 = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/BWAIndex/" + } + 'CanFam3.1' { + bed12 = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Sequence/BWAIndex/" + } + 'GRCz10' { + bed12 = "${params.igenomes_base}/Danio_rerio/Ensembl/GRCz10/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Danio_rerio/Ensembl/GRCz10/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Danio_rerio/Ensembl/GRCz10/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Danio_rerio/Ensembl/GRCz10/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Danio_rerio/Ensembl/GRCz10/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Danio_rerio/Ensembl/GRCz10/Sequence/BWAIndex/" + } + 'BDGP6' { + bed12 = "${params.igenomes_base}/Drosophila_melanogaster/Ensembl/BDGP6/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Drosophila_melanogaster/Ensembl/BDGP6/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Drosophila_melanogaster/Ensembl/BDGP6/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Drosophila_melanogaster/Ensembl/BDGP6/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Drosophila_melanogaster/Ensembl/BDGP6/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Drosophila_melanogaster/Ensembl/BDGP6/Sequence/BWAIndex/" + } + 'EquCab2' { + bed12 = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Equus_caballus/Ensembl/EquCab2/Sequence/BWAIndex/" + } + 'EB1' { + bed12 = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Escherichia_coli_K_12_DH10B/Ensembl/EB1/Sequence/BWAIndex/" + } + 'Galgal4' { + bed12 = "${params.igenomes_base}/Gallus_gallus/Ensembl/Galgal4/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Gallus_gallus/Ensembl/Galgal4/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Gallus_gallus/Ensembl/Galgal4/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Gallus_gallus/Ensembl/Galgal4/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Gallus_gallus/Ensembl/Galgal4/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Gallus_gallus/Ensembl/Galgal4/Sequence/BWAIndex/" + } + 'Gm01' { + bed12 = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Sequence/BWAIndex/" + } + 'Mmul_1' { + bed12 = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Sequence/BWAIndex/" + } + 'IRGSP-1.0' { + bed12 = "${params.igenomes_base}/Oryza_sativa_japonica/Ensembl/IRGSP-1.0/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Oryza_sativa_japonica/Ensembl/IRGSP-1.0/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Oryza_sativa_japonica/Ensembl/IRGSP-1.0/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Oryza_sativa_japonica/Ensembl/IRGSP-1.0/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Oryza_sativa_japonica/Ensembl/IRGSP-1.0/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Oryza_sativa_japonica/Ensembl/IRGSP-1.0/Sequence/BWAIndex/" + } + 'CHIMP2.1.4' { + bed12 = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Pan_troglodytes/Ensembl/CHIMP2.1.4/Sequence/BWAIndex/" + } + 'Rnor_6.0' { + bed12 = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_6.0/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_6.0/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_6.0/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_6.0/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_6.0/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Rattus_norvegicus/Ensembl/Rnor_6.0/Sequence/BWAIndex/" + } + 'R64-1-1' { + bed12 = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Sequence/BWAIndex/" + } + 'EF2' { + bed12 = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Schizosaccharomyces_pombe/Ensembl/EF2/Sequence/BWAIndex/" + } + 'Sbi1' { + bed12 = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Sorghum_bicolor/Ensembl/Sbi1/Sequence/BWAIndex/" + } + 'Sscrofa10.2' { + bed12 = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Sus_scrofa/Ensembl/Sscrofa10.2/Sequence/BWAIndex/" + } + 'AGPv3' { + bed12 = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Annotation/Genes/genes.bed" + fasta = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Sequence/WholeGenomeFasta/genome.fa" + gtf = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Annotation/Genes/genes.gtf" + star = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Sequence/STARIndex/" + bowtie2 = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Sequence/Bowtie2Index/" + bwa = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Sequence/BWAIndex/" + } + } +} diff --git a/conf/test.config b/conf/test.config new file mode 100644 index 00000000..cf9e7f01 --- /dev/null +++ b/conf/test.config @@ -0,0 +1,26 @@ +/* + * ------------------------------------------------- + * Nextflow config file for running tests + * ------------------------------------------------- + * Defines bundled input files and everything required + * to run a fast and simple test. Use as follows: + * nextflow run nf-core/bamtofastq -profile test + */ + +params { + config_profile_name = 'Test profile' + config_profile_description = 'Minimal test dataset to check pipeline function' + // Limit resources so that this can run on Travis + max_cpus = 2 + max_memory = 6.GB + max_time = 48.h + + // Input data + // TODO nf-core: Specify the paths to your test data on nf-core/test-datasets + // TODO nf-core: Give any required params for the test so that command line flags are not needed + singleEnd = false + readPaths = [ + ['Testdata', ['https://github.com/nf-core/test-datasets/raw/exoseq/testdata/Testdata_R1.tiny.fastq.gz', 'https://github.com/nf-core/test-datasets/raw/exoseq/testdata/Testdata_R2.tiny.fastq.gz']], + ['SRR389222', ['https://github.com/nf-core/test-datasets/raw/methylseq/testdata/SRR389222_sub1.fastq.gz', 'https://github.com/nf-core/test-datasets/raw/methylseq/testdata/SRR389222_sub2.fastq.gz']] + ] +} diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 00000000..7bfe2d69 --- /dev/null +++ b/docs/README.md @@ -0,0 +1,12 @@ +# nf-core/bamtofastq: Documentation + +The nf-core/bamtofastq documentation is split into the following files: + +1. [Installation](https://nf-co.re/usage/installation) +2. Pipeline configuration + * [Local installation](https://nf-co.re/usage/local_installation) + * [Adding your own system config](https://nf-co.re/usage/adding_own_config) + * [Reference genomes](https://nf-co.re/usage/reference_genomes) +3. [Running the pipeline](usage.md) +4. [Output and how to interpret the results](output.md) +5. [Troubleshooting](https://nf-co.re/usage/troubleshooting) diff --git a/docs/images/nf-core-bamtofastq_logo.png b/docs/images/nf-core-bamtofastq_logo.png new file mode 100644 index 00000000..f8c2399c Binary files /dev/null and b/docs/images/nf-core-bamtofastq_logo.png differ diff --git a/docs/output.md b/docs/output.md new file mode 100644 index 00000000..367e2192 --- /dev/null +++ b/docs/output.md @@ -0,0 +1,41 @@ +# nf-core/bamtofastq: Output + +This document describes the output produced by the pipeline. Most of the plots are taken from the MultiQC report, which summarises results at the end of the pipeline. + + + +## Pipeline overview +The pipeline is built using [Nextflow](https://www.nextflow.io/) +and processes data using the following steps: + +* [FastQC](#fastqc) - read quality control +* [MultiQC](#multiqc) - aggregate report, describing results of the whole pipeline + +## FastQC +[FastQC](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) gives general quality metrics about your reads. It provides information about the quality score distribution across your reads, the per base sequence content (%T/A/G/C). You get information about adapter contamination and other overrepresented sequences. + +For further reading and documentation see the [FastQC help](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/). + +> **NB:** The FastQC plots displayed in the MultiQC report shows _untrimmed_ reads. They may contain adapter sequence and potentially regions with low quality. To see how your reads look after trimming, look at the FastQC reports in the `trim_galore` directory. + +**Output directory: `results/fastqc`** + +* `sample_fastqc.html` + * FastQC report, containing quality metrics for your untrimmed raw fastq files +* `zips/sample_fastqc.zip` + * zip file containing the FastQC report, tab-delimited data file and plot images + + +## MultiQC +[MultiQC](http://multiqc.info) is a visualisation tool that generates a single HTML report summarising all samples in your project. Most of the pipeline QC results are visualised in the report and further statistics are available in within the report data directory. + +The pipeline has special steps which allow the software versions used to be reported in the MultiQC output for future traceability. + +**Output directory: `results/multiqc`** + +* `Project_multiqc_report.html` + * MultiQC report - a standalone HTML file that can be viewed in your web browser +* `Project_multiqc_data/` + * Directory containing parsed statistics from the different tools used in the pipeline + +For more information about how to use MultiQC reports, see [http://multiqc.info](http://multiqc.info) diff --git a/docs/usage.md b/docs/usage.md new file mode 100644 index 00000000..d4f9930e --- /dev/null +++ b/docs/usage.md @@ -0,0 +1,286 @@ +# nf-core/bamtofastq: Usage + +## Table of contents + + + +* [Table of contents](#table-of-contents) +* [Introduction](#introduction) +* [Running the pipeline](#running-the-pipeline) + * [Updating the pipeline](#updating-the-pipeline) + * [Reproducibility](#reproducibility) +* [Main arguments](#main-arguments) + * [`-profile`](#-profile) + * [`--reads`](#--reads) + * [`--singleEnd`](#--singleend) +* [Reference genomes](#reference-genomes) + * [`--genome` (using iGenomes)](#--genome-using-igenomes) + * [`--fasta`](#--fasta) + * [`--igenomesIgnore`](#--igenomesignore) +* [Job resources](#job-resources) + * [Automatic resubmission](#automatic-resubmission) + * [Custom resource requests](#custom-resource-requests) +* [AWS Batch specific parameters](#aws-batch-specific-parameters) + * [`--awsqueue`](#--awsqueue) + * [`--awsregion`](#--awsregion) +* [Other command line parameters](#other-command-line-parameters) + * [`--outdir`](#--outdir) + * [`--email`](#--email) + * [`--email_on_fail`](#--email_on_fail) + * [`-name`](#-name) + * [`-resume`](#-resume) + * [`-c`](#-c) + * [`--custom_config_version`](#--custom_config_version) + * [`--custom_config_base`](#--custom_config_base) + * [`--max_memory`](#--max_memory) + * [`--max_time`](#--max_time) + * [`--max_cpus`](#--max_cpus) + * [`--plaintext_email`](#--plaintext_email) + * [`--monochrome_logs`](#--monochrome_logs) + * [`--multiqc_config`](#--multiqc_config) + + + +## Introduction +Nextflow handles job submissions on SLURM or other environments, and supervises running the jobs. Thus the Nextflow process must run until the pipeline is finished. We recommend that you put the process running in the background through `screen` / `tmux` or similar tool. Alternatively you can run nextflow within a cluster job submitted your job scheduler. + +It is recommended to limit the Nextflow Java virtual machines memory. We recommend adding the following line to your environment (typically in `~/.bashrc` or `~./bash_profile`): + +```bash +NXF_OPTS='-Xms1g -Xmx4g' +``` + + + +## Running the pipeline +The typical command for running the pipeline is as follows: + +```bash +nextflow run nf-core/bamtofastq --reads '*_R{1,2}.fastq.gz' -profile docker +``` + +This will launch the pipeline with the `docker` configuration profile. See below for more information about profiles. + +Note that the pipeline will create the following files in your working directory: + +```bash +work # Directory containing the nextflow working files +results # Finished results (configurable, see below) +.nextflow_log # Log file from Nextflow +# Other nextflow hidden files, eg. history of pipeline runs and old logs. +``` + +### Updating the pipeline +When you run the above command, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use the cached version if available - even if the pipeline has been updated since. To make sure that you're running the latest version of the pipeline, make sure that you regularly update the cached version of the pipeline: + +```bash +nextflow pull nf-core/bamtofastq +``` + +### Reproducibility +It's a good idea to specify a pipeline version when running the pipeline on your data. This ensures that a specific version of the pipeline code and software are used when you run your pipeline. If you keep using the same tag, you'll be running the same version of the pipeline, even if there have been changes to the code since. + +First, go to the [nf-core/bamtofastq releases page](https://github.com/nf-core/bamtofastq/releases) and find the latest version number - numeric only (eg. `1.3.1`). Then specify this when running the pipeline with `-r` (one hyphen) - eg. `-r 1.3.1`. + +This version number will be logged in reports when you run the pipeline, so that you'll know what you used when you look back in the future. + + +## Main arguments + +### `-profile` +Use this parameter to choose a configuration profile. Profiles can give configuration presets for different compute environments. Note that multiple profiles can be loaded, for example: `-profile docker` - the order of arguments is important! + +If `-profile` is not specified at all the pipeline will be run locally and expects all software to be installed and available on the `PATH`. + +* `awsbatch` + * A generic configuration profile to be used with AWS Batch. +* `conda` + * A generic configuration profile to be used with [conda](https://conda.io/docs/) + * Pulls most software from [Bioconda](https://bioconda.github.io/) +* `docker` + * A generic configuration profile to be used with [Docker](http://docker.com/) + * Pulls software from dockerhub: [`nfcore/bamtofastq`](http://hub.docker.com/r/nfcore/bamtofastq/) +* `singularity` + * A generic configuration profile to be used with [Singularity](http://singularity.lbl.gov/) + * Pulls software from DockerHub: [`nfcore/bamtofastq`](http://hub.docker.com/r/nfcore/bamtofastq/) +* `test` + * A profile with a complete configuration for automated testing + * Includes links to test data so needs no other parameters + + + +### `--reads` +Use this to specify the location of your input FastQ files. For example: + +```bash +--reads 'path/to/data/sample_*_{1,2}.fastq' +``` + +Please note the following requirements: + +1. The path must be enclosed in quotes +2. The path must have at least one `*` wildcard character +3. When using the pipeline with paired end data, the path must use `{1,2}` notation to specify read pairs. + +If left unspecified, a default pattern is used: `data/*{1,2}.fastq.gz` + +### `--singleEnd` +By default, the pipeline expects paired-end data. If you have single-end data, you need to specify `--singleEnd` on the command line when you launch the pipeline. A normal glob pattern, enclosed in quotation marks, can then be used for `--reads`. For example: + +```bash +--singleEnd --reads '*.fastq' +``` + +It is not possible to run a mixture of single-end and paired-end files in one run. + + +## Reference genomes + +The pipeline config files come bundled with paths to the illumina iGenomes reference index files. If running with docker or AWS, the configuration is set up to use the [AWS-iGenomes](https://ewels.github.io/AWS-iGenomes/) resource. + +### `--genome` (using iGenomes) +There are 31 different species supported in the iGenomes references. To run the pipeline, you must specify which to use with the `--genome` flag. + +You can find the keys to specify the genomes in the [iGenomes config file](../conf/igenomes.config). Common genomes that are supported are: + +* Human + * `--genome GRCh37` +* Mouse + * `--genome GRCm38` +* _Drosophila_ + * `--genome BDGP6` +* _S. cerevisiae_ + * `--genome 'R64-1-1'` + +> There are numerous others - check the config file for more. + +Note that you can use the same configuration setup to save sets of reference files for your own use, even if they are not part of the iGenomes resource. See the [Nextflow documentation](https://www.nextflow.io/docs/latest/config.html) for instructions on where to save such a file. + +The syntax for this reference configuration is as follows: + + + +```nextflow +params { + genomes { + 'GRCh37' { + fasta = '' // Used if no star index given + } + // Any number of additional genomes, key is used with --genome + } +} +``` + + +### `--fasta` +If you prefer, you can specify the full path to your reference genome when you run the pipeline: + +```bash +--fasta '[path to Fasta reference]' +``` + +### `--igenomesIgnore` +Do not load `igenomes.config` when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in `igenomes.config`. + +## Job resources +### Automatic resubmission +Each step in the pipeline has a default set of requirements for number of CPUs, memory and time. For most of the steps in the pipeline, if the job exits with an error code of `143` (exceeded requested resources) it will automatically resubmit with higher requests (2 x original, then 3 x original). If it still fails after three times then the pipeline is stopped. + +### Custom resource requests +Wherever process-specific requirements are set in the pipeline, the default value can be changed by creating a custom config file. See the files hosted at [`nf-core/configs`](https://github.com/nf-core/configs/tree/master/conf) for examples. + +If you are likely to be running `nf-core` pipelines regularly it may be a good idea to request that your custom config file is uploaded to the `nf-core/configs` git repository. Before you do this please can you test that the config file works with your pipeline of choice using the `-c` parameter (see definition below). You can then create a pull request to the `nf-core/configs` repository with the addition of your config file, associated documentation file (see examples in [`nf-core/configs/docs`](https://github.com/nf-core/configs/tree/master/docs)), and amending [`nfcore_custom.config`](https://github.com/nf-core/configs/blob/master/nfcore_custom.config) to include your custom profile. + +If you have any questions or issues please send us a message on [Slack](https://nf-co.re/join/slack/). + +## AWS Batch specific parameters +Running the pipeline on AWS Batch requires a couple of specific parameters to be set according to your AWS Batch configuration. Please use the `-awsbatch` profile and then specify all of the following parameters. +### `--awsqueue` +The JobQueue that you intend to use on AWS Batch. +### `--awsregion` +The AWS region to run your job in. Default is set to `eu-west-1` but can be adjusted to your needs. + +Please make sure to also set the `-w/--work-dir` and `--outdir` parameters to a S3 storage bucket of your choice - you'll get an error message notifying you if you didn't. + +## Other command line parameters + + + +### `--outdir` +The output directory where the results will be saved. + +### `--email` +Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (`~/.nextflow/config`) then you don't need to specify this on the command line for every run. + +### `--email_on_fail` +This works exactly as with `--email`, except emails are only sent if the workflow is not successful. + +### `-name` +Name for the pipeline run. If not specified, Nextflow will automatically generate a random mnemonic. + +This is used in the MultiQC report (if not default) and in the summary HTML / e-mail (always). + +**NB:** Single hyphen (core Nextflow option) + +### `-resume` +Specify this when restarting a pipeline. Nextflow will used cached results from any pipeline steps where the inputs are the same, continuing from where it got to previously. + +You can also supply a run name to resume a specific run: `-resume [run-name]`. Use the `nextflow log` command to show previous run names. + +**NB:** Single hyphen (core Nextflow option) + +### `-c` +Specify the path to a specific config file (this is a core NextFlow command). + +**NB:** Single hyphen (core Nextflow option) + +Note - you can use this to override pipeline defaults. + +### `--custom_config_version` +Provide git commit id for custom Institutional configs hosted at `nf-core/configs`. This was implemented for reproducibility purposes. Default is set to `master`. + +```bash +## Download and use config file with following git commid id +--custom_config_version d52db660777c4bf36546ddb188ec530c3ada1b96 +``` + +### `--custom_config_base` +If you're running offline, nextflow will not be able to fetch the institutional config files +from the internet. If you don't need them, then this is not a problem. If you do need them, +you should download the files from the repo and tell nextflow where to find them with the +`custom_config_base` option. For example: + +```bash +## Download and unzip the config files +cd /path/to/my/configs +wget https://github.com/nf-core/configs/archive/master.zip +unzip master.zip + +## Run the pipeline +cd /path/to/my/data +nextflow run /path/to/pipeline/ --custom_config_base /path/to/my/configs/configs-master/ +``` + +> Note that the nf-core/tools helper package has a `download` command to download all required pipeline +> files + singularity containers + institutional configs in one go for you, to make this process easier. + +### `--max_memory` +Use to set a top-limit for the default memory requirement for each process. +Should be a string in the format integer-unit. eg. `--max_memory '8.GB'` + +### `--max_time` +Use to set a top-limit for the default time requirement for each process. +Should be a string in the format integer-unit. eg. `--max_time '2.h'` + +### `--max_cpus` +Use to set a top-limit for the default CPU requirement for each process. +Should be a string in the format integer-unit. eg. `--max_cpus 1` + +### `--plaintext_email` +Set to receive plain-text e-mails instead of HTML formatted. + +### `--monochrome_logs` +Set to disable colourful command line output and live life in monochrome. + +### `--multiqc_config` +Specify a path to a custom MultiQC configuration file. diff --git a/environment.yml b/environment.yml new file mode 100644 index 00000000..ea77b3b8 --- /dev/null +++ b/environment.yml @@ -0,0 +1,13 @@ +# You can use this file to create a conda environment for this pipeline: +# conda env create -f environment.yml +name: nf-core-bamtofastq-1.0dev +channels: + - conda-forge + - bioconda + - defaults +dependencies: + # TODO nf-core: Add required software dependencies here + - bioconda::fastqc=0.11.8 + - bioconda::multiqc=1.7 + - conda-forge::r-markdown=1.1 + - conda-forge::r-base=3.6.1 diff --git a/main.nf b/main.nf new file mode 100644 index 00000000..ed3bf385 --- /dev/null +++ b/main.nf @@ -0,0 +1,421 @@ +#!/usr/bin/env nextflow +/* +======================================================================================== + nf-core/bamtofastq +======================================================================================== + nf-core/bamtofastq Analysis Pipeline. + #### Homepage / Documentation + https://github.com/nf-core/bamtofastq +---------------------------------------------------------------------------------------- +*/ + +def helpMessage() { + // TODO nf-core: Add to this help message with new command line parameters + log.info nfcoreHeader() + log.info""" + + Usage: + + The typical command for running the pipeline is as follows: + + nextflow run nf-core/bamtofastq --reads '*_R{1,2}.fastq.gz' -profile docker + + Mandatory arguments: + --reads Path to input data (must be surrounded with quotes) + -profile Configuration profile to use. Can use multiple (comma separated) + Available: conda, docker, singularity, awsbatch, test and more. + + Options: + --genome Name of iGenomes reference + --singleEnd Specifies that the input is single end reads + + References If not specified in the configuration file or you wish to overwrite any of the references. + --fasta Path to Fasta reference + + Other options: + --outdir The output directory where the results will be saved + --email Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits + --email_on_fail Same as --email, except only send mail if the workflow is not successful + --maxMultiqcEmailFileSize Theshold size for MultiQC report to be attached in notification email. If file generated by pipeline exceeds the threshold, it will not be attached (Default: 25MB) + -name Name for the pipeline run. If not specified, Nextflow will automatically generate a random mnemonic. + + AWSBatch options: + --awsqueue The AWSBatch JobQueue that needs to be set when running on AWSBatch + --awsregion The AWS Region for your AWS Batch job to run on + """.stripIndent() +} + +// Show help message +if (params.help) { + helpMessage() + exit 0 +} + +/* + * SET UP CONFIGURATION VARIABLES + */ + +// Check if genome exists in the config file +if (params.genomes && params.genome && !params.genomes.containsKey(params.genome)) { + exit 1, "The provided genome '${params.genome}' is not available in the iGenomes file. Currently the available genomes are ${params.genomes.keySet().join(", ")}" +} + +// TODO nf-core: Add any reference files that are needed +// Configurable reference genomes +// +// NOTE - THIS IS NOT USED IN THIS PIPELINE, EXAMPLE ONLY +// If you want to use the channel below in a process, define the following: +// input: +// file fasta from ch_fasta +// +params.fasta = params.genome ? params.genomes[ params.genome ].fasta ?: false : false +if (params.fasta) { ch_fasta = file(params.fasta, checkIfExists: true) } + +// Has the run name been specified by the user? +// this has the bonus effect of catching both -name and --name +custom_runName = params.name +if (!(workflow.runName ==~ /[a-z]+_[a-z]+/)) { + custom_runName = workflow.runName +} + +if ( workflow.profile == 'awsbatch') { + // AWSBatch sanity checking + if (!params.awsqueue || !params.awsregion) exit 1, "Specify correct --awsqueue and --awsregion parameters on AWSBatch!" + // Check outdir paths to be S3 buckets if running on AWSBatch + // related: https://github.com/nextflow-io/nextflow/issues/813 + if (!params.outdir.startsWith('s3:')) exit 1, "Outdir not on S3 - specify S3 Bucket to run on AWSBatch!" + // Prevent trace files to be stored on S3 since S3 does not support rolling files. + if (workflow.tracedir.startsWith('s3:')) exit 1, "Specify a local tracedir or run without trace! S3 cannot be used for tracefiles." +} + +// Stage config files +ch_multiqc_config = file(params.multiqc_config, checkIfExists: true) +ch_output_docs = file("$baseDir/docs/output.md", checkIfExists: true) + +/* + * Create a channel for input read files + */ +if (params.readPaths) { + if (params.singleEnd) { + Channel + .from(params.readPaths) + .map { row -> [ row[0], [ file(row[1][0], checkIfExists: true) ] ] } + .ifEmpty { exit 1, "params.readPaths was empty - no input files supplied" } + .into { read_files_fastqc; read_files_trimming } + } else { + Channel + .from(params.readPaths) + .map { row -> [ row[0], [ file(row[1][0], checkIfExists: true), file(row[1][1], checkIfExists: true) ] ] } + .ifEmpty { exit 1, "params.readPaths was empty - no input files supplied" } + .into { read_files_fastqc; read_files_trimming } + } +} else { + Channel + .fromFilePairs( params.reads, size: params.singleEnd ? 1 : 2 ) + .ifEmpty { exit 1, "Cannot find any reads matching: ${params.reads}\nNB: Path needs to be enclosed in quotes!\nIf this is single-end data, please specify --singleEnd on the command line." } + .into { read_files_fastqc; read_files_trimming } +} + +// Header log info +log.info nfcoreHeader() +def summary = [:] +if (workflow.revision) summary['Pipeline Release'] = workflow.revision +summary['Run Name'] = custom_runName ?: workflow.runName +// TODO nf-core: Report custom parameters here +summary['Reads'] = params.reads +summary['Fasta Ref'] = params.fasta +summary['Data Type'] = params.singleEnd ? 'Single-End' : 'Paired-End' +summary['Max Resources'] = "$params.max_memory memory, $params.max_cpus cpus, $params.max_time time per job" +if (workflow.containerEngine) summary['Container'] = "$workflow.containerEngine - $workflow.container" +summary['Output dir'] = params.outdir +summary['Launch dir'] = workflow.launchDir +summary['Working dir'] = workflow.workDir +summary['Script dir'] = workflow.projectDir +summary['User'] = workflow.userName +if (workflow.profile == 'awsbatch') { + summary['AWS Region'] = params.awsregion + summary['AWS Queue'] = params.awsqueue +} +summary['Config Profile'] = workflow.profile +if (params.config_profile_description) summary['Config Description'] = params.config_profile_description +if (params.config_profile_contact) summary['Config Contact'] = params.config_profile_contact +if (params.config_profile_url) summary['Config URL'] = params.config_profile_url +if (params.email || params.email_on_fail) { + summary['E-mail Address'] = params.email + summary['E-mail on failure'] = params.email_on_fail + summary['MultiQC maxsize'] = params.maxMultiqcEmailFileSize +} +log.info summary.collect { k,v -> "${k.padRight(18)}: $v" }.join("\n") +log.info "-\033[2m--------------------------------------------------\033[0m-" + +// Check the hostnames against configured profiles +checkHostname() + +def create_workflow_summary(summary) { + def yaml_file = workDir.resolve('workflow_summary_mqc.yaml') + yaml_file.text = """ + id: 'nf-core-bamtofastq-summary' + description: " - this information is collected when the pipeline is started." + section_name: 'nf-core/bamtofastq Workflow Summary' + section_href: 'https://github.com/nf-core/bamtofastq' + plot_type: 'html' + data: | +
+${summary.collect { k,v -> "
$k
${v ?: 'N/A'}
" }.join("\n")} +
+ """.stripIndent() + + return yaml_file +} + +/* + * Parse software version numbers + */ +process get_software_versions { + publishDir "${params.outdir}/pipeline_info", mode: 'copy', + saveAs: { filename -> + if (filename.indexOf(".csv") > 0) filename + else null + } + + output: + file 'software_versions_mqc.yaml' into software_versions_yaml + file "software_versions.csv" + + script: + // TODO nf-core: Get all tools to print their version number here + """ + echo $workflow.manifest.version > v_pipeline.txt + echo $workflow.nextflow.version > v_nextflow.txt + fastqc --version > v_fastqc.txt + multiqc --version > v_multiqc.txt + scrape_software_versions.py &> software_versions_mqc.yaml + """ +} + +/* + * STEP 1 - FastQC + */ +process fastqc { + tag "$name" + label 'process_medium' + publishDir "${params.outdir}/fastqc", mode: 'copy', + saveAs: { filename -> filename.indexOf(".zip") > 0 ? "zips/$filename" : "$filename" } + + input: + set val(name), file(reads) from read_files_fastqc + + output: + file "*_fastqc.{zip,html}" into fastqc_results + + script: + """ + fastqc --quiet --threads $task.cpus $reads + """ +} + +/* + * STEP 2 - MultiQC + */ +process multiqc { + publishDir "${params.outdir}/MultiQC", mode: 'copy' + + input: + file multiqc_config from ch_multiqc_config + // TODO nf-core: Add in log files from your new processes for MultiQC to find! + file ('fastqc/*') from fastqc_results.collect().ifEmpty([]) + file ('software_versions/*') from software_versions_yaml.collect() + file workflow_summary from create_workflow_summary(summary) + + output: + file "*multiqc_report.html" into multiqc_report + file "*_data" + file "multiqc_plots" + + script: + rtitle = custom_runName ? "--title \"$custom_runName\"" : '' + rfilename = custom_runName ? "--filename " + custom_runName.replaceAll('\\W','_').replaceAll('_+','_') + "_multiqc_report" : '' + // TODO nf-core: Specify which MultiQC modules to use with -m for a faster run time + """ + multiqc -f $rtitle $rfilename --config $multiqc_config . + """ +} + +/* + * STEP 3 - Output Description HTML + */ +process output_documentation { + publishDir "${params.outdir}/pipeline_info", mode: 'copy' + + input: + file output_docs from ch_output_docs + + output: + file "results_description.html" + + script: + """ + markdown_to_html.r $output_docs results_description.html + """ +} + +/* + * Completion e-mail notification + */ +workflow.onComplete { + + // Set up the e-mail variables + def subject = "[nf-core/bamtofastq] Successful: $workflow.runName" + if (!workflow.success) { + subject = "[nf-core/bamtofastq] FAILED: $workflow.runName" + } + def email_fields = [:] + email_fields['version'] = workflow.manifest.version + email_fields['runName'] = custom_runName ?: workflow.runName + email_fields['success'] = workflow.success + email_fields['dateComplete'] = workflow.complete + email_fields['duration'] = workflow.duration + email_fields['exitStatus'] = workflow.exitStatus + email_fields['errorMessage'] = (workflow.errorMessage ?: 'None') + email_fields['errorReport'] = (workflow.errorReport ?: 'None') + email_fields['commandLine'] = workflow.commandLine + email_fields['projectDir'] = workflow.projectDir + email_fields['summary'] = summary + email_fields['summary']['Date Started'] = workflow.start + email_fields['summary']['Date Completed'] = workflow.complete + email_fields['summary']['Pipeline script file path'] = workflow.scriptFile + email_fields['summary']['Pipeline script hash ID'] = workflow.scriptId + if (workflow.repository) email_fields['summary']['Pipeline repository Git URL'] = workflow.repository + if (workflow.commitId) email_fields['summary']['Pipeline repository Git Commit'] = workflow.commitId + if (workflow.revision) email_fields['summary']['Pipeline Git branch/tag'] = workflow.revision + if (workflow.container) email_fields['summary']['Docker image'] = workflow.container + email_fields['summary']['Nextflow Version'] = workflow.nextflow.version + email_fields['summary']['Nextflow Build'] = workflow.nextflow.build + email_fields['summary']['Nextflow Compile Timestamp'] = workflow.nextflow.timestamp + + // TODO nf-core: If not using MultiQC, strip out this code (including params.maxMultiqcEmailFileSize) + // On success try attach the multiqc report + def mqc_report = null + try { + if (workflow.success) { + mqc_report = multiqc_report.getVal() + if (mqc_report.getClass() == ArrayList) { + log.warn "[nf-core/bamtofastq] Found multiple reports from process 'multiqc', will use only one" + mqc_report = mqc_report[0] + } + } + } catch (all) { + log.warn "[nf-core/bamtofastq] Could not attach MultiQC report to summary email" + } + + // Check if we are only sending emails on failure + email_address = params.email + if (!params.email && params.email_on_fail && !workflow.success) { + email_address = params.email_on_fail + } + + // Render the TXT template + def engine = new groovy.text.GStringTemplateEngine() + def tf = new File("$baseDir/assets/email_template.txt") + def txt_template = engine.createTemplate(tf).make(email_fields) + def email_txt = txt_template.toString() + + // Render the HTML template + def hf = new File("$baseDir/assets/email_template.html") + def html_template = engine.createTemplate(hf).make(email_fields) + def email_html = html_template.toString() + + // Render the sendmail template + def smail_fields = [ email: email_address, subject: subject, email_txt: email_txt, email_html: email_html, baseDir: "$baseDir", mqcFile: mqc_report, mqcMaxSize: params.maxMultiqcEmailFileSize.toBytes() ] + def sf = new File("$baseDir/assets/sendmail_template.txt") + def sendmail_template = engine.createTemplate(sf).make(smail_fields) + def sendmail_html = sendmail_template.toString() + + // Send the HTML e-mail + if (email_address) { + try { + if ( params.plaintext_email ){ throw GroovyException('Send plaintext e-mail, not HTML') } + // Try to send HTML e-mail using sendmail + [ 'sendmail', '-t' ].execute() << sendmail_html + log.info "[nf-core/bamtofastq] Sent summary e-mail to $email_address (sendmail)" + } catch (all) { + // Catch failures and try with plaintext + [ 'mail', '-s', subject, email_address ].execute() << email_txt + log.info "[nf-core/bamtofastq] Sent summary e-mail to $email_address (mail)" + } + } + + // Write summary e-mail HTML to a file + def output_d = new File( "${params.outdir}/pipeline_info/" ) + if (!output_d.exists()) { + output_d.mkdirs() + } + def output_hf = new File( output_d, "pipeline_report.html" ) + output_hf.withWriter { w -> w << email_html } + def output_tf = new File( output_d, "pipeline_report.txt" ) + output_tf.withWriter { w -> w << email_txt } + + c_reset = params.monochrome_logs ? '' : "\033[0m"; + c_purple = params.monochrome_logs ? '' : "\033[0;35m"; + c_green = params.monochrome_logs ? '' : "\033[0;32m"; + c_red = params.monochrome_logs ? '' : "\033[0;31m"; + + if (workflow.stats.ignoredCount > 0 && workflow.success) { + log.info "${c_purple}Warning, pipeline completed, but with errored process(es) ${c_reset}" + log.info "${c_red}Number of ignored errored process(es) : ${workflow.stats.ignoredCount} ${c_reset}" + log.info "${c_green}Number of successfully ran process(es) : ${workflow.stats.succeedCount} ${c_reset}" + } + + if (workflow.success) { + log.info "${c_purple}[nf-core/bamtofastq]${c_green} Pipeline completed successfully${c_reset}" + } else { + checkHostname() + log.info "${c_purple}[nf-core/bamtofastq]${c_red} Pipeline completed with errors${c_reset}" + } + +} + + +def nfcoreHeader(){ + // Log colors ANSI codes + c_reset = params.monochrome_logs ? '' : "\033[0m"; + c_dim = params.monochrome_logs ? '' : "\033[2m"; + c_black = params.monochrome_logs ? '' : "\033[0;30m"; + c_green = params.monochrome_logs ? '' : "\033[0;32m"; + c_yellow = params.monochrome_logs ? '' : "\033[0;33m"; + c_blue = params.monochrome_logs ? '' : "\033[0;34m"; + c_purple = params.monochrome_logs ? '' : "\033[0;35m"; + c_cyan = params.monochrome_logs ? '' : "\033[0;36m"; + c_white = params.monochrome_logs ? '' : "\033[0;37m"; + + return """ -${c_dim}--------------------------------------------------${c_reset}- + ${c_green},--.${c_black}/${c_green},-.${c_reset} + ${c_blue} ___ __ __ __ ___ ${c_green}/,-._.--~\'${c_reset} + ${c_blue} |\\ | |__ __ / ` / \\ |__) |__ ${c_yellow}} {${c_reset} + ${c_blue} | \\| | \\__, \\__/ | \\ |___ ${c_green}\\`-._,-`-,${c_reset} + ${c_green}`._,._,\'${c_reset} + ${c_purple} nf-core/bamtofastq v${workflow.manifest.version}${c_reset} + -${c_dim}--------------------------------------------------${c_reset}- + """.stripIndent() +} + +def checkHostname(){ + def c_reset = params.monochrome_logs ? '' : "\033[0m" + def c_white = params.monochrome_logs ? '' : "\033[0;37m" + def c_red = params.monochrome_logs ? '' : "\033[1;91m" + def c_yellow_bold = params.monochrome_logs ? '' : "\033[1;93m" + if (params.hostnames) { + def hostname = "hostname".execute().text.trim() + params.hostnames.each { prof, hnames -> + hnames.each { hname -> + if (hostname.contains(hname) && !workflow.profile.contains(prof)) { + log.error "====================================================\n" + + " ${c_red}WARNING!${c_reset} You are running with `-profile $workflow.profile`\n" + + " but your machine hostname is ${c_white}'$hostname'${c_reset}\n" + + " ${c_yellow_bold}It's highly recommended that you use `-profile $prof${c_reset}`\n" + + "============================================================" + } + } + } + } +} diff --git a/nextflow.config b/nextflow.config new file mode 100644 index 00000000..b05612dd --- /dev/null +++ b/nextflow.config @@ -0,0 +1,134 @@ +/* + * ------------------------------------------------- + * nf-core/bamtofastq Nextflow config file + * ------------------------------------------------- + * Default config options for all environments. + */ + +// Global default params, used in configs +params { + + // Workflow flags + // TODO nf-core: Specify your pipeline's command line flags + genome = false + reads = "data/*{1,2}.fastq.gz" + singleEnd = false + outdir = './results' + + // Boilerplate options + name = false + multiqc_config = "$baseDir/assets/multiqc_config.yaml" + email = false + email_on_fail = false + maxMultiqcEmailFileSize = 25.MB + plaintext_email = false + monochrome_logs = false + help = false + igenomes_base = "./iGenomes" + tracedir = "${params.outdir}/pipeline_info" + awsqueue = false + awsregion = 'eu-west-1' + igenomesIgnore = false + custom_config_version = 'master' + custom_config_base = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}" + hostnames = false + config_profile_description = false + config_profile_contact = false + config_profile_url = false +} + +// Container slug. Stable releases should specify release tag! +// Developmental code should specify :dev +process.container = 'nfcore/bamtofastq:dev' + +// Load base.config by default for all pipelines +includeConfig 'conf/base.config' + +// Load nf-core custom profiles from different Institutions +try { + includeConfig "${params.custom_config_base}/nfcore_custom.config" +} catch (Exception e) { + System.err.println("WARNING: Could not load nf-core/config profiles: ${params.custom_config_base}/nfcore_custom.config") +} + +profiles { + awsbatch { includeConfig 'conf/awsbatch.config' } + conda { process.conda = "$baseDir/environment.yml" } + debug { process.beforeScript = 'echo $HOSTNAME' } + docker { docker.enabled = true } + singularity { singularity.enabled = true } + test { includeConfig 'conf/test.config' } +} + +// Avoid this error: +// WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap. +// Testing this in nf-core after discussion here https://github.com/nf-core/tools/pull/351, once this is established and works well, nextflow might implement this behavior as new default. +docker.runOptions = '-u \$(id -u):\$(id -g)' + +// Load igenomes.config if required +if (!params.igenomesIgnore) { + includeConfig 'conf/igenomes.config' +} + +// Capture exit codes from upstream processes when piping +process.shell = ['/bin/bash', '-euo', 'pipefail'] + +timeline { + enabled = true + file = "${params.tracedir}/execution_timeline.html" +} +report { + enabled = true + file = "${params.tracedir}/execution_report.html" +} +trace { + enabled = true + file = "${params.tracedir}/execution_trace.txt" +} +dag { + enabled = true + file = "${params.tracedir}/pipeline_dag.svg" +} + +manifest { + name = 'nf-core/bamtofastq' + author = 'Friederike Hanssen' + homePage = 'https://github.com/nf-core/bamtofastq' + description = 'Workflow converts one or multiple bam files back to the fastq format' + mainScript = 'main.nf' + nextflowVersion = '>=0.32.0' + version = '1.0dev' +} + +// Function to ensure that resource requirements don't go beyond +// a maximum limit +def check_max(obj, type) { + if (type == 'memory') { + try { + if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1) + return params.max_memory as nextflow.util.MemoryUnit + else + return obj + } catch (all) { + println " ### ERROR ### Max memory '${params.max_memory}' is not valid! Using default value: $obj" + return obj + } + } else if (type == 'time') { + try { + if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1) + return params.max_time as nextflow.util.Duration + else + return obj + } catch (all) { + println " ### ERROR ### Max time '${params.max_time}' is not valid! Using default value: $obj" + return obj + } + } else if (type == 'cpus') { + try { + return Math.min( obj, params.max_cpus as int ) + } catch (all) { + println " ### ERROR ### Max cpus '${params.max_cpus}' is not valid! Using default value: $obj" + return obj + } + } +}