Skip to content

Commit

Permalink
Fixed CHANGELOG conflict
Browse files Browse the repository at this point in the history
  • Loading branch information
willbradshaw committed Jan 6, 2025
2 parents 719f3b5 + 8d58823 commit 4d2bb4c
Show file tree
Hide file tree
Showing 46 changed files with 804 additions and 126 deletions.
32 changes: 32 additions & 0 deletions .github/workflows/end-to-end-se.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: End-to-end MGS workflow test for single-end run

on: [pull_request]

jobs:
test-run-dev-se:
runs-on: ubuntu-latest
timeout-minutes: 10

steps:
- name: Checkout
uses: actions/checkout@v4


- name: Set up JDK 11
uses: actions/setup-java@v4
with:
java-version: '11'
distribution: 'adopt'

- name: Setup Nextflow latest (stable)
uses: nf-core/setup-nextflow@v1
with:
version: "latest"

- name: Install nf-test
run: |
wget -qO- https://get.nf-test.com | bash
sudo mv nf-test /usr/local/bin/
- name: Run run_dev_se workflow
run: nf-test test --tag run_dev_se --verbose
87 changes: 76 additions & 11 deletions .github/workflows/end-to-end.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@ name: End-to-end MGS workflow test
on: [pull_request]

jobs:
test:
test-index:
runs-on: ubuntu-latest
timeout-minutes: 10

steps:
- name: Checkout
Expand All @@ -16,10 +17,10 @@ jobs:
java-version: '11'
distribution: 'adopt'

- name: Setup Nextflow latest-edge
- name: Setup Nextflow latest (stable)
uses: nf-core/setup-nextflow@v1
with:
version: "latest-edge"
version: "latest"

- name: Install nf-test
run: |
Expand All @@ -28,19 +29,83 @@ jobs:
- name: Run index workflow
run: nf-test test --tag index --verbose
test-run:
runs-on: ubuntu-latest
timeout-minutes: 10

- name: Clean docker for more space
run: |
docker kill $(docker ps -q) 2>/dev/null || true
docker rm $(docker ps -a -q) 2>/dev/null || true
docker rmi $(docker images -q) -f 2>/dev/null || true
docker system prune -af --volumes
steps:
- name: Checkout
uses: actions/checkout@v4

- name: Clean up nf-test dir
run: sudo rm -rf .nf-test
- name: Set up JDK 11
uses: actions/setup-java@v4
with:
java-version: '11'
distribution: 'adopt'

- name: Setup Nextflow latest (stable)
uses: nf-core/setup-nextflow@v1
with:
version: "latest"

- name: Install nf-test
run: |
wget -qO- https://get.nf-test.com | bash
sudo mv nf-test /usr/local/bin/
- name: Run run workflow
run: nf-test test --tag run --verbose

test-run-output:
runs-on: ubuntu-latest
timeout-minutes: 10

steps:
- name: Checkout
uses: actions/checkout@v4

- name: Set up JDK 11
uses: actions/setup-java@v4
with:
java-version: '11'
distribution: 'adopt'

- name: Setup Nextflow latest (stable)
uses: nf-core/setup-nextflow@v1
with:
version: "latest"

- name: Install nf-test
run: |
wget -qO- https://get.nf-test.com | bash
sudo mv nf-test /usr/local/bin/
- name: Run run workflow
run: nf-test test --tag run_output --verbose

test-validation:
runs-on: ubuntu-latest
timeout-minutes: 5

steps:
- name: Checkout
uses: actions/checkout@v4

- name: Set up JDK 11
uses: actions/setup-java@v4
with:
java-version: '11'
distribution: 'adopt'

- name: Setup Nextflow latest (stable)
uses: nf-core/setup-nextflow@v1
with:
version: "latest"

- name: Install nf-test
run: |
wget -qO- https://get.nf-test.com | bash
sudo mv nf-test /usr/local/bin/
- name: Run run_validation workflow
run: nf-test test --tag validation --verbose
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ test/.nextflow*
pipeline_report.txt

.nf-test/
.nf-test.log
.nf-test.log
15 changes: 14 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,18 @@
# v2.5.3 (under development)
# v2.5.3
- Added new LOAD_SAMPLESHEET subworkflow to centralize samplesheet processing
- Updated tags to prevent inappropriate S3 auto-cleanup
- Testing infrastructure
- Split up the tests in `End-to-end MGS workflow test` so that they can be run in parallel on Github Actions.
- Implemented an end-to-end test that checks if the RUN workflow produces the correct output. The correct output for the test has been saved in `test-data/gold-standard-results` so that the user can diff the output of their test with the correct output to check where their pipeline might be failing.
- Began development of single-end read processing (still in progress)
- Restructured RAW, CLEAN, QC, TAXONOMY, and PROFILE workflows to handle both single-end and paired-end reads
- Added new FASTP_SINGLE, TRUNCATE_CONCAT_SINGLE, BBDUK_SINGLE, CONCAT_GROUP_SINGLE, SUBSET_READS_SINGLE and SUBSET_READS_SINGLE_TARGET processes to handle single-end reads
- Created separate end-to-end test workflow for single-end processing (which will be removed once single-end processing is fully integrated)
- Modified samplesheet handling to support both single-end and paired-end data
- Updated generate_samplesheet.sh to handle single-end data with --single_end flag
- Added read_type.config to handle single-end vs paired-end settings (set automatically based on samplesheet format)
- Created run_dev_se.config and run_dev_se.nf for single-end development testing (which will be removed once single-end processing is fully integrated)
- Added single-end samplesheet to test-data

# v2.5.2
- Changes to default read filtering:
Expand Down
41 changes: 36 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,34 @@

This Nextflow pipeline is designed to process metagenomic sequencing data, characterize overall taxonomic composition, and identify and quantify reads mapping to viruses infecting certain host taxa of interest. It was developed as part of the [Nucleic Acid Observatory](https://naobservatory.org/) project.

<!-- TOC start (generated with https://github.com/derlin/bitdowntoc) -->

- [Pipeline description](#pipeline-description)
* [Overview](#overview)
* [Index workflow](#index-workflow)
* [Run workflow](#run-workflow)
+ [Preprocessing phase](#preprocessing-phase)
+ [Viral identification phase](#viral-identification-phase)
+ [Taxonomic profiling phase](#taxonomic-profiling-phase)
+ [BLAST validation phase](#blast-validation-phase)
+ [QC and output phase](#qc-and-output-phase)
* [Pipeline outputs](#pipeline-outputs)
+ [Index workflow](#index-workflow-1)
+ [Run workflow](#run-workflow-1)
- [Using the workflow](#using-the-workflow)
* [Profiles and modes](#profiles-and-modes)
* [Installation & setup](#installation--setup)
+ [1. Install dependencies](#1-install-dependencies)
+ [2. Configure AWS & Docker](#2-configure-aws--docker)
+ [3. Clone this repository](#3-clone-this-repository)
+ [4. Run index/reference workflow](#4-run-indexreference-workflow)
* [Testing & validation](#testing--validation)
* [Running on new data](#running-on-new-data)
- [Run tests using `nf-test` before making pull requests](#run-tests-using-nf-test-before-making-pull-requests)
- [Troubleshooting](#troubleshooting)

<!-- TOC end -->

## Pipeline description

### Overview
Expand Down Expand Up @@ -179,7 +207,7 @@ To run this workflow with full functionality, you need access to the following d
2. **Docker:** To install Docker Engine for command-line use, follow the installation instructions available [here](https://docs.docker.com/engine/install/) (or [here](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/install-docker.html) for installation on an AWS EC2 instance).
3. **AWS CLI:** If not already installed, install the AWS CLI by following the instructions available [here](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html).
4. **Git:** To install the Git version control tool, follow the installation instructions available [here](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git).
5. **nf-test**: To install nf-test, follow the install instructions available [here](https://www.nf-test.com/docs/getting-started/).
5. **nf-test**: To install nf-test, follow the install instructions available [here](https://www.nf-test.com/installation/).

#### 2. Configure AWS & Docker

Expand Down Expand Up @@ -318,11 +346,14 @@ If running on Batch, a good process for starting the pipeline on a new dataset i
During the development process, we now request that users run the pipeline using `nf-test` locally before making pull requests (a test will be run automatically on the PR, but it's often useful to run it locally first). To do this, you need to make sure that you have a big enough ec2-instance. We recommend the `m5.xlarge` with at least `32GB` of EBS storage, as this machine closely reflects the VMs on Github Actions. Once you have an instance, run `nf-test run tests/main.test.nf`, which will run all workflows of the pipeline and check that they run to completion. If you want to run a specific workflow, you use the following commands:
```
nf-test run --tag index # Runs the index workflow
nf-test run --tag run # Runs the run workflow
nf-test run --tag validation # Runs the validation workflow
nf-test test --tag index # Runs the index workflow
nf-test test --tag run # Runs the run workflow
nf-test test --tag validation # Runs the validation workflow
nf-test test --tag run_output # Runs the run workflow with the test that verifies that the output files are correct
```
The intended results for the run workflow can be found in following directory `test-data/gold-standard-results`. Should the `run_output` test fail, you can diff the resulting files of that test, with the files in this folder to find the differences.
Importantly, make sure to periodically delete docker images to free up space on your instance. You can do this by running the following command, although note that this will delete all docker images:
```
Expand All @@ -332,7 +363,7 @@ docker rmi $(docker images -q) -f 2>/dev/null || true
docker system prune -af --volumes
```
# Troubleshooting
## Troubleshooting
When attempting to run a released version of the pipeline, the most common sources of errors are AWS permission issues. Before debugging a persistent error in-depth, make sure that you have all the permissions specified in Step 0 of [our Batch workflow guide](https://data.securebio.org/wills-public-notebook/notebooks/2024-06-11_batch.html). Next, make sure Nextflow has access to your AWS credentials, such as by running `eval "$(aws configure export-credentials --format env)"`.
Expand Down
Loading

0 comments on commit 4d2bb4c

Please sign in to comment.