Nextflow pipeline to perform functional profiling (genefamilies and pathways) of metagenomic short reads.
- Read cleaning (fastp)
- Summary of Species relative abundance (metaphlan3)
- Summary of Genefamilies and pathways abundance (humann3)
- Visulaziation of top N species, Genefamiles and pathways.
This is a pipeline for functional as well species abundance profiling of Shotgun Metagenomic reads. This facilitates the estimation of Genefamiles and Pathways' abundance within the samples.
The pipeline is written in Nextflow and its dependencies are available in singularity container.
The YAML file with conda enviromment with required tools is provided in envs/env.yaml
.
nextflow run humann.nf --reads 'data/*_R{1,2}.fastq.gz' \
--outdir "nf-humann" -profile nbi
Notable options:
--uniref [path]
: Path to the uniref database--chocophlan [path]
: Path to the chocophlan database--metaphlandb [path]
: Path to metaphlandb database
Profiles:
-profile nbi
: use the default location in the NBI cluster and SLURM scheduler.-profile vmqib
: use the default location of the database in the NBI cluster and local scheduler.
Note: The default location of these database on NBI cluster can be found in the nextflow.config