Skip to content

Commit

Permalink
Merge pull request #56 from PlantandFoodResearch/tests/public
Browse files Browse the repository at this point in the history
Added a test profile based on public data
  • Loading branch information
GallVp authored Aug 19, 2024
2 parents 230448f + 3e1e801 commit 89f53e8
Show file tree
Hide file tree
Showing 16 changed files with 117 additions and 81 deletions.
15 changes: 9 additions & 6 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Adopted from https://github.com/nf-core/modules/blob/master/.github/workflows/test.yml

name: Lint and -stub on Linux/Docker
name: CI tests
on:
push:
branches: [main]
branches:
- dev
pull_request:
branches: [main]

# Cancel if a newer run is started
concurrency:
Expand All @@ -30,7 +30,7 @@ jobs:
- name: Run pre-commit
run: pre-commit run --all-files

stub-test:
test:
runs-on: ubuntu-latest
name: Run stub test with docker
env:
Expand All @@ -44,17 +44,20 @@ jobs:
with:
version: "23.04.4"

- name: Disk space cleanup
uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1

- name: Run stub-test
run: |
nextflow run \
main.nf \
-profile local,docker \
-profile docker \
-stub \
-params-file tests/stub/params.json
confirm-pass:
runs-on: ubuntu-latest
needs: [pre-commit, stub-test]
needs: [pre-commit, test]
if: always()
steps:
- name: All tests ok
Expand Down
9 changes: 7 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## 0.4.0 - [07-Aug-2024]
## 0.4.0+dev - [19-Aug-2024]

### `Added`

Expand All @@ -24,6 +24,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
15. Reduced `BRAKER3` threads to 8 [#55](https://github.com/PlantandFoodResearch/pangene/issues/55)
16. Now the final annotations are stored in the `annotations` folder [#53](https://github.com/PlantandFoodResearch/pangene/issues/53)
17. Added `-gff` flag to `REPEATMASKER` to save the gff file [#54](https://github.com/PlantandFoodResearch/pangene/issues/54)
18. Now a single `fasta` file can be directly specified for `protein_evidence`
19. `eggnogmapper_db_dir` is not a required parameter anymore
20. `eggnogmapper_tax_scope` is now set to 1 (root div) by default
21. Added a `test` profile based on public data

### `Fixed`

Expand All @@ -46,7 +50,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
6. Removed dependency on <https://github.com/kherronism/nf-modules.git> for `BRAKER3` and `REPEATMASKER` modules which are now installed from <https://github.com/GallVp/nxf-components.git>
7. Removed dependency on <https://github.com/PlantandFoodResearch/nxf-modules.git>
8. Now the final annotations are not stored in the `final` folder
9. Now BRAKER3 outputs are not saved by default [#53](https://github.com/PlantandFoodResearch/pangene/issues/53)
9. Now BRAKER3 outputs are not saved by default [#53](https://github.com/PlantandFoodResearch/pangene/issues/53) and saved under `etc` folder when enabled
10. Removed `local` profile. Local executor is the default when no executor is specified. Therefore, the `local` profile was not needed.

## 0.3.3 - [18-Jun-2024]

Expand Down
31 changes: 0 additions & 31 deletions conf/base.config
Original file line number Diff line number Diff line change
@@ -1,34 +1,3 @@
profiles {
pfr {
process {
executor = 'slurm'
}

apptainer {
envWhitelist = 'APPTAINER_BINDPATH,APPTAINER_BIND'
cacheDir = "/workspace/pangene/singularity"
}
}

local {
process {
executor = 'local'
}
}

apptainer {
apptainer.enabled = true
apptainer.autoMounts= true
apptainer.registry = 'quay.io'
}

docker {
docker.enabled = true
docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64'
docker.registry = 'quay.io'
}
}

process {

cpus = { check_max( 1 * task.attempt, 'cpus' ) }
Expand Down
4 changes: 2 additions & 2 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ process { // SUBWORKFLOW: FASTA_BRAKER3
].flatten().unique(false).join(' ').trim()
ext.prefix = { "${meta.id}" }
publishDir = [
path: { "${params.outdir}/braker/" },
path: { "${params.outdir}/etc/braker/" },
mode: "copy",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
enabled: params.braker_save_outputs
Expand Down Expand Up @@ -335,7 +335,7 @@ process { // Universal

withName: SAVE_MARKED_GFF3 {
publishDir = [
path: { "${params.outdir}/splicing_marked" },
path: { "${params.outdir}/etc/splicing_marked" },
mode: "copy",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
]
Expand Down
7 changes: 7 additions & 0 deletions conf/test.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
params {
input = "${projectDir}/tests/minimal/assemblysheet.csv"
protein_evidence = 'https://raw.githubusercontent.com/Gaius-Augustus/BRAKER/f58479fe5bb13a9e51c3ca09cb9e137cab3b8471/example/proteins.fa'

braker_extra_args = '--gm_max_intergenic 10000 --skipOptimize' // Added for faster test execution! Do not use with actual data!
busco_lineage_datasets = 'eudicots_odb10'
}
6 changes: 3 additions & 3 deletions docs/parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ A NextFlow pipeline for pan-genome annotation
| Parameter | Description | Type | Default | Required | Hidden |
| ------------------------- | -------------------------------------------------------------------------------------------------------- | --------- | --------- | -------- | ------ |
| `input` | Target assemblies listed in a CSV sheet | `string` | | True | |
| `protein_evidence` | Protein evidence provided as fasta files listed in a text sheet | `string` | | True | |
| `eggnogmapper_db_dir` | Eggnogmapper database directory | `string` | | True | |
| `eggnogmapper_tax_scope` | Eggnogmapper taxonomy scopre. Eukaryota: 2759, Viridiplantae: 33090, Archaea: 2157, Bacteria: 2, root: 1 | `integer` | | True | |
| `protein_evidence` | Protein evidence provided as a fasta file or multiple fasta files listed in a plain txt file | `string` | | True | |
| `eggnogmapper_db_dir` | Eggnogmapper database directory | `string` | | | |
| `eggnogmapper_tax_scope` | Eggnogmapper taxonomy scopre. Eukaryota: 2759, Viridiplantae: 33090, Archaea: 2157, Bacteria: 2, root: 1 | `integer` | 1 | | |
| `rna_evidence` | FASTQ/BAM samples listed in a CSV sheet | `string` | | | |
| `liftoff_annotations` | Reference annotations listed in a CSV sheet | `string` | | | |
| `orthofinder_annotations` | Additional annotations for orthology listed in a CSV sheet | `string` | | | |
Expand Down
6 changes: 4 additions & 2 deletions local_pangene
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,10 @@ F_BOLD="\033[1m"

nextflow run \
main.nf \
-profile local,docker \
-profile docker,test \
-resume \
$stub \
-params-file pangene-test/params.json \
--max_cpus 8 \
--max_memory '32.GB' \
--eggnogmapper_tax_scope 33090 \
--eggnogmapper_db_dir ../dbs/emapperdb/5.0.2
2 changes: 1 addition & 1 deletion modules/local/utils.nf
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ def idFromFileName(fileName) {
).replaceFirst(
/\.f(ast)?q$/, ''
).replaceFirst(
/\.f(asta|sa|a|as|aa)?$/, ''
/\.f(asta|sa|a|as|aa|na)?$/, ''
).replaceFirst(
/\.gff(3)?$/, ''
).replaceFirst(
Expand Down
50 changes: 33 additions & 17 deletions nextflow.config
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
includeConfig './conf/base.config'

params {
// Input/output options
input = null
protein_evidence = null
eggnogmapper_db_dir = null
eggnogmapper_tax_scope = null
eggnogmapper_tax_scope = 1
rna_evidence = null
liftoff_annotations = null
orthofinder_annotations = null
Expand All @@ -21,20 +19,20 @@ params {
skip_fastqc = false
skip_fastp = false
min_trimmed_reads = 10000
extra_fastp_args = ""
extra_fastp_args = null
save_trimmed = false
remove_ribo_rna = false
save_non_ribo_reads = false
ribo_database_manifest = "${projectDir}/assets/rrna-db-defaults.txt"

// RNAseq alignment options
star_max_intron_length = 16000
star_align_extra_args = ""
star_align_extra_args = null
star_save_outputs = false
save_cat_bam = false

// Annotation options
braker_extra_args = ""
braker_extra_args = null
braker_save_outputs = false
liftoff_coverage = 0.9
liftoff_identity = 0.9
Expand All @@ -59,15 +57,26 @@ params {
validationS3PathCheck = true
}

manifest {
name = 'pangene'
author = """Usman Rashid, Jason Shiller"""
homePage = 'https://github.com/PlantandFoodResearch/pangene'
description = """A NextFlow pipeline for pan-genome annotation"""
mainScript = 'main.nf'
nextflowVersion = '!>=23.04.4'
version = '0.4.0'
doi = ''
includeConfig './conf/base.config'

profiles {
apptainer {
apptainer.enabled = true
apptainer.autoMounts = true
apptainer.registry = 'quay.io'
}

docker {
docker.enabled = true
docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64'
docker.registry = 'quay.io'
}

test { includeConfig 'conf/test.config' }
}

plugins {
id '[email protected]'
}

def trace_timestamp = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss')
Expand All @@ -84,8 +93,15 @@ trace {
file = "${params.outdir}/pipeline_info/execution_trace_${trace_timestamp}.txt"
}

plugins {
id '[email protected]'
manifest {
name = 'pangene'
author = """Usman Rashid, Jason Shiller"""
homePage = 'https://github.com/PlantandFoodResearch/pangene'
description = """A NextFlow pipeline for pan-genome annotation"""
mainScript = 'main.nf'
nextflowVersion = '!>=23.04.4'
version = '0.4.0+dev'
doi = ''
}

includeConfig './conf/modules.config'
9 changes: 5 additions & 4 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"type": "object",
"fa_icon": "fas fa-terminal",
"description": "",
"required": ["input", "protein_evidence", "eggnogmapper_db_dir", "eggnogmapper_tax_scope", "outdir"],
"required": ["input", "protein_evidence", "outdir"],
"properties": {
"input": {
"type": "string",
Expand All @@ -23,9 +23,9 @@
},
"protein_evidence": {
"type": "string",
"description": "Protein evidence provided as fasta files listed in a text sheet",
"description": "Protein evidence provided as a fasta file or multiple fasta files listed in a plain txt file",
"format": "file-path",
"mimetype": "text/txt",
"pattern": "^\\S+\\.(txt|fa|faa|fna|fsa|fas|fasta)(\\.gz)?$",
"fa_icon": "far fa-file-alt"
},
"eggnogmapper_db_dir": {
Expand All @@ -36,7 +36,8 @@
"eggnogmapper_tax_scope": {
"type": "integer",
"description": "Eggnogmapper taxonomy scopre. Eukaryota: 2759, Viridiplantae: 33090, Archaea: 2157, Bacteria: 2, root: 1",
"minimum": 0
"minimum": 1,
"default": 1
},
"rna_evidence": {
"type": "string",
Expand Down
4 changes: 3 additions & 1 deletion subworkflows/local/gff_eggnogmapper.nf
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,9 @@ workflow GFF_EGGNOGMAPPER {
ch_versions = ch_versions.mix(GFF2FASTA_FOR_EGGNOGMAPPER.out.versions.first())


ch_eggnogmapper_inputs = ch_gffread_fasta
ch_eggnogmapper_inputs = ! db_folder
? Channel.empty()
: ch_gffread_fasta
| combine(Channel.fromPath(db_folder))

EGGNOGMAPPER(
Expand Down
11 changes: 9 additions & 2 deletions subworkflows/local/gff_store.nf
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,15 @@ workflow GFF_STORE {
ch_target_gff // [ meta, gff ]
ch_eggnogmapper_annotations // [ meta, annotations ]
ch_fasta // [ meta, fasta ]
val_describe_gff // val(true|false)

main:
ch_versions = Channel.empty()

// COLLECTFILE: Add eggnogmapper hits to gff
ch_described_gff = ch_target_gff
ch_described_gff = ! val_describe_gff
? Channel.empty()
: ch_target_gff
| join(ch_eggnogmapper_annotations)
| map { meta, gff, annotations ->
def tx_annotations = annotations.readLines()
Expand Down Expand Up @@ -109,7 +112,11 @@ workflow GFF_STORE {
}

// MODULE: GT_GFF3 as FINAL_GFF_CHECK
FINAL_GFF_CHECK ( ch_described_gff )
ch_final_check_input = val_describe_gff
? ch_described_gff
: ch_target_gff

FINAL_GFF_CHECK ( ch_final_check_input )

ch_final_gff = FINAL_GFF_CHECK.out.gt_gff3
ch_versions = ch_versions.mix(FINAL_GFF_CHECK.out.versions.first())
Expand Down
7 changes: 6 additions & 1 deletion subworkflows/local/purge_nohit_models.nf
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,11 @@ workflow PURGE_NOHIT_MODELS {
ch_versions = ch_versions.mix(AGAT_SPFILTERFEATUREFROMKILLLIST.out.versions.first())

emit:
purged_gff = ch_target_purged_gff.mix(val_purge_nohits ? Channel.empty() : ch_target_gff)
purged_gff = ch_target_purged_gff
| mix(
val_purge_nohits
? Channel.empty()
: ch_target_gff
)
versions = ch_versions // [ versions.yml ]
}
2 changes: 2 additions & 0 deletions tests/minimal/assemblysheet.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tag,fasta,is_masked
a_thaliana,https://raw.githubusercontent.com/Gaius-Augustus/BRAKER/f58479fe5bb13a9e51c3ca09cb9e137cab3b8471/example/genome.fa,yes
6 changes: 6 additions & 0 deletions tests/minimal/params.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"input": "tests/minimal/assemblysheet.csv",
"protein_evidence": "https://raw.githubusercontent.com/Gaius-Augustus/BRAKER/f58479fe5bb13a9e51c3ca09cb9e137cab3b8471/example/proteins.fa",
"braker_extra_args": "--gm_max_intergenic 10000 --skipOptimize",
"busco_lineage_datasets": "eudicots_odb10"
}
Loading

0 comments on commit 89f53e8

Please sign in to comment.