QAPA is a python-and-R-based model for quantifying alternative polyadenylation (APA) from RNA-seq data.
The paper is titled QAPA: a new method for the systematic analysis of alternative polyadenylation from RNA-seq data
The application download is free to download
and documentation was used as a reference
to create the nextflow pipeline flow of this module
An example sample sheet is available at samplesheet_example.csv
. Each row in the samplesheet has nine
columns:
- sample: name of the sample for logs (e.g control_replicate1)
- fastq1: FASTQ1 for both single-end and paired-end RNA-seq
- fastq2: FASTQ2 for paired-end RNA-seq. If single-end, please leave this empty
When using your own data and input file instead of the provided test data and sample sheet, make sure to include in the input file you are using the absolute paths to the files, with the column names following the column names above.
Parameters used to run QAPA are specified in conf/modules.config file. Parameters relevant to the workflow itself are:
input
- path to thesamplesheet.csv
outdir
- name of the folder that the final output files are going to be in, located under QAPA/gtf
: GTF annotation file (has to be uncompressed (not zip or gz))fasta
: FASTA reference sequence filepolyabed
: BED 3'UTR library (has to be uncompressed (not zip or gz))run_qapa_build
: run qapa built (default: false)
Notes on --polyabed
and --run_qapa_build
:
WARNING: APAeval's implementation of the QAPA
build
mode is still in beta and might create suboptimal annotations. We strongly recommend using pre-generated annotations!
- If building annotations from scratch with
qapa build
, please pass the--run_qapa_build
flag and provide both GTF (--gtf
) and poly(A) BED file (--polyabed
) - If providing pre-generated QAPA annotations, please pass both GTF (
--gtf
) and poly(A) BED file (--polyabed
) only (DON"T pass the--run_qapa_build
flag)
The QAPA workflow only does quantification. Once parameters have been set in conf/modules.config file, run the nextflow pipeline with
nextflow main.nf --input samplesheet_example_files.csv` --gtf <path_to_gtf> --polyabed <path_to_poly(A)_bed> --fasta <path_to_fasta> --run_qapa_build -profile [docker/singularity]
One can find the corresponding gtf and fastq files from ../../test/test_data
This workflow uses docker containers. To run with docker, make sure that docker is installed and running
(e.g. to ensure docker is running, run the command docker --help
and a help message should be printed).
To run with docker
, please indicate -profile docker
nextflow main.nf --input samplesheet_example.csv --gtf <path_to_gtf> --polyabed <path_to_poly(A)_bed> --fasta <path_to_fasta> --run_qapa_build -profile docker
To run with singularity
, please indicate -profile singularity
nextflow main.nf --input samplesheet_example.csv --gtf <path_to_gtf> --polyabed <path_to_poly(A)_bed> --fasta <path_to_fasta> --run_qapa_build -profile singlularity
When using the default output_dir parameter value in conf/modules.config, QAPA store results under
results/qapa
folder, and the quantification output BED files (one in PPAU fraction, another in per-PAS TPM) will be stored in results/qapa/<sample_name>/qapa_quant_ppau.bed
and results/qapa/<sample_name>/qapa_quant_tpm.bed
.
If you have any question or comment about QAPA, please submit an issue on QAPA's GitHub repository