-
Notifications
You must be signed in to change notification settings - Fork 41
Core Pipeline
Isaac Overcast edited this page Nov 21, 2015
·
6 revisions
Important files generated and specific config used by each step.
In - Raw reads from 'raw_fastq_path', must be in fastq format (can be gzip compressed).
Out - Demultiplexed individuals to <work>/fastq/*.gz & info to /fastq/s1_demultiplex_stats.txt
Out - <work>/edits/*.fasta & info to <work>/edits/s2_rawedit_stats.txt
Out
- Dereplicated and sorted reads: <work>/edits/*.derep
- <work>/clust_<tolerance>/
- *.htemp - FASTA file of unmatched searches (vsearch)
- *.utemp - user defined output stats from vsearch
- *.clust.gz - unaligned clusters
- *.clustS.gz - Aligned clusters (post-muscle)
- *s3_cluster_stats.txt
- IFF reference sequence mapping
- <work>/refmapping/
- *.sam - Raw output of smalt mapping
- *.<mapped/unmapped>.bam - bam files for mapped and unmapped reads
- *.sorted-<mapped/umapped>.bam - sorted bam files for mapped and unmapped reads
- <work>/edits/*.fastq - Updated fastq files in the edits dir to contain only unmapped reads
- <work>/clust_<tolerance>/
- *.clustsS.gz - Merged denovo clusters (post-muscle) and reference sequence aligned pileups.
- Info to <work>/edits/clust_<tolerance>/s3_cluster_stats.txt
Out
- Info to <work>/edits/clust_<tolerance>/s4_Pi_E_estimate.txt
Out
- consensus reads: <work>/edits/clust_<tolerance>/consens_<outprefix>/
- *.consens - FASTA file of consensus reads
- *...hd5f... - database storage (maybe) of read depths
Out
- ordered consensus reads: <work>/edits/clust_<tolerance>/consens_<outprefix>/cat...
- vsearch matching output: <work>/edits/clust_<tolerance>/consens_<outprefix>/cat.utemp
- database of all clusters containing depth data: ...