-
Notifications
You must be signed in to change notification settings - Fork 6
Commands
Running picky.pl without any parameter
./picky.pl
will provide the list of commands picky support.
Please specify the command.
./picky.pl <command> -h
<command> [hashFq, selectRep, callSV]
hashFq : hash read uuids to friendly ids
lastParam : Last parameters for alignment
selectRep : select representative alignments for read
callSV : call structural variants
xls2vcf : convert Picky sv xls file to vcf
sam2align : convert sam to align format
preparepbs: chunk last fastq file and write pbs script for cluster submission
script : write a bash-script for single fastq processing
OPTIONAL step to hash read uuids to human-friendly ids.
./picky.pl hashFq --pfile <passFQFile> --ffile <failFQFile> --oprefix <outputPrefix>
--pfile STR pass .fastq file
--ffile STR fail .fastq file
--oprefix STR prefix to output filename
Return the suggested alignment parameters to be used with lastal.
It is advisable to use ./picky.pl script to set up the pipeline. See Quick start for example.
Select representative alignments from lastal's maf output.
./picky.pl selectRep [--thread <numberOfThreads>] [--preload <preloadFold>]
--thread INT number of threads
--preload INT Fold of thread count to preload maf records
Read from STDIN
Output directly to console (stdout) the selected representative alignment for the read in .align format.
parameter | description |
---|---|
--thread | number of threads to be used for alignment For faster turn-around, use more threads but this should not exceed the number of cores available on your machine. |
--preload | fold of preloading for read alignments. Uses more memory but shorten turnaround time by allowing alignment and selectRep steps to be executed concurrently. |
Perform SV calling on the .align file generated by picky.pl selectRep.
./picky.pl callSV --in <alignFile> --fastq <fqFile> --lastpara <last parameters> [--genome <genomeFastaFile> --removehomopolymerdeletion] [--sam] [--exlucde <chromosomeToExeclude> [--exlucde <anotherChromosomeToExeclude>]]
--oprefix STR prefix for output files
--fastq STR .fastq file
--lastpara STR lastal parameters used
--removehomopolymerdeletion
exclude DEL and INDEL possibly affected by homopolymer
--genome STR genome sequence in .fasta file
--sam flag to output .sam file
--exclude STR exclude SV invovling specified chromosome
(specify each chromosome with --exclude individually)
--multiloci report SVs on best alignment of multi-loci aligments
Provide .align file from Picky selectrep via STDIN or "--in"
Output a set of SV .xls files along with auxiliary files. See Output Format's Set 2 : SVs Calling.
parameter | description |
---|---|
--oprefix | prefix for output files |
--fastq | fastq file containing reads analyzed |
--lastpara | specified lastal parameters used which will be recorded in .sam file |
--sam | indicate .sam file to be generated |
--exclude | exclude SV invovling specified chromosome specify each chromosome with --exclude individually |
--multiloci | report SVs on best alignment of multi-loci aligments |
--removehomopolymerdeletion | OPTIONAL: exclude DEL and INDEL possibly affected by homopolymer ONLY necessary if you are using earlier base-called fastq |
--genome | OPTIONAL: genome sequence in .fasta file; ONLY necessary if you are using "--removehomopolymerdeletion" |
Convert .xls SV files generated by picky.pl callSV to .vcf file.
./picky.pl xls2vcf --xls <picky_xls_file> [--chrom <chromosome>] [--re <minReadsSupport>]
--xls STR picky SV xls file
--chrom STR restrict output to specified chromosomes [e.g. chr20]
--re INT min number of read evidence [default:2]
--merge window to merge SV [default: 1000 bp]
--converge window which SVs are considered converged concordantly [default: 20 bp]
parameter | description |
---|---|
--xls | SV .xls file generated by picky.pl callSV multiple .xls files separated by comma or each .xls file prefix with --xls i.e. "--xls sv.del.xls,sv.indel.xls" and "--xls sv.del.xls --xls sv.indel.xls" are equivalent |
Output directly to console (stdout) in .vcf format.
parameter | description |
---|---|
--chrom | report SVs found on specified chromosomes |
--re | report SVs that has at least this required number of reads support [default:2] |
--merge | window to merge SV [default: 1000 bp] |
--converge | window which SVs are considered converged concordantl [default: 20 bp] |
Convert .sam content to .align format for picky.pl callSV.
Input from console (stdin) in .sam format.
NOTE: The sam records should be read-blocked, i.e. alignment records from the same read should be contiguous. The tag value of 'SO:' must be "queryname" in the header line '@HD' or the tag 'SO:' excluded.
Output directly to console (stdout) in .align format.
NOTE: A large number of output columns are specific to LAST output and for tracebility. sam2align only output the minimum columns needed for callSV. The minimum columns are qStrand, qStart, qEnd for read/query and refId, refStrand, refStart and refEnd for reference/subject.
Chunk the specified .fastq file and write PBS scripts instantiated from the template "template.pbs" for all chunk. This prepapres files for cluster jobs to be submitted.
./picky.pl preparePBS --fastq <fastq_file> [--chunksize <numberOfReadsPerChunk>] [--template <template_file>]
--fastq STR fastq file
--chunksize INT number of fastq record per chunk file [default: 1000]
--template STR template file for PBS script [default: template.pbs]
--init STR write a copy of the template to specific file
See cluster support for an detail example.
parameter | description |
---|---|
--fastq | fastq file to be analyzed |
Write chunked fastq file for each <chunksize> fastq records from the specified fastq file along with the corresponding PBS script.
For a specified fastq "SCP20.fastq" with says 277,054 reads, Picky preparepbs will generate 278 chunk .fastq files (SCP20-c000001.fastq, SCP20-c000002.fastq, ..., SCP20-c000278.fastq) and the corresponding 278 PBS scripts (SCP20-c000001.pbs, SCP20-c000002.pbs, ..., SCP20-c000278.pbs).
parameter | description |
---|---|
--chunksize | number of fastq record per chunk file [default: 1000] Large chunksize means longer run time, but less number of chunk files to manage. You should adjust this value according to your needs and available cluster resources and configuration. |
--template | template file for PBS script [default: template.pbs] omit to use the default template, or specify your project-specific template |
--init | write a copy of the template to specific file. can be used to create "template.pbs", or use to create initial project-specific template |
Write a bash script for Picky pipeline stringing together lastal alignment, picky selectRep, picky callSV and picky xls2vcf.
./picky.pl script --fastq <fastq_file> [--thread <numberOfThreads>] [--preload <preload_fold_of_reads_alignments>]
--fastq STR fastq file
--thread INT number of threads to use [default: 4]
--preload INT fold of preloading for read alignments [default: 6]
See Quick start's Picky processes and The Picky Script for example.
parameter | description |
---|---|
--fastq | fastq file to be analyzed |
Parameterized bash script output directly to console (stdout) can be redirected to a file or stream edited.
parameter | description |
---|---|
--thread | number of threads to be used for alignment For faster turn-around, use more threads but this should not exceed the number of cores available on your machine. |
--preload | fold of preloading for read alignments. Uses more memory but shorten turnaround time by allowing alignment and selectRep steps to be executed concurrently. |