Skip to content
This repository has been archived by the owner on Mar 16, 2022. It is now read-only.

Commit

Permalink
Initial commit version 1.11.0
Browse files Browse the repository at this point in the history
  • Loading branch information
armintoepfer committed Aug 17, 2018
0 parents commit de471a6
Show file tree
Hide file tree
Showing 28 changed files with 715 additions and 0 deletions.
34 changes: 34 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
Copyright (c) 2016-2018, Pacific Biosciences of California, Inc.

All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted (subject to the limitations in the
disclaimer below) provided that the following conditions are met:

* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.

* Neither the name of Pacific Biosciences nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.

NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE
GRANTED BY THIS LICENSE. THIS SOFTWARE IS PROVIDED BY PACIFIC
BIOSCIENCES AND ITS CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL PACIFIC BIOSCIENCES OR ITS
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.
51 changes: 51 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
<p align="center">
<img src="doc/img/minorseq.png" alt="minorseq logos" width="400px"/>
</p>
<h1 align="center">MinorSeq</h1>
<p align="center">Minor Variant Calling and Phasing Tools</p>

***
## Availability
The latest pre-release, developers-only linux binaries can be installed via [bioconda](https://bioconda.github.io/):

conda install minorseq

These binaries are not ISO compliant.
For research only.
Not for use in diagnostics procedures.

Official support is only provided for official and stable [SMRT Analysis builds](http://www.pacb.com/products-and-services/analytical-software/)
provided by PacBio.

Unofficial support for binary pre-releases is provided via github issues,
not via mail to developers.

## Quick Tools Overview

### [End-to-end workflow](doc/INTRODUCTION.md)

Overview how to run your sample.

### [Minor variant caller](doc/JULIET.md)

`juliet` identifies minor variants from aligned ccs reads.

### [Reduce alignment](doc/FUSE.md)

`fuse` reduces an alignment into its closest representative sequence.

### [Swap BAM reference](doc/CLERIC.md)

`cleric` swaps the reference of an alignment by transitive alignment.

### [Minor variant pipeline](doc/JULIETFLOW.md)

`julietflow` automatizes the minor variant pipeline.

### [Mix Data _In-Silico_](doc/MIXDATA.md)

`mixdata` helps to mix clonal strains _in-silico_ for benchmarking studies.

## Disclaimer

THIS WEBSITE AND CONTENT AND ALL SITE-RELATED SERVICES, INCLUDING ANY DATA, ARE PROVIDED "AS IS," WITH ALL FAULTS, WITH NO REPRESENTATIONS OR WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OF MERCHANTABILITY, SATISFACTORY QUALITY, NON-INFRINGEMENT OR FITNESS FOR A PARTICULAR PURPOSE. YOU ASSUME TOTAL RESPONSIBILITY AND RISK FOR YOUR USE OF THIS SITE, ALL SITE-RELATED SERVICES, AND ANY THIRD PARTY WEBSITES OR APPLICATIONS. NO ORAL OR WRITTEN INFORMATION OR ADVICE SHALL CREATE A WARRANTY OF ANY KIND. ANY REFERENCES TO SPECIFIC PRODUCTS OR SERVICES ON THE WEBSITES DO NOT CONSTITUTE OR IMPLY A RECOMMENDATION OR ENDORSEMENT BY PACIFIC BIOSCIENCES.
44 changes: 44 additions & 0 deletions doc/CLERIC.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
<h1 align="center">
Cleric - Swap BAM alignment reference
</h1>

<p align="center">
<img src="img/cleric.png" alt="Logo of Cleric" width="200px"/>
</p>

## Install
Install the minorseq suite using bioconda, more info [here](../README.md).
One of the binaries is called `cleric`.

## Input data
*Cleric* operates on aligned records in the BAM format, the original reference
and the target reference as FASTA.
BAM file have to PacBio-compliant, meaning, cigar `M` is forbidden.
Two sequences have to be provided, either in individual files or combined in one.
The header of the original reference must match the reference name in the BAM.

## Scope
Current scope of *Cleric* is converting a given alignment to a different
reference. This is done by aligning the original and target reference sequences.
A transitive alignment is used to generate the new alignment.

## Output
*Cleric* provides a BAM file with the file named as provided via the last argument.

## Example
Simple example:
```
cleric m530526.align.bam reference.fasta new_ref.fasta cleric_output.bam
```

Or:
```
cat reference.fasta new_ref.fasta > combined.fasta
cleric m530526.align.bam combined.fasta cleric_output.bam
```

## FAQ
### Cleric does not finish.
Runtime is linear in the number of reads provided. The alignment step runs a
Needleman-Wunsch; with NxM runtime. Please do not provide references with
lengths of human chromosomes, but concentrate on your actual amplicon target.
32 changes: 32 additions & 0 deletions doc/FUSE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
<h1 align="center">
fuse - Reduce alignment into its representative sequence
</h1>

<p align="center">
<img src="img/fuse.png" alt="Logo of Fuse" width="100px"/>
</p>

## Install
Install the minorseq suite using bioconda, more info [here](../README.md).
One of the binaries is called `fuse`.

## Input data
*Fuse* operates on aligned records in the BAM format.
BAM files have to PacBio-compliant, meaning, cigar `M` is forbidden.

## Scope
Current scope of *Fuse* is creation of a high-quality consensus sequence.
Fuse includes in-frame insertions with a certain distance to each other.
Major deletions are being removed.

## Output
*Fuse* provides a FASTA file per input. Output file is provided by the second
argument.

## Example
Simple example:
```
fuse m530526.align.bam m530526.fasta
```

Output: `m530526.fasta`
59 changes: 59 additions & 0 deletions doc/INTRODUCTION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
## How to run your sample 101

### Step 1
A simple bioconda installation:

```sh
conda install minorseq
```

--------
### Step 2
Create CCS2 reads from your sequel chip

> Juliet currently uses PacBio CCS reads as input. The use of CCS rich QVs allows sensitive minor variant calling.
```
ccs --richQVs m54000_170101_050702_3545456.subreadset.xml yourdata.ccs.bam
```

--------
### Step 3
Filter CCS2 reads as described here: [JULIETFLOW.md#filtering](JULIETFLOW.md#filtering)

> To ensure a uniform noise profile, we filter to 99% predicted
> accuracy. Barcode demultiplexing might be done.
--------
### Step 4
Download the reference sequence of interest as `ref.fasta`

> Juliet currently calls amino acid variants to a given refrence
> sequence so they might be easily related to known variants.
--------
### Step 5
Create a target-config for your gene as described here: [JULIET.md#target-configuration](JULIET.md#target-configuration)

> The target-config specifies Open Reading and how specific amino
> acids should be labeled in output results (ie Disease Resistant
> Mutation variants)

--------
### Step 6
Run *julietflow*

> The calling sequence is very simple taking sequencing reads, the
> reference, and the reference annotation config.
```
julietflow -i yourdata.filtered.ccs.bam -r ref.fasta -c targetconfig.json
```

--------
### Step 7
Interpret results in `yourdata.json` or `yourdata.html`

> 'yourdata.html' is easily viewed in a web browser and reflects the
> underlying results stored in 'yourdata.json'
Loading

0 comments on commit de471a6

Please sign in to comment.