(updated on 2/9/2017)
The Horizomer pipeline relies on multiple external applications. We are working to simplify the installation process. Please follow the instructions in this document.
A Unix-like operating system (e.g., Linux or Mac OS) is required to run Horizomer.
Note: The pipeline has been tested on CentOS 6.6, Ubuntu 16.04, and Mac OS 10.11.
Have Miniconda or Anaconda installed in the system. The Python 3 version is preferable.
Create a conda environment and install required libraries:
conda create -n horizomer python=3.5
source activate horizomer
conda install click biopython
conda install -c biocore scikit-bio
Install third-party applications using conda:
conda install -c bioconda blast diamond fasttree mafft mcl muscle prodigal=2.6.2 raxml t_coffee trimal
Exit the conda environment when done.
source deactivate
For applications that require Python 2 (specifically: PhyloPhlAn and OrthoFinder), create another conda environment and install them:
conda create -n horizomer_py2 -c bioconda python=2.7 biopython orthofinder
The remaining applications have to be installed manually. Please refer to the table below for details.
(Note: those with "how to install" = "conda" are already installed if you have run the aforementioned commands.)
Name | Tested Version | Purpose | How to install | PMID |
---|---|---|---|---|
T-REX | 3.22 | HGT detection | download & compile | 20525630 |
RANGER-DTL | 2.0 | HGT detection | download binary | 22689773 |
PhyloNet | 3.6.1 | HGT detection | download binary | 18662388 |
Jane | 4.01 | HGT detection | download binary (!license!) | 20181081 |
TREE-PUZZLE | 5.3.rc16 | HGT detection | download & compile | 11934758 |
CONSEL | 0.20 | HGT detection | download | 11751242 |
DarkHorse | 1.5 rev170 | HGT detection | download & install | 17274820 |
HGTector | 0.2.1 | HGT detection | git clone | 25159222 |
EGID | 1.0 | HGT detection | download | 22355228 |
GeneMarkS | 4.30 | HGT detection | download binary (!license!) | 9461475 |
OrthoFinder | 1.1.4 | orthology identification | conda | 26243257 |
Phylomizer | 9/12/2016 | gene family tree building | git clone | NA |
PhyloPhlAn | 1/25/2017 | genome tree building | hg clone | 23942190 |
BLAST+ | 2.5.0 | sequence similarity search | conda | 2231712 |
DIAMOND | 0.8.28 | sequence similarity search | conda | 25402007 |
USEARCH | 9.1.13 | sequence similarity search | download binary (!license!) | 20709691 |
MAFFT | 7.305 | sequence alignment | conda | 12136088 |
MUSCLE | 3.8.31 | sequence alignment | conda | 15034147 |
KAlign | 2.03 | sequence alignment | download & compile | 16343337 |
T-Coffee | 11.00.8cbe486 | alignment combining | conda | 10964570 |
trimAl | 1.4.1 | alignment trimming | conda | 19505945 |
RAxML | 8.2.9 | tree building | conda | 24451623 |
PhyML | 3.2.20160530 | tree building | download & compile | 20525638 |
FastTree | 2.1.9 | tree building | conda | 20224823 |
MCL | 14-137 | clustering | conda | 22144159 |
Prodigal | 2.6.3 | gene calling | conda | 20211023 |
Multiple applications can be directly installed from conda channels. This not only simplifies the installation process but also guarantees the modularity of the entire pipeline. That is, the installed applications are only available from within the conda environment.
Available in bioconda:
- mcl=14.137
- t_coffee=11.0.8
- muscle=3.8.1551
- raxml=8.2.9
- trimal=1.4.1
- diamond=0.8.24 (Note: DIAMOND in bioconda is newer than that in biocore)
- blast-legacy=2.2.22
- blast=2.2.31
- mafft=7.305
- orthofinder=1.1.2 (Note: OrthoFinder is only compatible with Python 2)
- prodigal=2.6.2 (Note: the default version is 2.60, which is older)
- fasttree=2.1.9 (Note: this install has double precision support, which is non-default but critical (details))
Available in biocore:
- prodigal=2.6.2
- blast-legacy=2.2.22
- blast-plus=2.2.31
- diamond=0.7.10
Ubuntu users may take advantage of the repositories. For example, PhyML and Kalign, which are not available from conda, can be installed by:
sudo apt-get install phyml kalign
Please note that the installation takes effect system-wide.
Available from the default Ubuntu 16.04 LTS repository (universe):
- fasttree (2.1.8)
- kalign (2.03+20110620)
- mafft (7.271)
- mcl (14-137)
- muscle (3.8.31)
- mysql-server (5.7.17)
- ncbi-blast+ (2.2.31)
- phyml (3.2.0)
- prodigal (2.6.2)
- raxml (8.2.4)
- t-coffee (11.00.8cbe486)
- OrthoFinder requires MCL, FastTree, Blast+ and MAFFT
- Phylomizer requires Blast+, KAlign, MAFFT, MUSCLE, T-Coffee, and PhyML
- DarkHorse requires MySQL
The ETE team has released a conda package: ete3_external_apps, which wraps up multiple popular phylogenetics applications (including PhyML and Kalign). This is not required for Horizomer, but if you wish to install, do:
conda install -c etetoolkit ete3_external_apps
(WIP: A script will be created to automate the installation of most of these applications. However, a few of them are definitely not automatable due to license issues.)