Skip to content

Latest commit

 

History

History
140 lines (112 loc) · 9.54 KB

INSTALL.md

File metadata and controls

140 lines (112 loc) · 9.54 KB

Installation

(updated on 2/9/2017)

The Horizomer pipeline relies on multiple external applications. We are working to simplify the installation process. Please follow the instructions in this document.

Operating system

A Unix-like operating system (e.g., Linux or Mac OS) is required to run Horizomer.

Note: The pipeline has been tested on CentOS 6.6, Ubuntu 16.04, and Mac OS 10.11.

Recommended installation procedures

Have Miniconda or Anaconda installed in the system. The Python 3 version is preferable.

Create a conda environment and install required libraries:

conda create -n horizomer python=3.5
source activate horizomer
conda install click biopython
conda install -c biocore scikit-bio

Install third-party applications using conda:

conda install -c bioconda blast diamond fasttree mafft mcl muscle prodigal=2.6.2 raxml t_coffee trimal

Exit the conda environment when done.

source deactivate

For applications that require Python 2 (specifically: PhyloPhlAn and OrthoFinder), create another conda environment and install them:

conda create -n horizomer_py2 -c bioconda python=2.7 biopython orthofinder

The remaining applications have to be installed manually. Please refer to the table below for details.

List of required external applications

(Note: those with "how to install" = "conda" are already installed if you have run the aforementioned commands.)

Name Tested Version Purpose How to install PMID
T-REX 3.22 HGT detection download & compile 20525630
RANGER-DTL 2.0 HGT detection download binary 22689773
PhyloNet 3.6.1 HGT detection download binary 18662388
Jane 4.01 HGT detection download binary (!license!) 20181081
TREE-PUZZLE 5.3.rc16 HGT detection download & compile 11934758
CONSEL 0.20 HGT detection download 11751242
DarkHorse 1.5 rev170 HGT detection download & install 17274820
HGTector 0.2.1 HGT detection git clone 25159222
EGID 1.0 HGT detection download 22355228
GeneMarkS 4.30 HGT detection download binary (!license!) 9461475
OrthoFinder 1.1.4 orthology identification conda 26243257
Phylomizer 9/12/2016 gene family tree building git clone NA
PhyloPhlAn 1/25/2017 genome tree building hg clone 23942190
BLAST+ 2.5.0 sequence similarity search conda 2231712
DIAMOND 0.8.28 sequence similarity search conda 25402007
USEARCH 9.1.13 sequence similarity search download binary (!license!) 20709691
MAFFT 7.305 sequence alignment conda 12136088
MUSCLE 3.8.31 sequence alignment conda 15034147
KAlign 2.03 sequence alignment download & compile 16343337
T-Coffee 11.00.8cbe486 alignment combining conda 10964570
trimAl 1.4.1 alignment trimming conda 19505945
RAxML 8.2.9 tree building conda 24451623
PhyML 3.2.20160530 tree building download & compile 20525638
FastTree 2.1.9 tree building conda 20224823
MCL 14-137 clustering conda 22144159
Prodigal 2.6.3 gene calling conda 20211023

Appendix: Availability of applications

Availability from conda channels

Multiple applications can be directly installed from conda channels. This not only simplifies the installation process but also guarantees the modularity of the entire pipeline. That is, the installed applications are only available from within the conda environment.

Available in bioconda:

  • mcl=14.137
  • t_coffee=11.0.8
  • muscle=3.8.1551
  • raxml=8.2.9
  • trimal=1.4.1
  • diamond=0.8.24 (Note: DIAMOND in bioconda is newer than that in biocore)
  • blast-legacy=2.2.22
  • blast=2.2.31
  • mafft=7.305
  • orthofinder=1.1.2 (Note: OrthoFinder is only compatible with Python 2)
  • prodigal=2.6.2 (Note: the default version is 2.60, which is older)
  • fasttree=2.1.9 (Note: this install has double precision support, which is non-default but critical (details))

Available in biocore:

  • prodigal=2.6.2
  • blast-legacy=2.2.22
  • blast-plus=2.2.31
  • diamond=0.7.10

Availability from Ubuntu repositories

Ubuntu users may take advantage of the repositories. For example, PhyML and Kalign, which are not available from conda, can be installed by:

sudo apt-get install phyml kalign

Please note that the installation takes effect system-wide.

Available from the default Ubuntu 16.04 LTS repository (universe):

  • fasttree (2.1.8)
  • kalign (2.03+20110620)
  • mafft (7.271)
  • mcl (14-137)
  • muscle (3.8.31)
  • mysql-server (5.7.17)
  • ncbi-blast+ (2.2.31)
  • phyml (3.2.0)
  • prodigal (2.6.2)
  • raxml (8.2.4)
  • t-coffee (11.00.8cbe486)

Cross-dependencies

  • OrthoFinder requires MCL, FastTree, Blast+ and MAFFT
  • Phylomizer requires Blast+, KAlign, MAFFT, MUSCLE, T-Coffee, and PhyML
  • DarkHorse requires MySQL

Additional notes

The ETE team has released a conda package: ete3_external_apps, which wraps up multiple popular phylogenetics applications (including PhyML and Kalign). This is not required for Horizomer, but if you wish to install, do:

conda install -c etetoolkit ete3_external_apps

(WIP: A script will be created to automate the installation of most of these applications. However, a few of them are definitely not automatable due to license issues.)