diff --git a/README.md b/README.md index 319eedf..5fa367b 100644 --- a/README.md +++ b/README.md @@ -31,11 +31,11 @@ A **gene call** is an optimal set of exons predicted based on similarity to a sp ## Running MetaEuk ### Main Modules: + easy-predict Predict proteins from contigs (fasta/db) based on similarities to targets (fasta/db) and return a fasta predictexons Call optimal exon sets based on protein similarity reduceredundancy Cluster metaeuk calls which share an exon and select representative unitesetstofasta Create a fasta output from optimal exon sets groupstoacc Create a TSV output from representative to calls - easy-predict Predict proteins from contigs (fasta/db) based on similarities to targets (fasta/db) and return a fasta ### Important parameters: @@ -47,6 +47,13 @@ A **gene call** is an optimal set of exons predicted based on similarity to a sp --slice-search if refernceDB is a profile database, should be added +### easy-predict workflow: + +This workflow combines the following MetaEuk modules into a single step: predictexons, reduceredundancy and unitesetstofasta (each of which is detailed below). Its input are contigs (either as a Fasta file or a previously created database) and targets (either as a Fasta file of protein sequences or a previously created database of proteins or protein profiles). It will run the modules and output the predictions in Fasta foramt. + + metaeuk easy-predict contigsFasta/contigsDB proteinsFasta/referenceDB predsResultProteins.fas tempFolder + + ### Calling optimal exons sets: This module will extract all putative protein fragments from each contig and strand, query them against the reference targets and use dynamic programming to retain for each **T** the optimal compatible exon set from each **C** & **S** (thus creating **TCS** calls). @@ -102,12 +109,6 @@ can help mapping from each representative prediction after the redundancy reduct metaeuk groupstoacc contigsDB referenceDB predGroupsDB predGroups.tsv -### easy-predict workflow: - -This workflow combines the following MetaEuk modules into a single step: predictexons, reduceredundancy and unitesetstofasta. Its input are contigs (either as a Fasta file or a previously created database) and targets (either as a Fasta file of protein sequences or a previously created database of proteins or protein profiles). It will run the modules and output the predictions in Fasta foramt. - - metaeuk easy-predict contigsFasta/contigsDB proteinsFasta/referenceDB predsResultProteins.fas tempFolder - ## Compile from source Compiling MetaEuk from source has the advantage that it will be optimized to the specific system, which should improve its performance. To compile MetaEuk `git`, `g++` (4.6 or higher) and `cmake` (3.0 or higher) are required. Afterwards, the MetaEuk binary will be located in the `build/bin` directory.