Skip to content

A Python implementation of the OrthoANI algorithm for nucleotide identity measurement.

License

Notifications You must be signed in to change notification settings

althonos/pyorthoani

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyOrthoANI Stars

A Python implementation of the OrthoANI algorithm for nucleotide identity measurement.

Actions License Source Coverage PyPI Wheel Python Versions Python Implementations Source GitHub issues Changelog Downloads

🗺️ Overview

OrthoANI is a metric proposed by Lee et al.[1] in 2016 to improve computation of Average Nucleotide Identity. It uses BLASTn to find orthologous blocks in a pair of sequences, and then computes the average identity only considering alignments of reciprocal orthologs.

Algorithm

PyOrthoANI is a reimplementation of the closed-source Java implementation provided by the authors on ezbiocloud.net. It relies on Biopython to handle the I/O, and calls the BLAST+ binaries using the subprocess module of the Python standard library.

🔧 Installing

Installing with pip is the easiest:

$ pip install pyorthoani

PyOrthoANI also requires the BLAST+ binaries to be installed on your machine and available somewhere in your $PATH.

💡 Example

Use Biopython to load two FASTA files, and then orthoani.orthoani to compute the OrthoANI metric between them:

import pyorthoani
from Bio.SeqIO import read

genome_1 = read("sequence1.fa", "fasta")
genome_2 = read("sequence2.fa", "fasta")

ani = pyorthoani.orthoani(genome_1, genome_2)

pyorthoani can also be used from the CLI using a very simple command-line interface mimicking the original Java tool:

$ pyorthoani -q sequence1.fa -r sequence2.fa
57.25

🐏 Memory

orthoani uses the machine temporary folder to handle BLAST+ input and output files, which is configurable through tempfile.tempdir. On some systems (like ArchLinux), this filesystem can reside in memory, which means that your computer could have trouble processing very large files. If this happens, try changing the value of the tempfile.tempdir to a directory that is actually located on physical storage.

📏 Precision

Values computed by this package and the original Java implementation may differ slightly because in Java the authors perform rounding of floating-point values at the sub-percent level, while this library uses the full values.

🔖 Citation

PyOrthoANI is scientific software; it is submitted for publication and is currently available as a pre-print on bioRxiv. Please cite both PyOrthoANI and OrthoANI if you are using it in an academic work, for instance as:

PyOrthoANI (Larralde et al., 2024), a Python implementation of OrthoANI (Lee et al., 2016).

📜 About

This library is provided under the open-source MIT license.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original OrthoANI authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

📚 References

  • [1] Imchang Lee, Yeong Ouk Kim, Sang-Cheol Park and Jongsik Chun. OrthoANI: An improved algorithm and software for calculating average nucleotide identity (2016). International Journal of Systematic and Evolutionary Microbiology. doi:10.1099/ijsem.0.000760. PMID:26585518.

About

A Python implementation of the OrthoANI algorithm for nucleotide identity measurement.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages