This repository implements the Bayesian test of introgression developed by Ji et al. (2023). The program calculates the Bayes factor via the Savage-Dickey density ratio using an MCMC sample under the MSci or MSC-M to evaluate the evidence for a proposed gene flow event. Note that the MSci models gene flow as a discrete event that occurred at some fixed time, with its magnitude estimated through the introgression probability,
The Bayes factor representing the evidence for
where
The program is currently distributed as an R script. To install and run do the following:
- Clone the repository or download the source files.
- Navigate to the directory containing the script.
- Make sure you have R installed on your machine. If not, download and install R from https://www.r-project.org/.
The stats
package (>= 4.2.2) must be installed. This can be done with the following command:
install.packages("stats")
The program accepts as input a BPP MCMC sample file.
From the command line, the script can be run as follows:
Rscript BF-script.R [function] [alpha] [beta] [epsilon] [file] [column_indices]
function Either BF_Gamma or BF_Beta.
alpha Numeric value for the alpha parameter of the prior distribution on M or Varphi.
beta Numeric value for the beta parameter of the prior distribution on M or Varphi.
epsilon Numeric value for the epsilon parameter.
file Path to the MCMC sample file.
column_indices Column indices, separated by spaces. Ranges can be specified using ':'.
If the BF_Gamma option is selected then 'alpha' is the shape and 'beta' is the rate of the prior distribution.
If the BF_Beta option is selected then both 'alpha' and 'beta' are shape parameters of the prior distribution.
For example:
Rscript BF-script.R BF_Gamma 2 10 0.01 /path/to/sample-mcmc.txt 25:30
would use the BF_Gamma function with alpha = 2 (shape), beta = 10 (rate), epsilon = 0.01, an MCMC sample file located at /path/to/mcmc_file.txt, and columns 25:30. The MCMC file used in this example, along with the output file, can be found in the 'test' folder.
The output of the program is a text file containing the calculated Bayes factor for each proposed gene flow event. A Bayes factor threshold of 100 means strong support for
The usage instructions can be viewed by typing Rscript BF-script.R --help
.
Flouri, T., Jiao, X., Rannala, B., and Yang, Z. 2018. Species tree inference with BPP using genomic sequences and the multispecies coalescent. Mol. Biol. Evol., 35(10): 2585–2593.
Flouri, T., Jiao, X., Rannala, B., and Yang, Z. 2020. A Bayesian implementation of the multispecies coalescent model with introgression for phylogenomic analysis. Mol. Biol. Evol., 37(4): 1211–1223.
Ji, J., Jackson, D. J., Leache, A. D., and Yang, Z. 2023. Power of Bayesian and heuristic tests to detect cross-species introgression with reference to gene flow in the Tamias quadrivittatus group of North American chipmunks. Syst. Biol., page 10.1093/sysbio/syac077.