A two-phase approach for detecting recombination in nucleotide sequences

Chan, Cheong Xin, Beiko, Robert G. and Ragan, Mark A. (2007). A two-phase approach for detecting recombination in nucleotide sequences. In: Proceedings of the First Southern African Bioinformatics Workshop (SABioinf 2007). The First Southern African Bioinformatics Workshop, Johannesburg, South Africa, (9-16). 28-30 January 2007.

Author Chan, Cheong Xin
Beiko, Robert G.
Ragan, Mark A.
Title of paper A two-phase approach for detecting recombination in nucleotide sequences
Conference name The First Southern African Bioinformatics Workshop
Conference location Johannesburg, South Africa
Conference dates 28-30 January 2007
Proceedings title Proceedings of the First Southern African Bioinformatics Workshop (SABioinf 2007)
Place of Publication Johannesburg, South Africa
Publisher University of the Witwatersrand
Publication Year 2007
Sub-type Fully published paper
ISBN 978-0-620-38113-0
Start page 9
End page 16
Total pages 8
Language eng
Abstract/Summary Genetic recombination can produce heterogeneous phylogenetic histories within a set of homologous genes. Delineating recombination events is important in the study of molecular evolution, as inference of such events provides a clearer picture of the phylogenetic relationships among different gene sequences or genomes. Nevertheless, detecting recombination events can be a daunting task, as the performance of different recombination-detecting approaches can vary, depending on evolutionary events that take place after recombination. We recently evaluated the effects of post-recombination events on the prediction accuracy of recombination-detecting approaches using simulated nucleotide sequence data. The main conclusion, supported by other studies, is that one should not depend on a single method when searching for recombination events. In this paper, we introduce a two-phase strategy, applying three statistical measures to detect the occurrence of recombination events, and a Bayesian phylogenetic approach in delineating breakpoints of such events in nucleotide sequences. We evaluate the performance of these approaches using simulated data, and demonstrate the applicability of this strategy to empirical data. The two-phase strategy proves to be time-efficient when applied to large datasets, and yields high-confidence results.
Subjects 239901 Biological Mathematics
08 Information and Computing Sciences
0803 Computer Software
Keyword Comparative genomics
Recombination detection
Sequence analysis
Evolution and phylogenetics
References BEIKO, R.G., CHAN, C.X. AND RAGAN, M.A. 2005. A word-oriented approach to alignment validation. Bioinformatics 21, 2230-2239. BEIKO, R.G., HARLOW, T.J. AND RAGAN, M.A. 2005. Highways of gene sharing in prokaryotes. Proceedings of the National Academy of Sciences of the United States of America 102, 14332-14337. BRUEN, T.C., PHILIPPE, H. AND BRYANT, D. 2006. A simple and robust statistical test for detecting the presence of recombination. Genetics 172, 2665-2681. CHAN, C.X., BEIKO, R.G. AND RAGAN, M.A. 2006. Detecting recombination in evolving nucleotide sequences. BMC Bioinformatics 7, 412. ETHERINGTON, G.J., DICKS, J. AND ROBERTS, I.N. 2005. Recombination Analysis Tool (RAT): a program for the high-throughput detection of recombination. Bioinformatics 21, 278-281. GRAHAM, J., MCNENEY, B. AND SEILLIER-MOISEIWITSCH, F. 2005. Stepwise detection of recombination breakpoints in sequence alignments. Bioinformatics 21, 589-595. HARLOW, T.J., GOGARTEN, J.P. AND RAGAN, M.A. 2004. A hybrid clustering approach to recognition of protein families in 114 microbial genomes. BMC Bioinformatics 5, 45. HASEGAWA, M., KISHINO, H. AND YANO, T.A. 1985. Dating of the human ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution 22, 160-174. HEIN, J. 1990. Reconstructing evolution of sequences subject to recombination using parsimony. Mathematical Biosciences 98, 185-200. JAKOBSEN, I.B. AND EASTEAL, S. 1996. A program for calculating and displaying compatibility matrices as an aid in determining reticulate evolution in molecular sequences. Computer Applications in the Biosciences 12, 291-295. KOSAKOVSKY POND, S.L., POSADA, D., GRAVENOR, M.B., WOELK, C.H. AND FROST, S.D.W. 2006. Automated phylogenetic detection of recombination using a genetic algorithm. Molecular Biology and Evolution 23, 1891-1901. MAYNARD SMITH, J. 1992. Analyzing the mosaic structure of genes. Journal of Molecular Evolution 34, 126-129. MININ, V.N., DORMAN, K.S., FANG, F. AND SUCHARD, M.A. 2005. Dual multiple change-point model leads to more accurate recombination detection. Bioinformatics 21, 3034-3042. POSADA, D. 2002. Evaluation of methods for detecting recombination from DNA sequences: empirical data. Molecular Biology and Evolution 19, 708-717. POSADA, D. AND CRANDALL, K.A. 2001. Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proceedings of the National Academy of Sciences of the United States of America 98, 13757-13762. RAMBAUT, A. AND GRASSLY, N.C. 1997. Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Computer Applications in the Biosciences 13, 235-238. SAWYER, S. 1989. Statistical tests for detecting gene conversion. Molecular Biology and Evolution 6, 526-538. SUCHARD, M.A., WEISS, R.E., DORMAN, K.S. AND SINSHEIMER, J.S. 2003. Inferring spatial phylogenetic variation along nucleotide sequences: a multiple change-point model. Journal of the American Statistical Association 98, 427-437. WEILLER, G.F. 1998. Phylogenetic profiles: a graphical method for detecting genetic recombinations in homologous sequences. Molecular Biology and Evolution 15, 326-335. WIUF, C., CHRISTENSEN, T. AND HEIN, J. 2001. A simulation study of the reliability of recombination detection methods. Molecular Biology and Evolution 18, 1929-1939.
Q-Index Code EX
Additional Notes Also published as an article in South African Computer Journal, ISSN: 1015-7999, Issue 38, pp. 20-27, June 2007

Version Filter Type
Citation counts: Google Scholar Search Google Scholar
Created: Fri, 16 Feb 2007, 12:17:28 EST by Cheong Xin Chan on behalf of Faculty Of Engineering, Architecture & Info Tech