A Word-Oriented Approach to Alignment Validation

Beiko, Robert G., Chan, Cheong Xin and Ragan, Mark A. (2005) A Word-Oriented Approach to Alignment Validation. Bioinformatics, 21 10: 2230-2239. doi:10.1093/bioinformatics/bti335

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads
Beiko2005.pdf Beiko2005.pdf application/pdf 660.44KB 0

Author Beiko, Robert G.
Chan, Cheong Xin
Ragan, Mark A.
Title A Word-Oriented Approach to Alignment Validation
Journal name Bioinformatics   Check publisher's open access policy
ISSN 1367-4803
Publication date 2005-02-01
Sub-type Article (original research)
DOI 10.1093/bioinformatics/bti335
Open Access Status File (Author Post-print)
Volume 21
Issue 10
Start page 2230
End page 2239
Total pages 10
Place of publication Oxford
Publisher Oxford University Press
Language eng
Subject 239901 Biological Mathematics
279999 Biological Sciences not elsewhere classified
270208 Molecular Evolution
780105 Biological sciences
270199 Biochemistry and Cell Biology not elsewhere classified
C1
Abstract Motivation: Multiple sequence alignment at the level of whole proteomes requires a high degree of automation, precluding the use of traditional validation methods such as manual curation. Since evolutionary models are too general to describe the history of each residue in a protein family, there is no single algorithm/model combination that can yield a biologically or evolutionarily optimal alignment. We propose a 'shotgun' strategy where many different algorithms are used to align the same family, and the best of these alignments is then chosen with a reliable objective function. We present WOOF, a novel 'word-oriented' objective function that relies on the identification and scoring of conserved amino acid patterns (words) between pairs of sequences. Results: Tests on a subset of reference protein alignments from BAliBASE showed that WOOF tended to rank the (manually curated) reference alignment highest among 1060 alternative (automatically generated) alignments for a majority of protein families. Among the automated alignments, there was a strong positive relationship between the WOOF score and similarity to the reference alignment. The speed of WOOF and its independence from explicit considerations of three-dimensional structure make it an excellent tool for analyzing large numbers of protein families.
Keyword multiple sequence alignment
objective function
sequence analysis
Computer Science, Interdisciplinary Applications
Biotechnology & Applied Microbiology
Biochemical Research Methods
Mathematics, Interdisciplinary Applications
Q-Index Code C1
Additional Notes This is a pre-copy-editing, author-produced PDF of an article accepted for publication in Bioinformatics following peer review. The definitive publisher-authenticated version of Robert G. Beiko, Cheong Xin Chan and Mark A. Ragan, A word-oriented approach to alignment validation, Bioinformatics 2005 21(10): 2230-2239; doi:10.1093/bioinformatics/bti335 is available online at: http://dx.doi.org/doi:10.1093/bioinformatics/bti335. Copyright 2005 Oxford Journals. All rights reserved.

 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 12 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 10 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Mon, 04 Sep 2006, 10:00:00 EST by Cheong Xin Chan on behalf of Institute for Molecular Bioscience