Prediction of protein solvent accessibility using support vector machines

Yuan, Zheng, Burrage, Kevin and Mattick, John S. (2002) Prediction of protein solvent accessibility using support vector machines. Proteins: Structure, Function, and Genetics, 48 3: 566-570. doi:10.1002/prot.10176

Author Yuan, Zheng
Burrage, Kevin
Mattick, John S.
Title Prediction of protein solvent accessibility using support vector machines
Journal name Proteins: Structure, Function, and Genetics   Check publisher's open access policy
ISSN 0887-3585
Publication date 2002-08-15
Sub-type Article (original research)
DOI 10.1002/prot.10176
Open Access Status Not Open Access
Volume 48
Issue 3
Start page 566
End page 570
Total pages 5
Editor E.E. Lattman
Place of publication New York, N.Y., U. S. A.
Publisher Wiley-Liss
Language eng
Subject C1
230116 Numerical Analysis
780101 Mathematical sciences
0304 Medicinal and Biomolecular Chemistry
Formatted abstract
A Support Vector Machine learning system has been trained. to predict protein solvent accessibility from the primary structure. Different kernel functions and sliding window sizes have been explored to find how they affect the prediction performance. Using a cut-off threshold of 15% that splits the dataset evenly (an equal number of exposed and buried residues), this method was able to achieve a prediction accuracy of 70.1% for single sequence input and 73.9% for multiple alignment sequence input, respectively. The prediction of three and more states of solvent accessibility was also studied and compared with other methods. The prediction accuracies are better than, or comparable to, those obtained by other methods such as neural networks, Bayesian classification, multiple linear regression, and information theory. In addition, our results further suggest that this system may be combined with other prediction methods to achieve more reliable results, and that the Support Vector Machine method is a very useful tool for biological sequence analysis.
© 2002 Wiley-Liss, Inc.
Keyword Biochemistry & Molecular Biology
Protein structure prediction
Machine learning
Solvent accessibility
Computer simulation
Secondary structure prediction
Fold recognition
Q-Index Code C1

Document type: Journal Article
Sub-type: Article (original research)
Collections: Excellence in Research Australia (ERA) - Collection
Institute for Molecular Bioscience - Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 83 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 95 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Wed, 15 Aug 2007, 04:35:11 EST