Prediction of protein solvent accessibility using PSO-SVR with multiple sequence-derived features and weighted sliding window scheme

Zhang, Jian, Chen, Wenhan, Sun, Pingping, Zhao, Xiaowei and Ma, Zhiqiang (2015) Prediction of protein solvent accessibility using PSO-SVR with multiple sequence-derived features and weighted sliding window scheme. BioData Mining, 8 3.1: 3.16. doi:10.1186/s13040-014-0031-3


Author Zhang, Jian
Chen, Wenhan
Sun, Pingping
Zhao, Xiaowei
Ma, Zhiqiang
Title Prediction of protein solvent accessibility using PSO-SVR with multiple sequence-derived features and weighted sliding window scheme
Journal name BioData Mining   Check publisher's open access policy
ISSN 1756-0381
Publication date 2015-01-31
Sub-type Article (original research)
DOI 10.1186/s13040-014-0031-3
Open Access Status DOI
Volume 8
Issue 3.1
Start page 3.16
Total pages 16
Place of publication London, United Kingdom
Publisher BioMed Central
Collection year 2016
Language eng
Formatted abstract
Background: The prediction of solvent accessibility could provide valuable clues for analyzing protein structure and functions, such as protein 3-Dimensional structure and B-cell epitope prediction. To fully decipher the protein-protein interaction process, an initial but crucial step is to calculate the protein solvent accessibility, especially when the tertiary structure of the protein is unknown. Although some efforts have been put into the protein solvent accessibility prediction, the performance of existing methods is far from satisfaction.

Methods: In order to develop the high-accuracy model, we focus on some possible aspects concerning the prediction performance, including several sequence-derived features, a weighted sliding window scheme and the parameters optimization of machine learning approach. To address above issues, we take following strategies. Firstly, we explore various features which have been observed to be associated with the residue solvent accessibility. These discriminative features include protein evolutionary information, predicted protein secondary structure, native disorder, physicochemical propensities and several sequence-based structural descriptors of residues. Secondly, the different contributions of adjacent residues in sliding window are observed, thus a weighted sliding window scheme is proposed to differentiate the contributions of adjacent residues on the central residue. Thirdly, particle swarm optimization (PSO) is employed to search the global best parameters for the proposed predictor.

Results: Evaluated by 3-fold cross-validation, our method achieves the mean absolute error (MAE) of 14.1% and the person correlation coefficient (PCC) of 0.75 for our new-compiled dataset. When compared with the state-of-the-art prediction models in the two benchmark datasets, our method demonstrates better performance. Experimental results demonstrate that our PSAP achieves high performances and outperforms many existing predictors. A web server called PSAP is built and freely available at http://59.73.198.144:8088/SolventAccessibility/.
Keyword Particle swarm optimization
Protein sequence
Solvent accessibility
Support vector regression
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2016 Collection
School of Chemistry and Molecular Biosciences
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 2 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 2 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 03 Nov 2015, 02:07:31 EST by System User on behalf of Scholarly Communication and Digitisation Service