SVM-based prediction of propeptide cleavage sites in spider toxins identifies toxin innovation in an Australian tarantula

Wong, Emily S. W., Hardy, Margaret C., Wood, David, Bailey, Timothy and King, Glenn F. (2013) SVM-based prediction of propeptide cleavage sites in spider toxins identifies toxin innovation in an Australian tarantula. PLoS ONE, 8 7: e66279.1-e66279.11. doi:10.1371/journal.pone.0066279


Author Wong, Emily S. W.
Hardy, Margaret C.
Wood, David
Bailey, Timothy
King, Glenn F.
Title SVM-based prediction of propeptide cleavage sites in spider toxins identifies toxin innovation in an Australian tarantula
Journal name PLoS ONE   Check publisher's open access policy
ISSN 1932-6203
Publication date 2013-07-22
Sub-type Article (original research)
DOI 10.1371/journal.pone.0066279
Open Access Status DOI
Volume 8
Issue 7
Start page e66279.1
End page e66279.11
Total pages 11
Place of publication San Francisco, CA, United States
Publisher Public Library of Science
Collection year 2014
Language eng
Formatted abstract
Spider neurotoxins are commonly used as pharmacological tools and are a popular source of novel compounds with therapeutic and agrochemical potential. Since venom peptides are inherently toxic, the host spider must employ strategies to avoid adverse effects prior to venom use. It is partly for this reason that most spider toxins encode a protective proregion that upon enzymatic cleavage is excised from the mature peptide. In order to identify the mature toxin sequence directly from toxin transcripts, without resorting to protein sequencing, the propeptide cleavage site in the toxin precursor must be predicted bioinformatically. We evaluated different machine learning strategies (support vector machines, hidden Markov model and decision tree) and developed an algorithm (SpiderP) for prediction of propeptide cleavage sites in spider toxins. Our strategy uses a support vector machine (SVM) framework that combines both local and global sequence information. Our method is superior or comparable to current tools for prediction of propeptide sequences in spider toxins. Evaluation of the SVM method on an independent test set of known toxin sequences yielded 96% sensitivity and 100% specificity. Furthermore, we sequenced five novel peptides (not used to train the final predictor) from the venom of the Australian tarantula Selenotypus plumipes to test the accuracy of the predictor and found 80% sensitivity and 99.6% 8-mer specificity. Finally, we used the predictor together with homology information to predict and characterize seven groups of novel toxins from the deeply sequenced venom gland transcriptome of S. plumipes, which revealed structural complexity and innovations in the evolution of the toxins. The precursor prediction tool (SpiderP) is freely available on ArachnoServer (http://www.arachnoserver.org/spiderP.htm‚Äčl), a web portal to a comprehensive relational database of spider toxins. All training data, test data, and scripts used are available from the SpiderP website.
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2014 Collection
Institute for Molecular Bioscience - Publications
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 5 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 10 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 17 Sep 2013, 14:46:56 EST by Susan Allen on behalf of Institute for Molecular Bioscience