Semi-supervised learning for cyberbullying detection in social networks

Nahar, Vinita, Al-Maskari, Sanad, Li, Xue and Pang, Chaoyi (2014). Semi-supervised learning for cyberbullying detection in social networks. In: Hua Wang and Mohamed A. Sharaf, Databases Theory and Applications - 25th Australasian Database Conference, ADC 2014, Proceedings. 25th Australasian Database Conference, ADC 2014, Brisbane, QLD, (160-171). 14 - 16 July2014. doi:10.1007/978-3-319-08608-8_14


Author Nahar, Vinita
Al-Maskari, Sanad
Li, Xue
Pang, Chaoyi
Title of paper Semi-supervised learning for cyberbullying detection in social networks
Conference name 25th Australasian Database Conference, ADC 2014
Conference location Brisbane, QLD
Conference dates 14 - 16 July2014
Proceedings title Databases Theory and Applications - 25th Australasian Database Conference, ADC 2014, Proceedings   Check publisher's open access policy
Journal name Lecture Notes in Computer Science   Check publisher's open access policy
Series Lecture Notes in Computer Science
Place of Publication Heidelberg, Germany
Publisher Springer
Publication Year 2014
Year available 2014
Sub-type Fully published paper
DOI 10.1007/978-3-319-08608-8_14
ISBN 9783319086071
9783319086088
ISSN 0302-9743
1611-3349
Editor Hua Wang
Mohamed A. Sharaf
Volume 8506 LNCS
Start page 160
End page 171
Total pages 12
Chapter number 14
Total chapters 21
Collection year 2015
Language eng
Abstract/Summary Current approaches on cyberbullying detection are mostly static: they are unable to handle noisy, imbalanced or streaming data efficiently. Existing studies on cyberbullying detection are mainly supervised learning approaches, assuming data is sufficiently pre-labelled. However this is impractical in the real-world situation where only a small number of labels are available in streaming data. In this paper, we propose a semi-supervised leaning approach that will augment training data samples and apply a fuzzy SVM algorithm. The augmented training technique automatically extracts and enlarges training set from the unlabelled streaming text, while learning is conducted by utilising a very small training set provided as an initial input. The experimental results indicate that the proposed augmented approach outperformed all other methods, and is suitable in the real-world situations, where sufficiently labelled instances are not available for training. For the proposed fuzzy SVM approach we handle complex and multidimensional data generated by streaming text, where the importance of features are discriminated for the decision function. The evaluation conducted on different experimental scenarios indicates the superiority of the proposed fuzzy SVM against all other methods.
Keyword Cyberbullying Detection
Semi-supervised learning
Social Networks
Text stream classification
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

 
Versions
Version Filter Type
Citation counts: Scopus Citation Count Cited 5 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 29 Jul 2014, 12:33:21 EST by System User on behalf of School of Information Technol and Elec Engineering