Accent classification using support vector machines

Pedersen, Carol and Diederich, Joachim (2007). Accent classification using support vector machines. In: R. Lee, M. V. Chowdhury, S. Ray and T. Lee, Proceedings 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007). 6th Annual IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007), Melbourne, Australia, (444-449). 11-13 July 2007.


Author Pedersen, Carol
Diederich, Joachim
Title of paper Accent classification using support vector machines
Conference Paper Type Fully Published Paper
Conference name 6th Annual IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007)    (ERA 2010 Rank A)
DOI 10.1109/ICIS.2007.47
Conference location Melbourne, Australia
Conference dates 11-13 July 2007
Proceedings title Proceedings 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007)
Journal name 6th IEEE/ACIS International Conference on Computer and Information Science, Proceedings
Editor R. Lee
M. V. Chowdhury
S. Ray
T. Lee
Place published Los Alamitos, CA, United States
Publisher IEEE
Publication date 2007
ISBN 9780769528410
0769528414
Start page 444
End page 449
Total pages 6
Collection year 2008
Language eng
Abstract/Summary Accent is the pattern of pronunciation and acoustic features in speech which can identify a person's linguistic, social or cultural background. It is an important source of inter-speaker variability, and a particular problem for automated speech recognition. Current approaches to the identification of speaker accent may require specialised linguistic knowledge or analysis of the particular speech contrasts, and often extensive pre-processing on large amounts of data. An accent classification system using time-based segments consisting of Mel Frequency Cepstral Coefficients as features and employing Support Vector Machines is studied for a small corpus of two accents of English. On one- to four-second audio samples from three topics, accuracy in the binary classification task is up to 75% to 97.5%, with very high recall and precision. Its use with mis-matched content is at best 85% with a tendency towards majority-class classification if the accent groups are significantly imbalanced.
Subjects 280206 Speech Recognition
780101 Mathematical sciences
E1
Keyword Cepstral analysis
Signal classification
Speaker recognition
Support vector machines
Q-Index Code E1
Q-Index Status Confirmed Code
Institutional Status UQ

 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in Thomson Reuters Web of Science Article
Scopus Citation Count Cited 4 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Access Statistics: 225 Abstract Views  -  Detailed Statistics
Created: Tue, 06 May 2008, 10:04:51 EST by Donna Clark on behalf of School of Information Technol and Elec Engineering