Accent classification using support vector machines

Pedersen, Carol and Diederich, Joachim (2007). Accent classification using support vector machines. In: R. Lee, M. V. Chowdhury, S. Ray and T. Lee, Proceedings 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007). 6th Annual IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007), Melbourne, Australia, (444-449). 11-13 July 2007.


Author Pedersen, Carol
Diederich, Joachim
Title of paper Accent classification using support vector machines
Conference name 6th Annual IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007)
Conference location Melbourne, Australia
Conference dates 11-13 July 2007
Proceedings title Proceedings 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007)
Journal name 6th IEEE/ACIS International Conference on Computer and Information Science, Proceedings
Place of Publication Los Alamitos, CA, United States
Publisher IEEE
Publication Year 2007
Sub-type Fully published paper
DOI 10.1109/ICIS.2007.47
ISBN 9780769528410
0769528414
Editor R. Lee
M. V. Chowdhury
S. Ray
T. Lee
Start page 444
End page 449
Total pages 6
Collection year 2008
Language eng
Abstract/Summary Accent is the pattern of pronunciation and acoustic features in speech which can identify a person's linguistic, social or cultural background. It is an important source of inter-speaker variability, and a particular problem for automated speech recognition. Current approaches to the identification of speaker accent may require specialised linguistic knowledge or analysis of the particular speech contrasts, and often extensive pre-processing on large amounts of data. An accent classification system using time-based segments consisting of Mel Frequency Cepstral Coefficients as features and employing Support Vector Machines is studied for a small corpus of two accents of English. On one- to four-second audio samples from three topics, accuracy in the binary classification task is up to 75% to 97.5%, with very high recall and precision. Its use with mis-matched content is at best 85% with a tendency towards majority-class classification if the accent groups are significantly imbalanced.
Subjects 280206 Speech Recognition
780101 Mathematical sciences
E1
Keyword Cepstral analysis
Signal classification
Speaker recognition
Support vector machines
Q-Index Code E1
Q-Index Status Confirmed Code
Institutional Status UQ

 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 1 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 6 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Access Statistics: 276 Abstract Views  -  Detailed Statistics
Created: Tue, 06 May 2008, 10:04:51 EST by Donna Clark on behalf of School of Information Technol and Elec Engineering