Perception or synthesized voice quality in connected speech by Cantonese speakers

Yiu, Edwin M-L, Murdoch, Bruce, Hird, Kathryn and Lau, Polly (2002) Perception or synthesized voice quality in connected speech by Cantonese speakers. Journal of The Acoustical Society of America, 112 3: 1091-1101. doi:10.1121/1.1500753

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads
UQ62615_OA.pdf Full text (open access) application/pdf 170.68KB 0

Author Yiu, Edwin M-L
Murdoch, Bruce
Hird, Kathryn
Lau, Polly
Title Perception or synthesized voice quality in connected speech by Cantonese speakers
Journal name Journal of The Acoustical Society of America   Check publisher's open access policy
ISSN 0001-4966
Publication date 2002-09
Year available 2002
Sub-type Article (original research)
DOI 10.1121/1.1500753
Open Access Status File (Publisher version)
Volume 112
Issue 3
Start page 1091
End page 1101
Total pages 11
Place of publication Melville, NY, United States
Publisher A I P Publishing LLC
Collection year 2002
Language eng
Subject C1
321025 Rehabilitation and Therapy - Hearing and Speech
730303 Occupational, speech and physiotherapy
Abstract Perceptual voice analysis is a subjective process. However, despite reports of varying degrees of intrajudge and interjudge reliability, it is widely used in clinical voice evaluation. One of the ways to improve the reliability of this procedure is to provide judges with signals as external standards so that comparison can be made in relation to these anchor signals. The present study used a Klatt speech synthesizer to create a set of speech signals with varying degree of three different voice qualities based on a Cantonese sentence. The primary objective of the study was to determine whether different abnormal voice qualities could be synthesized using the built-in synthesis parameters using a perceptual study. The second objective was to determine the relationship between acoustic characteristics of the synthesized signals and perceptual judgment. Twenty Cantonese-speaking speech pathologists with at least three years of clinical experience in perceptual voice evaluation were asked to undertake two tasks. The first was to decide whether the voice quality of the synthesized signals was normal or not. The second was to decide whether the abnormal signals should be described as rough, breathy, or vocal fry. The results showed that signals generated with a small degree of aspiration noise were perceived as breathiness while signals with a small degree of flutter or double pulsing were perceived as roughness. When the flutter or double pulsing increased further, tremor and vocal fry, rather than roughness, were perceived. Furthermore, the amount of aspiration noise, flutter, or double pulsing required for male voice stimuli was different from that required for the female voice stimuli with a similar level of perceptual breathiness and roughness. These findings showed that changes in perceived vocal quality could be achieved by systematic modifications of synthesis parameters. This opens up the possibility of using synthesized voice signals as external standards or anchors to improve the reliability of clinical perceptual voice evaluation. (C) 2002 Acoustical Society of America.
Keyword Acoustics
Vocal Quality
Q-Index Code C1
Institutional Status UQ

Document type: Journal Article
Sub-type: Article (original research)
Collections: Excellence in Research Australia (ERA) - Collection
School of Health and Rehabilitation Sciences Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 13 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 14 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 14 Aug 2007, 17:59:30 EST