Accessing phonetic variation in spoken language corpora through non-standard orthography

Schalley, Andrea C., Musgrave, Simon and Haugh, Michael (2014) Accessing phonetic variation in spoken language corpora through non-standard orthography. Australian Journal of Linguistics, 34 1: 139-170. doi:10.1080/07268602.2014.875459


Author Schalley, Andrea C.
Musgrave, Simon
Haugh, Michael
Title Accessing phonetic variation in spoken language corpora through non-standard orthography
Journal name Australian Journal of Linguistics   Check publisher's open access policy
ISSN 1469-2996
0726-8602
Publication date 2014-02-20
Year available 2014
Sub-type Article (original research)
DOI 10.1080/07268602.2014.875459
Open Access Status Not Open Access
Volume 34
Issue 1
Start page 139
End page 170
Total pages 32
Place of publication Melbourne, VIC, Australia
Publisher Routledge
Language eng
Abstract Much of the sociolinguistic and stylistic variation which is of interest to linguists is phonetic in nature, but the access route to corpus data is typically via a textual transcription. This poses a significant problem for a researcher who wishes to access the original recordings of speech in order to analyse variation: how can they search for relevant data? Many transcription traditions allow for the representation of such variation through non-standard orthography, and such conventions should therefore allow access to data relevant to the study of variation. However, the specific conventions used vary between traditions (and indeed may not be applied consistently by individual transcribers). This then creates another problem where the researcher wishes to access data across an aggregated collection, which is a practical necessity given the relatively limited size of most corpora of spoken language. In this paper, we analyse the conventions used in two of the component collections in the Australian National Corpus, the Australian Radio Talkback Corpus and the Monash Corpus of Spoken English. On the basis of this analysis, we develop a fragment of an ontology which gives an explicit account of the phenomena related to non-standard pronunciation represented in the transcripts and which can therefore act as the basis for better searching of the collections and better access to relevant data for analysing sociolinguistic and stylistic variation.
Keyword Transcription
Orthography
Phonetic variation
Ontology
Corpus
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status Non-UQ
Additional Notes Link to publication: http://www.tandfonline.com/doi/full/10.1080/07268602.2014.875459

Document type: Journal Article
Sub-type: Article (original research)
Collection: School of Languages and Cultures Publications
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 1 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 1 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 07 Jun 2016, 15:32:44 EST by Ms Katrina Hume on behalf of School of Languages and Cultures