Tuesday, August 7, 2012

What the WVDP did all summer (and last year)

The West Virginia Dialect Project spent 14 months creating “sound slides” from interviews of the 67 speakers in the West Virginia Corpus of English in Appalachia (WVCEA). These sound slides are typed transcripts of audio recordings that are manually time-aligned to flow along with the audio files as they play. These transcripts are stored in Praat TextGrids with boundaries marked between utterances. Utterances are defined as sections of speech surrounded by silences of 0.06 seconds or larger. Manual time-alignment was a pain-staking process that involved careful attention to details such as appropriate spacing, correct spelling, background noise, and precise boundaries. These sound slides are stored in a searchable database, which will be the basis for many future research projects.

 
The process was time consuming. It took approximately an hour to manually time-align two and a half minutes of audio in Praat. Most of the interviews in the WVCEA consist of one or two speakers and last about an hour. Of course, there are exceptions. A few of the interviews have up to seven speakers and can last more than two hours, greatly increasing the amount of work required to manually time-align those interviews. Because of the need for accuracy, after the initial time-aligning of an interview, a second research assistant reviewed the TextGrids for errors. Altogether, the process took anywhere between 25-40 hours to align each interview and over 2,000 hours to manually time-align the entire corpus. This work was supported by two National Science Foundation grants, “A Sociolinguistic Baseline for English in Appalachia” (BCS 0743489) and “Phonetic Variation in Appalachia” (BCS-1120156).

A Sample of TextGrids
Speaker(s)
# of Speakers (including interviewer)
# of Utterances
Length of Audio (Hours)
Gbr 3
2
2545
   0:43
Barb 1
2
5296
1:00
Mon 7/ Mon 8
3
7326
1:20
Boon 3/ Boon 4
4
9309
1:32
Log 2/ Log 3/ Log 4
7
12359
2:30

No comments:

Post a Comment