What the WVDP did all summer (and last year)
The West Virginia Dialect
Project spent 14 months creating “sound slides” from interviews of the 67 speakers in the West
Virginia Corpus of English in Appalachia (WVCEA). These sound slides are typed
transcripts of audio recordings that are manually time-aligned to flow along
with the audio files as they play. These transcripts are stored in Praat
TextGrids with boundaries marked between utterances. Utterances are defined as sections
of speech surrounded by silences of 0.06 seconds or larger. Manual
time-alignment was a pain-staking process that involved careful attention to
details such as appropriate spacing, correct spelling, background noise, and
precise boundaries. These sound slides are stored in a searchable database, which
will be the basis for many future research projects.
The process was time
consuming. It took approximately an hour to manually time-align two and a half
minutes of audio in Praat. Most of the interviews in the WVCEA consist of one
or two speakers and last about an hour. Of course, there are exceptions. A few
of the interviews have up to seven speakers and can last more than two hours, greatly
increasing the amount of work required to manually time-align those interviews.
Because of the need for accuracy, after the initial time-aligning of an
interview, a second research assistant reviewed the TextGrids for errors.
Altogether, the process took anywhere between 25-40 hours to align each
interview and over 2,000 hours to manually time-align the entire corpus. This
work was supported by two National Science Foundation grants, “A
Sociolinguistic Baseline for English in Appalachia” (BCS 0743489) and “Phonetic Variation in Appalachia” (BCS-1120156).
A Sample of TextGrids
Speaker(s)
|
# of Speakers (including
interviewer)
|
# of Utterances
|
Length of Audio (Hours)
|
Gbr 3
|
2
|
2545
|
0:43
|
Barb 1
|
2
|
5296
|
1:00
|
Mon 7/ Mon 8
|
3
|
7326
|
1:20
|
Boon 3/ Boon 4
|
4
|
9309
|
1:32
|
Log 2/ Log 3/ Log 4
|
7
|
12359
|
2:30
|
No comments:
Post a Comment