The University of Washington/Northwestern University (UW/NU) Corpus 1.0

This page contains information about the UW/NC corpus. All information found here is also contained in the README file included with the corpus. You can download the entire corpus (in compressed .tar.gz format, 436MB) by clicking here.

Citation

Please cite as:
McCloy, D. R., Souza, P. E., Wright, R. A., Haywood, J., Gehani, N., & Rudolph, S. (2013). The UW/NC corpus. Version 1.0. http://depts.washington.edu/phonlab/resources/uwnu/uwnu1
BibTeX   Zotero RDF

Audio files

The corpus includes 3600 audio files in WAV format, sampled at 44.1 kHz with 16-bit depth. Files are readings of 180 sentences by 20 different talkers (5 males and 5 females from each of two dialect regions of American English: the Pacific Northwest and the Northern Cities). The set of audio files has been RMS-normalized to equate intensity across all recordings in the corpus.

TextGrids

A set of 3600 time-aligned transcriptions are included in the corpus. These are TextGrids for use with the praat software[1] that have been automatically generated by the Penn Phonetics lab forced aligner software[2] and are known to contain misalignments. They have NOT been checked or corrected by humans (much less by well-trained phoneticians or speech scientists). Use at your own risk.

Sentences

The sentence texts are drawn from the IEEE “Harvard” set.[3] Transcripts of the 180 sentences (along with their identification numbers) are included in the corpus in tab-delimited format. Individual transcript files for each sentence are also included. Sentence identification numbers are derived from the “list-sentence” notation of the original IEEE sentence lists: for example, sentence 01-07 corresponds to sentence #7 from list #1 of the original numbering scheme.

Filename conventions

The first two characters in the filenames reflect the dialect region of the talker (PN = Pacific Northwest, NC = Northern Cities). The third character indicates talker gender, and the fourth and fifth characters are meaningless digits, serially assigned to talkers during corpus creation. After an underscore, the sentence identification number comprises the remainder of the filename. For example, file PNM02_01-07.wav is a recording of Pacific Northwest Male #02 reading sentence number 01-07.

References

[1] Boersma, P., & Weenink, D. (2013). Praat: Doing phonetics by computer. http://www.praat.org/

[2] Yuan, J., & Liberman, M. (2008). The Penn Phonetics Lab forced aligner. http://www.ling.upenn.edu/phonetics/p2fa/

[3] Rothauser, E. H., Chapman, W. D., Guttman, N., Hecker, M. H. L., Nordby, K. S., Silbiger, H. R., Urbanek, G. E., & Weinstock, M. (1969). IEEE recommended practice for speech quality measurements. IEEE Transactions on Audio and Electroacoustics, 17, 225–246. DOI: 10.1109/TAU.1969.1162058