Current Projects

ATAROS

The ATAROS (Automatic Tagging and Recognition of Stance) project aims to identify acoustic signals of stance-taking (opinions, evaluations, judgments, etc.) in order to inform the development of automatic stance recognition in natural speech. Because existing corpora generally have a low frequency of stance-taking in conversation, we are creating an audio corpus of dyads completing collaborative tasks designed to elicit a high density of stance-taking at increasing levels of involvement. Funded by NSF IIS #1351034 awarded to PIs Gina-Anne Levow, Richard Wright, Mari Ostendorf.

For access to the corpus for research purposes, send an email to ataros@uw.edu with your institutional affiliation and a short description of the type of data you want and how you plan to use it.

Luan, Y., Wright, R.A., Ostendorf, M., & Levow, G.-A. (to appear). Relating automatic vowel space estimates to talker intelligibility. Proceedings of the 15th Annual Conference of the International Speech Communication Association (Interspeech 2014), Singapore, Sept. 14-18, 2014. [manuscript]

Technical Report 1 (May 2014): Corpus collection and initial task validation [manuscript]

Freeman, V., Levow, G.-A., & Wright, R.A. (2014). “Phonetic marking of stance in a collaborative-task spontaneous-speech corpus.” Presented at the 167th Meeting of the Acoustical Society of America (ASA), Providence, RI, May 5-9. [handout]

Freeman, V., Chan, J., Levow, G.-A., Wright, R.A., Ostendorf, M., & Zayats, V. (to appear). Manipulating stance and involvement using collaborative tasks: An exploratory comparison. Proceedings of the 15th Annual Conference of the International Speech Communication Association (Interspeech 2014), Singapore, Sept. 14-18, 2014. [manuscript]

Perceptual Adaptation to Distortions

This project examines several issues related to speech perception under conditions of distortion (i.e., hearing loss, hearing aid amplification, dialect, foreign accent, etc). Current work is focused on cross-dialect perception and talker familiarity, and how they interact with hearing loss of different types. This research is funded by NIDCD flowthrough funds to Pamela Souza at Northwestern University (NIH grant R01DC006014), and is a collaboration with Northwestern's Hearing Aid Lab.

McCloy, D.R., Wright, R.A., & Souza, P.E. (submitted). A symmetrical cross-dialect study of acoustic predictors of speech intelligibility.

McCloy, D.R., Wright, R.A., & Souza, P.E. (2014). Modeling intrinsic intelligibility variation: Vowel-space size and structure. Proceedings of Meetings on Acoustics, 18, 060007. doi:10.1121/1.4870070.

Souza, P.E., Gehani, N., Wright, R.A., & McCloy, D.R. (2013). The advantage of knowing the talker. Journal of the American Academy of Audiology, 24, 689–700. doi:10.3766/jaaa.24.8.6. [manuscript]

McCloy, D.R. (2013). Separating segmental and prosodic contributions to intelligibility. Poster presented at the 4th International Summer School on Speech Production and Perception: Speaker-Specific Behavior, Aix-en-Provence, FR.

McCloy, D.R., Wright, R.A., & McGrath, A. (2012). Modelling talker intelligibility variation in a dialect-controlled corpus. Poster presented at the 164th Meeting of the Acoustical Society of America, Kansas City, MO. doi:10.1121/1.4755663.

Wright, R.A. & Souza, P.E. (2012). Comparing identification of standardized and regionally-valid vowels. Journal of Speech Language and Hearing Research.

Souza, P.E., Wright, R.A., & Bor, S. (2012). Consequences of Broad Auditory Filters for Identification of Multichannel-Compressed Vowels. Journal of Speech, Language, and Hearing Research, 55(2): 474-486.

Bor, S., Souza, P.E., & Wright, R.A. (2008). Multichannel compression: Effects of reduced spectral contrast on vowel identification. Journal of Speech, Language, and Hearing Research, 51(5), 1315-1327. doi:10.1044/1092-4388(2008/07-0009).

PHOIBLE

PHOIBLE (Phonetics Information Base and Lexicon) is a knowledge base of phonological inventories, structured as a queryable and extensible mathematical graph. The knowledge base includes allophonic detail for many languages, and all phones are encoded as IPA unicode and as vectors of distinctive features, allowing "fuzzy" queries for classes of sounds instead of searching using individual glyphs. As of January 2012, there were over 1500 languages included.

Moran, S., McCloy, D.R., & Wright, R.A. (eds.) (2014). PHOIBLE Online. Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at phoible.org)

McCloy, D.R., Moran, S., & Wright, R.A. (2013). Revisiting ‘The role of features in phonological inventories’. Paper presented at the CUNY Conference on the Feature in Phonology and Phonetics, New York, NY.

Moran, S., McCloy, D.R., & Wright, R.A. (2012). Revisiting the population size vs. phoneme inventory size. Language, 88(4), 877–893. doi:10.1353/lan.2012.0087.

Moran, S., McCloy, D.R., & Wright, R.A. (2012). Revisiting the population vs phoneme-inventory correlation. Presented at the 86th Meeting of the Linguistic Society of America, Portland, OR. [slides] [extended abstract]

Moran, S. (2011). Technological infrastructure for comparative linguistics and linguistic fieldwork: Some case studies. Presented at Max-Planck-Institut für evolutionäre Anthropologie, Leipzig, 27 January 2011.


Past Projects

Vocal Joystick

The Vocal Joystick Project seeks to develop a continuous and discrete control device based on continuous and discrete dimensions of human vocalizations. There are four continuous dimensions that can be extracted from vowel-like vocalizations: vowel height, vowel backness, pitch, and intensity. In addition to these, discrete control can be achieved through brief obstruent-like sounds and through words. The extracted dimensions can be used as control parameters for a variety of devices. The most obvious application is computer control in GUI environments, but applications also include robotic arms, thermostats, and lighting levels.

Bilmes, J., Malkin, J., Li, X., Harada, S., Kilanski, K., Kirchhoff, K., Wright, R.A., Subramanya, A., Landay, J., Dowden, P., & Chizeck, H. (2006). The Vocal Joystick. IEEE International Conference on Audio, Speech and Signal Processing, Toulouse, France, May 2006  [pdf] [bibtex]

Kilanski, K., Malkin, J., Li, X., Wright, R.A., & Bilmes, J. (2006). The Vocal Joystick Data Collection Effort and Vowel Corpus. Interspeech, Pittsburgh, Sep. 2006.  [pdf] [bibtex]

Li, X., Malkin, J., Harada, S., Bilmes, J., Wright, R.A., & Landay, J. (2006). An Online Adaptive Filtering Algorithm for the Vocal Joystick. Interspeech, Pittsburgh, Sep. 2006  [pdf] [bibtex] [video]

Bilmes, J., Li, X., Malkin, J., Kilanski, K., Wright, R.A., Kirchhoff, K., Subramanya, A., Harada, S., Landay, J., Dowden, P., & Chizeck, H. (2005). The Vocal Joystick: A Voice-Based Human-Computer Interface for Individuals with Motor Impairments. Human Language Technology Conf. and Conf. on Empirical Methods in Natural Language Processing, Vancouver, Canada, Oct. 2005  [pdf] [bibtex]