Karen Livescu

Research Assistant Professor
Toyota Technological Institute at Chicago
email: klivescu at uchicago.edu

My main research interests are in speech and language processing, with a slant toward combining statistical modeling techniques with knowledge from linguistics and speech science.

I completed my PhD in 2005 in the Spoken Language Systems group of the Computer Science and Artificial Intelligence Laboratory, under the advisement of Jim Glass, on the topic of modeling spoken word pronunciations using a dynamic Bayesian network model of sub-phonetic features. In 2005-2007 I was a post-doctoral lecturer in the EECS department at MIT. In the summer of 2006, I led a team project at the Johns Hopkins summer workshop on speech and language engineering, on the topic of articulatory feature-based methods for speech recognition.



Refereed publications:

O. Cetin, M. Magimai-Doss, K. Livescu, A. Kantor, S. King, C. Bartels, and J. Frankel, "Monolingual and crosslingual comparison of tandem features derived from articulatory and phone MLPs," in Proc. ASRU, Kyoto, Japan, December 2007.

J. Frankel, M. Magimai-Doss, S. King, K. Livescu, and O. Cetin, "Articulatory feature classifiers trained on 2000 hours of telephone speech," in Proc. Interspeech, Antwerp, Belgium, August 2007.

M. Hasegawa-Johnson, K. Livescu, P. Lal, and K. Saenko, "Audiovisual speech recognition with articulator positions as hidden variables," in Proc. ICPhS, Saarbruecken, Germany, August 2007.

K. Livescu, O. Cetin, M. Hasegawa-Johnson, S. King, C. Bartels, N. Borges, A. Kantor, P. Lal, L. Yung, A. Bezman, S. Dawson-Haggerty, B. Woods, J. Frankel, M. Magimai-Doss, K. Saenko, "Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU Summer Workshop," in Proc. ICASSP, Honolulu, April 2007.

O. Cetin, A. Kantor, S. King, C. Bartels, M. Magimai-Doss, J. Frankel, and K. Livescu, "An articulatory feature-based tandem approach and factored observation modeling," in Proc. ICASSP, Honolulu, April 2007.

K. Livescu, A. Bezman, N. Borges, L. Yung, O. Cetin, J. Frankel, S. King, M. Magimai-Doss, X. Chi, and L. Lavoie, ``Manual transcription of conversational speech at the articulatory feature level,'' in Proc. ICASSP, Honolulu, Hawaii, April 2007.

S. King, J. Frankel, K. Livescu, E. McDermott, K. Richmond, and M. Wester, ``Speech production knowledge in automatic speech recognition,'' Journal of the Acoustical Society of America 121(2):723-742, February 2007.

K. Saenko and K. Livescu, ``An Asynchronous DBN for Audio-Visual Speech Recognition,'' in Proc. IEEE Workshop on Spoken Language Technologies, December 2006.

K. Saenko, K. Livescu, M. Siracusa, K. Wilson, J. Glass, and T. Darrell, ``Visual speech recognition with loosely synchronized feature streams,'' in Proc. ICCV, Beijing, October 2005.

T. J. Hazen, I. L. Hetherington, H. Shu, and K. Livescu, "Pronunciation modeling using a finite-state transducer representation." Speech Communication 46(2):189-203, June 2005.

M. Hasegawa-Johnson, J. Baker, S. Borys, K. Chen, E. Coogan, S. Greenberg, A. Juneja, K. Kirchhoff, K. Livescu, K. Sonmez, S. Mohan, J. Muller, and T. Wang, ``Landmark-based speech recognition: Report of the 2004 Johns Hopkins Summer Workshop,'' in Proc. ICASSP, Philadelphia, March 2005.

K. Saenko, K. Livescu, J. Glass, and T. Darrell, ``Production domain modeling of pronunciation for visual speech recognition,'' in Proc. ICASSP, Philadelphia, March 2005.

K. Livescu and J. Glass, "Feature-based pronunciation modeling with trainable asynchrony probabilities." in Proc. ICSLP, Jeju, South Korea, October 2004.

K. Livescu and J. Glass, "Feature-based pronunciation modeling for speech recognition." in Proc. HLT/NAACL, Boston, May 2004.

K. Livescu, J. Glass, and J. Bilmes, "Hidden feature modeling for speech recognition using dynamic Bayesian networks." in Proc. EUROSPEECH, Geneva, Switzerland, August-September 2003.

T. J. Hazen, I. L. Hetherington, H. Shu, and K. Livescu, "Pronunciation modeling using a finite-state transducer representation." in Proc. ISCA Tutorial and Research Workshop on Pronunciation Modeling and Lexicon Adaptation for Spoken Language, Estes Park, Colorado, September 2002.

G. Zweig, J. Bilmes, T. Richardson, K. Filali, K. Livescu, P. Xu, K. Jackson, Y. Brandman, E. Sandness, E. Holtz, J. Torres, and B. Byrne, "Structurally discriminative graphical models for automatic speech recognition -- results from the 2001 Johns Hopkins Summer Workshop." in Proc. ICASSP, Orlando, Florida, May 2002.

K. Livescu and J. Glass, "Segment-based recognition on the PhoneBook task: Initial results and observations on duration modeling." in Proc. EUROSPEECH, Aalborg, Denmark, September 2001.

K. Livescu and J. Glass, "Lexical modeling of non-native speech for automatic speech recognition." in Proc. ICASSP, Istanbul, Turkey, June 2000.


Technical reports and theses:

K. Livescu, O. Cetin, M. Hasegawa-Johnson, S. King, C. Bartels, N. Borges, A. Kantor, P. Lal, L. Yung, A. Bezman, S. Dawson-Haggerty, and B. Woods, "Articulatory Feature-based Methods for Acoustic and Audio-Visual Speech Recognition: 2006 JHU Summer Workshop Final Report." Center for Language and Speech Processing, Johns Hopkins University.

K. Livescu, "Feature-Based Pronunciation Modeling for Automatic Speech Recognition." Ph.D. Thesis, MIT Department of Electrical Engineering and Computer Science, September 2005.

M. Hasegawa-Johnson, J. Baker, S. Greenberg, K. Kirchhoff, J. Muller, K. Sonmez, S. Borys, K. Chen, A. Juneja, K. Livescu, S. Mohan, E. Coogan, and T. Wang,``Landmark-based Speech Recognition: Report of the 2004 Johns Hopkins Summer Workshop,'' Johns Hopkins University 2004 Summer Workshop final report.

J. Bilmes, G. Zweig, T. Richardson, K. Filali, K. Livescu, P. Xu, K. Jackson, Y. Brandman, E. Sandness, E. Holtz, J. Torres, and B. Byrne, "Discriminatively Structured Graphical Models for Speech Recognition." Johns Hopkins University 2001 Summer Workshop final report.

K. Livescu, "Analysis and Modeling of Non-Native Speech for Automatic Speech Recognition." S.M. Thesis, MIT Department of Electrical Engineering and Computer Science, August 1999.

K. Livescu, "Analysis of Human and Parrot Phonation Using an Energy Operator and Energy Separation Algorithm." A.B. Thesis, Princeton Department of Physics, April 1996.


Other presentations, abstracts:

"Phonological models in automatic speech recognition," invited talk at ACL SIGMORPHON workshop, 2008.

       Note that references in this talk are not exhaustive.

       Link to X-ray video in talk.

K. Livescu, X. Chi, L. Lavoie, A. Bezman, N. Borges, and L. Yung, "A study of manual articulatory feature-based transcription of conversational speech (abstract and poster)" Acoustical Society of America meeting, Nov.-Dec. 2006.

K. Livescu and J. Glass, "Feature-based pronunciation modeling for automatic speech recognition. (abstract and poster)" presented at "From Sound to Sense: 50+ Years of Discoveries in Speech Communication", Cambridge, MA, June 2004.



Some neat speech links:

Listen to the sounds of the IPA chart

Why is it hard to understand the lyrics in high soprano singing? (It is not because they are singing in Middle High German)

An interactive vocal tract demo

A formant synthesis demo


Personal