Karen Livescu

Assistant Professor
Toyota Technological Institute at Chicago

Assistant Professor (part time)
University of Chicago Department of Computer Science

email: klivescu at ttic.edu

NEW: Post-doc opportunity!

NEW: Student opportunities

NEW: CMSC 35900: Topics in Artificial Intelligence: Speech Technologies (Autumn 2009)

My main research interests are in speech and language processing, with a slant toward combining statistical modeling techniques with knowledge from linguistics and speech science.

As of September 2008, I am an Assistant Professor at TTI-Chicago, a philanthropically endowed academic computer science institute located on the University of Chicago campus. We are recruiting students to our PhD program and intern program, as well as additional faculty, including in speech and language-related areas; please email me for details.

I completed my PhD in 2005 at MIT in the Spoken Language Systems group of the Computer Science and Artificial Intelligence Laboratory. In 2005-2007 I was a post-doctoral lecturer in the MIT EECS department. In the summer of 2006, I led a team project at the Johns Hopkins summer workshop on speech and language engineering, on the topic of articulatory feature-based methods for speech recognition. In Feb.-Aug. 2008 I was a Research Assistant Professor at TTI-Chicago.



OLD NEWS:

Illinois Speech Day, May 5, 2009



Refereed publications:

A. Margolis, M. Ostendorf, and K. Livescu, "Cross-genre training for automatic prosody classification," Speech Prosody, May 2010, to appear.

K. Livescu and M. Stoehr, "Multi-view learning of acoustic features for speaker recognition", ASRU, Dec. 2009.

K. Saenko, K. Livescu, J. Glass, and T. Darrell, "Multistream articulatory feature-based models for visual speech recognition," IEEE Trans. Pattern Analysis and Machine Intelligence 31(9):1700-1707, September 2009.

K. Chaudhuri, S. Kakade, K. Livescu, and K. Sridharan, "Multi-view clustering via canonical correlation analysis", ICML, June 2009.

K. Livescu, B. Zhu, and J. Glass, "On the phonetic information in ultrasonic microphone signals", ICASSP, April 2009.

O. Cetin, M. Magimai-Doss, K. Livescu, A. Kantor, S. King, C. Bartels, and J. Frankel, "Monolingual and crosslingual comparison of tandem features derived from articulatory and phone MLPs," in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), December 2007.

J. Frankel, M. Magimai-Doss, S. King, K. Livescu, and O. Cetin, "Articulatory feature classifiers trained on 2000 hours of telephone speech," in Proc. Interspeech, August 2007.

M. Hasegawa-Johnson, K. Livescu, P. Lal, and K. Saenko, "Audiovisual speech recognition with articulator positions as hidden variables," in Proc. ICPhS, August 2007.

K. Livescu, O. Cetin, M. Hasegawa-Johnson, S. King, C. Bartels, N. Borges, A. Kantor, P. Lal, L. Yung, A. Bezman, S. Dawson-Haggerty, B. Woods, J. Frankel, M. Magimai-Doss, K. Saenko, "Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU Summer Workshop," in Proc. ICASSP, April 2007.

O. Cetin, A. Kantor, S. King, C. Bartels, M. Magimai-Doss, J. Frankel, and K. Livescu, "An articulatory feature-based tandem approach and factored observation modeling," in Proc. ICASSP, April 2007.

K. Livescu, A. Bezman, N. Borges, L. Yung, O. Cetin, J. Frankel, S. King, M. Magimai-Doss, X. Chi, and L. Lavoie, ``Manual transcription of conversational speech at the articulatory feature level,'' in Proc. ICASSP, April 2007.

S. King, J. Frankel, K. Livescu, E. McDermott, K. Richmond, and M. Wester, ``Speech production knowledge in automatic speech recognition,'' Journal of the Acoustical Society of America 121(2):723-742, February 2007.

K. Saenko and K. Livescu, ``An Asynchronous DBN for Audio-Visual Speech Recognition,'' in Proc. IEEE Workshop on Spoken Language Technologies (SLT), December 2006.

K. Saenko, K. Livescu, M. Siracusa, K. Wilson, J. Glass, and T. Darrell, ``Visual speech recognition with loosely synchronized feature streams,'' in Proc. ICCV, October 2005.

T. J. Hazen, I. L. Hetherington, H. Shu, and K. Livescu, "Pronunciation modeling using a finite-state transducer representation." Speech Communication 46(2):189-203, June 2005.

M. Hasegawa-Johnson, J. Baker, S. Borys, K. Chen, E. Coogan, S. Greenberg, A. Juneja, K. Kirchhoff, K. Livescu, K. Sonmez, S. Mohan, J. Muller, and T. Wang, ``Landmark-based speech recognition: Report of the 2004 Johns Hopkins Summer Workshop,'' in Proc. ICASSP, March 2005.

K. Saenko, K. Livescu, J. Glass, and T. Darrell, ``Production domain modeling of pronunciation for visual speech recognition,'' in Proc. ICASSP, March 2005.

K. Livescu and J. Glass, "Feature-based pronunciation modeling with trainable asynchrony probabilities." in Proc. ICSLP, October 2004.

K. Livescu and J. Glass, "Feature-based pronunciation modeling for speech recognition." in Proc. HLT/NAACL, May 2004.

K. Livescu, J. Glass, and J. Bilmes, "Hidden feature modeling for speech recognition using dynamic Bayesian networks." in Proc. EUROSPEECH, August-September 2003.

T. J. Hazen, I. L. Hetherington, H. Shu, and K. Livescu, "Pronunciation modeling using a finite-state transducer representation." in Proc. ISCA Tutorial and Research Workshop on Pronunciation Modeling and Lexicon Adaptation for Spoken Language (PMLA), September 2002.

G. Zweig, J. Bilmes, T. Richardson, K. Filali, K. Livescu, P. Xu, K. Jackson, Y. Brandman, E. Sandness, E. Holtz, J. Torres, and B. Byrne, "Structurally discriminative graphical models for automatic speech recognition -- results from the 2001 Johns Hopkins Summer Workshop." in Proc. ICASSP, May 2002.

K. Livescu and J. Glass, "Segment-based recognition on the PhoneBook task: Initial results and observations on duration modeling." in Proc. EUROSPEECH, September 2001.

K. Livescu and J. Glass, "Lexical modeling of non-native speech for automatic speech recognition." in Proc. ICASSP, June 2000.


Technical reports and theses:

K. Livescu, O. Cetin, M. Hasegawa-Johnson, S. King, C. Bartels, N. Borges, A. Kantor, P. Lal, L. Yung, A. Bezman, S. Dawson-Haggerty, and B. Woods, "Articulatory Feature-based Methods for Acoustic and Audio-Visual Speech Recognition: 2006 JHU Summer Workshop Final Report." Center for Language and Speech Processing, Johns Hopkins University.

K. Livescu, "Feature-Based Pronunciation Modeling for Automatic Speech Recognition." Ph.D. Thesis, MIT Department of Electrical Engineering and Computer Science, September 2005.

M. Hasegawa-Johnson, J. Baker, S. Greenberg, K. Kirchhoff, J. Muller, K. Sonmez, S. Borys, K. Chen, A. Juneja, K. Livescu, S. Mohan, E. Coogan, and T. Wang,``Landmark-based Speech Recognition: Report of the 2004 Johns Hopkins Summer Workshop,'' Johns Hopkins University 2004 Summer Workshop final report.

J. Bilmes, G. Zweig, T. Richardson, K. Filali, K. Livescu, P. Xu, K. Jackson, Y. Brandman, E. Sandness, E. Holtz, J. Torres, and B. Byrne, "Discriminatively Structured Graphical Models for Speech Recognition." Johns Hopkins University 2001 Summer Workshop final report.

K. Livescu, "Analysis and Modeling of Non-Native Speech for Automatic Speech Recognition." S.M. Thesis, MIT Department of Electrical Engineering and Computer Science, August 1999.

K. Livescu, "Analysis of Human and Parrot Phonation Using an Energy Operator and Energy Separation Algorithm." A.B. Thesis, Princeton Department of Physics, April 1996.


Other presentations, abstracts:

"Phonological models in automatic speech recognition," invited talk at ACL SIGMORPHON workshop, 2008.

       Note that references in this talk are not exhaustive.

       Link to X-ray video in talk.

K. Livescu, X. Chi, L. Lavoie, A. Bezman, N. Borges, and L. Yung, "A study of manual articulatory feature-based transcription of conversational speech (abstract and poster)" Acoustical Society of America meeting, Nov.-Dec. 2006.

K. Livescu and J. Glass, "Feature-based pronunciation modeling for automatic speech recognition. (abstract and poster)" presented at "From Sound to Sense: 50+ Years of Discoveries in Speech Communication", June 2004.



Some neat speech links:

Listen to the sounds of the IPA chart

Why is it hard to understand the lyrics in high soprano singing? (It is not because they are singing in Middle High German)

An interactive vocal tract demo

A formant synthesis demo


Personal