Karen Livescu

Assistant Professor
Toyota Technological Institute at Chicago

Assistant Professor (part time)
University of Chicago Department of Computer Science

email: klivescu at ttic.edu

My main research interests are in speech and language processing, with a slant toward combining statistical modeling techniques with knowledge from linguistics and speech science.

I am an Assistant Professor at TTI-Chicago, a philanthropically endowed academic computer science institute located on the University of Chicago campus. We are recruiting students to our PhD program and intern program, as well as additional faculty, including in speech and language-related areas.

I completed my PhD in 2005 at MIT in the Spoken Language Systems group of the Computer Science and Artificial Intelligence Laboratory. In 2005-2007 I was a post-doctoral lecturer in the MIT EECS department. In the summer of 2006, I led a team project at the Johns Hopkins summer workshop on speech and language engineering, on the topic of articulatory feature-based methods for speech recognition. In Feb.-Aug. 2008 I was a Research Assistant Professor at TTI-Chicago.


News    Students/Postdocs    Publications    Teaching    Misc



NEWS:

Undergraduate research opportunities

Old news:

Joint ACL/ISCA/ICML Symposium on Learning in Speech and Language Processing
Post-doc opportunity
Illinois Speech Day, May 10, 2010
Illinois Speech Day, May 5, 2009



Teaching:

Spring 2012 ... TTIC 31110: Speech Technologies (TTIC)
Spring 2011 ... TTIC 31090: Signals, Systems, and Random Processes (TTIC and U. Chicago)
Winter 2011 ... 20114231: Introduction to Speech Recognition (Weizmann Institute)
Autumn 2009 ... CMSC 35900: Topics in Artificial Intelligence: Speech Technologies (TTIC and U. Chicago)
Autumn 2007, Autumn & Spring 2006, Autumn 2005 ... 6.003: Signals and Systems (MIT)
Spring 2007 ... 6.345: Automatic Speech Recognition (MIT)



Students and post-docs:

Raman Arora (TTIC post-doc)
Hao Tang (TTIC PhD, 2010-)
Arild Brandrud Næss (NTNU PhD, co-advised with Torbjørn Svendsen)

* Past students/post-docs:

Louis Terry (Northwestern CSE PhD 2011, co-advised with Aggelos Katsaggelos)
John Labiak (U. Chicago Statistics MS 2010, co-advised with Yali Amit and Partha Niyogi)
Bo Zhu (MIT EECS MEng 2006-2007, co-advised with Jim Glass)
Heejin Kim (UIUC post-doc, Jan. - May 2010, co-supervised with Mark Hasegawa-Johnson)

* Past interns/visiting students/undergrad research assistants:

Sujeeth Bharadwaj (UIUC ECE, intern summer 2011)
Sam Bowman (U. Chicago Linguistics BA 2011)
Matt Faytak (U. Chicago Linguistics BA)
Preethi Jyothi (Ohio State CSE PhD, visiting summer 2010)
Gabrielle Knight (Northwestern Integrated Sciences BS 2011, intern summer 2010)
Anna Margolis (intern/visiting student 2009-2011, U. Washington CS PhD)
Katie Mock (U. Chicago Linguistics BA)
Mindi Porebsky (UIUC Linguistics BA 2011)
Rohit Prabhavalkar (Ohio State CSE PhD, visiting summer 2010)
Mark Stoehr (U. Chicago Math BS, intern and undergrad research assistant 2009-2010)



Refereed publications:

R. Prabhavalkar, E. Fosler-Lussier, and K. Livescu, "A factored conditional random field model for articulatory feature forced transcription", ASRU, December 2011.

J. Labiak and K. Livescu, "Nearest neighbors with learned distances for phonetic frame classification", Interspeech, August 2011.

A. B. Næss, K. Livescu, and R. Prabhavalkar, "Articulatory feature classification using nearest neighbors", Interspeech, August 2011.

P. Jyothi, K. Livescu, and E. Fosler-Lussier, "Lexical access experiments with context-dependent articulatory feature-based models", ICASSP, May 2011.

S. Bowman and K. Livescu, "Modeling pronunciation variation with context-dependent articulatory feature decision trees", Interspeech, September 2010.

L. Terry, K. Livescu, J. Pierrehumbert, and A. Katsaggelos, "Audio-visual anticipatory coarticulation modeling by human and machine", Interspeech, September 2010.

A. Margolis, K. Livescu, and M. Ostendorf, "Semi-supervised domain adaptation for automatic dialog act tagging", DANLP, July 2010.

A. Margolis, M. Ostendorf, and K. Livescu, "Cross-genre training for automatic prosody classification", Speech Prosody, May 2010.

K. Livescu and M. Stoehr, "Multi-view learning of acoustic features for speaker recognition", ASRU, December 2009.

K. Saenko, K. Livescu, J. Glass, and T. Darrell, "Multistream articulatory feature-based models for visual speech recognition," IEEE Trans. Pattern Analysis and Machine Intelligence 31(9):1700-1707, September 2009.

K. Chaudhuri, S. Kakade, K. Livescu, and K. Sridharan, "Multi-view clustering via canonical correlation analysis", ICML, June 2009.

K. Livescu, B. Zhu, and J. Glass, "On the phonetic information in ultrasonic microphone signals", ICASSP, April 2009.

O. Cetin, M. Magimai-Doss, K. Livescu, A. Kantor, S. King, C. Bartels, and J. Frankel, "Monolingual and crosslingual comparison of tandem features derived from articulatory and phone MLPs," in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), December 2007.

J. Frankel, M. Magimai-Doss, S. King, K. Livescu, and O. Cetin, "Articulatory feature classifiers trained on 2000 hours of telephone speech," in Proc. Interspeech, August 2007.

M. Hasegawa-Johnson, K. Livescu, P. Lal, and K. Saenko, "Audiovisual speech recognition with articulator positions as hidden variables," in Proc. ICPhS, August 2007.

K. Livescu, O. Cetin, M. Hasegawa-Johnson, S. King, C. Bartels, N. Borges, A. Kantor, P. Lal, L. Yung, A. Bezman, S. Dawson-Haggerty, B. Woods, J. Frankel, M. Magimai-Doss, K. Saenko, "Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU Summer Workshop," in Proc. ICASSP, April 2007.

O. Cetin, A. Kantor, S. King, C. Bartels, M. Magimai-Doss, J. Frankel, and K. Livescu, "An articulatory feature-based tandem approach and factored observation modeling," in Proc. ICASSP, April 2007.

K. Livescu, A. Bezman, N. Borges, L. Yung, O. Cetin, J. Frankel, S. King, M. Magimai-Doss, X. Chi, and L. Lavoie, ``Manual transcription of conversational speech at the articulatory feature level,'' in Proc. ICASSP, April 2007.

S. King, J. Frankel, K. Livescu, E. McDermott, K. Richmond, and M. Wester, ``Speech production knowledge in automatic speech recognition,'' Journal of the Acoustical Society of America 121(2):723-742, February 2007.

K. Saenko and K. Livescu, ``An Asynchronous DBN for Audio-Visual Speech Recognition,'' in Proc. IEEE Workshop on Spoken Language Technologies (SLT), December 2006.

K. Saenko, K. Livescu, M. Siracusa, K. Wilson, J. Glass, and T. Darrell, ``Visual speech recognition with loosely synchronized feature streams,'' in Proc. ICCV, October 2005.

T. J. Hazen, I. L. Hetherington, H. Shu, and K. Livescu, "Pronunciation modeling using a finite-state transducer representation." Speech Communication 46(2):189-203, June 2005.

M. Hasegawa-Johnson, J. Baker, S. Borys, K. Chen, E. Coogan, S. Greenberg, A. Juneja, K. Kirchhoff, K. Livescu, K. Sonmez, S. Mohan, J. Muller, and T. Wang, ``Landmark-based speech recognition: Report of the 2004 Johns Hopkins Summer Workshop,'' in Proc. ICASSP, March 2005.

K. Saenko, K. Livescu, J. Glass, and T. Darrell, ``Production domain modeling of pronunciation for visual speech recognition,'' in Proc. ICASSP, March 2005.

K. Livescu and J. Glass, "Feature-based pronunciation modeling with trainable asynchrony probabilities." in Proc. ICSLP, October 2004.

K. Livescu and J. Glass, "Feature-based pronunciation modeling for speech recognition." in Proc. HLT/NAACL, May 2004.

K. Livescu, J. Glass, and J. Bilmes, "Hidden feature modeling for speech recognition using dynamic Bayesian networks." in Proc. EUROSPEECH, August-September 2003.

T. J. Hazen, I. L. Hetherington, H. Shu, and K. Livescu, "Pronunciation modeling using a finite-state transducer representation." in Proc. ISCA Tutorial and Research Workshop on Pronunciation Modeling and Lexicon Adaptation for Spoken Language (PMLA), September 2002.

G. Zweig, J. Bilmes, T. Richardson, K. Filali, K. Livescu, P. Xu, K. Jackson, Y. Brandman, E. Sandness, E. Holtz, J. Torres, and B. Byrne, "Structurally discriminative graphical models for automatic speech recognition -- results from the 2001 Johns Hopkins Summer Workshop." in Proc. ICASSP, May 2002.

K. Livescu and J. Glass, "Segment-based recognition on the PhoneBook task: Initial results and observations on duration modeling." in Proc. EUROSPEECH, September 2001.

K. Livescu and J. Glass, "Lexical modeling of non-native speech for automatic speech recognition." in Proc. ICASSP, June 2000.


Technical reports and theses:

K. Livescu, O. Cetin, M. Hasegawa-Johnson, S. King, C. Bartels, N. Borges, A. Kantor, P. Lal, L. Yung, A. Bezman, S. Dawson-Haggerty, and B. Woods, "Articulatory Feature-based Methods for Acoustic and Audio-Visual Speech Recognition: 2006 JHU Summer Workshop Final Report." Center for Language and Speech Processing, Johns Hopkins University.

K. Livescu, "Feature-Based Pronunciation Modeling for Automatic Speech Recognition." Ph.D. Thesis, MIT Department of Electrical Engineering and Computer Science, September 2005.

M. Hasegawa-Johnson, J. Baker, S. Greenberg, K. Kirchhoff, J. Muller, K. Sonmez, S. Borys, K. Chen, A. Juneja, K. Livescu, S. Mohan, E. Coogan, and T. Wang,``Landmark-based Speech Recognition: Report of the 2004 Johns Hopkins Summer Workshop,'' Johns Hopkins University 2004 Summer Workshop final report.

J. Bilmes, G. Zweig, T. Richardson, K. Filali, K. Livescu, P. Xu, K. Jackson, Y. Brandman, E. Sandness, E. Holtz, J. Torres, and B. Byrne, "Discriminatively Structured Graphical Models for Speech Recognition." Johns Hopkins University 2001 Summer Workshop final report.

K. Livescu, "Analysis and Modeling of Non-Native Speech for Automatic Speech Recognition." S.M. Thesis, MIT Department of Electrical Engineering and Computer Science, August 1999.

K. Livescu, "Analysis of Human and Parrot Phonation Using an Energy Operator and Energy Separation Algorithm." A.B. Thesis, Princeton Department of Physics, April 1996.


Other presentations, abstracts:

"Phonological models in automatic speech recognition," invited talk at ACL SIGMORPHON workshop, 2008.

       Note that references in this talk are not exhaustive.

       Link to X-ray video in talk.

K. Livescu, X. Chi, L. Lavoie, A. Bezman, N. Borges, and L. Yung, "A study of manual articulatory feature-based transcription of conversational speech (abstract and poster)" Acoustical Society of America meeting, Nov.-Dec. 2006.

K. Livescu and J. Glass, "Feature-based pronunciation modeling for automatic speech recognition. (abstract and poster)" presented at "From Sound to Sense: 50+ Years of Discoveries in Speech Communication", June 2004.



Some neat speech links:

Listen to the sounds of the IPA chart

Why is it hard to understand the lyrics in high soprano singing? (It is not because they are singing in Middle High German)

An interactive vocal tract demo

A formant synthesis demo


Personal