Karen Livescu

Assistant Professor
Toyota Technological Institute at Chicago

Assistant Professor (part time)
University of Chicago Department of Computer Science

email: klivescu at ttic.edu

My main research interests are in speech and language processing, with a slant toward combining machine learning techniques with knowledge from linguistics and speech science.

I am an Assistant Professor at TTI-Chicago, a philanthropically endowed academic computer science institute located on the University of Chicago campus. We are recruiting students to our PhD program and intern program, as well as additional faculty, including in speech and language-related areas (more on Speech and Language at TTIC).

I completed my PhD in 2005 at MIT in the Spoken Language Systems group of the Computer Science and Artificial Intelligence Laboratory. In 2005-2007 I was a post-doctoral lecturer in the MIT EECS department. In Feb.-Aug. 2008 I was a Research Assistant Professor at TTI-Chicago.


News    Speech&Language@TTIC    Students/Postdocs    Publications    Teaching    Misc



News:
CSL Special Issue on Speech Production in Speech Technologies (submission deadline June 1, 2014)
Undergraduate research opportunities

Old news:
Workshop on Speech Production in Automatic Speech Recognition, August 30, 2013
Midwest Speech and Language Days, May 2-3, 2013
Symposium on Machine Learning in Speech and Language Processing 2012


Teaching:

Winter 2014 ... TTIC 31110 (CMSC 35900): Speech Technologies (TTIC and U. Chicago)
Spring 2013 ... TTIC 31090: Signals, Systems, and Random Processes (TTIC and U. Chicago)
Spring 2012 ... TTIC 31110: Speech Technologies (TTIC)
Spring 2011 ... TTIC 31090: Signals, Systems, and Random Processes (TTIC and U. Chicago)
Winter 2011 ... 20114231: Introduction to Speech Recognition (Weizmann Institute)
Autumn 2009 ... CMSC 35900: Topics in Artificial Intelligence: Speech Technologies (TTIC and U. Chicago)
Autumn 2007, Autumn & Spring 2006, Autumn 2005 ... 6.003: Signals and Systems (MIT)
Spring 2007 ... 6.345: Automatic Speech Recognition (MIT)



Grad students and post-docs:

Weiran Wang (TTIC post-doc, 2014-)
Taehwan Kim (TTIC PhD, 2009-)
Arild Brandrud Næss (NTNU PhD, co-advised with Torbjørn Svendsen)
Bahador Nooraei (TTIC PhD, 2012-)
Hao Tang (TTIC PhD, 2010-)

* Past grad students/post-docs:

Raman Arora (post-doc 2011-2013)
Louis Terry (Northwestern CSE PhD 2011, co-advised with Aggelos Katsaggelos)
John Labiak (U. Chicago Statistics MS 2010, co-advised with Yali Amit and Partha Niyogi)
Bo Zhu (MIT EECS MEng 2006-2007, co-advised with Jim Glass)
Heejin Kim (UIUC post-doc, Jan. - May 2010, co-supervised with Mark Hasegawa-Johnson)

* Past interns/visiting students/undergrad research assistants:

Hadas Benisty (Technion EE, intern summer 2012)
Sujeeth Bharadwaj (UIUC ECE, intern summer 2011)
Sam Bowman (U. Chicago Linguistics BA 2011)
Soham De (Jadavpur University CSE, intern summer 2012)
Victoria Evelkin (Technion EE, intern summer 2012)
Matt Faytak (U. Chicago Linguistics BA)
Katie Henry (U. Chicago Computer Science BA)
Preethi Jyothi (Ohio State CSE PhD, visiting summer 2010)
Gabrielle Knight (Northwestern Integrated Sciences BS 2011, intern summer 2010)
Anna Margolis (intern/visiting student 2009-2011, U. Washington CS PhD)
Katie Mock (U. Chicago Linguistics BA)
Mindi Porebsky (UIUC Linguistics BA 2011)
Rohit Prabhavalkar (Ohio State CSE PhD, visiting summer 2010)
Mark Stoehr (U. Chicago Math BS, intern and undergrad research assistant 2009-2010)



Publications:

R. Arora and K. Livescu
"Multi-view learning with supervision for transformed bottleneck features"
ICASSP 2014 (to appear).

K. Levin, K. Henry, A. Jansen, and K. Livescu
"Fixed-dimensional acoustic embeddings of variable-length segments in low-resource settings"
ASRU 2013. (Best Student Paper 2nd place)

T. Kim, G. Shakhnarovich, and K. Livescu
"Fingerspelling recognition with semi-Markov conditional random fields"
ICCV 2013.

P. Jyothi, E. Fosler-Lussier, and K. Livescu
"Discriminative training of WFST factors with application to pronunciation modeling"
Interspeech 2013.

G. Andrew, R. Arora, J. Bilmes, and K. Livescu
"Deep canonical correlation analysis"
ICML 2013.

R. Prabhavalkar, K. Livescu, E. Fosler-Lussier, and J. Keshet
"Discriminative articulatory models for spoken term detection in low-resource conversational settings"
ICASSP 2013.

R. Arora and K. Livescu
"Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains"
ICASSP 2013.

T. Kim, K. Livescu, and G. Shakhnarovich
"American Sign Language fingerspelling recognition with phonological feature-based tandem models"
SLT 2012.

K. Livescu, E. Fosler-Lussier, and F. Metze
"Subword modeling for automatic speech recognition: Past, present, and emerging approaches"
(preprint -- differs slightly from published version)
Signal Processing Magazine 29(6):44-57, November 2012.

R. Arora, A. Cotter, K. Livescu, and N. Srebro
"Stochastic optimization for PCA and PLS"
50th Annual Allerton Conference on Communication, Control, and Computing, 2012.

R. Arora and K. Livescu
"Kernel CCA for multi-view learning of acoustic features using articulatory measurements"
Symposium on Machine Learning in Speech and Language Processing (MLSLP) 2012.

R. Prabhavalkar, J. Keshet, K. Livescu, and E. Fosler-Lussier
"Discriminative spoken term detection with limited data"
Symposium on Machine Learning in Speech and Language Processing (MLSLP) 2012.

P. Jyothi, E. Fosler-Lussier, and K. Livescu
"Discriminatively learning factorized finite state pronunciation models from dynamic Bayesian networks"
Interspeech 2012. (Best Student Paper award)

H. Tang, J. Keshet, and K. Livescu
"Discriminative pronunciation modeling: A large-margin, feature-rich approach"
ACL 2012.

S. Bharadwaj, R. Arora, K. Livescu, and M. Hasegawa-Johnson
"Multi-view acoustic feature learning using articulatory measurements"
IEEE International Workshop on Statistical Machine Learning for Speech Processing (IWSML) 2012.

R. Prabhavalkar, E. Fosler-Lussier, and K. Livescu
"A factored conditional random field model for articulatory feature forced transcription"
ASRU 2011.

J. Labiak and K. Livescu
"Nearest neighbors with learned distances for phonetic frame classification"
Interspeech 2011.

A. B. Næss, K. Livescu, and R. Prabhavalkar
"Articulatory feature classification using nearest neighbors"
Interspeech 2011.

P. Jyothi, K. Livescu, and E. Fosler-Lussier
"Lexical access experiments with context-dependent articulatory feature-based models"
ICASSP 2011.


S. Bowman and K. Livescu
"Modeling pronunciation variation with context-dependent articulatory feature decision trees"
Interspeech 2010.


L. Terry, K. Livescu, J. Pierrehumbert, and A. Katsaggelos
"Audio-visual anticipatory coarticulation modeling by human and machine"
Interspeech 2010.


A. Margolis, K. Livescu, and M. Ostendorf
"Semi-supervised domain adaptation for automatic dialog act tagging"
DANLP 2010.


A. Margolis, M. Ostendorf, and K. Livescu
"Cross-genre training for automatic prosody classification"
Speech Prosody 2010.


K. Livescu and M. Stoehr
"Multi-view learning of acoustic features for speaker recognition"
ASRU 2009.


K. Saenko, K. Livescu, J. Glass, and T. Darrell
"Multistream articulatory feature-based models for visual speech recognition"
IEEE Trans. Pattern Analysis and Machine Intelligence 31(9):1700-1707, September 2009.


K. Chaudhuri, S. Kakade, K. Livescu, and K. Sridharan
"Multi-view clustering via canonical correlation analysis"
ICML 2009.


K. Livescu, B. Zhu, and J. Glass
"On the phonetic information in ultrasonic microphone signals"
ICASSP 2009.


O. Cetin, M. Magimai-Doss, K. Livescu, A. Kantor, S. King, C. Bartels, and J. Frankel
"Monolingual and crosslingual comparison of tandem features derived from articulatory and phone MLPs"
ASRU 2007.


J. Frankel, M. Magimai-Doss, S. King, K. Livescu, and O. Cetin
"Articulatory feature classifiers trained on 2000 hours of telephone speech"
Interspeech 2007.


M. Hasegawa-Johnson, K. Livescu, P. Lal, and K. Saenko
"Audiovisual speech recognition with articulator positions as hidden variables"
ICPhS 2007.


K. Livescu, O. Cetin, M. Hasegawa-Johnson, S. King, C. Bartels, N. Borges, A. Kantor, P. Lal, L. Yung, A. Bezman, S. Dawson-Haggerty, B. Woods, J. Frankel, M. Magimai-Doss, K. Saenko
"Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU Summer Workshop"
ICASSP 2007.


O. Cetin, A. Kantor, S. King, C. Bartels, M. Magimai-Doss, J. Frankel, and K. Livescu
"An articulatory feature-based tandem approach and factored observation modeling"
ICASSP 2007.


K. Livescu, A. Bezman, N. Borges, L. Yung, O. Cetin, J. Frankel, S. King, M. Magimai-Doss, X. Chi, and L. Lavoie
"Manual transcription of conversational speech at the articulatory feature level"
ICASSP 2007.


S. King, J. Frankel, K. Livescu, E. McDermott, K. Richmond, and M. Wester
"Speech production knowledge in automatic speech recognition"
Journal of the Acoustical Society of America 121(2):723-742, February 2007.


K. Saenko and K. Livescu
"An Asynchronous DBN for Audio-Visual Speech Recognition"
SLT 2006.


K. Saenko, K. Livescu, M. Siracusa, K. Wilson, J. Glass, and T. Darrell
"Visual speech recognition with loosely synchronized feature streams"
ICCV 2005.


T. J. Hazen, I. L. Hetherington, H. Shu, and K. Livescu
"Pronunciation modeling using a finite-state transducer representation"
Speech Communication 46(2):189-203, June 2005.


M. Hasegawa-Johnson, J. Baker, S. Borys, K. Chen, E. Coogan, S. Greenberg, A. Juneja, K. Kirchhoff, K. Livescu, K. Sonmez, S. Mohan, J. Muller, and T. Wang
"Landmark-based speech recognition: Report of the 2004 Johns Hopkins Summer Workshop"
ICASSP 2005.


K. Saenko, K. Livescu, J. Glass, and T. Darrell
"Production domain modeling of pronunciation for visual speech recognition,"
ICASSP 2005.

K. Livescu and J. Glass
"Feature-based pronunciation modeling with trainable asynchrony probabilities"
ICSLP 2004.


K. Livescu and J. Glass
"Feature-based pronunciation modeling for speech recognition"
HLT/NAACL 2004.


K. Livescu, J. Glass, and J. Bilmes
"Hidden feature modeling for speech recognition using dynamic Bayesian networks"
Eurospeech 2003.


T. J. Hazen, I. L. Hetherington, H. Shu, and K. Livescu
"Pronunciation modeling using a finite-state transducer representation"
ISCA Tutorial and Research Workshop on Pronunciation Modeling and Lexicon Adaptation for Spoken Language (PMLA) 2002.


G. Zweig, J. Bilmes, T. Richardson, K. Filali, K. Livescu, P. Xu, K. Jackson, Y. Brandman, E. Sandness, E. Holtz, J. Torres, and B. Byrne
"Structurally discriminative graphical models for automatic speech recognition -- results from the 2001 Johns Hopkins Summer Workshop"
ICASSP 2002.


K. Livescu and J. Glass
"Segment-based recognition on the PhoneBook task: Initial results and observations on duration modeling."
Eurospeech 2001.


K. Livescu and J. Glass
"Lexical modeling of non-native speech for automatic speech recognition."
ICASSP 2000.



Technical reports, theses:

K. Livescu, O. Cetin, M. Hasegawa-Johnson, S. King, C. Bartels, N. Borges, A. Kantor, P. Lal, L. Yung, A. Bezman, S. Dawson-Haggerty, and B. Woods, "Articulatory Feature-based Methods for Acoustic and Audio-Visual Speech Recognition: 2006 JHU Summer Workshop Final Report." Center for Language and Speech Processing, Johns Hopkins University.

K. Livescu, "Feature-Based Pronunciation Modeling for Automatic Speech Recognition." Ph.D. Thesis, MIT Department of Electrical Engineering and Computer Science, September 2005.

M. Hasegawa-Johnson, J. Baker, S. Greenberg, K. Kirchhoff, J. Muller, K. Sonmez, S. Borys, K. Chen, A. Juneja, K. Livescu, S. Mohan, E. Coogan, and T. Wang,"Landmark-based Speech Recognition: Report of the 2004 Johns Hopkins Summer Workshop," Johns Hopkins University 2004 Summer Workshop final report.

J. Bilmes, G. Zweig, T. Richardson, K. Filali, K. Livescu, P. Xu, K. Jackson, Y. Brandman, E. Sandness, E. Holtz, J. Torres, and B. Byrne, "Discriminatively Structured Graphical Models for Speech Recognition." Johns Hopkins University 2001 Summer Workshop final report.

K. Livescu, "Analysis and Modeling of Non-Native Speech for Automatic Speech Recognition." S.M. Thesis, MIT Department of Electrical Engineering and Computer Science, August 1999.

K. Livescu, "Analysis of Human and Parrot Phonation Using an Energy Operator and Energy Separation Algorithm." A.B. Thesis, Princeton Department of Physics, April 1996.


2006 JHU Summer Workshop:

If you are looking for information on, or data resulting from, the 2006 JHU summer workshop project on articulatory feature-based methods in speech recognition, it can be found here.



Some neat speech links:

Listen to the sounds of the IPA chart

Why is it hard to understand the lyrics in high soprano singing? (It is not because they are singing in Middle High German)

An interactive vocal tract demo

A formant synthesis demo



Personal