TTIC
Toyota Technological Institute at Chicago  

Hal Daume III

University of Southern California

Flexible Machine Learning for Hard Language Problems

March 23, 2006 10:00am

Abstract:

Natural language processing abounds with hard prediction problems for which complex outputs are sought. In machine translation, one aims to produce a coherent translated sentence; in document summarization, an entire short document is required. Effective models for these and other problems rely heavily on approximate search methods in order to find the best possible output. Unfortunately, the fact that search is used in the final output production is rarely taken into account when machine learning methods are conceived and employed. This leads to complex algorithms with few theoretical guarantees about performance on unseen test data.

I present a machine learning approach that directly solves "structured prediction" problems by considering formal techniques that reduce structured prediction to simple binary classification, within the context of search. This reduction is error-limiting: it provides theoretical guarantees about the performance of the structured prediction model on unseen test data. It also lends itself to novel training methods for structured prediction models, yielding efficient learning algorithms that perform well in practice. I empirically evaluate this approach in the context of two tasks: entity detection and tracking and automatic document summarization.

If you have questions, or would like to meet the speaker, please contact Ponda at 4-1994 or pondabarnes@tti-c.org. For information on future TTI-C talks or events, please go to the TTI-C Events page.



return to events page