Workshop on Machine Learning in Speech and Language Processing

September 13, 2016
San Francisco, CA, USA
Speaker: Fei Sha (UCLA)

Title: Being Shallow and Random is (almost) as Good as Being Deep and Thoughtful

Abstract:
Learning the right representation of data is a crucial component to applying machine learning techniques. In this talk, I will describe two extremes, at the opposite end of the spectrum. On one end, we study deep neural networks where we learn hierarchical representations through layered networks. On the other end, we study kernel methods, which can be seen as shallow but wide networks. While theoretically appealing, kernel methods are empirically challenging and have been lackluster since the arrival of deep learning architectures. In contrast, deep architectures are empirically very successful, although theoretical understanding about them has been lacking.

I will describe our efforts and findings in investigating these two paradigms for real-world applications such as automatic speech recognition. Specifically, our extensive empirical studies, enabled by large-scale computing, highlight the similarity and difference between them. In particular, they suggest that being shallow and random in learning representation is almost equally powerful.

I will conclude by relating this line of work to other works in our lab and other research groups, and reflect on possible directions for future research.