Workshop on Machine Learning in Speech and Language Processing

August 11, 2017
Sydney, Australia
Speaker: Tasha Nagamine (Columbia)

Title: Feature Representation and Transformation in Multilayer Perceptron Acoustic Models

While deep learning has shown great success in recent years, how the nodes in different layers of neural networks represent both the input and the properties of the network function remain unknown. We present an empirical and joint framework to study the encoding properties of node activations in hidden and output layers of the network, and to construct the equivalent linear transformation applied to each data point. These methods are used to discern and quantify the properties of feed-forward neural networks trained to map acoustic features to phoneme labels. We show a selective and progressively nonlinear warping of the feature space in which the most discriminant dimensions of the input samples are emphasized. Analyzing the sample-dependent linear transforms applied to each data sample shows that categorization is achieved by forming prototypical templates to explicitly model the all the variations of each class. This work provides a comprehensive analysis of representation and computation of neural networks and provides an intuitive account of how deep neural networks perform classification.