Face image distribution in tarining and 
sets of images Top: images of an individual's face included in a set available for training. Bottom: images of an individual's face in a set presented for testing. We know that the images all belong to the same individual; the idea is to match distributions rather than individual images or a signle image to a set.

In most paradigms of face recognition it is assumed that while a set of training images is available for each individual in the database, the input (test data) consists of a single shot. However, in many scenarios the recognition system has access to a set of face images of the person to be recognized. We want to use this fact to do a better job in recognition.

This can be viewed in the framework of a general problem in classification: if you have sets of observations from each class, and a set of observation from an unknown class, what is the best way to label this new set, or in other words, how does one compare sets of observations?

We propose to address this problem from probabilistic point of view. Assuming that the observation in each class c are drawn from some density pc, and the observations x01...x0Nin the set to be classified are drawn from p0, the classification objective can be framed as hypothesis testing: we want to select on of the hypotheses of the form Hi:p0=pi. It is known that the optimal test for this task is the likelihood ratio test; we should select the class c for which the product pc(x01)…pc(x0N). However, things are complicated by the fact that we don't know the class-conditional densities pi, and can only estimate them.

Under these conditions, we propose to use an approximation to the expected likelihood ratio, the Kullback-Leibler (KL) divergence between the density estimates, as the measure of similarity between the sets in question. The way to calculate KL divergence depends on the density estimation model. In our previous work, we proposed to use a particularly simple estimate: a single Gaussian of low intrinsic dimension (a technique known as Probabilisti PCA). In this case KL divergence can be computed in closed form.

In our current work, we explore more expressive density estimation models, such as mixture of Gaussians. Ultimately we would like to try to use our approach with non-parametric density estimates. With these models the KL divergence can no longer be computed in closed form, but it can be efficiently estimated using a Monte-Carlo method (essentially, by drawing the likelihood ratio values from the target distribution and averaging over a number of trials).

G. Shakhnarovich, J. Fisher, T. Darrell. Face recognition from long-term observations -- presented at ECCV 2002. Introduced our approach based on KL-divergence between face image densities.
O. Arandjelovic, G. Shakhnarovich, J. Fisher, R. Cippola, T. Darrell. Face Recognition with Image Sets Using Manifold Density Divergence -- a semi-parametric method for computing KL divergences between densities modeled with Gaussian Mixture Models. (This work is done with collaboration with our colleagues from University of Cambridge.)