Research

My research interests are broadly in Machine Learning, Statistical Learning Theory, Big Data, Matrix Factorization, Online Learning, and Algorithms. I am particularly interested in theoretically rigorous and practically scalable models for real world applications in Healthcare, Energy, and Education. Here's a glimpse of some of my recent research projects:

Multiresolution Matrix Factorization

The interplay between the geometry of a space X and the structure of function spaces on X is a classical theme in harmonic analysis. As an instance of this connection, in this work, we developed the matrix factorization analog of multiresolution analysis on finite sets. The resulting factorizations, on the one hand, provide a natural way to define multiresolution on graphs, correlated sets of random variables, and so on. On the other hand, they lead to new classes of structured matrices and new matrix compression algorithms.

Adaptivity in Kernel Regression

We present the first result for kernel regression where the procedure adapts locally at a point x to both the unknown local dimension of the metric space X and the unknown Holder-continuity of the regression function at x. The result holds with high probability simultaneously at all points x in a general metric space X of unknown structure. An important implication of this work is that under mild assumptions on the data, instead of having to resort to dimensionality reduction techniques, we can work in the original space and still achieve nearly optimal rates of convergence.

MDPs with Dynamic Temporal Uncertainty

To compensate for the increased variability in supply and demand (using renewable sources), we need algorithms for online energy resource allocation under temporal uncertainty of future consumption and availability. Predictive information is revealed incrementally in an online manner, leading to what we call dynamic temporal uncertainty. We demonstrate the non-triviality of this problem and provide nearly optimal online algorithms, both randomized and deterministic, to handle time varying uncertainty in future rewards for non-stationary MDPs in general and for energy resource allocation in particular. We also derive theoretical upper and lower bounds that hold even for a finite horizon, and es- tablish that, in the deterministic case, discounting future rewards can be used as a strategy to maximize the total (undiscounted) reward.

Biobjective Game Theoretic Clustering

We propose a new approach to clustering. Our idea is to map cluster formation to coalition formation in cooperative games, and to use the Shapley value of the patterns to identify clusters and cluster representatives. We show that the underlying game is convex and this leads to an efficient biobjective clustering algorithm that we call BiGC. The algorithm yields high-quality clustering with respect to average point-to-center distance (potential) as well as average intracluster point-to-point distance (scatter). We demonstrate the superiority of BiGC over state-of-the-art clustering algorithms (including the center based and the multiobjective techniques) through a detailed experimentation using standard cluster validity criteria on several benchmark data sets. We also show that BiGC satisfies key clustering properties such as order independence, scale invariance, and richness.