I am now an assistant professor of Computer Science at the
Toyota Technological Institute.
My main research areas are machine learning and randomized/online algorithms.
This followed my graduate work at CMU with Avrim Blum and an NSF postdoc at MIT under the guidance of Santosh
Vempala.
Here is my CV.
Teaching and students
Autumn 2004:
Online Algorithms. MWF 4:30-5:20. Ryerson 276 (University of Chicago)
Current student: Duru Turkoglu (University of Chicago)
Previous summer students: Nina Balcan (CMU), Abie Flaxman (CMU), Brendan McMahan (CMU), Claire Monteleoni (MIT)
Publications
-
Ivona Bezakova, Adam Kalai, and Rahul Santhanam. Graph Model Selection using Maximum Likelihood. To appear in Proceedings of the 23rd International Conference on Machine Learning, 2006.
In recent years, a number of random graph models, such as preferential attachment, have been proposed as probabilistic models of large graphs. We suggest an objective method for ranking their performance on actual graphs. In particular, we look at the probability that a model assigns to a given graph, and design efficient MCMC algorithms for estimating these quantities.
- Elad Hazan, Adam Kalai, Satyen Kale, and Amit Agarwal. Logarithmic Regret Algorithms for Online Convex Optimization.
To appear in Proceedings of 19th Annual
Conference on Learning Theory, 2006.
We present improved algorithms for the problem of online optimization, that use second derivatives, analogous to Newtons method.
- Adam Kalai and Santosh Vempala. Simulated Annealing for Convex Optimization.
To appear in Math of OR, 2006. (15-min ppt 50-min ppt)
Simulated annealing is a general-purpose optimization technique that
has proven useful in practice, but is notoriously difficult to
analyze. We show that it can be used for the problem of minimizing
a linear function over a convex set, where the set is specified only
by a membership oracle. Moreover, its guaranteed convergence rate is
faster than that of any known alternative algorithm.
- Abie Flaxman, Adam Tauman Kalai, and Brendan McMahan. Online Convex Optimization in the Bandit Setting: Gradient
Descent Without a Gradient. In Proceedings of the Sixteenth
Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 385-394, 2005. (20-min ppt, 50-min ppt)
In many problems, a decision-maker has to make a sequence of
decisions online, with limited feedback. For example, consider a
company has to choose how much to spend in advertising through a
number of channels each month. The feedback the company receives
may be only the company's total profit that period. Such feedback
is indirect and noisy -- it may be more influenced by the economy
than the company's choices during that period.
To model such problems, we consider a fixed convex feasible set. On the ith period, the decision maker
chooses a feasible point x_i, and receives some reward, f_i(x_i),
where f_i is any bounded concave function.
The only feedback the decision maker receives is this reward
f_i(x_i) -- no other information is revealed about f_i.
We give an efficient algorithm that guarantees an average
reward that approaches the best average reward achievable by a
single feasible point, where the best is chosen with the benefit of
hindsight. These guarantees hold for an arbitrary sequence of
bounded concave functions.
-
Sham Kakade and Adam Tauman Kalai. From Batch to Transductive Online Learning. In Advances in Neural Information Processing Systems 18, 2006 (NIPS 05). (20-min NIPS ppt)
We consider the online transductive learning (Ben-David, Kushilevitz and Mansour, 1995) model where an
arbitrary sequence of examples are presented to a learner, one
at a time. The learner must predict each label, online, given only
the labels of the previous examples and the future unlabeled
examples. In contrast, more standard i.i.d. models of
learning assume that examples are drawn independently and
identically distributed from a fixed distribution.
We give an algorithm that employs unlabeled data to
convert any efficient learning algorithm in an i.i.d. setting to an
efficient learning algorithm in the more difficult online
transductive model.
-
Adam Tauman Kalai, Adam Klivans, Yishay Mansour, and Rocco
Servedio. Agnostically Learning Halfspaces. In Proceedings of
the 46th Annual Symposium on the Foundations of Computer Science, pp. 11-20,
2005. (long version) (15-min FOCS ppt)
The agnostic model of learning (Kearns, Schapire, and Sellie, 1992) is a
natural model of learning where one aims to do as well as the best
function in a given class of functions. Unfortunately, there are
many previous negative and impossibility results for agnostic
learning, due to computational intractability. We give a new
algorithm for agnostic learning, based on the celebrated learning
algorithm of Linial, Mansour, and Nisan. We show that, under mild
distributional assumptions, it learns to predict as well as the best
halfspace, in n dimensions. The algorithm does not suffer from the
``curse of dimensionality'' -- it runs in time polynomial rather
than exponential in the dimension n.
- Sanjoy
Dasgupta, Adam Tauman Kalai, and Claire Monteleoni. Analysis of
Perceptron-Based Active Learning. In Proceedings of 18th Annual
Conference on Learning Theory, 2005.
In active learning, the set of unlabeled examples is known in
advance, and the learner can choose which examples are labeled as
training data. We give a simple and efficient active learning
procedure based on the classic Perceptron algorithm. Under
distributional assumptions, the number of points that must be
labeled to achieve any error rate is logarithmic in the desired error rate.
- Adam Kalai and Rocco Servedio. Boosting in the Presence of Noise.
Journal of Computer and System Sciences 71(3): 266-290, 2005. (STOC version)
Boosting is a technique that attempts to improve the accuracy of any
learning algorithm. It has both strong theoretical guarantees as
well as overwhelming empirical success. One often-cited weakness of
boosting is its inability to deal with noisy data, even when the
noise is chosen completely randomly and independently. Following
work of Kearns, Mansour, and McAllester, we show that a simple
divide-and-conquer boosting algorithm can boost optimally in the
presence of noise.
- Adam Kalai and Santosh Vempala. Efficient Algorithms for On-line Optimization. Journal of
Computer and System Sciences 71(3): 291-307, 2005. (COLT
version)
In online optimization, one wants to solve a sequence of
combinatorial optimization problems in the face of uncertainty. For
example, each day one wants to choose a path to drive to work, and
only afterwards, one finds out the times on the various roads.
Suppose that one can efficiently solve the one-shot version of the
optimization problem, e.g., one can easily find the shortest path in
a graph with known times. We give an efficient procedure for
adaptively choosing solutions to a sequence of such problems, where
the average quality of the solution approaches that of the best
single solution. This holds for an arbitrary sequence of problems,
yet guarantees are given with respect to the best solution, chosen
with the benefit of hindsight. Our algorithm is as efficient as the best
offline solution, i.e., the fastest algorithm to solve the problem if the
entire sequence were known in advance.
- Adam Tauman Kalai. Learning Monotonic Linear Functions. In Proceedings of 17th Annual
Conference on Learning Theory, pp. 487-501, 2004. (long version: Efficient estimators for Generalized Additive Models, submitted) (50-minute ppt)
Generalized Additive Models (Hastie and Tibshirani) are a broad
generalization of both linear and logistic regression. We give the
first computationally efficient algorithm that provably learns this
class of models. It is efficient in two sense: its runtime is
polynomial in the size of the training data, and its error
approaches optimal at a rate that is inverse polynomial.
- Avrim Blum, Suchi Chawla, and Adam Kalai. Static Optimality and Dynamic Search Optimality in Lists and
Trees. Algorithmica 36:3, pp. 249-260, 2003. (SODA version)
Splay Trees are an efficient algorithm for maintaining a dynamic
binary search tree, popular both in theory and practice. They are
guaranteed to be only a constant factor worse than the best possible
static binary search tree. We give algorithms that are only
(1+eps) worse than the best fixed binary search tree, and
similar results for lists.
- Avrim Blum, Adam Kalai, and Hal Wasserman. Noise-Tolerant Learning, the Parity Problem, and the Statistical
Query Model. JACM 50(4):506--519, 2003. (STOC version)
The parity with noise learning problem is one of the most beautiful
problems in computer science that is believed to be hard. It is not
only believed to be hard on worst-case instances but also on random
instances. An efficient algorithm for this problem would find
applications in many fields, such as error-correcting codes. We
give the fastest-known algorithm. While it is not truly efficient,
it is the first solution that is faster than brute force.
- Adam Kalai. Generating Random Factored Numbers, Easily. Journal of
Cryptology 16(4):287-289, 2003.
(SODA version)
Our goal is to generate a uniformly random number between 1 and N,
along with its prime factorization. Since there are no known
efficient factoring algorithms, one cannot simply generate a random
number and then factor it. Eric Bach's award-winning dissertation
introduces and solves this problem. We present a four-line solution
that is still efficient.
- Adam Kalai. Efficient Pattern-Matching with Don't Cares. In Proceedings of the
Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 655-656, 2002.
We give a randomized algorithm for the string matching with don't
cares problem, the standard string matching problem with wildcards.
We present a four-line solution that is slightly faster and much
simpler than previous algorithms.
- Adam Kalai. Better computers for better people. Ph.D. thesis, technical report CMU-CS-01-132, 2001.
- Adam Kalai and Santosh Vempala. Efficient Algorithms for Universal Portfolios.
Journal of Machine Learning Research 3(3):423--440,
2002. (FOCS version)
The Universal Portfolio is an investment strategy introduced by
Thomas Cover. Previously, all known implementations required
computation exponential in the number of stocks. In this paper, we
use random walks for an implementation that is polynomial in the
number of stocks.
- Adam and Ehud Kalai. Strategic Polarization. Journal of Mathematical
Psychology, 45:4, pp. 656-663, 2001.
An example of polarization is a marriage in which one spouse gives
generously to charity while the other donates nothing. This may
misrepresent what is, in actuality, a small discrepancy in
preferences. It may be that the donating spouse would like to see
10\% of their combined income go to charity each year, while the apparently frugal spouse would like to see 9\% donated. A simple game-theoretic analysis suggests that the spouses will end up donating 10\% and 0\%, respectively. This paper gives a strategic (game-theoretic) analysis of
polarization.
- Avrim Blum, Carl Burch, and Adam Kalai. Finely Competitive Paging. In
Proceedings of the 40th Annual Symposium on the Foundations of
Computer Science, pp. 450-458, 1999.
In the on-line paging problem, one has a fixed-size cache. Each
time a page is requested that is not in the cache, it has to be
brought in at a cost of 1 and another page has to be evicted from
the cache. We present an algorithm that has the same worst-case
performance as the previous algorithms, but for a large and
important class of request sequences, is superior.
- Heung-Yeung Shum, Adam Kalai, and Steve Seitz. Omnivergent Stereo. Journal of
Computer Vision 48:3, pp 159-172, 2002. (also at ICCV 1999)
We describe the design of an optimal camera for 3D reconstruction.
In our model, a camera may capture a fixed number of pixels (light
rays) from anywhere inside the circular frame of the camera.
Afterwards, points outside the camera must be reconstructed as
accurately as possible. We show that capturing tangent rays (in
both directions) gives optimal uniform reconstructions.
- Avrim Blum, Adam Kalai, and John Langford. Beating the Holdout: Bounds for K-Fold and Progressive
Cross-Validation. In Proceedings of the 10th Annual Conference
on Computational Learning Theory, 1999.
K-Fold cross-validation is a popular technique in machine learning
for estimating the performance of a learned hypothesis on a data
set. We provide the first theoretical justification for this method
that shows that it is, on average, more accurate than a held-out
test set of comparable size.
- Stan
Chen, Adam Kalai, Avrim Blum, and Roni Rosenfeld. On-line
Algorithms for Combining Language Models. In Proceedings of
the International Conference on Accoustics, Speech, and Signal
Processing, 1999.
We show how Cover's Universal Portfolios can be applied to the field
of language modeling. The goal of a language model is to accurately
predict the next word in a sequence of words, with applications to
speech recognition and data compression. In this setting, the
universal guarantees are very striking. We support this with
experiments on several different corpora.
- Avrim Blum and Adam Kalai. Universal Portfolios With and Without Transaction Costs. Machine Learning 35:3, pp 193-205, 1999. (15-min
COLT slides).
We present a much simpler analysis of Thomas Cover's Universal Portfolios. This analysis easily extends to the case of transaction costs.
- Adam Kalai and Mel Siegel. Improved Rendering of Parallax Panoramagrams for a
Time-Multiplexed Autostereoscopic Display. In Stereoscopic
Displays and Applications IX Proceedings of SPIE 3295, 1998.
We describe a new kind of 3D display that is viewable without glasses.
- Avrim Blum and Adam Kalai. A Note on Learning from Multiple-Instance Examples. Machine Learning 30:1, pp. 23-30, 1998.
In the multiple-instance learning setting introduced by Dietterich
et al, the example data consist of n-tuples of points. An n-tuple
is labeled positive if any of the n constituent points are positive.
We show that this problem can be reduced to that of learning in the
presence of noise.