From macglashan at tti-c.org Wed Oct 1 11:52:29 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Wed Oct 1 11:52:17 2008 Subject: [TTIC Colloquium] TTI-C Colloquium: Shai Ben-David, University of Waterloo In-Reply-To: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: When: Monday, October 6 @ 2:00pm Where: TTI-C Conference Room: 1427 E. 60th St, 2nd Floor Who: Shai Ben-David, University of Waterloo Topic: Axiomatic view of clustering - overcoming Kleinberg's impossibility result. Clustering is a basic and vastly applicable task. Yet, there exist distressingly meager theoretical understanding of clustering. Can there be a general theory of clustering that is independent of any particular clustering algorithm or cost function or data-generating model? In a highly influential NIPS'02 paper, John Kleinberg considered an axiomatic framework for clustering and proved an impossibility result. That result is often interpreted as stating the impossibility of building a satisfactory axiomatic foundations for clustering. We take a second look at that result and show that Kleinberg's pessimistic conclusion can be overcome by a relatively small change to the formalism underlying the analysis of clustering. I'll conclude with a high level discussion of the possible goals for a general theory of clustering. Part of this work is joint with Margarita Ackerman. Contact: Nati Srebro, TTI-C nati@tti-c.org 834-7493 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081001/948c94d1/attachment.htm From macglashan at tti-c.org Mon Oct 6 08:54:52 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Mon Oct 6 08:54:28 2008 Subject: [TTIC Colloquium] TTI-C Colloquium: Shai Ben-David, University of Waterloo References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: <445A028BF92C496CB8D92A8551B0F65D@jmacglDPLFYD1> When: TODAY: Monday, October 6 @ 2:00pm Where: TTI-C Conference Room: 1427 E. 60th St, 2nd Floor Who: Shai Ben-David, University of Waterloo Topic: Axiomatic view of clustering - overcoming Kleinberg's impossibility result. Clustering is a basic and vastly applicable task. Yet, there exist distressingly meager theoretical understanding of clustering. Can there be a general theory of clustering that is independent of any particular clustering algorithm or cost function or data-generating model? In a highly influential NIPS'02 paper, John Kleinberg considered an axiomatic framework for clustering and proved an impossibility result. That result is often interpreted as stating the impossibility of building a satisfactory axiomatic foundations for clustering. We take a second look at that result and show that Kleinberg's pessimistic conclusion can be overcome by a relatively small change to the formalism underlying the analysis of clustering. I'll conclude with a high level discussion of the possible goals for a general theory of clustering. Part of this work is joint with Margarita Ackerman. Contact: Nati Srebro, TTI-C nati@tti-c.org 834-7493 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081006/c88da7c8/attachment.htm From macglashan at tti-c.org Tue Oct 7 08:14:29 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Tue Oct 7 08:13:56 2008 Subject: [TTIC Colloquium] TTI-C Colloquium: Partha Niyogi, University of Chicago References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: When: Monday, October 13 @ 2:00pm Where: TTI-C Conference Room: 1427 E. 60th St, 2nd Floor Who: Partha Niyogi, University of Chicago Topic: A Geometric Perspective on Learning Theory and Algorithms Increasingly, we face machine learning problems in very high dimensional spaces. We proceed with the intuition that although natural data lives in very high dimensions, they have relatively few degrees of freedom. One way to formalize this intuition is to model the data as lying on or near a low dimensional manifold embedded in the high dimensional space. This point of view leads to a new class of algorithms that are "manifold motivated" and a new set of theoretical questions that surround their analysis. A central construction in these algorithms is a graph or simplicial complex that is data-derived and we will relate the geometry of these to the geometry of the underlying manifold. The implications of this for machine learning and numerical analysis will be considered. Contact: Karen Livescu, TTI-C klivescu@tti-c.org 834-2549 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081007/df3cc898/attachment-0001.htm From macglashan at tti-c.org Thu Oct 9 09:20:42 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Thu Oct 9 09:19:58 2008 Subject: [TTIC Colloquium] TTI-C Talk: Ranjit Jhala, UC San Diego References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: When: Thursday, October 16 @ 10:30am Where: TTI-C Conference Room: 1427 E. 60th St, 2nd Floor Who: Ranjit Jhala, University of California, San Diego Topic: Liquid Types ABSTRACT: : We present Logically Qualified Data Types, abbreviated to Liquid Types, a new static verification technique which combines the complementary strengths of automated deduction (SMT solvers), model checking (Predicate Abstraction), and type systems (Hindley-Milner inference). Liquid Types automate static verification of deep invariants by combining local implication checks over simple refinement predicates with global subtyping checks. The former are discharged using SMT solvers, and the latter using standard type-based mechanisms. We have implemented Liquid Types in a tool Dsolve, which takes as input an Ocaml program and a set of logical qualifiers and infers liquid types for the expressions in the program. To demonstrate the utility of our approach, we describe experiments using Dsolve to statically verify, with minimal annotations, the safety of array accesses on a diverse set of benchmarks, and the key invariants of a variety of data structure libraries including several sorting implementations, AVL trees, red-black trees, finite, balanced binary search maps, and an extensible vector library. (Joint work with Patrick Rondon and Ming Kawaguchi) BIO: Ranjit Jhala is an Assistant Professor in the Department of Computer Science and Engineering at UC San Diego. Before joining UCSD, he was a graduate student at UC Berkeley. Ranjit is interested in Programming Languages and Software Engineering, more specifically, in techniques for building reliable computer systems. His work draws from, combines and contributes to methods the areas of Model Checking, Program Analysis and Automated Deduction. Contact: Amal Ahmed, TTI-C amal@tti-c.org 834-6832 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081009/e04272d0/attachment.htm From macglashan at tti-c.org Mon Oct 13 08:11:24 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Mon Oct 13 08:10:30 2008 Subject: [TTIC Colloquium] TTI-C Colloquium: Partha Niyogi, University of Chicago References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: <7EB87161ECE14096BE567747402DE331@jmacglDPLFYD1> When: TODAY: Monday, October 13 @ 2:00pm Where: TTI-C Conference Room: 1427 E. 60th St, 2nd Floor Who: Partha Niyogi, University of Chicago Topic: A Geometric Perspective on Learning Theory and Algorithms Increasingly, we face machine learning problems in very high dimensional spaces. We proceed with the intuition that although natural data lives in very high dimensions, they have relatively few degrees of freedom. One way to formalize this intuition is to model the data as lying on or near a low dimensional manifold embedded in the high dimensional space. This point of view leads to a new class of algorithms that are "manifold motivated" and a new set of theoretical questions that surround their analysis. A central construction in these algorithms is a graph or simplicial complex that is data-derived and we will relate the geometry of these to the geometry of the underlying manifold. The implications of this for machine learning and numerical analysis will be considered. Contact: Karen Livescu, TTI-C klivescu@tti-c.org 834-2549 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081013/29033c60/attachment.htm From macglashan at tti-c.org Mon Oct 13 14:14:10 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Mon Oct 13 14:13:11 2008 Subject: [TTIC Colloquium] UoC Seminar Annoucement Message-ID: Petascale Active Data Store Seminar Series Speaker: Ian Foster Director, Computation Institute Host: Mike Papka Date: October 15, 2008 Time: 3:30 - 4:30pm Location: University of Chicago, RI, room 405 Title: Taming the Data Deluge: Building an Open Analytics Environment The Petascale Active Data Store (PADS) Fall Seminar Series is a forum for discussions of data intensive computing. This first series will introduce the National Science Foundation funded PADS system, its hardware and software infrastructure, and highlight some of the scientific domain partners and how they plan to use the environment. Ian Foster, director of Computation Institute, is the guest speaker. Related website: http://pads.ci.uchicago.edu/ From macglashan at tti-c.org Tue Oct 14 09:10:34 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Tue Oct 14 09:09:32 2008 Subject: [TTIC Colloquium] TTI-C Colloquium: Jeff Erickson, UIUC References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: <12F4C71AD79048078383D767A574D732@jmacglDPLFYD1> When: Monday, October 20 @ 2:00pm Where: TTI-C Conference Room: 1427 E. 60th St, 2nd Floor Who: Jeff Erickson, University of Illinois at Urbana-Champaign Title: Homology Flows I will describe the first algorithms to compute maximum flows in surface-embedded graphs in near-linear time. Our results generalize an O(n log n)-time max-flow algorithm for undirected planar graphs, published by Hassin and Johnson in 1985. Except for this special case, the only previous time bounds for our problem follow from algorithms for general sparse graphs. Our key insight is to optimize the homology class of the flow, rather than directly optimizing the flow itself. This is joint work with Erin Chambers, Amir Nayyeri, and Aparna Sundar. Contact: Beno?t Hudson, TTI-C bhudson@tti-c.org 834-2623 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081014/fa2257e9/attachment-0001.htm From macglashan at tti-c.org Thu Oct 16 08:41:04 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Thu Oct 16 08:39:56 2008 Subject: [TTIC Colloquium] TTI-C Talk: Ranjit Jhala, UC San Diego References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: When: TODAY: Thursday, October 16 @ 10:30am Where: ROOM CHANGE: TTI-C Lobby Conference Room #201 Who: Ranjit Jhala, University of California, San Diego Topic: Liquid Types ABSTRACT: We present Logically Qualified Data Types, abbreviated to Liquid Types, a new static verification technique which combines the complementary strengths of automated deduction (SMT solvers), model checking (Predicate Abstraction), and type systems (Hindley-Milner inference). Liquid Types automate static verification of deep invariants by combining local implication checks over simple refinement predicates with global subtyping checks. The former are discharged using SMT solvers, and the latter using standard type-based mechanisms. We have implemented Liquid Types in a tool Dsolve, which takes as input an Ocaml program and a set of logical qualifiers and infers liquid types for the expressions in the program. To demonstrate the utility of our approach, we describe experiments using Dsolve to statically verify, with minimal annotations, the safety of array accesses on a diverse set of benchmarks, and the key invariants of a variety of data structure libraries including several sorting implementations, AVL trees, red-black trees, finite, balanced binary search maps, and an extensible vector library. (Joint work with Patrick Rondon and Ming Kawaguchi) BIO: Ranjit Jhala is an Assistant Professor in the Department of Computer Science and Engineering at UC San Diego. Before joining UCSD, he was a graduate student at UC Berkeley. Ranjit is interested in Programming Languages and Software Engineering, more specifically, in techniques for building reliable computer systems. His work draws from, combines and contributes to methods the areas of Model Checking, Program Analysis and Automated Deduction. Contact: Amal Ahmed, TTI-C amal@tti-c.org 834-6832 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081016/431403f9/attachment.htm From macglashan at tti-c.org Thu Oct 16 08:48:12 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Thu Oct 16 08:47:05 2008 Subject: [TTIC Colloquium] TTI-C Talk: Kamalika Chaudhuri, UCSD References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: <20838B18F1A84BC2A564C424CDA3618B@jmacglDPLFYD1> When: TODAY: Thursday, October 16 @ 2:00pm Where: TTI-C Conference Room: 1427 E. 60th St, 2nd Floor Who: Kamalika Chaudhuri, University of California, San Diego Title: Learning Mixtures of Distributions through Spectral Algorithms Clustering, a method of finding structure in unlabelled data by grouping the data points into few groups based on a similarity measure, has many applications in AI, Physics and Biology. A simple theoretical model that captures clustering is the problem of learning mixtures of distributions. In this setting, one is given sample points generated from a mixture of T distributions of a certain type, and the goal is to recover these distributions and classify the points correctly. In this talk, motivated by applications in biology, I will focus on learning mixtures of product distributions. The most common method in practice is uses principal component analysis(PCA) as a preprocessing step to find the T-dimensional subspace that contains the T centers. While this has been analysed theoretically, it is known to be ineffective in certain situations -- namely, when the proportion of different distributions in the mixture is too skewed, or when the variance in irrelevant directions is too high. In the first part of the talk, we present a simple method which simultaneously exploits the correlation between the signal coordinates and independence between the noise coordinates to effectively separate the centers of the distributions. Our method performs better than PCA-based algorithms when learning mixtures of binary product distributions and axis-aligned Gaussians. In the second part of the talk, motivated again by our application in biology, we address the sample complexity of learning mixtures of distributions. We present a simple and efficient algorithm that learns mixtures of two binary product distributions with low sample complexity. [Based on joint work with Satish Rao, Eran Halperin and Shuheng Zhou.] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081016/32d668b2/attachment.htm From macglashan at tti-c.org Mon Oct 20 09:08:14 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Mon Oct 20 09:06:59 2008 Subject: [TTIC Colloquium] TTI-C Colloquium: Jeff Erickson, UIUC References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: When: TODAY: Monday, October 20 @ 2:00pm Where: TTI-C Conference Room: 1427 E. 60th St, 2nd Floor Who: Jeff Erickson, University of Illinois at Urbana-Champaign Title: Homology Flows I will describe the first algorithms to compute maximum flows in surface-embedded graphs in near-linear time. Our results generalize an O(n log n)-time max-flow algorithm for undirected planar graphs, published by Hassin and Johnson in 1985. Except for this special case, the only previous time bounds for our problem follow from algorithms for general sparse graphs. Our key insight is to optimize the homology class of the flow, rather than directly optimizing the flow itself. This is joint work with Erin Chambers, Amir Nayyeri, and Aparna Sundar. Contact: Beno?t Hudson, TTI-C bhudson@tti-c.org 834-2623 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081020/4f827ec5/attachment.htm From macglashan at tti-c.org Tue Oct 21 08:42:14 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Tue Oct 21 08:40:51 2008 Subject: [TTIC Colloquium] ML Seminar Talk: Nati Srebro, TTI-C References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: <5B3719F8C63C4200998461BC6A14DB82@jmacglDPLFYD1> When: Wednesday, October 22 @ 11:00am Where: Machine Learning Seminar TTI-C Conference Room: 1427 E. 60th St, 2nd Floor Who: Nati Srebro, Toyota Technological Institute at Chicago Title: Stochastic Convex Optimization Recently regret bounds for online convex optimization have been derived under very general conditions. These results can be used also in the stochastic batch setting by applying online-to-batch conversions. It is interesting to study whether stochastic guarantees can be obtained more directly, for example using uniform convergence guarantees. We discover a surprising and complex situation: although the stochastic convex optimization problem is solvable (e.g. using online-to-batch conversions), no uniform convergence holds in the general case, and empirical minimization might fail. This is unlike the familiar case of supervised learning, where learning is possible if and only if the empirical errors of all hypothesis converge uniformly and so empirical minimization can be used. The situation also seems to contradict more general results by Vapnik on the equivalence of stochastic optimization and uniform convergence. Rather then being a difference between online methods and a global minimization approach, we show that the key ingredient is strong convexity and regularization. In doing so we provide new understanding of the role of regularization: even when regularization does not ensure uniform convergence (which is the standard understanding of regularization), it can ensure stability, and so ensure generalization even without uniform convergence. Joint work with Karthik Sridharan and Shai Shalev-Shwartz Contact: Nati Srebro, TTI-C nati@tti-c.org 834-7493 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081021/4acc91ec/attachment-0001.htm From macglashan at tti-c.org Tue Oct 21 12:53:55 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Tue Oct 21 12:52:33 2008 Subject: [TTIC Colloquium] TTI-C Colloquium: David Israel, SRI International References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: When: Monday, October 27 @ 2:00pm Where: TTI-C Conference Room: 1427 E. 60th St, 2nd Floor Who: David Israel, SRI International Title: There Aren't Any Real Belief-Desire Intention Theories Out There ... But Mine!! (And really, I'm not too sure about mine.) Over the last 20+ years, a number of researchers in AI have presented accounts of agents and architectures for agents that those researchers have called "Belief-Desire-Intention" (BDI) models. I think very few, if any, of those accounts deserve the name. While a dispute over names is unlikely to be either illuminating or profitable, I will argue that there are quite important issues at stake -- not with respect to naming conventions, but with respect to how one thinks about agents. This will involve my sketching an account that really is a BDI account, and if time allows, relating some of the issues raised to issues that once precipitated great debates among statisticians of different stripes. Contact: David McAllester, TTI-C mcallester@tti-c.org 702-5562 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081021/57946174/attachment.htm From macglashan at tti-c.org Tue Oct 21 14:55:57 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Tue Oct 21 14:54:29 2008 Subject: [TTIC Colloquium] UoC Seminar Announcement Message-ID: <95EA76291E7D4790B8EC588C5C186975@jmacglDPLFYD1> Petascale Active Data Store Seminar Speaker: Mike Wilde MCS, Argonne National Laboratory Host: Mike Papka Date: October 22, 2008 Time: 3:30 pm Location: Argonne National Laboratory, A134; University of Chicago, Research Institute, rm. 405 Title: Swift: Parallel Scripting for Petascale Data Analysis Swift is a system for the rapid and reliable specification, execution, and management of large-scale science and engineering problems. It supports applications that execute large numbers of tasks that pass data via files - as is common, for example, when analyzing large quantities of data or performing parameter studies and ensemble simulations. The open source Swift software combines a simple scripting language to enable the concise, high-level specifications of complex parallel computations, mappers for accessing diverse data formats in a convenient manner, and an execution engine that can manage the dispatch of tasks to environments ranging from clusters to grids to petascale systems such as the Blue Gene/P. This talk will describe the Swift programming model and show several examples of applying it on large-scale clusters to problems in diverse scientific disciplines. Related website: http://www.ci.uchicago.edu/swift/ From macglashan at tti-c.org Mon Oct 27 06:57:38 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Mon Oct 27 06:56:04 2008 Subject: [TTIC Colloquium] TTI-C Colloquium: David Israel, SRI International References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: When: TODAY: Monday, October 27 @ 2:00pm Where: TTI-C Conference Room: 1427 E. 60th St, 2nd Floor Who: David Israel, SRI International Title: There Aren't Any Real Belief-Desire Intention Theories Out There ... But Mine!! (And really, I'm not too sure about mine.) Over the last 20+ years, a number of researchers in AI have presented accounts of agents and architectures for agents that those researchers have called "Belief-Desire-Intention" (BDI) models. I think very few, if any, of those accounts deserve the name. While a dispute over names is unlikely to be either illuminating or profitable, I will argue that there are quite important issues at stake -- not with respect to naming conventions, but with respect to how one thinks about agents. This will involve my sketching an account that really is a BDI account, and if time allows, relating some of the issues raised to issues that once precipitated great debates among statisticians of different stripes. Contact: David McAllester, TTI-C mcallester@tti-c.org 702-5562 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081027/ccdf749e/attachment.htm From macglashan at tti-c.org Tue Oct 28 09:43:00 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Tue Oct 28 09:41:10 2008 Subject: [TTIC Colloquium] TTI-C Colloquium: David Forsyth, UIUC References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: When: Monday, November 3 @ 2:00pm Where: TTI-C Conference Room: 1427 E. 60th St, 2nd Floor Who: David Forsyth, University of Illinois at Urbana-Champaign Title: Looking at People There is a great need for programs that can describe what people are doing from video. This is difficult to do, because it is hard to identify and track people in video sequences, because we have no canonical vocabulary for describing what people are doing, and because phenomena such as aspect and individual variation greatly affect the appearance of what people are doing. Recent work in kinematic tracking has produced methods that can report the kinematic configuration of the body fairly accurately and fully automatically. The problem of vocabulary is more difficult. I will discuss a generative activity model that allows activities to be assembled from a set of distinct spatial and temporal components. The models themselves are learned from labelled motion capture data and are assembled in a way that makes it possible to learn very complex finite automata without estimating large numbers of parameters. The advantage of such a model is that one can search videos for examples of activities specified with a simple query language, without possessing any example of the activity sought. In this case, aspect is dealt with by explicit 3D reasoning. An alternative strategy for dealing with aspect and individual variation is to build discriminative methods applied to appearance features. The difficulty here is that activities look different when seen from different directions. I will describe recent methods that make it possible to transfer models --- that is, to learn a model of an activity from one view, then recognize it in a completely different view. Contact: Greg Shakhnarovich, TTI-C greg@tti-c.org 834-2572 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081028/404c6bf6/attachment-0001.htm From macglashan at tti-c.org Tue Oct 28 11:17:13 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Tue Oct 28 11:15:28 2008 Subject: [TTIC Colloquium] ML Seminar: Daniel Hsu, UCSD References: <9BFA4FFE1ACE407581765CA2326E1547@jmacglDPLFYD1> Message-ID: When: Wednesday, October 29 @ 11:00am Where: TTI-C Conference Room: 1427 E. 60th St, 2nd Floor Who: Daniel Hsu, University of California, San Diego Title: Consistent sampling strategies for active learning In many applications, labeled data typically comes at a higher cost than unlabeled data (e.g. in time, effort). An active learner is given unlabeled data and must pay to view any label. The hope is that significantly fewer labeled examples are used than in the supervised (non-active) learning model. A typical strategy for active learning starts by querying a few randomly-chosen points to get a very rough idea of the decision boundary, and then queries points that are increasingly closer to its current estimate of the boundary. Such selective sampling methods immediately bring to the forefront the unique difficulty of active learning: sampling bias. In this talk, I'll describe active learning strategies that properly manage the sampling bias of selective sampling and contrast them with popular heuristics that are provably inconsistent. Based on joint work with Sanjoy Dasgupta and Claire Monteleoni. Contact: Shai Shalev-Shwartz, TTI-C shai@tti-c.org 834-6850 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081028/54f89931/attachment.htm From macglashan at tti-c.org Tue Oct 28 15:02:44 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Tue Oct 28 15:00:53 2008 Subject: [TTIC Colloquium] UC Seminar Annoucement Message-ID: Petascale Active Data Store Seminar Speaker: Jesse Shapiro Graduate School of Business, University of Chicago Host: Mike Papka Date: October 29, 2008 Time: 3:30 pm Location: University of Chicago, Research Institute, rm. 405 Title: What Drives Media Slant? Evidence from U.S. Daily Newspapers The Petascale Active Data Store (PADS) Fall Seminar Series is a forum for discussions of data intensive computing. This first series will introduce the National Science Foundation funded PADS system, its hardware and software infrastructure, and highlight some of the scientific domain partners and how they plan to use the environment. Information: PADS seminar will be available via the Access Grid at Argonne National Laboratory, Building 221, rm. A134. Related website: http://pads.ci.uchicago.edu/ From macglashan at tti-c.org Wed Oct 29 09:52:09 2008 From: macglashan at tti-c.org (Julia MacGlashan) Date: Wed Oct 29 09:50:19 2008 Subject: [TTIC Colloquium] UC Seminar Series by Ketan Mulmuley: Nov 5, 12, & 19 Message-ID: DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF CHICAGO Date: Wednesday, November 5, 12 and 19, 2008 Time: 2:30 p.m. Place: RY 251 ---------------------------------------------------------- Speaker: Ketan Mulmuley From: University of Chicago Web page: http://www.cs.uchicago.edu/people/mulmuley Title: On P vs NP, Geometric Complexity theory, and the Riemann Hypothesis Abstract: This series of three colloquium talks on November 5, 12 and 19 (2.30 p.m.) will give a nontechnical, high level overview of geometric complexity theory (GCT), which is an approach to the P vs. NP problem via algebraic geometry, representation theory, and the theory of a new class of quantum groups, called nonstandard quantum groups, that arise in this approach. In particular, GCT says that the P vs. NP problem in characteristic zero is intimately linked to the Riemann Hypothesis over finite fields. A high level view of potential implications in mathematics, physics and quantum computation would also be given. No background in algebraic geometry, representation theory or quantum groups would be assumed. Complementary talks in the logic and theory seminars on November 10 (at 2.30 p.m. and 3.45 p.m.) would elaborate on the basic notion of obstructions in GCT. References for GCT: The basic plan of GCT is given in: GCTflip: "On P vs. NP, Geometric Complexity Theory and the Flip I: high level view". It has been partially implemented in a series of papers: GCT1 to GCT11. GCT1 to 4: Joint with Milind Sohoni GCT5: Joint with Hari Narayanan GCTflip, its abstract (GCTabs), and GCT1-8 are available on the speaker's personal home page. GCT8-11 are under preparation. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ttic.uchicago.edu/pipermail/colloquium/attachments/20081029/da9125cd/attachment.htm