TechTalks from event: ICML 2011
Multiple Instance Learning with Manifold BagsIn many machine learning applications, labeling every instance of data is burdensome. Multiple Instance Learning (MIL), in which training data is provided in the form of labeled bags rather than labeled instances, is one approach for a more relaxed form of supervised learning. Though much progress has been made in analyzing MIL problems, existing work considers bags that have a finite number of instances. In this paper we argue that in many applications of MIL (e.g. image, audio, e.t.c.) the bags are better modeled as low dimensional manifolds in high dimensional feature space. We show that the geometric structure of such manifold bags affects PAC-learnability. We discuss how a learning algorithm that is designed for finite sized bags can be adapted to learn from manifold bags. Furthermore, we propose a simple heuristic that reduces the memory requirements of such algorithms. Our experiments on real-world data validate our analysis and show that our approach works well.
Minimax Learning Rates for Bipartite Ranking and Plug-in RulesWhile it is now well-known in the standard binary classication setup, that, under suitable margin assumptions and complexity conditions on the regression function, fast or even super-fast rates (i.e. rates faster than n^(-1/2) or even faster than n^-1) can be achieved by plug-in classiers, no result of this nature has been proved yet in the context of bipartite ranking, though akin to that of classication. It is the main purpose of the present paper to investigate this issue. Viewing bipartite ranking as a nested continuous collection of cost-sensitive classication problems, we exhibit a global low noise condition under which certain plug-in ranking rules can be shown to achieve fast (but not super-fast) rates, establishing thus minimax upper bounds for the excess of ranking risk.
From PAC-Bayes Bounds to Quadratic Programs for Majority VotesWe propose to construct a weighted majority vote on a set of basis functions by minimizing a risk bound (called the C-bound) that depends on the first two moments of the margin of the Q-convex combination realized on the training data. This bound minimization algorithm turns out to be a quadratic program that can be efficiently solved. A first version of the algorithm is designed for the supervised inductive setting and turns out to be competitive with AdaBoost, MDBoost and the SVM. The second version of the algorithm, designed for the transductive setting, competes well with TSVM. We also propose a new PAC-Bayes theorem that bounds the difference between the "true" value of the C-bound and its empirical estimate and that, unexpectedly, contains no KL-divergence.