TechTalks from event: ICML 2011

Learning Theory

  • Multiple Instance Learning with Manifold Bags Authors: Boris Babenko; Nakul Verma; Piotr Dollar; Serge Belongie
    In many machine learning applications, labeling every instance of data is burdensome. Multiple Instance Learning (MIL), in which training data is provided in the form of labeled bags rather than labeled instances, is one approach for a more relaxed form of supervised learning. Though much progress has been made in analyzing MIL problems, existing work considers bags that have a finite number of instances. In this paper we argue that in many applications of MIL (e.g. image, audio, e.t.c.) the bags are better modeled as low dimensional manifolds in high dimensional feature space. We show that the geometric structure of such manifold bags affects PAC-learnability. We discuss how a learning algorithm that is designed for finite sized bags can be adapted to learn from manifold bags. Furthermore, we propose a simple heuristic that reduces the memory requirements of such algorithms. Our experiments on real-world data validate our analysis and show that our approach works well.
  • Minimax Learning Rates for Bipartite Ranking and Plug-in Rules Authors: Sylvain Robbiano; Stéphan Clémençon
    While it is now well-known in the standard binary classi cation setup, that, under suitable margin assumptions and complexity conditions on the regression function, fast or even super-fast rates (i.e. rates faster than n^(-1/2) or even faster than n^-1) can be achieved by plug-in classi ers, no result of this nature has been proved yet in the context of bipartite ranking, though akin to that of classi cation. It is the main purpose of the present paper to investigate this issue. Viewing bipartite ranking as a nested continuous collection of cost-sensitive classi cation problems, we exhibit a global low noise condition under which certain plug-in ranking rules can be shown to achieve fast (but not super-fast) rates, establishing thus minimax upper bounds for the excess of ranking risk.
  • From PAC-Bayes Bounds to Quadratic Programs for Majority Votes Authors: Jean-Francis Roy; Francois Laviolette; Mario Marchand
    We propose to construct a weighted majority vote on a set of basis functions by minimizing a risk bound (called the C-bound) that depends on the first two moments of the margin of the Q-convex combination realized on the training data. This bound minimization algorithm turns out to be a quadratic program that can be efficiently solved. A first version of the algorithm is designed for the supervised inductive setting and turns out to be competitive with AdaBoost, MDBoost and the SVM. The second version of the algorithm, designed for the transductive setting, competes well with TSVM. We also propose a new PAC-Bayes theorem that bounds the difference between the "true" value of the C-bound and its empirical estimate and that, unexpectedly, contains no KL-divergence.