TechTalks from event: ICML 2011

Supervised Learning

  • Multi-Label Classification on Tree- and DAG-Structured Hierarchies Authors: Wei Bi; James Kwok
    Many real-world applications involve multi-label classification, in which the labels are organized in the form of a tree or directed acyclic graph (DAG). However, current research efforts typically ignore the label dependencies or can only exploit the dependencies in tree-structured hierarchies. In this paper, we present a novel hierarchical multi-label classification algorithm which can be used on both tree- and DAG-structured hierarchies. The key idea is to formulate the search for the optimal consistent multi-label as the finding of the best subgraph in a tree/DAG. Using a simple greedy strategy, the proposed algorithm is computationally efficient, easy to implement, does not suffer from the problem of insufficient/skewed training data in classifier training, and can be readily used on large hierarchies. Theoretical results guarantee the optimality of the obtained solution. Experiments are performed on a large number of functional genomics data sets. The proposed method consistently outperforms the state-of-the-art method on both tree- and DAG-structured hierarchies.
  • Surrogate losses and regret bounds for cost-sensitive classification with example-dependent costs Authors: Clayton Scott
    We study surrogate losses in the context of cost-sensitive classification with example-dependent costs, a problem also known as regression level set estimation. We give sufficient conditions on the surrogate loss for the existence of a surrogate regret bound. Such bounds imply that as the surrogate risk tends to its optimal value, so too does the expected misclassification cost. Our sufficient conditions encompass example-dependent versions of the hinge, exponential, and other common losses. These results provide theoretical justification for some previously proposed surrogate-based algorithms, and suggests others that have not yet been developed.
  • Support Vector Machines as Probabilistic Models Authors: Vojtech Franc; Alexander Zien; Bernhard Schölkopf
    We show how the SVM can be viewed as a maximum likelihood estimate of a class of probabilistic models. This model class can be viewed as a reparametrization of the SVM in a similar vein to the $ u$-SVM reparametrizing the original SVM. It is not discriminative, but has a non-uniform marginal. We illustrate the benefits of this new view by re-deriving and re-investigating two established SVM-related algorithms.
  • Locally Linear Support Vector Machines Authors: Lubor Ladicky; Philip Torr
    Linear support vector machines ({sc svm}s) have become popular for solving classification tasks due to their fast and simple online application to large scale data sets. However, many problems are not linearly separable. For these problems kernel-based {sc svm}s are often used, but unlike their linear variant they suffer from various drawbacks in terms of computational and memory efficiency. Their response can be represented only as a function of the set of support vectors, which has been experimentally shown to grow linearly with the size of the training set. In this paper we propose a novel locally linear {sc svm} classifier with smooth decision boundary and bounded curvature. We show how the functions defining the classifier can be approximated using local codings and show how this model can be optimized in an online fashion by performing stochastic gradient descent with the same convergence guarantees as standard gradient descent method for linear {sc svm}. Our method achieves comparable performance to the state-of-the-art whilst being significantly faster than competing kernel {sc svm}s. We generalise this model to locally finite dimensional kernel {sc svm}.