TechTalks from event: ICML 2011

statistical relational learning

  • Relational Active Learning for Joint Collective Classification Models Authors: Ankit Kuwadekar; Jennifer Neville
    In many network domains, labeled data may be costly to acquire---indicating a need for {em relational active learning} methods. Recent work has demonstrated that relational model performance can be improved by taking network structure into account when choosing instances to label. However, in collective inference settings, {em both} model estimation {em and} prediction can be improved by acquiring a node's label---since relational models estimate a joint distribution over labels in the network and collective classification methods propagate information from labeled training data during prediction. This conflates improvement in learning with improvement in inference, since labeling nodes can reduce inference error without improving the overall quality of the learned model. Here, we use {em across-network} classification to separate the effects on learning and prediction, and focus on reduction of learning error. When label propagation is used for learning, we find that labeling based on prediction {em certainty} is more effective than labeling based on {em uncertainty}. As such, we propose a novel active learning method that combines a network-based {em certainty} metric with semi-supervised learning and relational resampling. We evaluate our approach on synthetic and real-world networks and show faster learning compared to several baselines, including the network based method of Bilgic et al. 2010.
  • A Three-Way Model for Collective Learning on Multi-Relational Data Authors: Maximilian Nickel; Volker Tresp; Hans-Peter Kriegel
    Relational learning is becoming increasingly important in many areas of application. Here, we present a novel approach to relational learning based on the factorization of a three-way tensor. We show that unlike other tensor approaches, our method is able to perform collective learning via the latent components of the model and provide an efficient algorithm to compute the factorization. We substantiate our theoretical considerations regarding the collective learning capabilities of our model by the means of experiments on both a new dataset and a dataset commonly used in entity resolution. Furthermore, we show on common benchmark datasets that our approach achieves better or on-par results, if compared to current state-of-the-art relational learning solutions, while it is significantly faster to compute.

Outlier Detection

  • Learning Multi-View Neighborhood Preserving Projections Authors: Novi Quadrianto; Christoph Lampert
    We address the problem of metric learning for multi-view data, namely the construction of embedding projections from data in different representations into a shared feature space, such that the Euclidean distance in this space provides a meaningful within-view as well as between-view similarity. Our motivation stems from the problem of cross-media retrieval tasks, where the availability of a joint Euclidean distance function is a prerequisite to allow fast, in particular hashing-based, nearest neighbor queries. We formulate an objective function that expresses the intuitive concept that matching samples are mapped closely together in the output space, whereas non-matching samples are pushed apart, no matter in which view they are available. The resulting optimization problem is not convex, but it can be decomposed explicitly into a convex and a concave part, thereby allowing efficient optimization using the convex-concave procedure. Experiments on an image retrieval task show that nearest-neighbor based cross-view retrieval is indeed possible, and the proposed technique improves the retrieval accuracy over baseline techniques.
  • On the Robustness of Kernel Density M-Estimators Authors: JooSeuk Kim; Clayton Scott
    We analyze a method for nonparametric density estimation that exhibits robustness to contamination of the training sample. This method achieves robustness by combining a traditional kernel density estimator (KDE) with ideas from classical M-estimation. The KDE based on a Gaussian kernel is interpreted as a sample mean in the associated reproducing kernel Hilbert space (RKHS). This mean is estimated robustly through the use of a robust loss, yielding the so-called robust kernel density estimator (RKDE). This robust sample mean can be found via a kernelized iteratively re-weighted least squares (IRWLS) algorithm. Our contributions are summarized as follows. First, we present a representer theorem for the RKDE, which gives an insight into the robustness of the RKDE. Second, we provide necessary and sufficient conditions for kernel IRWLS to converge to the global minimizer, in the Gaussian RKHS, of the objective function defining the RKDE. Third, characterize and provide a method for computing the influence function associated with the RKDE. Fourth, we illustrate the robustness of the RKDE through experiments on several data sets.

Time Series

  • Learning Discriminative Fisher Kernels Authors: Laurens Van der Maaten
    Fisher kernels provide a commonly used vectorial representation of structured objects. The paper presents a technique that exploits label information to improve the object representation of Fisher kernels by employing ideas from metric learning. In particular, the new technique trains a generative model in such a way that the distance between the log-likelihood gradients induced by two objects with the same label is as small as possible, and the distance between the gradients induced by two objects with different labels is as large as possible. We illustrate the strong performance of classifiers trained on the resulting object representations on problems in handwriting recognition, speech recognition, facial expression analysis, and bio-informatics.
  • Time Series Clustering: Complex is Simpler! Authors: Lei Li; B. Aditya Prakash
    Given a motion capture sequence, how to identify the category of the motion? Classifying human motions is a critical task in motion editing and synthesizing, for which manual labeling is clearly inefficient for large databases. Here we study the general problem of time series clustering. We propose a novel method of clustering time series that can (a) learn joint temporal dynamics in the data; (b) handle time lags; and (c) produce interpretable features. We achieve this by developing complex-valued linear dynamical systems (CLDS), which include real-valued Kalman filters as a special case; our advantage is that the transition matrix is simpler (just diagonal), and the transmission one easier to interpret. We then present Complex-Fit, a novel EM algorithm to learn the parameters for the general model and its special case for clustering. Our approach produces significant improvement in clustering quality, 1.5 to 5 times better than well-known competitors on real motion capture sequences.