Tutorial: Learning Kernels
Kernel methods are widely used in statistical learning. Positive definite symmetric (PDS) kernels implicitly specify an inner product in a Hilbert space where large-margin techniques are used for learning and estimation. They can be combined with algorithms such as support vector machines (SVMs) or other kernel-based algorithms to form powerful learning techniques.
But the choice of the kernel, which is critical to the success of these algorithms, is typically left to the user. To limit the risk of a poor choice of kernel, in the last decade or so, a number of publications have investigated the idea of learning the kernel from data. Rather than requesting the user to commit to a specific kernel, which may not be optimal, in particular if the user's prior knowledge about the task is poor, learning kernel methods require the user only to supply a family of kernels. The task of selecting (or learning) a kernel out of that family is then reserved to the learning algorithm which, as for standard kernel-based methods, must also use the data to choose a hypothesis in the reproducing kernel Hilbert space (RKHS) associated to the kernel selected.
This tutorial describes the main theoretical, algorithmic, and empirical results related to learning kernels obtained in the last decade, including recent progress in all of these aspects in the last few years. Our tutorial will also introduce the audience to software libraries and packages incorporating the implementation of several of the most effective learning kernel algorithms and indicate how to use these algorithms in applications to effectively improve performance.
Learning kernel is a fundamental topic for kernel methods and machine learning in general. The question of selecting the appropriate kernel has been raised since the beginning of kernel methods, in particular for SVMs. Significant improvements in this area will both reduce the requirements from the users when applying machine learning techniques and help achieve better performance. Additionally, the methods used for learning kernels, including the formulation and solution to the optimization techniques, the algorithms, and the theoretical insights can be useful in other areas of machine learning, such as learning problems with data-dependent hypotheses, feature selection or feature reweighting, distance learning, transfer learning and many others. Finally, there are many interesting research questions in this area that have not been explored sufficiently yet. This tutorial will provide a convenient introduction to both standard and advanced material in this area, which will help interested researchers to investigate these questions.