Browse Events
You can sort by using the menu on the right.

Other ICML 2012 Tutorials
Probabilistic Topic Models; Representation Learning; Mirror Descent and Saddle Point First Order Algorithms

Performance Evaluation for Learning Algorithms: Techniques, Application and Issues
The purpose of the tutorial is to promote an appreciation of the need for rigorous and objective evaluation and an understanding of the available alternatives along with their assumptions, constraints and context of application. Machine learning researchers and practitioners alike will all benefit from the contents of the tutorial, which discusses the need for sound evaluation strategies, practical approaches and tools for evaluation, going well beyond those described in existing machine learning and data mining textbooks, so far.

Spectral Approaches to Learning Latent Variable Models
Examples of popular latent variable models include latent tree graphical models and dynamical system models, both of which occupy a fundamental place in engineering, control theory, economics as well as the physical, biological, and social sciences. Unfortunately, to discover the right latent state representation and model parameters, we must solve difficult structural and temporal credit assignment problems. Work on learning latent variable structure has predominantly relied on likelihood maximization and local search heuristics such as expectation maximization (EM); these heuristics often lead to a search space with a host of bad local optima, and may therefore require impractically many restarts to reach a prescribed training precision.
This tutorial will focus on a recentlydiscovered class of spectral learning algorithms. These algorithms hold the promise of overcoming these problems and enabling learning of latent structure in tree and dynamical system models. Unlike the EM algorithm, spectral methods are computationally efficient, statistically consistent, and have no local optima; in addition, they can be simple to implement, and have stateoftheart practical performance for many interesting learning problems.
We will describe the main theoretical, algorithmic, and empirical results related to spectral learning algorithms, starting with an overview of linear system identification results obtained in the last two decades, and then focusing on the remarkable recent progress in learning nonlinear dynamical systems, latent tree graphical models, and kernelbased nonparametric models.

PACBayesian Analysis in Supervised, Unsupervised, and Reinforcement Learning
PACBayesian analysis is a basic and very general tool for datadependent analysis in machine learning. By now, it has been applied in such diverse areas as supervised learning, unsupervised learning, and reinforcement learning, leading to stateoftheart algorithms and accompanying generalization bounds. PACBayesian analysis, in a sense, takes the best out of Bayesian methods and PAC learning and puts it together: (1) it provides an easy way to exploit prior knowledge (like Bayesian methods); (2) it provides strict and explicit generalization guarantees (like VC theory); and (3) it is datadependent and provides an easy and strict way of exploiting benign conditions (like Rademacher complexities). In addition, PACBayesian bounds directly lead to efficient learning algorithms.
We will start with a general introduction to PACBayesian analysis, which should be accessible to an average student, who is familiar with machine learning at the basic level. Then, we will survey multiple forms of PACBayesian bounds and their numerous applications in different fields (including supervised and unsupervised learning, finite and continuous domains, and the very recent extension to martingales and reinforcement learning). Some of these applications will be explained in more details, while others will be surveyed at a high level. We will also describe the relations and distinctions between PACBayesian analysis, Bayesian learning, VC theory, and Rademacher complexities. We will discuss the role, value, and shortcomings of frequentist bounds that are inspired by Bayesian analysis.

ICML 2012 Tutorial on Prediction, Belief, and Markets
Prediction markets are financial markets designed to aggregate opinions across large populations of traders. A typical prediction market offers a set of securities with payoffs determined by the future state of the world. For example, a market might offer a security worth $1 if Barack Obama is reelected in 2012 and $0 otherwise. Roughly speaking, a trader who believes the probability of Obama's reelection is p should be willing to buy this security at any price less than $p and (short) sell this security at any price greater than $p. For this reason, the going price of this security could be interpreted as traders' collective belief about the likelihood of Obama's reelection. Prediction markets have been used to generate accurate forecasts in a variety of domains including politics, disease surveillance, business, and entertainment, and are cited in the media increasingly often.
This tutorial will cover some of the basic mathematical ideas used in the design of prediction markets, and illustrate several fundamental connections between these ideas and techniques used in machine learning. We will begin with an overview of proper scoring rules, which can be used to measure the accuracy of a single entity's prediction, and are closely related to proper loss functions. We will then discuss market scoring rules, automated market makers based on proper scoring rules which can be used to aggregate the predictions of many forecasters, and describe how market scoring rules can be implemented as inventorybased markets in which securities are bought and sold. We will describe recent research exploring a dualitybased interpretation of market scoring rules which can be exploited to design new markets that can be run efficiently over very large state spaces. Finally, we will explore the fundamental mathematical connections between market scoring rules and two areas of machine learning: online "noregret" learning and variational inference with exponential families.
This tutorial will be selfcontained. No background on markets or specific areas of machine learning is required.

Tutorial on Causal inference  conditional independences and beyond
Machine learning has traditionally been focused on prediction. Given observations that have been generated by an unknown stochastic dependency, the goal is to infer a law that will be able to correctly predict future observations generated by the same dependency. In contrast to this goal, causal inference tries to infer the causal structure underlying the observed dependencies. More precisely, one tries to infer the behavior of a system under interventions without performing them, which does not fit into any traditional prediction scenario. Apart from the fact that it is still debated whether this is possible at all, it is a priori not clear, given that it is, why machine learning tools should be helpful for this task.
Since the Eighties there has been a community of researchers, mostly from statistics, philosophy, and computer science who have developed methods aiming at inferring causal relationships from observational data. The pioneering work of Glymour, Scheines, Spirtes, and Pearl describes assumptions that link conditional statistical dependences to causality, which then renders many causal inference problems solvable. The typical task, which is solved by the corresponding algorithms, reads: given observations from the joint distribution on the variables X_{1},...,X_{n} with n ≥ 3, infer the causal directed acyclic graph (or parts of it).
Recently, this work has been complemented by several researchers from machine learning who described methods that do not rely on conditional independences alone, but employ other properties of joint probability distributions. These methods use established tools of machine learning like, for instance, regression and reproducing kernel Hilbert spaces. In contrast to the above approaches, the causal direction can sometimes be inferred when only two variables are observed. Remarkably, this can be helpful for more traditional machine learning tasks like prediction under changing background conditions, because the task has different solutions depending on whether the predicted variable is the cause or the effect.Outline
 Introductory remarks: causal dependences versus statistical dependences
 Independence based causal inference: assumptions/algorithms/limitations
 New inference principles via restricting model classes
 Foundation of new inference rules by algorithmic information theory
 How machine learning can benefit from causal inference 
Tutorial on Statistical Learning Theory in Reinforcement Learning and Approximate Dynamic Programming
Many interesting sequential decisionmaking tasks can be formulated as reinforcement learning (RL) problems. In a RL problem, an agent interacts with a dynamic, stochastic, and incompletely known environment, with the goal of finding an actionselection strategy, or policy, to maximize some measure of its longterm performance. Dynamic programming (DP) algorithms are the most powerful tools to solve a RL problem, i.e., to find an optimal policy. However, these algorithms guarantee to find an optimal policy only if the environment (i.e., the dynamics and the rewards) is completely known and the size of the state and action spaces are not too large. When one of these conditions is violated (e.g., the only information about the environment is of the form of samples of transitions and rewards), approximate algorithms are needed, and thus, DP methods turn to approximate dynamic programming (ADP) and RL algorithms (while the term RL is more often used in the AI and machine learning community, ADP is more common in the field of operations research). In this case, the convergence and performance guarantees of the standard DP algorithms are no longer valid, and the main theoretical challenge is to study the performance of ADP and RL algorithms.
Statistical learning theory (SLT) has been fundamental in understanding the statistical properties of the algorithms developed in machine learning. In particular, SLT has explained the interaction between the process generating the samples and the hypothesis space used by the learning algorithm, and shown when and howwell classification and regression problems can be solved. These results also proved to be particularly useful in dimensioning the machine learning problems (i.e., number of samples, complexity of the hypothesis space) and tuning the parameters of the algorithms (e.g., the regularizer in regularized methods).
In recent years, SLT tools have been used to study the performance of batch versions of RL and ADP algorithms (in the batch version of RL and ADP, v.s. incremental or online versions of these algorithms, a sampling policy is used to build a training set for the learning algorithm) with the objective of deriving finitesample bounds on the performance loss (w.r.t. the optimal policy) of the policy learned by these methods. Such an objective requires to effectively combine SLT tools with the ADP algorithms, and to show how the error is propagated through the iterations of these iterative algorithms.
The main objective of this tutorial is to strengthen the links between ADP and SLT. More specifically, our goal is twofold: 1) to raise the awareness of the RL and ADP communities with regard to the potential benefits of the theoretical analysis of their algorithms for the advancement of these fields, and 2) to introduce the potential theoretical problems in sequential decisionmaking to the researchers in the field of SLT. An introduction to ADP methods will be given, tools from SLT needed in the analysis of these methods will be introduced, followed by an overview of the existing results. The properties of these results and how they may help us in tuning the algorithms will be discussed, the results will be compared with those in supervised learning, and the potential open problems in this area will be highlighted.

Object, functional and structured data: towards next generation kernelbased methods  ICML 2012 Workshop
This workshop concerns analysis and prediction of complex data such as objects, functions and structures. It aims to discuss various ways to extend machine learning and statistical inference to these data and especially to complex outputs prediction. A special attention will be paid to operatorvalued kernels and tools for prediction in infinite dimensional space.

Inferning 2012: ICML Workshop on interaction between Inference and Learning
This workshop studies the interactions between algorithms that learn a model, and algorithms that use the resulting model parameters for inference. These interactions are studied from two perspectives.
The first perspective studies how the choice of an inference algorithm influences the parameters the model ultimately learns. For example, many parameter estimation algorithms require inference as a subroutine. Consequently, when we are faced with models for which exact inference is expensive, we must use an approximation instead: MCMC sampling, belief propagation, beamsearch, etc. On some problems these approximations yield superior models, yet on others, they fail catastrophically. We invite studies that analyze (both empirically and theoretically) the impact of approximate inference on model learning. How does approximate inference alter the learning objective? Affect generalization? Influence convergence properties? Further, does the behavior of inference change as learning continues to improve the quality of the model?
A second perspective from which we study these interactions is by considering how the learning objective and model parameters can impact both the quality and performance of inference during “test time.” These unconventional approaches to learning combine generalization to unseen data with other desiderata such as fast inference. For example, work in structured cascades learns model for which greedy, efficient inference can be performed at test time while still maintaining accuracy guarantees. Similarly, there has been work that learns operators for efficient searchbased inference. There has also been work that incorporates resource constraints on running time and memory into the learning objective.
This workshop brings together practitioners from different fields (information extraction, machine vision, natural language processing, computational biology, etc.) in order to study a unified framework for understanding and formalizing the interactions between learning and inference. The following is a partial list of relevant keywords for the workshop:
 learning with approximate inference
 costaware learning
 learning sparse structures
 pseudolikelihood training
 contrastive divergence
 piecewise training
 coarse to fine learning and inference
 scoring matching
 stochastic approximation
 incremental gradient methods and more ...
 Recent Events
 ICML 2016 Webcast from Ballroom
 ICML 2016 Plenary Webcsat
 4th Pacific Northwest Regional NLP Workshop
 FOCS 2015
 BTAS 2015
 CoNLL 2015
 ACLIJCNLP 2015
 RE.WORK Deep Learning Summit
 CVPR 2015
 NAACL 2015
 Biometrics
 SC15 Biometrics
 LxMLS 2014
 AAAI 2015 Oral Talks
 FOCS 2014
 International Conference on Machine Learning 2014
 ACL 2014
 CVPR Vision meets Cognition Workshop 2014
 CVPR 2014 Oral Talks
 CVPR 2014 Video Spotlights
 Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS) 2014
 WritersUA West 2014
 LxMLS 2013
 FOCS 2013
 WritersUA East 2013
 ICCV 2013
 Inferning 2013
 ICRA 2013 Orals and Tutorials
 CVPR 2013
 AAAI 2013 Tutorials
 NAACL 2013
 ICML 2013
 CVPR 2013 Track 4
 CVPR 2013 Track 3
 CVPR 2013 Track 2
 CVPR 2013 Track 1
 ICML 2013 Plenary Webcast
 Vator Splash LA 2013
 International Conference on Learning Representations (ICLR) 2013
 Sixteenth International Conference on Artificial Intelligence and Statistics (AISTATS) 2013
 CVPR 2013 Webcast
 ColumbiaPrinceton Probability Day 2013
 ICRA 2013 Plenary Talks
 Vator Splash SFO Feb 2013
 WritersUA 2013
 Testing
 2nd Lisbon Machine Learning School (2012)
 Machine Learning in Computational Biology (MLCB) 2012
 NIPS 2012 Workshop on LogLinear Models
 Big Learning : Algorithms, Systems, and Tools
 Big Data Meets Computer Vision: First International Workshop on Large Scale Visual Recognition and Retrieval
 OpenCV using Python
 IPC–SMTA HighReliability Cleaning and Conformal Coating Conference 2012
 The Website Is The App (And Vice Versa)
 Vator Splash NY 2012
 FOCS 2012
 Building Meaningful Customer Experiences
 How to start a startup as a nontechnical founder
 Green Initiatives Conference and Expo 2012
 FailCon 2012
 Other ICML 2012 Tutorials
 Performance Evaluation for Learning Algorithms: Techniques, Application and Issues
 Spectral Approaches to Learning Latent Variable Models
 PACBayesian Analysis in Supervised, Unsupervised, and Reinforcement Learning
 ICML 2012 Tutorial on Prediction, Belief, and Markets
 Tutorial on Causal inference  conditional independences and beyond
 Tutorial on Statistical Learning Theory in Reinforcement Learning and Approximate Dynamic Programming
 ICML 2012 Invited Talks
 Object, functional and structured data: towards next generation kernelbased methods  ICML 2012 Workshop
 Inferning 2012: ICML Workshop on interaction between Inference and Learning
 ICML 2012 Workshop on Representation Learning
 ICML 2012 Workshop on New Challenges for Exploration & Exploitation 3
 Conference on Learning Theory
 Plenary talks from ICRA 2012
 New design principles and frontiers for Wearable robotics
 Semantic Perception and Mapping for Knowledgeenabled Service Robotics
 IndustryAcademia collaboration in the ECHORD project: a bridge for European robotic innovation
 Modular Surgical Robotics: how can we make it possible?
 Stochastic Geometry in SLAM
 Workshop on Longterm Autonomy II
 Conditions for Replicable Experiments and Performance Comparison in Robotics Research
 Bio Assembler for 3D Cellular System Innovation
 The Future of HRI  Paving the way to next generation of HRI
 2nd Workshop on Semantic Perception, Mapping and Exploration (SPME)
 Robotic Satellite Servicing
 Robotics and Performing Arts: Reciprocal influences
 Haptic Teleoperation of Mobile Robots: Theory, Applications and Perspectives
 7th Fullday Workshop on Software Development and Integration in Robotics (SDIRVII)
 ManyRobot Systems: Crossing the Reality Gap
 BioBots
 Variable Impedance Actuators Moving the Robots of Tomorrow
 Tutorial: Reinforcement Learning for Robotics and Control
 Tutorial: Advanced 3D Point Cloud Processing with Point Cloud Library (PCL)
 Tutorial: Motion Planning for Dynamic Environments
 IEEE IPDPS 2012
 IEEE CVPR 2012
 Annual Healthcare IT Conference 2012
 Technical session talks from ICRA 2012
 NASA FAP 2012 Technical Conference/Subsonic Fixed Wing Project Track
 26th IEEE International Parallel & Distributed Processing Symposium
 IEEE Computer Vision and Pattern Recognition (CVPR) 2012
 ICML 2012 Oral Talks (International Conference on Machine Learning)
 Big Learning: Algorithms, Systems, and Tools for Learning at Scale
 The 4th International Workshop on Music and Machine Learning: Learning from Musical Structure
 Sparse Representation and Lowrank Approximation
 Learning Semantics Workshop
 Domain Adaptation Workshop: Theory and Application
 Machine Learning in Computational Biology (MLCB) 2011
 NIPS 2011 Workshop on Integrating Language and Vision
 Copulas in Machine Learning Workshop 2011
 1st Lisbon Machine Learning School
 FOCS 2011
 Mobile Health Conference 2011
 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
 Venture Shift New York Webcast
 Sixth Annual Machine Learning Symposium
 IJCB2011
 Green Initiatives Conference 2011
 ICML 2011
 Clinical Informatics Summit 2011
 IEEE CVPR 2011
 IEEE IPDPS 2011
 EduPar 2011
 Healthcare IT Conference 2011
 IEEE FOCS 2010
 2010 North American Association For Environmental Education Conference
 Advanced Operating Systems from GaTech
 KDD 2010
 INCOSE 2010
 Clinical Informatics Summit 2010
 24th IEEE International Parallel and Distributed Processing Symposium
 Claremont Health Informatics Workshop 2010