You can sort by using the menu on the right.
Tutorial on Statistical Learning Theory in Reinforcement Learning and Approximate Dynamic Programming
Many interesting sequential decision-making tasks can be formulated as reinforcement learning (RL) problems. In a RL problem, an agent interacts with a dynamic, stochastic, and incompletely known environment, with the goal of finding an action-selection strategy, or policy, to maximize some measure of its long-term performance. Dynamic programming (DP) algorithms are the most powerful tools to solve a RL problem, i.e., to find an optimal policy. However, these algorithms guarantee to find an optimal policy only if the environment (i.e., the dynamics and the rewards) is completely known and the size of the state and action spaces are not too large. When one of these conditions is violated (e.g., the only information about the environment is of the form of samples of transitions and rewards), approximate algorithms are needed, and thus, DP methods turn to approximate dynamic programming (ADP) and RL algorithms (while the term RL is more often used in the AI and machine learning community, ADP is more common in the field of operations research). In this case, the convergence and performance guarantees of the standard DP algorithms are no longer valid, and the main theoretical challenge is to study the performance of ADP and RL algorithms.
Statistical learning theory (SLT) has been fundamental in understanding the statistical properties of the algorithms developed in machine learning. In particular, SLT has explained the interaction between the process generating the samples and the hypothesis space used by the learning algorithm, and shown when and how-well classification and regression problems can be solved. These results also proved to be particularly useful in dimensioning the machine learning problems (i.e., number of samples, complexity of the hypothesis space) and tuning the parameters of the algorithms (e.g., the regularizer in regularized methods).
In recent years, SLT tools have been used to study the performance of batch versions of RL and ADP algorithms (in the batch version of RL and ADP, v.s. incremental or online versions of these algorithms, a sampling policy is used to build a training set for the learning algorithm) with the objective of deriving finite-sample bounds on the performance loss (w.r.t. the optimal policy) of the policy learned by these methods. Such an objective requires to effectively combine SLT tools with the ADP algorithms, and to show how the error is propagated through the iterations of these iterative algorithms.
The main objective of this tutorial is to strengthen the links between ADP and SLT. More specifically, our goal is two-fold: 1) to raise the awareness of the RL and ADP communities with regard to the potential benefits of the theoretical analysis of their algorithms for the advancement of these fields, and 2) to introduce the potential theoretical problems in sequential decision-making to the researchers in the field of SLT. An introduction to ADP methods will be given, tools from SLT needed in the analysis of these methods will be introduced, followed by an overview of the existing results. The properties of these results and how they may help us in tuning the algorithms will be discussed, the results will be compared with those in supervised learning, and the potential open problems in this area will be highlighted.
Object, functional and structured data: towards next generation kernel-based methods - ICML 2012 Workshop
This workshop concerns analysis and prediction of complex data such as objects, functions and structures. It aims to discuss various ways to extend machine learning and statistical inference to these data and especially to complex outputs prediction. A special attention will be paid to operator-valued kernels and tools for prediction in infinite dimensional space.
Inferning 2012: ICML Workshop on interaction between Inference and Learning
This workshop studies the interactions between algorithms that learn a model, and algorithms that use the resulting model parameters for inference. These interactions are studied from two perspectives.
The first perspective studies how the choice of an inference algorithm influences the parameters the model ultimately learns. For example, many parameter estimation algorithms require inference as a subroutine. Consequently, when we are faced with models for which exact inference is expensive, we must use an approximation instead: MCMC sampling, belief propagation, beam-search, etc. On some problems these approximations yield superior models, yet on others, they fail catastrophically. We invite studies that analyze (both empirically and theoretically) the impact of approximate inference on model learning. How does approximate inference alter the learning objective? Affect generalization? Influence convergence properties? Further, does the behavior of inference change as learning continues to improve the quality of the model?
A second perspective from which we study these interactions is by considering how the learning objective and model parameters can impact both the quality and performance of inference during “test time.” These unconventional approaches to learning combine generalization to unseen data with other desiderata such as fast inference. For example, work in structured cascades learns model for which greedy, efficient inference can be performed at test time while still maintaining accuracy guarantees. Similarly, there has been work that learns operators for efficient search-based inference. There has also been work that incorporates resource constraints on running time and memory into the learning objective.
This workshop brings together practitioners from different fields (information extraction, machine vision, natural language processing, computational biology, etc.) in order to study a unified framework for understanding and formalizing the interactions between learning and inference. The following is a partial list of relevant keywords for the workshop:
- learning with approximate inference
- cost-aware learning
- learning sparse structures
- pseudo-likelihood training
- contrastive divergence
- piecewise training
- coarse to fine learning and inference
- scoring matching
- stochastic approximation
- incremental gradient methods and more ...
ICML 2012 Workshop on Representation Learning
In this workshop we consider the question of how we can learn meaningful and useful representations of the data. There has been a great deal of recent work on this topic, much of it emerging from researchers interested in training deep architectures. Deep learning methods such as deep belief networks, sparse coding-based methods, convolutional networks, and deep Boltzmann machines, have shown promise as a means of learning invariant representations of data and have already been successfully applied to a variety of tasks in computer vision, audio processing, natural language processing, information retrieval, and robotics. Bayesian nonparametric methods and other hierarchical graphical model-based approaches have also been recently shown the ability to learn rich representations of data.
By bringing together researchers with diverse expertise and perspectives but who are all interested in the question of how to learn data representations, we will explore the challenges and promising directions for future research in this area.
ICML 2012 Workshop on New Challenges for Exploration & Exploitation 3
The goal of this challenge is to build an algorithm that learns efficiently a policy to serve news articles on a web site. At each iteration of the evaluation process, you will be asked to pick an article from a list given a visitor (136 binary features + a timestamp). To build a smart algorithm, you might want to balance carefully exploration and exploitation and pay close attention to the “age” of the news articles (among other things of course). A quick look on the leaderboard is enough to figure out why that last point matters. It is the overall CTR (click through rate) of your algorithm that will be taken into account to rank it on the leaderboard.
Semantic Perception and Mapping for Knowledge-enabled Service Robotics
Consider a robot that is to act as a household assistant in an unknown kitchen environment. This robot has to acquire and use knowledge about where the task-relevant objects, such as the dish- washer and the oven are and how the robot can act on them. A recent advent of smart devices (e.g. smart phones) and high-quality-low-cost sensors (e.g. Kinect) provides us with the aordable resources for the robot which link sensory information to the robot's knowledge base and high-level deliberative components. Resources like this allow the general-purpose service robots to e.g. query information from world wide web, seek help from remote experts through shared autonomy interfaces and to act independently and safely in human living envi- ronments.
In this hands-on workshop we will identify key problems and so- lutions by narrowing down the denition of semantics, we will dis- cuss what is the representative end world model as a result of se- mantic mapping, single out the optimal sensors, consider static vs. dynamic aspects of environment modeling and nally address the life- long learning in order to leverage not only the sensor data but also from human living patterns and behaviors. The workshop will feature excellent talks from researchers from academia as well as industry, live demonstrations, poster session and a working session with an aim to standardize some fundamental concepts in semantic mapping. We plan to build upon the series of related events at previous IROS, ICRA and RSS conferences.