TechTalks from event: ICML 2011

Tutorial: Introduction to Bandits: Algorithms and Theory

  • Tutorial: Introduction to Bandits: Algorithms and Theory Authors: Jean-Yves Audibert, Rémi Munos
    This tutorial intends to be an introduction to stochastic and adversarial multi-armed bandit algorithms and to survey some of the recent advances. In the multi-armed bandit problem, at each stage, an agent (or decision maker) chooses one action (or arm), and receives a reward from it. The agent aims at maximizing his rewards. Since he does not know the process generating the rewards, he needs to explore (try) the different actions and yet, exploit (concentrate its draws on) the seemingly most rewarding arms. The bandit problem has been increasingly popular in the machine learning community. It is the simplest setting where one encounters the exploration-exploitation dilemma. It has a wide range of applications including advertizement [1, 6], economics [2, 12], games [7] and optimization [10, 5, 9, 3], model selection and machine learning algorithms itself [13, 4]. It can be a central building block of larger systems, like in evolutionary programming [8] and reinforcement learning [14], in particular in large state space Markovian Decision Problems [11].
  • Tutorial: Introduction to Bandits: Algorithms and Theory - Part 2 Authors: Jean-Yves Audibert, Remi Munos
    Continuation of Part 1