TechTalks from event: Conference on Learning Theory

  • The Best of Both Worlds: Stochastic and Adversarial Bandits Authors: Sbastien Bubeck and Aleksandrs Slivkins
    We present a new bandit algorithm, SAO (Stochastic and Adversarial Optimal) whose regret is (essentially) optimal both for adversarial rewards and for stochastic rewards. Specifically, SAO combines the O(?n) worst-case regret of Exp3 (Auer et al., 2002b) and the (poly)logarithmic regret of UCB1 (Auer et al., 2002a) for stochastic rewards. Adversarial rewards and stochastic rewards are the two main settings in the literature on multi-armed bandits (MAB). Prior work on MAB treats them separately, and does not attempt to jointly optimize for both. This result falls into the general agenda to design algorithms that combine the optimal worst-case performance with improved guarantees for "nice" problem instances.
  • Open Problem: Regret Bounds for Thompson Sampling Authors: Lihong Li and Olivier Chapelle
  • Open Problem: Better Bounds for Online Logistic Regression Authors: H. Brendan McMahan and Matthew Streeter
    Known algorithms applied to online logistic regression on a feasible set of L2 diameter D achieve regret bounds like O(eD log T) in one dimension, but we show a bound of O(?D + log T) is possible in a binary 1-dimensional problem. Thus, we pose the following question: Is it possible to achieve a regret bound for online logistic regression that is O(poly(D) log(T))? Even if this is not possible in general, it would be interesting to have a bound that reduces to our bound in the one-dimensional case.
  • Open Problem: Learning Dynamic Network Models from a Static Snapshot Authors: Jan Ramon and Constantin Comendant
    In this paper we consider the problem of learning a graph generating process given the evolving graph at a single point in time. Given a graph of sufficient size, can we learn the (repeatable) process that generated it? We formalize the generic problem and then consider two simple instances which are variations on the well-know graph generation models by Erds-Rnyi and Albert-Barabasi.
  • Open Problem: Does AdaBoost Always Cycle? Authors: Cynthia Rudin, Robert E. Schapire, and Ingrid Daubechies
    We pose the question of whether the distributions computed by AdaBoost always converge to a cycle.
  • Open Problem: Is Averaging Needed for Strongly Convex Stochastic Gradient Descent? Authors:
    Stochastic gradient descent (SGD) is a simple and very popular iterative method to solve stochastic optimization problems which arise in machine learning. A common practice is to return the average of the SGD iterates. While the utility of this is well-understood for general convex problems, the situation is much less clear for strongly convex problems (such as solving SVM). Although the standard analysis in the strongly convex case requires averaging, it was recently shown that this actually degrades the convergence rate, and a better rate is obtainable by averaging just a suffix of the iterates. The question we pose is whether averaging is needed at all to get optimal rates.