IEEE FOCS 2010
TechTalks from event: IEEE FOCS 2010
51st Annual IEEE Symposium on Foundations of Computer Science
-
Geometric complexity theory (GCT)Geometric complexity theory (GCT) is an approach towards the fundamental lower bound problems of complexity theory through algebraic geometry and representation theory. This talk will give a brief introduction to GCT without assuming any background in algebraic geometry or representation theory.
-
How to Grow Your Lower BoundsI will survey the state of the art in proving hardness for data structure problems in the cell-probe model of computation.
-
How To Think About Mechanism DesignMechanism design studies optimization problems where the underlying data --- such as the value of a good or the cost of performing a task --- is initially unknown to the algorithm designer. Auction settings are canonical examples, where the private data is the willingness to pay the bidders for the goods on sale, and the optimization problem is to allocate the goods to maximize some objective, such as revenue or overall value to society. A "mechanism" is a protocol that interacts with participants and determines a solution to the underlying optimization problem. We first explain why designing mechanisms with good game-theoretic properties boils down to algorithm design in a certain "constrained computational model." We differentiate between single-parameter problems, where this computational model is well understood, and multi-parameter problems, where it is not. We then study two specific problem domains: revenue-maximizing auctions, and the recent idea of analyzing auctions via a "Bayesian thought experiment"; and welfare-maximizing mechanisms, with an emphasis on black-box reductions that transmute approximation algorithms into truthful approximation mechanisms.
-
Constructive Algorithms for Discrepancy MinimizationGiven a set system (V,S), V=[n] and S={S_1,ldots,S_m}, the minimum discrepancy problem is to find a 2-coloring X:V -> {-1,+1}, such that each set is colored as evenly as possible, i.e. find X to minimize max_{j in [m]} |sum_{i in S_j} X(i)|. In this paper we give the first polynomial time algorithms for discrepancy minimization, that achieve bounds similar to those known existentially using the so-called Entropy Method. We also give a first approximation-like result for discrepancy. Specifically we give efficient randomized algorithms to: 1. Construct an $O(sqrt{n})$ discrepancy coloring for general sets systems when $m=O(n)$, matching the celebrated result of Spencer up to constant factors. Previously, no algorithmic guarantee better than the random coloring bound, i.e. O((n log n)^{1/2}), was known. More generally, for $mgeq n$, we obtain a discrepancy bound of O(n^{1/2} log (2m/n)). 2. Construct a coloring with discrepancy $O(t^{1/2} log n)$, if each element lies in at most $t$ sets. This matches the (non-constructive) result of Srinivasan cite{Sr}. 3. Construct a coloring with discrepancy O(lambdalog(nm)), where lambda is the hereditary discrepancy of the set system. In particular, this implies a logarithmic approximation for hereditary discrepancy. The main idea in our algorithms is to gradually produce a coloring by solving a sequence of semidefinite programs, while using the entropy method to guide the choice of the semidefinite program at each stage.
-
Bounded Independence Fools Degree-2 Threshold FunctionsLet x be a random vector coming from any k-wise independent distribution over {-1,1}^n. For an n-variate degree-2 polynomial p, we prove that E[sgn(p(x))] is determined up to an additive eps for k = poly(1/eps). This gives a large class of explicit pseudo-random generators against such functions and answers an open question of Diakonikolas et al. (FOCS 2009).
In the process, we develop a novel analytic technique we dub multivariate FT-mollification. This provides a generic tool to approximate bounded (multivariate) functions by low-degree polynomials (with respect to several different notions of approximation). A univariate version of the method was introduced by Kane et al. (SODA 2010) in the context of streaming algorithms. In this work, we refine it and generalize it to the multivariate setting. We believe that our technique is of independent mathematical interest. To illustrate its generality, we note that it implies a multidimensional generalization of Jackson's classical result in approximation theory due to (Ganzburg 1979).
To obtain our main result, we combine the FT-mollification technique with several linear algebraic and probabilistic tools. These include the invariance principle of of Mossell, O'Donnell and Oleszkiewicz, anti-concentration bounds for low-degree polynomials, an appropriate decomposition of degree-2 polynomials, and a generalized hyper-contractive inequality for quadratic forms which takes the operator norm of the associated matrix into account. Our analysis is quite modular; it readily adapts to show that intersections of halfspaces and degree-2 threshold functions are fooled by bounded independence. From this it follows that Omega(1/eps^2)-wise independence derandomizes the Goemans-Williamson hyperplane rounding scheme.
Our techniques unify, simplify, and in some cases improve several recent results in the literature concerning threshold functions. For the case of ``regular'' halfspaces we give a simple proof of an optimal independence bound of Th
-
The Coin Problem, and Pseudorandomness for Branching Programs / Pseudorandom Generators for Regular Branching ProgramsThe emph{Coin Problem} is the following problem: a coin is given, which lands on head with probability either $1/2 + eta$ or $1/2 - eta$. We are given the outcome of $n$ independent tosses of this coin, and the goal is to guess which way the coin is biased, and to be correct with probability $ge 2/3$. When our computational model is unrestricted, the majority function is optimal, and succeeds when $eta ge c /sqrt{n}$ for a large enough constant $c$. The coin problem is open and interesting in models that cannot compute the majority function.
In this paper we study the coin problem in the model of emph{read-once width-$w$ branching programs}. We prove that in order to succeed in this model, $eta$ must be at least $1/ (log n)^{Theta(w)}$. For constant $w$ this is tight by considering the recursive tribes function.
We generalize this to a emph{Dice Problem}, where instead of independent tosses of a coin we are given independent tosses of one of two $m$-sided dice. We prove that if the distributions are too close, then the dice cannot be distinguished by a small-width read-once branching program.
We suggest one application for this kind of theorems: we prove that Nisan's Generator fools width-$w$ read-once emph{permutation} branching programs, using seed length $O(w^4 log n log log n + log n log (1/eps))$. For $w=eps=Theta(1)$, this seedlength is $O(log n log log n)$. The coin theorem and its relatives might have other connections to PRGs. This application is related to the independent, but chronologically-earlier, work of Braverman, Rao, Raz and Yehudayoff (which might be submitted to this FOCS).
-
Settling the Polynomial Learnability of Mixtures of GaussiansGiven data drawn from a mixture of multivariate Gaussians, a basic problem is to accurately estimate the mixture parameters. We give an algorithm for this problem that has a running time, and data requirement polynomial in the dimension and the inverse of the desired accuracy, with provably minimal assumptions on the Gaussians. As simple consequences of our learning algorithm, we can perform near-optimal clustering of the sample points and density estimation for mixtures of $k$ Gaussians, efficiently.
The building blocks of our algorithm are based on the work (Kalai emph{et al}, STOC 2010)~cite{2Gs} that gives an efficient algorithm for learning mixtures of two Gaussians by considering a series of projections down to one dimension, and applying the emph{method of moments} to each univariate projection. A major technical hurdle in~cite{2Gs} is showing that one can efficiently learn emph{univariate} mixtures of two Gaussians. In contrast, because pathological scenarios can arise when considering univariate projections of mixtures of more than two Gaussians, the bulk of the work in this paper concerns how to leverage an algorithm for learning univariate mixtures (of many Gaussians) to yield an efficient algorithm for learning in high dimensions. Our algorithm employs emph{hierarchical clustering} and rescaling, together with delicate methods for backtracking and recovering from failures that can occur in our univariate algorithm.
Finally, while the running time and data requirements of our algorithm depend exponentially on the number of Gaussians in the mixture, we prove that such a dependence in necessary.
-
Polynomial Learning of Distribution FamiliesThe question of polynomial learnability of probability distributions, particularly Gaussian mixture distributions, has recently received significant attention in theoretical computer science and machine learning. However, despite major progress, the general question of polynomial learnability of Gaussian mixture distributions still remained open. The current work resolves the question of polynomial learnability for Gaussian mixtures in high dimension with an arbitrary but fixed number of components.
The result for Gaussian distributions relies on a very general result of independent interest on learning parameters of distributions belonging to what we call {it polynomial families}. These families are characterized by their moments being polynomial of parameters and, perhaps surprisingly, include almost all common probability distributions as well as their mixtures and products. Using tools from real algebraic geometry, we show that parameters of any distribution belonging to such a family can be learned in polynomial time.
To estimate parameters of a Gaussian mixture distribution the general results on polynomial families are combined with a certain deterministic dimensionality reduction allowing learning a high-dimensional mixture to be reduced to a polynomial number of parameter estimation problems in low dimension.
-
Agnostically learning under permutation invariant distributionsWe generalize algorithms from computational learning theory that are successful under the uniform distribution on the Boolean hypercube ${0,1}^n$ to algorithms successful on permutation invariant distributions, distributions where the probability mass remains constant upon permutations in the instances. While the tools in our generalization mimic those used for the Boolean hypercube, the fact that permutation invariant distributions are not product distributions presents a significant obstacle.
Under the uniform distribution, halfspaces can be agnostically learned in polynomial time for constant $eps$. The main tools used are a theorem of Peres~cite{Peres04} bounding the {it noise sensitivity} of a halfspace, a result of~cite{KOS04} that this theorem this implies Fourier concentration, and a modification of the Low-Degree algorithm of Linial, Mansour, Nisan~cite{LMN:93} made by Kalai et. al.~cite{KKMS08}. These results are extended to arbitrary product distributions in~cite{BOWi08}.
We prove analogous results for permutation invariant distributions; more generally, we work in the domain of the symmetric group. We define noise sensitivity in this setting, and show that noise sensitivity has a nice combinatorial interpretation in terms of Young tableaux. The main technical innovations involve techniques from the representation theory of the symmetric group, especially the combinatorics of Young tableaux. We show that low noise sensitivity implies concentration on ``simple'' components of the Fourier spectrum, and that this fact will allow us to agnostically learn halfspaces under permutation invariant distributions to constant accuracy in roughly the same time as in the uniform distribution over the Boolean hypercube case.
-
Learning Convex Concepts from Gaussian Distributions with PCAWe present a new algorithm for learning a convex set in $R^n$ given labeled examples drawn from any Gaussian distribution. The efficiency of the algorithm depends on the dimension of the {em normal subspace}, the span of vectors orthogonal to supporting hyperplanes of the convex set. The key step of the algorithm uses a Singular Value Decomposition (SVD) to approximate the relevant normal subspace. The complexity of the resulting algorithm is $poly(n)2^{ ilde{O}(k)}$ for an arbitrary convex set with normal subspace of dimension $k$. For the important special case of the intersection of $k$ halfspaces, the complexity is [ poly(n,k,1/eps) + n cdot min , k^{O(log k/eps^4)}, (k/eps)^{O(klog (1/eps))} ] to learn a hypothesis that correctly classifies $1-eps$ of the unknown Gaussian distribution. This improves substantially on existing algorithms and is the first algorithm to achieve a fixed polynomial dependence on $n$. The proof of our main result is based on a monotonicity property of Gaussian space.
- All Talks
- Geometric complexity theory (GCT)
- How to Grow Your Lower Bounds
- How To Think About Mechanism Design
- Constructive Algorithms for Discrepancy Minimization
- Bounded Independence Fools Degree-2 Threshold Functions
- The Coin Problem, and Pseudorandomness for Branching Programs / Pseudorandom Generators for Regular Branching Programs
- Settling the Polynomial Learnability of Mixtures of Gaussians
- Polynomial Learning of Distribution Families
- Agnostically learning under permutation invariant distributions
- Learning Convex Concepts from Gaussian Distributions with PCA
- Determinant Sums for Undirected Hamiltonicity
- Fighting Perebor: New and Improved Algorithms for Formula and QBF Satisfiability
- The Monotone Complexity of k-CLIQUE on Random Graphs
- The Complexity of Distributions
- Hardness of Finding Independent Sets in Almost 3-Colorable Graphs
- Approximation Algorithms for the Edge-Disjoint Paths Problem via Raecke Decompositions
- Computational Transition at the Uniqueness Threshold
- Avner Magen Memorial Talk
- Clustering with Spectral Norm and the k-means Algorithm
- Stability yields a PTAS for k-Median and k-Means Clustering
- The Geometry of Manipulation -- a Quantitative Proof of the Gibbard-Satterthwaite Theorem
- Efficient volume sampling for row/column subset selection
- Polylogarithmic Approximation for Edit Distance and the Asymmetric Query Complexity
- New Constructive Aspects of the Lovasz Local Lemma
- The Geometry of Scheduling
- Sublinear Time Optimization for Machine Learning
- Estimating the longest increasing sequence in sublinear time
- Testing Properties of Sparse Images
- A Unified Framework for Testing Linear-Invariant Properties
- Optimal Testing of Reed-Muller Codes
- Subexponential Algorithms for Unique Games and Related problems
- Nevanlinna Prize Talk. Laplacian Gems
- Dependent Randomized Rounding via Exchange Properties of Combinatorial Structures
- Minimum-Cost Network Design with (Dis)economies of Scale
- One Tree Suffices: A Simultaneous O(1)-Approximation for Single-Sink Buy-at-Bulk
- Min st-Cut Oracle for Planar Graphs with Near-Linear Preprocessing Time
- Subcubic Equivalences Between Path, Matrix, and Triangle Problems
- Replacement Paths via Fast Matrix Multiplication
- All-Pairs Shortest Paths in O(n2) Time With High Probability
- Approximating Maximum Weight Matching in Near-linear Time
- Pure and Bayes-Nash Price of Anarchy for Generalized Second Price Auction
- Frugal and Truthful Auctions for Vertex Covers, Flows, and Cuts / Frugal Mechanism Design via Spectral Techniques
- Budget Feasible Mechanisms
- Black-Box Randomized Reductions in Algorithmic Mechanism Design
- Boosting and Differential Privacy
- A Multiplicative Weights Mechanism for Interactive Privacy-Preserving Data Analysis
- Impossibility of Differentially Private Universally Optimal Mechanisms
- The Limits of Two-Party Differential Privacy
- Deciding first-order properties for sparse graphs
- Logspace Versions of the Theorems of Bodlaender and Courcelle
- A separator theorem in minor-closed classes
- Optimal stochastic planarization
- Solving linear systems through nested dissection
- Approaching optimality for solving SDD linear systems
- Fast approximation algorithms for cut-based graph problems
- Metric Extension Operators, Vertex Sparsifiers and Lipschitz Extendability
- Vertex Sparsifiers and Abstract Rounding Algorithms
- A non-linear lower bound for planar epsilon-nets
- The sub-exponential upper bound for on-line chain partitioning
- Improved Bounds for Geometric Permutations
- On the Queue Number of Planar Graphs
- Holographic Algorithms with Matchgates Capture Precisely Tractable Planar #CSP
- A Decidable Dichotomy Theorem on Directed Graph Homomorphisms with Non-negative Weights
- Overcoming the Hole In The Bucket: Public-Key Cryptography Resilient to Continual Memory Leakage
- Cryptography Against Continuous Memory Attacks
- On the Insecurity of Parallel Repetition for Leakage Resilience
- Black-Box, Round-Efficient Secure Computation via Non-Malleability Assumption
- From Standard to Adaptive Hardness And Composable Security With No Set-Up
- On the Computational Complexity of Coin Flipping
- Sequential Rationality in Cryptographic Protocols
- A Fourier-analytic approach to Reed-Muller decoding
- Pseudorandom generators for CC0[p] and the Fourier spectrum of low-degree polynomials over finite fields
- Matching Vector Codes / Local list decoding with a constant number of queries
- Codes for Computationally Simple Channels: Explicit Constructions with Optimal Rate
- A lower bound for dynamic approximate membership data structures
- Lower Bounds for Near Neighbor Search via Metric Expansion
- Distance Oracles Beyond the Thorup?Zwick Bound