IEEE FOCS 2010
TechTalks from event: IEEE FOCS 2010
51st Annual IEEE Symposium on Foundations of Computer Science

The Geometry of Manipulation  a Quantitative Proof of the GibbardSatterthwaite TheoremWe prove a quantitative version of the GibbardSatterthwaite theorem. We show that a uniformly chosen voter for a neutral social choice function f of q>3 alternatives and n voters will be manipulable with probability at least $10^{4}epsilon^2n^{3}q^{30}$, where $epsilon$ is the minimal statistical distance between f and the family of dictator functions.
Our results extend those of Friedgut Kalai and Naor [FKN09], which were obtained for the case of 3 alternatives, and imply that the approach of masking manipulations behind computational hardness (as considered in [BO91, CS03, EL05, PR06, CS06]) cannot hide manipulations completely.

Efficient volume sampling for row/column subset selectionWe give efficient algorithms for volume sampling, i.e., for picking $k$subsets of the rows of any given matrix with probabilities proportional to the squared volumes of the simplices defined by them and the origin (or the squared volumes of the parallelepipeds defined by these subsets of rows). This solves an open problem from the monograph on spectral algorithms by Kannan and Vempala (see Section $7.4$ of cite{KV}, also implicit in cite{BDM, DRVW}).
Our first algorithm for volume sampling $k$subsets of rows from an $m$by$n$ matrix runs in $O(kmn^omega log n)$ arithmetic operations and a second variant of it for $(1+epsilon)$approximate volume sampling runs in $O(mn log m cdot k^{2}/epsilon^{2} + m log^{omega} m cdot k^{2omega+1}/epsilon^{2omega} cdot log(k epsilon^{1} log m))$ arithmetic operations, which is almost linear in the size of the input (i.e., the number of entries) for small $k$.
Our efficient volume sampling algorithms imply the following results for lowrank matrix approximation: egin{enumerate} item Given $A in R^{m imes n}$, in $O(kmn^{omega} log n)$ arithmetic operations we can find $k$ of its rows such that projecting onto their span gives a $sqrt{k+1}$approximation to the matrix of rank $k$ closest to $A$ under the Frobenius norm. This improves the $O(k sqrt{log k})$approximation of Boutsidis, Drineas and Mahoney cite{BDM} and matches the lower bound shown in cite{DRVW}. The method of conditional expectations gives a emph{deterministic} algorithm with the same complexity. The running time can be improved to $O(mn log m cdot k^{2}/epsilon^{2} + m log^{omega} m cdot k^{2omega+1}/epsilon^{2omega} cdot log(k epsilon^{1} log m))$ at the cost of losing an extra $(1+epsilon)$ in the approximation factor. item The same rows and projection as in the previous point give a $sqrt{(k+1)(nk)}$approximation to the matrix of rank $k$ closest to $A$ under the spectral norm. In this paper, we show an almost matching lower bound of

Polylogarithmic Approximation for Edit Distance and the Asymmetric Query ComplexityWe present a nearlinear time algorithm that approximates the edit distance between two strings within a polylogarithmic factor; specifically, for strings of length $n$ and every fixed $eps>0$, it can compute a $(log n)^{O(1/eps)}$approximation in $n^{1+eps}$ time. This is an {em exponential} improvement over the previously known factor, $2^{ ilde O(sqrt{log n})}$, with a comparable running time~cite{ORedit,AOedit}. Previously, no efficient polylogarithmic approximation algorithm was known for any computational task involving edit distance (e.g., nearest neighbor search or sketching).
This result arises naturally in the study of a new emph{asymmetric query} model. In this model, the input consists of two strings $x$ and $y$, and an algorithm can access $y$ in an unrestricted manner, while being charged for querying every symbol of $x$. Indeed, we obtain our main result by designing an algorithm that makes a small number of queries in this model. We also provide a nearlymatching lower bound on the number of queries.
Our lower bound is the first to expose we hardness of edit distance stemming from the input strings being ``repetitive'', which means that many of their substrings are approximately identical. Consequently, our lower bound provides the first rigorous separation between edit distance and Ulam distance, which is edit distance on nonrepetitive strings, i.e., permutations.

New Constructive Aspects of the Lovasz Local LemmaThe Lov'{a}sz Local Lemma (LLL) is a powerful tool that gives sufficient conditions for avoiding all of a given set of ``bad'' events, with positive probability. A series of results have provided algorithms to efficiently construct structures whose existence is nonconstructively guaranteed by the LLL, culminating in the recent breakthrough of Moser & Tardos for the full asymmetric LLL. We show that the output distribution of the MoserTardos algorithm wellapproximates the emph{conditional LLLdistribution}  the distribution obtained by conditioning on all bad events being avoided. We show how a known bound on the probabilities of events in this distribution can be used for further probabilistic analysis and give new constructive and nonconstructive results.
We also show that when an LLL application provides a small amount of slack, the number of resamplings of the MoserTardos algorithm is nearly linear in the number of underlying independent variables (not events!), and can thus be used to give efficient constructions in cases where the underlying proof applies the LLL to superpolynomially many events. Even in cases where finding a bad event that holds is computationally hard, we show that applying the algorithm to avoid a polynomialsized ``core'' subset of bad events leads to a desired outcome with high probability. This is shown via a simple union bound over the probabilities of noncore events in the conditional LLLdistribution, and automatically leads to simple and efficient MonteCarlo (and in most cases $RNC$) algorithms.
We demonstrate this idea on several applications. We give the first constantfactor approximation algorithm for the Santa Claus problem by making an LLLbased proof of Feige constructive. We provide Monte Carlo algorithms for acyclic edge coloring, nonrepetitive graph colorings, and Ramseytype graphs. In all these applications the algorithm falls directly out of the nonconstructive LLLbased proof. Our algorithms are very simple, often provide bet

The Geometry of SchedulingWe consider the following general scheduling problem: There are $n$ jobs with arbitrary release times and sizes. In addition, each job has an associated arbitrary monotone function specifying the cost incurred when the job is completed at a particular time. This problem formulation is general enough to include many natural scheduling objectives, such as weighted flow time, weighted tardiness, and sum of flow time squared. The main contribution of this paper is a randomized polynomialtime algorithm with approximation ratio $O(log log (nP) )$, where $P$ is the maximum job size. This general result improves the best known approximation ratios by at least an exponential factor (and much more in some cases) for essentially {em all} of the nontrivial common special cases of this problem.
Our result is based on a novel connection between scheduling and geometry. We start with a certain strong linear programming relaxation for this problem, based on exponentially many knapsack cover inequalities. We show that the problem of rounding this linear program can be reduced to geometric weighted set multicover problems, where the sets/objects are have near linear union complexity. We then show how to apply Varadarajan's quasiuniform sampling technique to obtain solutions for these geometric set multicover problems. We believe that this geometric interpretation of scheduling is of independent interest, and will likely find numerous future applications.

Sublinear Time Optimization for Machine LearningWe give sublineartime approximation algorithms for some optimization problems arising in machine learning, such as training linear classifiers and finding minimum enclosing balls. Our algorithms can be extended to some kernelized versions of these problems, such as SVDD, hard margin SVM, and $L_2$SVM, for which sublineartime algorithms were not known before. These new algorithms use a combination of novel sampling techniques and the randomized implementation of online learning algorithms. We give lower bounds which show the running times of our algorithms to be nearly best possible in the unitcost RAM model. We also give implementations of our algorithms in the semistreaming setting, obtaining the first low pass sublinear time algorithms with polylogarithmic space and arbitrary approximation factor. Finally we show that our algorithms, which are Monte Carlo, can be made Las Vegas with an additional lineartime step, and we show that such linear work is necessary for Las Vegas results.

Estimating the longest increasing sequence in sublinear timeFinding the longest increasing subsequence (LIS) is a classic algorithmic problem. Let $n$ denote the size of the array. Simple $O(nlog n)$ algorithms are known for this problem. What can a sublinear time algorithm achieve? We develop a polylogarithmic time randomized algorithm that for any constant $delta > 0$, given $f$ outputs an estimate of the LIS that, with high probability, is accurate to within an additive $delta n$. More precisely, the running time of the algorithm is $(log n)^c (1/delta)^{O(1/delta)}$ where the exponent $c$ is independent of $delta$. Previously, the best known polylogarithmic time algorithms could only achieve an additive $n/2$ approximation.
The LIS problem can be formulated as a dynamic program. Our overall approach seems very general, and might be applicable to approximating other dynamic programs.

Testing Properties of Sparse ImagesWe initiate the study of testing properties of images that correspond to {em sparse/} $0/1$valued matrices of size $n imes n$. Our study is related to but different from the study initiated by Raskhodnikova ({em Proceedings of RANDOM, 2003/}), where the images correspond to {em dense/} $0/1$valued matrices. Specifically, while distance between images in the model studied by Raskhodnikova is the fraction of entries on which the images differ taken with respect to all $n^2$ entries, the distance measure in our model is defined by the fraction of such entries taken with respect to the actual number of $1$'s in the matrix. We study several natural properties: connectivity, convexity, monotonicity, and being a line. In all cases we give testing algorithms with sublinear complexity, and in some of the cases we also provide corresponding lower bounds.

A Unified Framework for Testing LinearInvariant PropertiesIn a sequence of recent papers, Sudan and coauthors have investigated the relation between testability of properties of Boolean functions and the invariance of the properties with respect to transformations of the domain. Linearinvariance is arguably the most common such symmetry for natural properties of Boolean functions on the hypercube. Hence, it is an important goal to find necessary and sufficient conditions for testability of linearinvariant properties. This is explicitly posed as an open problem in a recent survey of Sudan. We obtain the following results:
1. We show that every linearinvariant property that can be characterized by forbidding induced solutions to a (possibly infinite) set of linear equations can be tested with onesided error.
2. We show that every linearinvariant property that can be tested with onesided error can be characterized by forbidding induced solutions to a (possibly infinite) set of systems of linear equations.
We conjecture that our result from item (1) can be extended to cover systems of linear equations. We further show that the validity of this conjecture would have the following implications:
1. It would imply that every linearinvariant property that is closed under restrictions to linear subspaces is testable with onesided error. Such a result would unify several previous results on testing Boolean functions, such as the results on testing lowdegree polynomials and results on testing Fourier dimensionality.
2. It would imply that a linearinvariant property P is testable with onesided error *if and only if* P is closed under restrictions to linear subspaces, thus resolving Sudan's problem.

Optimal Testing of ReedMuller CodesWe consider the problem of testing if a given function $f : F_2^n ightarrow F_2$ is close to any degree $d$ polynomial in $n$ variables, also known as the ReedMuller testing problem. The Gowers norm is based on a natural $2^{d+1}$query test for this property. Alon et al.~cite{AKKLR} rediscovered this test and showed that it accepts every degree $d$ polynomial with probability $1$, while it rejects functions that are $Omega(1)$far with probability $Omega(1/(d 2^{d}))$. We give an asymptotically optimal analysis of this test, and show that it rejects functions that are (even only) $Omega(2^{d})$far with $Omega(1)$probability (so the rejection probability is a universal constant independent of $d$ and $n$). This implies a tight relationship between the $(d+1)^{ m{st}}$Gowers norm of a function and its maximal correlation with degree $d$ polynomials, when the correlation is close to 1.
Our proof works by induction on $n$ and yields a new analysis of even the classical BlumLubyRubinfeld~cite{BLR} linearity test, for the setting of functions mapping $F_2^n$ to $F_2$. The optimality follows from a tighter analysis of counterexamples to the ``inverse conjecture for the Gowers norm'' constructed by cite{GT07,LMS}.
Our result has several implications. First, it shows that the Gowers norm test is tolerant, in that it also accepts close codewords. Second, it improves the parameters of an XOR lemma for polynomials given by Viola and Wigderson~cite{VW}. Third, it implies a ``query hierarchy'' result for property testing of affineinvariant properties. That is, for every function $q(n)$, it gives an affineinvariant property that is testable with $O(q(n))$queries, but not with $o(q(n))$queries, complementing an analogous result of cite{GKNR08} for graph properties.
 All Talks
 Geometric complexity theory (GCT)
 How to Grow Your Lower Bounds
 How To Think About Mechanism Design
 Constructive Algorithms for Discrepancy Minimization
 Bounded Independence Fools Degree2 Threshold Functions
 The Coin Problem, and Pseudorandomness for Branching Programs / Pseudorandom Generators for Regular Branching Programs
 Settling the Polynomial Learnability of Mixtures of Gaussians
 Polynomial Learning of Distribution Families
 Agnostically learning under permutation invariant distributions
 Learning Convex Concepts from Gaussian Distributions with PCA
 Determinant Sums for Undirected Hamiltonicity
 Fighting Perebor: New and Improved Algorithms for Formula and QBF Satisfiability
 The Monotone Complexity of kCLIQUE on Random Graphs
 The Complexity of Distributions
 Hardness of Finding Independent Sets in Almost 3Colorable Graphs
 Approximation Algorithms for the EdgeDisjoint Paths Problem via Raecke Decompositions
 Computational Transition at the Uniqueness Threshold
 Avner Magen Memorial Talk
 Clustering with Spectral Norm and the kmeans Algorithm
 Stability yields a PTAS for kMedian and kMeans Clustering
 The Geometry of Manipulation  a Quantitative Proof of the GibbardSatterthwaite Theorem
 Efficient volume sampling for row/column subset selection
 Polylogarithmic Approximation for Edit Distance and the Asymmetric Query Complexity
 New Constructive Aspects of the Lovasz Local Lemma
 The Geometry of Scheduling
 Sublinear Time Optimization for Machine Learning
 Estimating the longest increasing sequence in sublinear time
 Testing Properties of Sparse Images
 A Unified Framework for Testing LinearInvariant Properties
 Optimal Testing of ReedMuller Codes
 Subexponential Algorithms for Unique Games and Related problems
 Nevanlinna Prize Talk. Laplacian Gems
 Dependent Randomized Rounding via Exchange Properties of Combinatorial Structures
 MinimumCost Network Design with (Dis)economies of Scale
 One Tree Suffices: A Simultaneous O(1)Approximation for SingleSink BuyatBulk
 Min stCut Oracle for Planar Graphs with NearLinear Preprocessing Time
 Subcubic Equivalences Between Path, Matrix, and Triangle Problems
 Replacement Paths via Fast Matrix Multiplication
 AllPairs Shortest Paths in O(n2) Time With High Probability
 Approximating Maximum Weight Matching in Nearlinear Time
 Pure and BayesNash Price of Anarchy for Generalized Second Price Auction
 Frugal and Truthful Auctions for Vertex Covers, Flows, and Cuts / Frugal Mechanism Design via Spectral Techniques
 Budget Feasible Mechanisms
 BlackBox Randomized Reductions in Algorithmic Mechanism Design
 Boosting and Differential Privacy
 A Multiplicative Weights Mechanism for Interactive PrivacyPreserving Data Analysis
 Impossibility of Differentially Private Universally Optimal Mechanisms
 The Limits of TwoParty Differential Privacy
 Deciding firstorder properties for sparse graphs
 Logspace Versions of the Theorems of Bodlaender and Courcelle
 A separator theorem in minorclosed classes
 Optimal stochastic planarization
 Solving linear systems through nested dissection
 Approaching optimality for solving SDD linear systems
 Fast approximation algorithms for cutbased graph problems
 Metric Extension Operators, Vertex Sparsifiers and Lipschitz Extendability
 Vertex Sparsifiers and Abstract Rounding Algorithms
 A nonlinear lower bound for planar epsilonnets
 The subexponential upper bound for online chain partitioning
 Improved Bounds for Geometric Permutations
 On the Queue Number of Planar Graphs
 Holographic Algorithms with Matchgates Capture Precisely Tractable Planar #CSP
 A Decidable Dichotomy Theorem on Directed Graph Homomorphisms with Nonnegative Weights
 Overcoming the Hole In The Bucket: PublicKey Cryptography Resilient to Continual Memory Leakage
 Cryptography Against Continuous Memory Attacks
 On the Insecurity of Parallel Repetition for Leakage Resilience
 BlackBox, RoundEfficient Secure Computation via NonMalleability Assumption
 From Standard to Adaptive Hardness And Composable Security With No SetUp
 On the Computational Complexity of Coin Flipping
 Sequential Rationality in Cryptographic Protocols
 A Fourieranalytic approach to ReedMuller decoding
 Pseudorandom generators for CC0[p] and the Fourier spectrum of lowdegree polynomials over finite fields
 Matching Vector Codes / Local list decoding with a constant number of queries
 Codes for Computationally Simple Channels: Explicit Constructions with Optimal Rate
 A lower bound for dynamic approximate membership data structures
 Lower Bounds for Near Neighbor Search via Metric Expansion
 Distance Oracles Beyond the Thorup?Zwick Bound