FOCS 2011
TechTalks from event: FOCS 2011
We will be uploading the videos for FOCS 2011 during the week of Nov 28th 2011. If you find any discrepancy, please let us know by clicking on report error link on talk page. If you did not permit the video to be published and by mistake we have published your talk, please notify us immediately at support AT weyond.com
6A

Streaming Algorithms via Precision SamplingA technique introduced by Indyk and Woodruff [STOC 2005] has inspired several recent advances in datastream algorithms. We show that a number of these results follow easily from the application of a single probabilistic method called {\em Precision Sampling}. Using this method, we obtain simple datastream algorithms that maintain a randomized sketch of an input vector $x=(x_1,\ldots x_n)$, which is useful for the following applications: \begin{itemize} \item Estimating the $F_k$moment of $x$, for $k>2$. \item Estimating the $\ell_p$norm of $x$, for $p\in[1,2]$, with small update time. \item Estimating cascaded norms $\ell_p(\ell_q)$ for all $p,q>0$. \item$\ell_1$ sampling, where the goal is to produce an element $i$ with probability (approximately) $x_i/\x\_1$. It extends to similarly defined $\ell_p$sampling, for $p\in [1,2]$. \end{itemize} For all these applications the algorithm is essentially the same: premultiply the vector $x$ entrywise by a wellchosen random vector, and run a heavyhitter estimation algorithm on the resulting vector. Our sketch is a linear function of $x$, thereby allowing general updates to the vector $x$. Precision Sampling itself addresses the problem of estimating a sum $\sum_{i=1}^n a_i$ from weak estimates of each real $a_i\in[0,1]$. More precisely, the estimator first chooses a desired precision $u_i\in(0,1]$ for each $i\in[n]$, and then it receives an estimate of every $a_i$ within additive $u_i$. Its goal is to provide a good approximation to $\sum a_i$ while keeping a tab on the cost $\sum_i (1/u_i)$. Here we refine previous work [Andoni, Krauthgamer, and Onak, FOCS 2010] which shows that as long as $\sum a_i=\Omega(1)$, a good multiplicative approximation can be achieved using total precision of only $O(n\log n)$.

Steiner ShallowLight Trees are Exponentially Lighter than Spanning OnesFor a pair of parameters alpha \ge 1, beta \ge 1, a spanning tree T of a weighted undirected nvertex graph G = (V,E,w) is called an (alpha,beta)shallowlight tree (shortly, (alpha,beta)SLT) of G with respect to a designated vertex rt in V if (1) it approximates all distances from rt to other vertices up to a factor of alpha, and (2) its weight is at most beta times the weight of the minimum spanning tree MST(G) of G. The parameter alpha (respectively, beta) is called the rootdistortion (resp., lightness) of the tree T. Shallowlight trees (SLTs) constitute a fundamental graph structure, with numerous theoretical and practical applications. In particular, they were used for constructing spanners, in network design, for VLSIcircuit design, for various data gathering and dissemination tasks in wireless and sensor networks, in overlay networks, and in the messagepassing model of distributed computing. Tight tradeoffs between the parameters of SLTs were established by Awerbuch, Baratz and Peleg, PODC'90 and Khuller, Raghavachari and Young, SODA'93. They showed that for any eps > 0 there always exist (1+eps,O(1/eps))SLTs, and that the upper bound beta = O(1/eps) on the lightness of SLTs cannot be improved. In this paper we show that using Steiner points one can build SLTs with logarithmic lightness, i.e., beta = O(log 1/eps). This establishes an \emph{exponential separation} between spanning SLTs and Steiner ones. One particularly remarkable point on our tradeoff curve is eps = 0. In this regime our construction provides a \emph{shortestpath tree} with weight at most O(log n) * w(MST(G)). Moreover, we prove matching lower bounds that show that all our results are tight up to constant factors. Finally, on our way to these results we settle (up to constant factors) a number of open questions that were raised by Khuller et al. in SODA'93.

Fully dynamic maximal matching in O(log n) update timeWe present an algorithm for maintaining maximal matching in a graph under addition and deletion of edges. Our data structure is randomized that takes $O( \log n)$ expected amortized time for each edge update where $n$ is the number of vertices in the graph. While there is a trivial $O(n)$ algorithm for edge update, the previous best known result for this problem for a graph with $n$ vertices and $m$ edges is $O( {(n+ m)}^{0.7072})$ which is sublinear only for a sparse graph. To the best of our knowledge this is the first polylog update time for maximal matching that implies an exponential improvement from the previous results. For the related problem of maximum matching, Onak and Rubinfield \cite{onak} designed a randomized data structure that achieves $O(\log^2 n)$ amortized time for each update for maintaining a $c$approximate maximum matching for some large constant $c$. In contrast, we can maintain a factor two approximate maximum matching in $O(\log n )$ expected time per update as a direct corollary of the maximal matching scheme. This in turn also implies a two approximate vertex cover maintenance scheme that takes $O(\log n )$ expected time per update.

Which Networks Are Least Susceptible to Cascading Failures?The resilience of networks to various types of failures is an undercurrent in many parts of graph theory and network algorithms. In this paper we study the resilience of networks in the presence of {\em cascading failures}  failures that spread from one node to another across the network structure. One finds such cascading processes at work in the kind of contagious failures that spread among financial institutions during a financial crisis, through nodes of a power grid or communication network during a widespread outage, or through a human population during the outbreak of an epidemic disease. A widely studied model of cascades in networks assumes that each node $v$ of the network has a threshold $\ell(v)$, and fails if it has at least $\ell(v)$ failed neighbors. We assume that each node selects a threshold $\ell(v)$ independently using a probability distribution $\mu$. Our work centers on a parameter that we call the $\mu$risk of a graph: the maximum failure probability of any node in the graph, in this threshold cascade model parameterized by threshold distribution $\mu$. This defines a very broad class of models; for example, the large literature on edge percolation, in which propagation happens along edges that are included independently at random with some probability $p$, takes place in a small part of the parameter space of threshold cascade models, and one where the distribution $\mu$ is monotonically decreasing with the threshold. In contrast we want to study the whole space, including threshold distributions with qualitatively different behavior, such as those that are sharply increasing. We develop techniques for relating differences in $\mu$risk to the structures of the underlying graphs. This is challenging in large part because, despite the simplicity of its formulation, the threshold cascade model has been very hard to analyze for arbitrary graphs $G$ and arbitrary threshold distributions $\mu$. It turns out that when selecting among a set of graphs to minimize the $\mu$risk, the result depends quite intricately on $\mu$. We develop several techniques for evaluating the $\mu$risk of $d$regular graphs. For $d=2$ we are able to solve the problem completely: the optimal graph is always a clique (i.e.\ triangle) or tree (i.e.\ infinite path), although which graph is better exhibits a surprising nonmonotonicity as the threshold parameters vary. When $d>2$ we present a technique based on powerseries expansions of the failure probability that allows us to compare graphs in certain parts of the parameter space, deriving conclusions including the fact that as $\mu$ varies, at least three different graphs are optimal among $3$regular graphs. In particular, the set of optimal 3regular graphs includes one which is neither a clique nor a tree.
6B

The Power of Linear EstimatorsFor a broad class of practically relevant distribution properties, which includes entropy and support size, nearly all of the proposed estimators have an especially simple form. Given a set of independent samples from a discrete distribution, these estimators tally the vector of summary statisticsthe number of species seen once, twice, etc. in the sampleand output the dot product between these summary statistics, and a fixed vector of coefficients. We term such estimators \emph{linear}. This historical proclivity towards linear estimators is slightly perplexing, since, despite many efforts over nearly 60 years, all proposed such estimators have significantly suboptimal convergence. Our main result, in some sense vindicating this insistence on linear estimators, is that for any property in this broad class, there exists a nearoptimal linear estimator. Additionally, we give a practical and polynomialtime algorithm for constructing such estimators for any given parameters. While this result does not yield explicit bounds on the sample complexities of these estimation tasks, we leverage the insights provided by this result, to give explicit constructions of a linear estimators for three properties: entropy, $L_1$ distance to uniformity, and for pairs of distributions, $L_1$ distance.Our entropy estimator, when given $O(\frac{n}{\eps \log n})$ independent samples from a distribution of support at most $n,$ will estimate the entropy of the distribution to within accuracy $\epsilon$, with probability of failure $o(1/poly(n)).$ From recent lower bounds, this estimator is optimal, to constant factor, both in its dependence on $n$, and its dependence on $\epsilon.$ In particular, the inverselinear convergence rate of this estimator resolves the main open question of [VV11], which left open the possibility that the error decreased only with the square root of the number of samples. Our distance to uniformity estimator, on given $O(\frac{m}{\eps^2\log m})$ independent samples from any distribution, returns an $\eps$accurate estimate of the $L_1$ distance to the uniform distribution of support $m$. This is the first sublinearsample estimator for this problem, and is constantfactor optimal, for constant $\epsilon$. Finally, our framework extends naturally to properties of pairs of distributions, including estimating the $L_1$ distance and KLdivergence between pairs of distributions. We give an explicit linear estimator for estimating $L_1$ distance to accuracy $\epsilon$ using $O(\frac{n}{\eps^2\log n})$ samples from each distribution, which is constantfactor optimal, for constant $\epsilon$.

An algebraic proof of a robust social choice impossibility theoremAn important element of social choice theory are impossibility theorem, such as Arrow's theorem and GibbardSatterthwaite's theorem, which state that under certain natural constraints, social choice mechanisms are impossible to construct. In recent years, beginning in Kalai, much work has been done in finding \textit{robust} versions of these theorems, showing that impossibility remains even when the constraints are \textit{almost} always satisfied. In this work we present an Algebraic approach for producing such results. We demonstrate it for a lesser known variant of Arrow's theorem, found in Dokow and Holzman.

Planar Graphs: Random Walks and Bipartiteness TestingWe initiate the study of the testability of properties in arbitrary planar graphs. We prove that bipartiteness can be tested in constant time. The previous bound for this class of graphs was Otilde(sqrt(n)), and the constanttime testability was only known for planar graphs with bounded degree. Previously used transformations of unboundeddegree sparse graphs into boundeddegree sparse graphs cannot be used to reduce the problem to the testability of boundeddegree planar graphs. Our approach extends to arbitrary minorfree graphs. Our algorithm is based on random walks. The challenge is here to analyze random walks for a class of graphs that has good separators, i.e., bad expansion. Standard techniques that use a fast convergence to a uniform distribution do not work in this case. Roughly speaking, our analysis technique selfreduces the problem of Ã¯Â¬ nding an odd length cycle in a multigraph G induced by a collection of cycles to another multigraph GÃ¢â‚¬Â² induced by a set of shorter oddlength cycles, in such a way that when a random walks Ã¯Â¬ nds a cycle in GÃ¢â‚¬Â² with probability p>0, then it does so with probability lambda(p)>0 in G. This reduction is applied until the cycles collapse to selfloops that can be easily detected.

Testing and Reconstruction of Lipschitz Functions with Applications to Data PrivacyA function f : D > R has Lipschitz constant c if dR(f(x), f(y)) <= c dD(x, y) for all x, y in D,where dR and dD denote the distance functions on the range and domain of f, respectively. We say a function is Lipschitz if it has Lipschitz constant 1. (Note that rescaling by a factor of 1=c converts a function with a Lipschitz constant c into a Lipschitz function.) In other words, Lipschitz functions are not very sensitive to small changes in the input. We initiate the study of testing and local reconstruction of the Lipschitz property of functions. A property tester has to distinguish functions with the property (in this case, Lipschitz) from functions that are epsilonfar from having the property, that is, differ from every function with the property on at least an epsilon fraction of the domain. A local filter reconstructs an arbitrary function f to ensure that the reconstructed function g has the desired property (in this case, is Lipschitz), changing f only when necessary. A local filter is given a function f and a query x and, after looking up the value of f on a small number of points, it has to output g(x) for some function g, which has the desired property and does not depend on x. If f has the property, g must be equal to f. We consider functions over domains {0,1}^d, {1,...,n} and {1,...,n}^d, equipped with l1 distance. We design efficient testers of the Lipschitz property for functions of the form f:{0,1}^d > \delta Z, where \delta \in (0,1] and \delta Z is the set of multiples of \delta, and of the form f: {1,...,n} > R, where R is (discretely) metrically convex. In the first case, the tester runs in time O(d min{d,r}/\delta\epsilon), where r is the diameter of the image of f; in the second, in time O((\log n)/\epsilon). We give corresponding lower bounds of Omega(d) and Omega(log n) on the query complexity (in the second case, only for nonadaptive 1sided error testers). Our lower bound for functions over {0,1}^d is tight for the case of the {0,1,2} range and constant \epsilon. The first tester implies an algorithm for functions of the form f:{0,1}^d > R that distinguishes Lipschitz functions from functions that are \epsilonfar from (1+\delta)Lipschitz. We also present a local filter of the Lipschitz property for functions of the form f: {1,...,n}^d > R with lookup complexity O((log n+1)^d). For functions of the form {0,1}^d, we show that every nonadaptive local filter has lookup complexity exponential in d. The testers that we developed have applications to programs analysis. The reconstructors have applications to data privacy. For the first application, the Lipschitz property of the function computed by a program corresponds to a notion of robustness to noise in the data. The application to privacy is based on the fact that a function f of entries in a database of sensitive information can be released with noise of magnitude proportional to a Lipschitz constant of f, while preserving the privacy of individuals whose data is stored in the database (Dwork, McSherry, Nissim and Smith, TCC 2006). We give a differentially private mechanism, based on local filters, for releasing a function f when a Lipschitz constant of f is provided by a distrusted client. We show that when no reliable Lipschitz constant of f is given, previously known differentially private mechanisms either have a substantially higher running time or have a higher expected error for a large class of symmetric functions f.