TechTalks from event: IEEE FOCS 2010

51st Annual IEEE Symposium on Foundations of Computer Science

  • Determinant Sums for Undirected Hamiltonicity Authors: Andreas Bjorklund
    We present a Monte Carlo algorithm for Hamiltonicity detection in an $n$-vertex undirected graph running in $O^*(2^{frac{3}{4}n})$ time. To the best of our knowledge, this is the first superpolynomial improvement on the worst case runtime for the problem since the $O^*(2^n)$ bound established for TSP almost fifty years ago (Bellman 1962, Held and Karp 1962). It answers in part the first open problem in Woeginger's 2003 survey on exact algorithms for NP-hard problems. For bipartite graphs, we improve the bound to $O^*(2^{frac{1}{2}n})$ time. Both the bipartite and the general algorithm can be implemented to use space polynomial in $n$. We combine several recently resurrected ideas to get the result. Our main technical contribution is a new reduction inspired by the algebraic sieving method for $k$-Path (Koutis ICALP 2008, Williams IPL 2009). We transform the Hamiltonicity instance into many smaller Cycle Cover instances in which we are left to count weighted cycle covers over a finite field of characteristic two. We next adopt and apply the determinant summation technique for Exact Set Covers (Bj"orklund STACS 2010).
  • Fighting Perebor: New and Improved Algorithms for Formula and QBF Satisfiability Authors: Rahul Santhanam
    We investigate the possibility of finding satisfying assignments to Boolean formulae and testing validity of quantified Boolean formulae (QBF) asymptotically faster than a brute force search.

    Our first main result is a simple deterministic algorithm running in time $2^{n - Omega(n)}$ for satisfiability of formulae of linear size in $n$, where $n$ is the number of variables in the formula. This algorithm extends to exactly counting the number of satisfying assignments, within the same time bound.

    Our second main result is a deterministic algorithm running in time $2^{n - Omega(n/log(n)}$ for solving QBFs in which the number of occurrences of any variable is bounded by a constant. For instances which are ``structured'', in a certain precise sense, the algorithm can be modified to run in time $2^{n - Omega(n)}$.

    To the best of our knowledge, no non-trivial algorithms were known for these problems before.

    As a byproduct of the technique used to establish our first main result, we show that every function computable by linear-size formulae can be represented by decision trees of size $2^{n - Omega(n)}$. As a consequence, we get strong superlinear {it average-case} formula size lower bounds for the Parity function.

  • The Monotone Complexity of k-CLIQUE on Random Graphs Authors: Benjamin Rossman
    Understanding the average-case complexity of natural problems on natural distributions is an important challenge for complexity theory. In this paper we consider the average-case complexity of the $k$-Clique{} problem (for fixed $k$) on monotone circuits. A natural class of distributions in this context are ErdH{o}s-R'enyi random graphs $G(n,p)$ at threshold functions $p(n) in Theta(n^{-2/(k-1)})$, for which $Pr(G(n,p)$ contains a $k$-clique$)$ is bounded away from $0$ and $1$. Our main result is a lower bound of $omega(n^{k/4})$ on the size of monotone circuits which solve $k$-Clique{} (asymptotically almost surely) on $G(n,p)$ for two sufficiently far-apart threshold functions $p(n)$, such as $n^{-2/(k-1)}$ and $2n^{-2/(k-1)}$. This result complements a previous lower bound cite{STOC08} of $omega(n^{k/4})$ on the size of $AC^0$ circuits which solve $k$-Clique{} on $G(n,p)$ for a single threshold function $p(n)$. These two lower bounds---obtained by different techniques in the different settings of $AC^0$ and monotone circuits---support an intuition that {em random graphs at the threshold may be a source of hard instances for $k$-Clique{} in general}. (Note that similar beliefs about random extsc{SAT} are common in statistical physics.)

    In addition, we show that $k/4$ in this lower bound is tight up to $o(k)$ by constructing monotone circuits of size $n^{k/4 + O(1)}$ which solve $k$-Clique{} on $G(n,p)$ for all functions $p : N o [0,1]$ (monotonizing a construction of $AC^0$ circuits due to Amano cite{Amano09}). This moreover demonstrates a gap between the worst-case and average-case complexity of $k$-Clique{} on monotone circuits, in light of an $wtOmega(n^k)$ worst-case lower bound due to Razborov cite{Razborov85}.

    One technical contribution of this paper is the introduction of a new variant of sunflowers called {em $(p,q)$-sunflowers}, in which petals may overlap (but not too much on average). We prove a combinatorial theorem (along the lines of the ErdH{

  • The Complexity of Distributions Authors: Emanuele Viola
    Complexity theory typically studies the complexity of computing a function $h(x) : {0,1}^m o {0,1}^n$ of a given input $x$. We advocate the study of the complexity of generating the distribution $h(x)$ for uniform $x$, given random bits. Our main results are:

    egin{enumerate} item Any function $f : {0,1}^ell o {0,1}^n$ such that (i) each output bit $f_i$ depends on $o(log n)$ input bits, and (ii) $ell le log_2 inom{n}{alpha n} + n^{0.99}$, has output distribution $f(U)$ at statistical distance $ge 1 - 1/n^{0.49}$ from the uniform distribution over $n$-bit strings of hamming weight $alpha n$.

    We also prove lower bounds for generating $(X,b(X))$ for boolean $b$, and in the case in which each bit $f_i$ is a small-depth decision tree.

    These lower bounds seem to be the first of their kind; the proofs use anti-concentration results for the sum of random variables.

    item Lower bounds for generating distributions imply succinct data structures lower bounds. As a corollary of (1), we obtain the first lower bound for the membership problem of representing a set $S subseteq [n]$ of size $alpha n$, in the case where $1/alpha$ is a power of $2$: If queries ``$i in S$?'' are answered by non-adaptively probing $o(log n)$ bits, then the representation uses $ge log_2 inom{n}{alpha n} + Omega(log n)$ bits.

    item Upper bounds complementing the bounds in (1) for various settings of parameters.

    item Uniform randomized AC$^0$ circuits of $poly(n)$ size and depth $d = O(1)$ with error $e$ can be simulated by uniform randomized AC$^0$ circuits of $poly(n)$ size and depth $d+1$ with error $e + o(1)$ using $le (log n)^{O( log log n)}$ random bits.

    Previous derandomizations [Ajtai and Wigderson '85; Nisan '91] increase the depth by a constant factor, or else have poor seed length. end{enumerate}

  • Hardness of Finding Independent Sets in Almost 3-Colorable Graphs Authors: Irit Dinur and Subhash Khot and Will Perkins and Muli Safra
    For every $eps > 0$, and integer $q geq 3$, we show that given an $N$-vertex graph that has an induced $q$-colorable subgraph of size $(1-eps)N$, it is NP-hard to find an independent set of size $ frac{N}{q^2}$.
  • Approximation Algorithms for the Edge-Disjoint Paths Problem via Raecke Decompositions Authors: Matthew Andrews
    We study the Edge-Disjoint Paths with Congestion (EDPwC) problem in undirected networks in which we must integrally route a set of demands without causing large congestion on an edge. We present a $(polylog(n),poly(loglog n)))$-approximation, which means that if there exists a solution that routes $X$ demands integrally on edge-disjoint paths (i.e. with congestion $1$), then the approximation algorithm can route $X/polylog(n)$ demands with congestion $poly(loglog n)$.

    The best previous result for this problem was a $(n^{1/eta},eta)$-approximation for $eta<log n$.

  • Computational Transition at the Uniqueness Threshold Authors: Allan Sly
    The hardcore model is a model of lattice gas systems which has received much attention in statistical physics, probability theory and theoretical computer science. It is the probability distribution over independent sets $I$ of a graph weighted proportionally to $lambda^{|I|}$ with fugacity parameter $lambda$. We prove that at the uniqueness threshold of the hardcore model on the $d$-regular tree, approximating the partition function becomes computationally hard.

    Specifically, we show that unless NP$=$RP there is no polynomial time approximation scheme for the partition function (the sum of such weighted independent sets) on graphs of maximum degree $d$ for fugacity $lambda_c(d) < lambda < lambda_c(d) + varepsilon(d)$ where $$lambda_c = frac{(d-1)^{d-1}}{(d-2)^d}$$ is the uniqueness threshold on the $d$-regular tree and $varepsilon(d)>0$ is a positive constant. Weitz produced an FPTAS for approximating the partition function when $0<lambda < lambda_c(d)$ so this result demonstrates that the computation threshold exactly coincides with the statistical physics phase transition thus confirming the main conjecture of [MWW, '09]. We further analyze the special case of $lambda=1, d=6$ and show there is no polynomial time approximation scheme for approximately counting independent sets on graphs of maximum degree $d= 6$, which is optimal, improving the previous bound of $d= 24$.

    Our proof is based on specially constructed random bi-partite graphs which act as gadgets in a reduction to MAX-CUT. Building on the involved second moment method analysis of cite{MWW:09} and combined with an analysis of the reconstruction problem on the tree our proof establishes a strong version of ``replica'' method heuristics developed by theoretical physicists. The result establishes the first rigorous correspondence between the hardness of approximate counting and sampling with statistical physics phase transitions.

  • Avner Magen Memorial Talk Authors: Toni Pitassi
    Prof. Avner Magen was a brilliant scholar, making fundamental contributions to a number of areas of theoretical computer science that include metric embeddings, sublinear algorithms, convex programming, computational geometry and approximation algorithms.

    Avner completed his undergraduate and graduate studies at the Hebrew University of Jerusalem, and received his Ph.D. in Computer Science in 2002, under the supervision of Prof. Nati Linial. He held a postdoctoral fellowship at NEC Research in Princeton, New Jersey, from 2000 until 2002. He joined the University of Toronto in 2002, first as a postdoctoral fellow, and then as an Assistant Professor in 2004. He was promoted to Associate Professor in 2009.

    Avner was a wonderful colleague with a terrific sense of humor, great energy, and tremendous warmth. He was a dedicated research supervisor and a superb teacher whose mentorship inspired his students and those around him. He was one of theoretical computer science's most creative researchers. The loss of such mathematical talent will have a profound impact on his colleagues, students, and the entire research community.
  • Clustering with Spectral Norm and the k-means Algorithm Authors: Amit Kumar and Ravindran Kannan
    There has been much progress on efficient algorithms for clustering data points generated by a mixture of $k$ probability distributions under the assumption that the means of the distributions are well-separated, i.e., the distance between the means of any two distributions is at least $Omega(k)$ standard deviations. These results generally make heavy use of the generative model and particular properties of the distributions. In this paper, we show that a simple clustering algorithm works without assuming any generative (probabilistic) model. Our only assumption is what we call a ``proximity condition'': the projection of any data point onto the line joining its cluster center to any other cluster center is $Omega(k)$ standard deviations closer to its own center than the other center. Here the notion of standard deviations is based on the spectral norm of the matrix whose rows represent the difference between a point and the mean of the cluster to which it belongs. We show that in the generative models studied, our proximity condition is satisfied and so we are able to derive most known results for generative models as corollaries of our main result. We also prove some new results for generative models - e.g., we can cluster all but a small fraction of points only assuming a bound on the variance. Our algorithm relies on the well known $k$-means algorithm, and along the way, we prove a result of independent interest -- that the $k$-means algorithm converges to the ``true centers'' even in the presence of spurious points provided the initial (estimated) centers are close enough to the corresponding actual centers and all but a small fraction of the points satisfy the proximity condition. Finally, we present a new technique for boosting the ratio of inter-center separation to standard deviation. This allows us to prove results for learning mixture of a class of distributions under weaker separation conditions.
  • Stability yields a PTAS for k-Median and k-Means Clustering Authors: Pranjal Awasthi and Avrim Blum and Or Sheffet
    We consider $k$-median clustering in finite metric spaces and $k$-means clustering in Euclidean spaces, in the setting where $k$ is part of the input (not a constant). For the $k$-means problem, Ostrovsky et al.~cite{Ostrovsky06} show that if the input satisfies the condition that the optimal $(k-1)$-means clustering is more expensive than the optimal $k$-means clustering by a factor of $max{100, 1/alpha^2}$, then one can achieve a $(1+f(alpha))$-approximation to the $k$-means optimal in time polynomial in $n$ and $k$ by using a variant of Lloyd's algorithm. In this work we substantially improve this approximation guarantee. We show that given only the condition that the $(k-1)$-means optimal is more expensive than the $k$-means optimal by a factor $1+alpha$ for {em some} constant $alpha>0$, we can obtain a PTAS. In particular, under this assumption, for any $eps>0$ we can achieve a $(1+eps)$-approximation to the $k$-means optimal in time polynomial in $n$ and $k$, and exponential in $1/eps$ and $1/alpha$. We thus decouple the strength of the assumption from the quality of the approximation ratio. We also give a PTAS for the $k$-median problem in finite metrics under the analogous assumption as well. For $k$-means, we in addition give a randomized algorithm with improved running time of $n^{O(1)}(k log n)^{poly(1/epsilon,1/alpha)}$.

    We also use our technique to obtain a PTAS under the assumption considered by Balcan et al.~cite{Balcan09} that all $(1+alpha)$ approximations are $delta$-close to a desired target clustering, when all target clusters have size greater than $2delta n$. Both results are based on a new notion of clustering stability, that extends both the notions of~cite{Ostrovsky06} and of~cite{Balcan09}. No FPTAS for the $k$-median problem over such stable instances exists, unless $P=NP$. Thus our algorithm is in a sense best possible for such instances.