List of all recorded talks

Polynomial Learning of Distribution FamiliesThe question of polynomial learnability of probability distributions, particularly Gaussian mixture distributions, has recently received significant attention in theoretical computer science and machine learning. However, despite major progress, the general question of polynomial learnability of Gaussian mixture distributions still remained open. The current work resolves the question of polynomial learnability for Gaussian mixtures in high dimension with an arbitrary but fixed number of components.
The result for Gaussian distributions relies on a very general result of independent interest on learning parameters of distributions belonging to what we call {it polynomial families}. These families are characterized by their moments being polynomial of parameters and, perhaps surprisingly, include almost all common probability distributions as well as their mixtures and products. Using tools from real algebraic geometry, we show that parameters of any distribution belonging to such a family can be learned in polynomial time.
To estimate parameters of a Gaussian mixture distribution the general results on polynomial families are combined with a certain deterministic dimensionality reduction allowing learning a highdimensional mixture to be reduced to a polynomial number of parameter estimation problems in low dimension.

Agnostically learning under permutation invariant distributionsWe generalize algorithms from computational learning theory that are successful under the uniform distribution on the Boolean hypercube ${0,1}^n$ to algorithms successful on permutation invariant distributions, distributions where the probability mass remains constant upon permutations in the instances. While the tools in our generalization mimic those used for the Boolean hypercube, the fact that permutation invariant distributions are not product distributions presents a significant obstacle.
Under the uniform distribution, halfspaces can be agnostically learned in polynomial time for constant $eps$. The main tools used are a theorem of Peres~cite{Peres04} bounding the {it noise sensitivity} of a halfspace, a result of~cite{KOS04} that this theorem this implies Fourier concentration, and a modification of the LowDegree algorithm of Linial, Mansour, Nisan~cite{LMN:93} made by Kalai et. al.~cite{KKMS08}. These results are extended to arbitrary product distributions in~cite{BOWi08}.
We prove analogous results for permutation invariant distributions; more generally, we work in the domain of the symmetric group. We define noise sensitivity in this setting, and show that noise sensitivity has a nice combinatorial interpretation in terms of Young tableaux. The main technical innovations involve techniques from the representation theory of the symmetric group, especially the combinatorics of Young tableaux. We show that low noise sensitivity implies concentration on ``simple'' components of the Fourier spectrum, and that this fact will allow us to agnostically learn halfspaces under permutation invariant distributions to constant accuracy in roughly the same time as in the uniform distribution over the Boolean hypercube case.

Learning Convex Concepts from Gaussian Distributions with PCAWe present a new algorithm for learning a convex set in $R^n$ given labeled examples drawn from any Gaussian distribution. The efficiency of the algorithm depends on the dimension of the {em normal subspace}, the span of vectors orthogonal to supporting hyperplanes of the convex set. The key step of the algorithm uses a Singular Value Decomposition (SVD) to approximate the relevant normal subspace. The complexity of the resulting algorithm is $poly(n)2^{ ilde{O}(k)}$ for an arbitrary convex set with normal subspace of dimension $k$. For the important special case of the intersection of $k$ halfspaces, the complexity is [ poly(n,k,1/eps) + n cdot min , k^{O(log k/eps^4)}, (k/eps)^{O(klog (1/eps))} ] to learn a hypothesis that correctly classifies $1eps$ of the unknown Gaussian distribution. This improves substantially on existing algorithms and is the first algorithm to achieve a fixed polynomial dependence on $n$. The proof of our main result is based on a monotonicity property of Gaussian space.

Determinant Sums for Undirected HamiltonicityWe present a Monte Carlo algorithm for Hamiltonicity detection in an $n$vertex undirected graph running in $O^*(2^{frac{3}{4}n})$ time. To the best of our knowledge, this is the first superpolynomial improvement on the worst case runtime for the problem since the $O^*(2^n)$ bound established for TSP almost fifty years ago (Bellman 1962, Held and Karp 1962). It answers in part the first open problem in Woeginger's 2003 survey on exact algorithms for NPhard problems. For bipartite graphs, we improve the bound to $O^*(2^{frac{1}{2}n})$ time. Both the bipartite and the general algorithm can be implemented to use space polynomial in $n$. We combine several recently resurrected ideas to get the result. Our main technical contribution is a new reduction inspired by the algebraic sieving method for $k$Path (Koutis ICALP 2008, Williams IPL 2009). We transform the Hamiltonicity instance into many smaller Cycle Cover instances in which we are left to count weighted cycle covers over a finite field of characteristic two. We next adopt and apply the determinant summation technique for Exact Set Covers (Bj"orklund STACS 2010).

Fighting Perebor: New and Improved Algorithms for Formula and QBF SatisfiabilityWe investigate the possibility of finding satisfying assignments to Boolean formulae and testing validity of quantified Boolean formulae (QBF) asymptotically faster than a brute force search.
Our first main result is a simple deterministic algorithm running in time $2^{n  Omega(n)}$ for satisfiability of formulae of linear size in $n$, where $n$ is the number of variables in the formula. This algorithm extends to exactly counting the number of satisfying assignments, within the same time bound.
Our second main result is a deterministic algorithm running in time $2^{n  Omega(n/log(n)}$ for solving QBFs in which the number of occurrences of any variable is bounded by a constant. For instances which are ``structured'', in a certain precise sense, the algorithm can be modified to run in time $2^{n  Omega(n)}$.
To the best of our knowledge, no nontrivial algorithms were known for these problems before.
As a byproduct of the technique used to establish our first main result, we show that every function computable by linearsize formulae can be represented by decision trees of size $2^{n  Omega(n)}$. As a consequence, we get strong superlinear {it averagecase} formula size lower bounds for the Parity function.

The Monotone Complexity of kCLIQUE on Random GraphsUnderstanding the averagecase complexity of natural problems on natural distributions is an important challenge for complexity theory. In this paper we consider the averagecase complexity of the $k$Clique{} problem (for fixed $k$) on monotone circuits. A natural class of distributions in this context are ErdH{o}sR'enyi random graphs $G(n,p)$ at threshold functions $p(n) in Theta(n^{2/(k1)})$, for which $Pr(G(n,p)$ contains a $k$clique$)$ is bounded away from $0$ and $1$. Our main result is a lower bound of $omega(n^{k/4})$ on the size of monotone circuits which solve $k$Clique{} (asymptotically almost surely) on $G(n,p)$ for two sufficiently farapart threshold functions $p(n)$, such as $n^{2/(k1)}$ and $2n^{2/(k1)}$. This result complements a previous lower bound cite{STOC08} of $omega(n^{k/4})$ on the size of $AC^0$ circuits which solve $k$Clique{} on $G(n,p)$ for a single threshold function $p(n)$. These two lower boundsobtained by different techniques in the different settings of $AC^0$ and monotone circuitssupport an intuition that {em random graphs at the threshold may be a source of hard instances for $k$Clique{} in general}. (Note that similar beliefs about random extsc{SAT} are common in statistical physics.)
In addition, we show that $k/4$ in this lower bound is tight up to $o(k)$ by constructing monotone circuits of size $n^{k/4 + O(1)}$ which solve $k$Clique{} on $G(n,p)$ for all functions $p : N o [0,1]$ (monotonizing a construction of $AC^0$ circuits due to Amano cite{Amano09}). This moreover demonstrates a gap between the worstcase and averagecase complexity of $k$Clique{} on monotone circuits, in light of an $wtOmega(n^k)$ worstcase lower bound due to Razborov cite{Razborov85}.
One technical contribution of this paper is the introduction of a new variant of sunflowers called {em $(p,q)$sunflowers}, in which petals may overlap (but not too much on average). We prove a combinatorial theorem (along the lines of the ErdH{

The Complexity of DistributionsComplexity theory typically studies the complexity of computing a function $h(x) : {0,1}^m o {0,1}^n$ of a given input $x$. We advocate the study of the complexity of generating the distribution $h(x)$ for uniform $x$, given random bits. Our main results are:
egin{enumerate} item Any function $f : {0,1}^ell o {0,1}^n$ such that (i) each output bit $f_i$ depends on $o(log n)$ input bits, and (ii) $ell le log_2 inom{n}{alpha n} + n^{0.99}$, has output distribution $f(U)$ at statistical distance $ge 1  1/n^{0.49}$ from the uniform distribution over $n$bit strings of hamming weight $alpha n$.
We also prove lower bounds for generating $(X,b(X))$ for boolean $b$, and in the case in which each bit $f_i$ is a smalldepth decision tree.
These lower bounds seem to be the first of their kind; the proofs use anticoncentration results for the sum of random variables.
item Lower bounds for generating distributions imply succinct data structures lower bounds. As a corollary of (1), we obtain the first lower bound for the membership problem of representing a set $S subseteq [n]$ of size $alpha n$, in the case where $1/alpha$ is a power of $2$: If queries ``$i in S$?'' are answered by nonadaptively probing $o(log n)$ bits, then the representation uses $ge log_2 inom{n}{alpha n} + Omega(log n)$ bits.
item Upper bounds complementing the bounds in (1) for various settings of parameters.
item Uniform randomized AC$^0$ circuits of $poly(n)$ size and depth $d = O(1)$ with error $e$ can be simulated by uniform randomized AC$^0$ circuits of $poly(n)$ size and depth $d+1$ with error $e + o(1)$ using $le (log n)^{O( log log n)}$ random bits.
Previous derandomizations [Ajtai and Wigderson '85; Nisan '91] increase the depth by a constant factor, or else have poor seed length. end{enumerate}

Hardness of Finding Independent Sets in Almost 3Colorable GraphsFor every $eps > 0$, and integer $q geq 3$, we show that given an $N$vertex graph that has an induced $q$colorable subgraph of size $(1eps)N$, it is NPhard to find an independent set of size $ frac{N}{q^2}$.

Approximation Algorithms for the EdgeDisjoint Paths Problem via Raecke DecompositionsWe study the EdgeDisjoint Paths with Congestion (EDPwC) problem in undirected networks in which we must integrally route a set of demands without causing large congestion on an edge. We present a $(polylog(n),poly(loglog n)))$approximation, which means that if there exists a solution that routes $X$ demands integrally on edgedisjoint paths (i.e. with congestion $1$), then the approximation algorithm can route $X/polylog(n)$ demands with congestion $poly(loglog n)$.
The best previous result for this problem was a $(n^{1/eta},eta)$approximation for $eta<log n$.

Computational Transition at the Uniqueness ThresholdThe hardcore model is a model of lattice gas systems which has received much attention in statistical physics, probability theory and theoretical computer science. It is the probability distribution over independent sets $I$ of a graph weighted proportionally to $lambda^{I}$ with fugacity parameter $lambda$. We prove that at the uniqueness threshold of the hardcore model on the $d$regular tree, approximating the partition function becomes computationally hard.
Specifically, we show that unless NP$=$RP there is no polynomial time approximation scheme for the partition function (the sum of such weighted independent sets) on graphs of maximum degree $d$ for fugacity $lambda_c(d) < lambda < lambda_c(d) + varepsilon(d)$ where $$lambda_c = frac{(d1)^{d1}}{(d2)^d}$$ is the uniqueness threshold on the $d$regular tree and $varepsilon(d)>0$ is a positive constant. Weitz produced an FPTAS for approximating the partition function when $0<lambda < lambda_c(d)$ so this result demonstrates that the computation threshold exactly coincides with the statistical physics phase transition thus confirming the main conjecture of [MWW, '09]. We further analyze the special case of $lambda=1, d=6$ and show there is no polynomial time approximation scheme for approximately counting independent sets on graphs of maximum degree $d= 6$, which is optimal, improving the previous bound of $d= 24$.
Our proof is based on specially constructed random bipartite graphs which act as gadgets in a reduction to MAXCUT. Building on the involved second moment method analysis of cite{MWW:09} and combined with an analysis of the reconstruction problem on the tree our proof establishes a strong version of ``replica'' method heuristics developed by theoretical physicists. The result establishes the first rigorous correspondence between the hardness of approximate counting and sampling with statistical physics phase transitions.
 All Talks
 Polynomial Learning of Distribution Families
 Agnostically learning under permutation invariant distributions
 Learning Convex Concepts from Gaussian Distributions with PCA
 Determinant Sums for Undirected Hamiltonicity
 Fighting Perebor: New and Improved Algorithms for Formula and QBF Satisfiability
 The Monotone Complexity of kCLIQUE on Random Graphs
 The Complexity of Distributions
 Hardness of Finding Independent Sets in Almost 3Colorable Graphs
 Approximation Algorithms for the EdgeDisjoint Paths Problem via Raecke Decompositions
 Computational Transition at the Uniqueness Threshold