FOCS 2011
TechTalks from event: FOCS 2011
We will be uploading the videos for FOCS 2011 during the week of Nov 28th 2011. If you find any discrepancy, please let us know by clicking on report error link on talk page. If you did not permit the video to be published and by mistake we have published your talk, please notify us immediately at support AT weyond.com
5A

On the Power of Adaptivity in Sparse RecoveryThe goal of (stable) sparse recovery is to recover a $k$sparse approximation $x^*$ of a vector $x$ from linear measurements of $x$. Specifically, the goal is to recover $x^*$ such that \[ \norm{p}{xx^*} \le C \min_{k\text{sparse } x'} \norm{q}{xx'} \] for some constant $C$ and norm parameters $p$ and $q$. It is known that, for $p=q=1$ or $p=q=2$, this task can be accomplished using $m=O(k \log (n/k)$ {\em nonadaptive} measurements~\cite{CRT06:StableSignal} and that this bound is tight~\cite{DIPW,FPRU}. In this paper we show that if one is allowed to perform measurements that are {\em adaptive} , then the number of measurements can be considerably reduced. Specifically, for $C=1+\epsilon$ and $p=q=2$ we show:* A scheme with $m=O(\frac{1}{\eps}k \log \log (n\eps/k))$ measurements that uses $O(\sqrt{\log k} \cdot \log \log (n\eps/k))$ rounds. This is a significant improvement over the {\em best possible} nonadaptive bound. * A scheme with $m=O(\frac{1}{\eps}k \log (k/\eps) + k \log (n/k))$ measurements that uses {\em two} rounds. This improves over the {\em best known} nonadaptive bound. To the best of our knowledge, these are the first results of this type.

(1+eps)Approximate Sparse RecoveryThe problem central to sparse recovery and compressive sensing is that of \emph{stable sparse recovery}: we want a distribution $\mathcal{A}$ of matrices $A \in \R^{m \times n}$ such that, for any $x \in \R^n$ and with probability $1  \delta > 2/3$ over $A \in \mathcal{A}$, there is an algorithm to recover $\hat{x}$ from $Ax$ with \begin{align} \norm{p}{\hat{x}  x} \leq C \min_{k\text{sparse } x'} \norm{p}{x  x'} \end{align} for some constant $C > 1$ and norm $p$. The measurement complexity of this problem is well understood for constant $C > 1$. However, in a variety of applications it is important to obtain $C = 1+\eps$ for a small $\eps > 0$, and this complexity is not well understood. We resolve the dependence on $\eps$ in the number of measurements required of a $k$sparse recovery algorithm, up to polylogarithmic factors for the central cases of $p=1$ and $p=2$. Namely, we give new algorithms and lower bounds that show the number of measurements required is $k/\eps^{p/2} \textrm{polylog}(1/\eps)$. We also give matching bounds when the output is required to be $k$sparse, in which case we achieve $k/\eps^p \textrm{polylog}(1/\eps)$. This shows the distinction between the complexity of sparse and nonsparse outputs is fundamental.

NearOptimal ColumnBased Matrix ReconstructionWe consider lowrank reconstruction of a matrix using its columns and we present asymptotically optimal algorithms for both spectral norm and Frobenius norm reconstruction. The main tools we introduce to obtain our results are: (i) the use of fast approximate SVDlike decompositions for column reconstruction, and (ii) two deterministic algorithms for selecting rows from matrices with orthonormal columns, building upon the sparse representation theorem for decompositions of the identity that appeared in~\cite{BSS09}.

Near Linear Lower Bound for Dimension Reduction in L1Given a set of $n$ points in $\ell_{1}$, how many dimensions are needed to represent all pairwise distances within a specific distortion ? This dimensiondistortion tradeoff question is well understood for the $\ell_{2}$ norm, where $O((\log n)/\epsilon^{2})$ dimensions suffice to achieve $1+\epsilon$ distortion. In sharp contrast, there is a significant gap between upper and lower bounds for dimension reduction in $\ell_{1}$. A recent result shows that distortion $1+\epsilon$ can be achieved with $n/\epsilon^{2}$ dimensions. On the other hand, the only lower bounds known are that distortion $\delta$ requires $n^{\Omega(1/\delta^2)}$ dimension and that distortion $1+\epsilon$ requires $n^{1/2O(\epsilon \log(1/\epsilon))}$ dimensions. In this work, we show the first near linear lower bounds for dimension reduction in $\ell_{1}$. In particular, we show that $1+\epsilon$ distortion requires at least $n^{1O(1/\log(1/\epsilon))}$ dimensions. Our proofs are combinatorial, but inspired by linear programming. In fact, our techniques lead to a simple combinatorial argument that is equivalent to the LP based proof of BrinkmanCharikar for lower bounds on dimension reduction in $\ell_{1}$.
5B

The Complexity of Quantum States  a combinatorial approachThe classical description of quantum states is in general exponential in the number of qubits. Can we get polynomial descriptions for more restricted sets of states such as ground states of interesting subclasses of local Hamiltonians? This is the basic problem in the study of the complexity of ground states, and requires an understanding of multiparticle entanglement and quantum correlations in such states. We propose a combinatorial approach to this question, based on a reformulation of the detectability lemma introduced by us in the context of quantum gap amplification \cite{ref:Aha09b}. We give an alternative proof of the detectability lemma which is not only simple and intuitive, but also removes a key restriction in the original statement, making it more suitable for this new context. We then provide a one page proof of Hastings' proof that the correlations in the ground states of Gapped Hamiltonians decay exponentially with the distance, demonstrating the simplicity of the combinatorial approach for those problems. As our main application, we provide a combinatorial proof of Hastings' seminal 1D area law \cite{ref:Has07} for the special case of frustration free systems. Area laws provide a fundamental ingredient in the study of the complexity of ground states, since they offer a way to bound in a quantitative way the entanglement in such states. An intricate combinatorial analysis allows us to significantly improve the bounds achieved in Hastings proofs, and derive an exponentially better scaling in terms of the inverse spectral gap and the dimensionality of the particles. This holds out hope that the new approach might be a promising route towards resolving the 2D case and higher dimensions, which is one of the most important open questions in Hamiltonian complexity.

On the complexity of Commuting Local Hamiltonians, and tight conditions for Topological Order in such systemsThe local Hamiltonian problem plays the equivalent role of SAT in quantum complexity theory. Understanding the complexity of the intermediate case in which the constraints are quantum but all local terms in the Hamiltonian commute, is of importance for conceptual, physical and computational complexity reasons. Bravyi and Vyalyi showed in 2003 \cite{BV}, using a clever application of the representation theory of C*algebras, that if the terms in the Hamiltonian are all twolocal, the problem is in NP, and the entanglement in the ground states is local. The general case remained open since then. In this paper we extend this result beyond the twolocal case, to the case of threequbit interactions. We then extend our results even further, and show that NP verification is possible for threewise interaction between qutrits as well, as long as the interaction graph is planar and also "nearly Euclidean" in some welldefined sense. The proofs imply that in all such systems, the entanglement in the ground states is local. These extensions imply an intriguing sharp transition phenomenon in commuting Hamiltonian systems: the ground spaces of 3local "physical" systems based on qubits and qutrits are diagonalizable by a basis whose entanglement is highly local, while more involved interactions (the particle dimensionality or the locality of the interaction is larger) can already exhibit topological order; In particular, for those parameters, there exist Hamiltonians all of whose groundstates have entanglement which spreads over scales proportional to the size of the system, such as Kitaev's Toric Code system. This has a direct implication to the developing theory of Topological Order, since it shows that one cannot improve on the parameters to construct topological order systems based on commuting Hamiltonians. This is of particular interest in light of the recent proofs by Bravyi, Hastings and Michalakis

Quantum query complexity of state conversionStateconversion generalizes query complexity to the problem of converting between two inputdependent quantum states by making queries to the input. We characterize the complexity of this problem by introducing a natural informationtheoretic norm that extends the Schur product operator norm. The complexity of converting between two systems of states is given by the distance between them, as measured by this norm. In the special case of function evaluation, the norm is closely related to the general adversary bound, a semidefinite program that lowerbounds the number of input queries needed by a quantum algorithm to evaluate a function. We thus obtain that the general adversary bound characterizes the quantum query complexity of any function whatsoever. This generalizes and simplifies the proof of the same result in the case of boolean input and output. Also in the case of function evaluation, we show that our norm satisfies a remarkable composition property, implying that the quantum query complexity of the composition of two functions is at most the product of the query complexities of the functions, up to a constant. Finally, our result implies that discrete and continuoustime query models are equivalent in the boundederror setting, even for the general stateconversion problem.

Optimal bounds for quantum bit commitmentBit commitment is a fundamental cryptographic primitive with numerous applications. Quantum information allows for bit commitment schemes in the information theoretic setting where no dishonest party can perfectly cheat. The previously bestknown quantum protocol by Ambainis achieved a cheating probability of at most 3/4. On the other hand, Kitaev showed that no quantum protocol can have cheating probability less than 1/sqrt{2}(his lower bound on coin flipping can be easily extended to bit commitment). Closing this gap has since been an important open question. In this paper, we provide the optimal bound for quantum bit commitment. First, we show a lower bound of approximately 0.739, improving Kitaev's lower bound. For this, we present some generic cheating strategies for Alice and Bob and conclude by proving a new relation between the trace distance and fidelity of two quantum states. Second, we present an optimal quantum bit commitment protocol which has cheating probability arbitrarily close to $0.739$. More precisely, we show how to use any weak coin flipping protocol with cheating probability 1/2 + eps in order to achieve a quantum bit commitment protocol with cheating probability 0.739 + O(eps). We then use the optimal quantum weak coin flipping protocol described by Mochon. Last, in order to stress the fact that our protocol uses quantum effects beyond the weak coin flip, we show that any classical bit commitment protocol with access to perfect weak (or strong) coin flipping has cheating probability at least 3/4.
6A

Streaming Algorithms via Precision SamplingA technique introduced by Indyk and Woodruff [STOC 2005] has inspired several recent advances in datastream algorithms. We show that a number of these results follow easily from the application of a single probabilistic method called {\em Precision Sampling}. Using this method, we obtain simple datastream algorithms that maintain a randomized sketch of an input vector $x=(x_1,\ldots x_n)$, which is useful for the following applications: \begin{itemize} \item Estimating the $F_k$moment of $x$, for $k>2$. \item Estimating the $\ell_p$norm of $x$, for $p\in[1,2]$, with small update time. \item Estimating cascaded norms $\ell_p(\ell_q)$ for all $p,q>0$. \item$\ell_1$ sampling, where the goal is to produce an element $i$ with probability (approximately) $x_i/\x\_1$. It extends to similarly defined $\ell_p$sampling, for $p\in [1,2]$. \end{itemize} For all these applications the algorithm is essentially the same: premultiply the vector $x$ entrywise by a wellchosen random vector, and run a heavyhitter estimation algorithm on the resulting vector. Our sketch is a linear function of $x$, thereby allowing general updates to the vector $x$. Precision Sampling itself addresses the problem of estimating a sum $\sum_{i=1}^n a_i$ from weak estimates of each real $a_i\in[0,1]$. More precisely, the estimator first chooses a desired precision $u_i\in(0,1]$ for each $i\in[n]$, and then it receives an estimate of every $a_i$ within additive $u_i$. Its goal is to provide a good approximation to $\sum a_i$ while keeping a tab on the cost $\sum_i (1/u_i)$. Here we refine previous work [Andoni, Krauthgamer, and Onak, FOCS 2010] which shows that as long as $\sum a_i=\Omega(1)$, a good multiplicative approximation can be achieved using total precision of only $O(n\log n)$.

Steiner ShallowLight Trees are Exponentially Lighter than Spanning OnesFor a pair of parameters alpha \ge 1, beta \ge 1, a spanning tree T of a weighted undirected nvertex graph G = (V,E,w) is called an (alpha,beta)shallowlight tree (shortly, (alpha,beta)SLT) of G with respect to a designated vertex rt in V if (1) it approximates all distances from rt to other vertices up to a factor of alpha, and (2) its weight is at most beta times the weight of the minimum spanning tree MST(G) of G. The parameter alpha (respectively, beta) is called the rootdistortion (resp., lightness) of the tree T. Shallowlight trees (SLTs) constitute a fundamental graph structure, with numerous theoretical and practical applications. In particular, they were used for constructing spanners, in network design, for VLSIcircuit design, for various data gathering and dissemination tasks in wireless and sensor networks, in overlay networks, and in the messagepassing model of distributed computing. Tight tradeoffs between the parameters of SLTs were established by Awerbuch, Baratz and Peleg, PODC'90 and Khuller, Raghavachari and Young, SODA'93. They showed that for any eps > 0 there always exist (1+eps,O(1/eps))SLTs, and that the upper bound beta = O(1/eps) on the lightness of SLTs cannot be improved. In this paper we show that using Steiner points one can build SLTs with logarithmic lightness, i.e., beta = O(log 1/eps). This establishes an \emph{exponential separation} between spanning SLTs and Steiner ones. One particularly remarkable point on our tradeoff curve is eps = 0. In this regime our construction provides a \emph{shortestpath tree} with weight at most O(log n) * w(MST(G)). Moreover, we prove matching lower bounds that show that all our results are tight up to constant factors. Finally, on our way to these results we settle (up to constant factors) a number of open questions that were raised by Khuller et al. in SODA'93.

Fully dynamic maximal matching in O(log n) update timeWe present an algorithm for maintaining maximal matching in a graph under addition and deletion of edges. Our data structure is randomized that takes $O( \log n)$ expected amortized time for each edge update where $n$ is the number of vertices in the graph. While there is a trivial $O(n)$ algorithm for edge update, the previous best known result for this problem for a graph with $n$ vertices and $m$ edges is $O( {(n+ m)}^{0.7072})$ which is sublinear only for a sparse graph. To the best of our knowledge this is the first polylog update time for maximal matching that implies an exponential improvement from the previous results. For the related problem of maximum matching, Onak and Rubinfield \cite{onak} designed a randomized data structure that achieves $O(\log^2 n)$ amortized time for each update for maintaining a $c$approximate maximum matching for some large constant $c$. In contrast, we can maintain a factor two approximate maximum matching in $O(\log n )$ expected time per update as a direct corollary of the maximal matching scheme. This in turn also implies a two approximate vertex cover maintenance scheme that takes $O(\log n )$ expected time per update.

Which Networks Are Least Susceptible to Cascading Failures?The resilience of networks to various types of failures is an undercurrent in many parts of graph theory and network algorithms. In this paper we study the resilience of networks in the presence of {\em cascading failures}  failures that spread from one node to another across the network structure. One finds such cascading processes at work in the kind of contagious failures that spread among financial institutions during a financial crisis, through nodes of a power grid or communication network during a widespread outage, or through a human population during the outbreak of an epidemic disease. A widely studied model of cascades in networks assumes that each node $v$ of the network has a threshold $\ell(v)$, and fails if it has at least $\ell(v)$ failed neighbors. We assume that each node selects a threshold $\ell(v)$ independently using a probability distribution $\mu$. Our work centers on a parameter that we call the $\mu$risk of a graph: the maximum failure probability of any node in the graph, in this threshold cascade model parameterized by threshold distribution $\mu$. This defines a very broad class of models; for example, the large literature on edge percolation, in which propagation happens along edges that are included independently at random with some probability $p$, takes place in a small part of the parameter space of threshold cascade models, and one where the distribution $\mu$ is monotonically decreasing with the threshold. In contrast we want to study the whole space, including threshold distributions with qualitatively different behavior, such as those that are sharply increasing. We develop techniques for relating differences in $\mu$risk to the structures of the underlying graphs. This is challenging in large part because, despite the simplicity of its formulation, the threshold cascade model has been very hard to analyze for arbitrary graphs $G$ and arbitrary threshold distributions $\mu$. It turns out that when selecting among a set of graphs to minimize the $\mu$risk, the result depends quite intricately on $\mu$. We develop several techniques for evaluating the $\mu$risk of $d$regular graphs. For $d=2$ we are able to solve the problem completely: the optimal graph is always a clique (i.e.\ triangle) or tree (i.e.\ infinite path), although which graph is better exhibits a surprising nonmonotonicity as the threshold parameters vary. When $d>2$ we present a technique based on powerseries expansions of the failure probability that allows us to compare graphs in certain parts of the parameter space, deriving conclusions including the fact that as $\mu$ varies, at least three different graphs are optimal among $3$regular graphs. In particular, the set of optimal 3regular graphs includes one which is neither a clique nor a tree.