TechTalks from event: IEEE FOCS 2010

51st Annual IEEE Symposium on Foundations of Computer Science

  • A Fourier-analytic approach to Reed-Muller decoding Authors: Parikshit Gopalan
    We present a Fourier-analytic approach to list-decoding Reed-Muller codes over arbitrary finite fields. We use this to show that quadratic forms over any field are locally list-decodeable up to their minimum distance. The analogous statement for linear polynomials was proved in the celebrated works of Goldreich-Levin [GL89] and Goldreich-Rubinfeld-Sudan [GRS00]. Previously, tight bounds for quadratic polynomials were known only for q = 2; 3 [GKZ08]; the best bound known for other fields was the Johnson radius.

    Our approach departs from previous work on Reed-Muller decoding which relies on some form of self-correction [GRS00, AS03, STV01, GKZ08]. We seek to explain the good list-decoding properties of RM codes by using the rich structure in the weight distribution of these codes. We observe that the crux of the problem is to bound the number of low-weight codewords near a received word. We do this by applying ideas from Fourier analysis of Boolean functions to low-degree polynomials over finite fields, in conjunction with classical results about the structure of low-weight codewords.

  • Pseudorandom generators for CC0[p] and the Fourier spectrum of low-degree polynomials over finite fields Authors: Shachar Lovett and Partha Mukhopadhyay and Amir Shpilka
    In this paper we give the first construction of a pseudorandom generator, with seed length $O(log n)$, for $mathrm{CC}_0[p]$, the class of constant-depth circuits with unbounded fan-in $mathrm{MOD}_p$ gates, for some prime $p$. More accurately, the seed length of our generator is $O(log n)$ for any constant error $epsilon>0$. In fact, we obtain our generator by fooling distributions generated by low degree polynomials, over $mathbb{F}_p$, {em when evaluated on the Boolean cube}. This result significantly extends previous constructions that either required a long seed~cite{LubyVeWi93} or that could only fool the distribution generated by linear functions over $mathbb{F}_p$, when evaluated on the Boolean cube~cite{LovettReTrVa09,MekaZu09:small_bias}.

    Enroute of constructing our PRG, we prove two structural results for low degree polynomials over finite fields that can be of independent interest.

    egin{enumerate} item Let $f$ be an $n$-variate degree $d$ polynomial over $mathbb{F}_p$. Then, for every $epsilon>0$ there exists a subset $S subset [n]$, whose size depends only on $d$ and $epsilon$, such that $sum_{alpha in mathbb{F}_p^n: alpha e 0, alpha_S=0}|hat{f}(alpha)|^2 leq epsilon$. Namely, there is a constant size subset $S$ such that the total weight of the nonzero Fourier coefficients that do not involve any variable from $S$ is small.

    item Let $f$ be an $n$-variate degree $d$ polynomial over $mathbb{F}_p$. If the distribution of $f$ when applied to uniform zero-one bits is $epsilon$-far (in statistical distance) from its distribution when applied to biased bits, then for every $delta>0$, $f$ can be approximated over zero-one bits, up to error $delta$, by a function of a small number (depending only on $epsilon,delta$ and $d$) of lower degree polynomials. end{enumerate}

  • Matching Vector Codes / Local list decoding with a constant number of queries Authors: Zeev Dvir and Parikshit Gopalan and Sergey Yekhanin / Avraham Ben-Aroya and Klim Efremenko and Amnon Ta-Shma
    An (r,delta,epsilon)-locally decodable code encodes a k-bit message x to an N-bit codeword C(x), such that for every i in [k], the i-th message bit can be recovered with probability 1-epsilon, by a randomized decoding procedure that queries only r bits, even if the codeword C(x) is corrupted in up to delta*N locations.

    Recently a new class of locally decodable codes, based on families of vectors with restricted dot products has been discovered. We refer to those codes as Matching Vector (MV) codes. Several families of (r,delta,r*delta)-locally decodable MV codes have been obtained. While codes in those families were shorter than codes of earlier generations, they suffered from having large values of epsilon=r*delta, which meant that r-query MV codes could only handle error-rates below 1/r. Thus larger query complexity gave shorter length codes but at the price of less error-tolerance. No MV codes of super-constant number of queries capable of tolerating a constant fraction of errors were known to exist.

    In this paper we present a new view of matching vector codes and uncover certain similarities between MV codes and classical Reed Muller codes. Our view allows us to obtain deeper insights into the power and limitations of MV codes. Specifically,

    1. We show that existing families of MV codes can be enhanced to tolerate a large constant fraction of errors, independent of the number of queries. Such enhancement comes at a price of a moderate increase in the number of queries;

    2. Our construction yields the first families of matching vector codes of super-constant query complexity that can tolerate a constant fraction of errors. Our codes are shorter than Reed Muller LDCs for all values of r < log k / (log log k)^c, for some constant c;

    3. We show that any MV code encodes messages of length k to codewords of length at least k*2^{sqrt{ log k} }. Therefore MV codes do not improve upon Reed Muller LDCs for r > (log k)^{sqrt{log k}}.

  • Codes for Computationally Simple Channels: Explicit Constructions with Optimal Rate Authors: Venkatesan Guruswami and Adam Smith
    In this paper, we consider coding schemes for emph{computationally bounded} channels, which can introduce an arbitrary set of errors as long as (a) the fraction of errors is bounded by $p$ w.h.p. and (b) the process which adds the errors can be described by a sufficiently ``simple'' circuit. Codes for such channel models are attractive since, like codes for traditional adversarial errors, they can handle channels whose true behavior is emph{unknown} or emph{varying} over time.

    For three classes of channels, we provide explicit, efficiently encodable/decodable codes of optimal rate where only emph{in}efficiently decodable codes were previously known. In each case, we provide one encoder/decoder that works for emph{every} channel in the class. The encoders are randomized, and probabilities are taken over the (local, unknown to the decoder) coins of the encoder and those of the channel.

    Unique decoding for additive errors: We give the first construction of polytime encodable/decodable codes for emph{additive} (a.k.a. emph{oblivious}) channels that achieve the Shannon capacity $1-H(p)$. These are channels which add an arbitrary error vector $einit{n}$ of weight at most $pn$ to the transmitted word; the vector $e$ can depend on the code but not on the particular transmitted word. Such channels capture binary symmetric errors and burst errors as special cases.

    List-decoding for log-space channels: A emph{space-$S(n)$ bounded} channel reads and modifies the transmitted codeword as a stream, using at most $S(n)$ bits of workspace on transmissions of $n$ bits. For constant $S$, this captures many models from the literature, including emph{discrete channels with finite memory} and emph{arbitrarily varying channels}. We give an efficient code with optimal rate (up to $1-H(p)$) that recovers a short list containing the correct message with high probability for channels limited to emph{logarithmic} space.

    List-decoding for poly-time channels: For any const

  • A lower bound for dynamic approximate membership data structures Authors: Shachar Lovett and Ely Porat
    An approximate membership data structure is a randomized data structure for representing a set which supports membership queries. It allows for a small false positive error rate but has no false negative errors. Such data structures were first introduced by Bloom~cite{Bloom70} in the 1970's, and have since had numerous applications, mainly in distributed systems, database systems, and networks.

    The algorithm of Bloom is quite effective: it can store a set $S$ of size $n$ by using only $approx 1.44 n log_2(1/epsilon)$ bits while having false positive error $epsilon$. This is within a constant factor of the entropy lower bound of $n log_2(1/epsilon)$ for storing such sets~cite{CarterFlGiMaWe78}. Closing this gap is an important open problem, as Bloom filters are widely used is situations were storage is at a premium.

    Bloom filters have another property: they are dynamic. That is, they support the iterative insertions of up to $n$ elements. In fact, if one removes this requirement, there exist static data structures which receive the entire set at once and can almost achieve the entropy lower bound~cite{DietzfelbingerPa08,Porat09:bloom_static}; they require only $n log_2(1/epsilon)(1+o(1))$ bits.

    Our main result is a new lower bound for the memory requirements of any dynamic approximate membership data structure. We show that for any constant $epsilon>0$, any such data structure which achieves false positive error rate of $epsilon$ must use at least $C(epsilon) cdot n log_2(1/epsilon)$ memory bits, where $C(epsilon)>1$ depends only on $epsilon$. This shows that the entropy lower bound cannot be achieved by dynamic data structures for any constant error rate.

  • Lower Bounds for Near Neighbor Search via Metric Expansion Authors: Rina Panigrahy and Kunal Talwar and Udi Wieder
    In this paper we show how the complexity of performing nearest neighbor (NNS) search on a metric space is related to the expansion of the metric space. Given a metric space we look at the graph obtained by connecting every pair of points within a certain distance $r$ . We then look at various notions of expansion in this graph relating them to the cell probe complexity of NNS for randomized and deterministic, exact and approximate algorithms. For example if the graph has node expansion $Phi$ then we show that any deterministic $t$-probe data structure for $n$ points must use space $S$ where $(St/n)^t > Phi$. We show similar results for randomized algorithms as well. These relationships can be used to derive most of the known lower bounds in the well known metric spaces such as $l_1$, $l_2$, $l_infty$ by simply computing their expansion. In the process, we strengthen and generalize our previous results~cite{PTW08}. Additionally, we unify the approach in~cite{PTW08} and the communication complexity based approach. Our work reduces the problem of proving cell probe lower bounds of near neighbor search to computing the appropriate expansion paramete
  • Distance Oracles Beyond the Thorup?Zwick Bound Authors: Mihai Patrascu and Liam Roditty
    We give the first improvement to the space/approximation trade-off of distance oracles since the seminal result of Thorup and Zwick [STOC'01].

    For unweighted graphs, our distance oracle has size O(n^{5/3}) = O(n^{1.66cdots}) and, when queried about vertices at distance d, returns a path of length 2d+1.

    For weighted graphs with m=n^2/alpha edges, our distance oracle has size O(n^2 / sqrt[3]{alpha}) and returns a factor 2 approximation.

    Based on a widely believed conjecture about the hardness of set intersection queries, we show that a 2-approximate distance oracle requires space Omega~(n^2 / sqrt{alpha}). For unweighted graphs, this implies an Omega~(n^{1.5}) space lower bound to achieve approximation 2d+1.