
Upload Video
videos in mp4/mov/flv
close
Upload video
Note: publisher must agree to add uploaded document 
Upload Slides
slides or other attachment
close
Upload Slides
Note: publisher must agree to add uploaded document 
Feedback
help us improve
close
Feedback
Please help us improve your experience by sending us a comment, question or concern
Please help transcribe this video using our simple transcription tool. You need to be logged in to do so.
Description
For a broad class of practically relevant distribution properties, which includes entropy and support size, nearly all of the proposed estimators have an especially simple form. Given a set of independent samples from a discrete distribution, these estimators tally the vector of summary statisticsthe number of species seen once, twice, etc. in the sampleand output the dot product between these summary statistics, and a fixed vector of coefficients. We term such estimators \emph{linear}.
This historical proclivity towards linear estimators is slightly perplexing, since, despite many efforts over nearly 60 years, all proposed such estimators have significantly suboptimal convergence.
Our main result, in some sense vindicating this insistence on linear estimators, is that for any property in this broad class, there exists a nearoptimal linear estimator. Additionally, we give a practical and polynomialtime algorithm for constructing such estimators for any given parameters.
While this result does not yield explicit bounds on the sample complexities of these estimation tasks, we leverage the insights provided by this result, to give explicit constructions of a linear estimators for three properties: entropy, $L_1$ distance to uniformity, and for pairs of distributions, $L_1$ distance.Our entropy estimator, when given $O(\frac{n}{\eps \log n})$ independent samples from a distribution of support at most $n,$ will estimate the entropy of the distribution to within accuracy $\epsilon$, with probability of failure $o(1/poly(n)).$ From recent lower bounds, this estimator is optimal, to constant factor, both in its dependence on $n$, and its dependence on $\epsilon.$ In particular, the inverselinear convergence rate of this estimator resolves the main open question of [VV11], which left open the possibility that the error decreased only with the square root of the number of samples.
Our distance to uniformity estimator, on given $O(\frac{m}{\eps^2\log m})$ independent samples from any distribution, returns an $\eps$accurate estimate of the $L_1$ distance to the uniform distribution of support $m$. This is the first sublinearsample estimator for this problem, and is constantfactor optimal, for constant $\epsilon$.
Finally, our framework extends naturally to properties of pairs of distributions, including estimating the $L_1$ distance and KLdivergence between pairs of distributions. We give an explicit linear estimator for estimating $L_1$ distance to accuracy $\epsilon$ using $O(\frac{n}{\eps^2\log n})$ samples from each distribution, which is constantfactor optimal, for constant $\epsilon$.