Please help transcribe this video using our simple transcription tool. You need to be logged in to do so.
We study learning of initial intervals in the prediction model. We show that for each distribution D over the domain, there is an algorithm AD, whose probability of a mistake in round m is at most ( + o(1))/m. We also show that the best possible bound that can be achieved in the case in which the same algorithm A must be applied for all distributions D is at least (1??e - o(1))1?m > (3?5-o(1))1?m. Informally, "knowing" the distribution D enables an algorithm to reduce its error rate by a constant factor strictly greater than 1. As advocated by Ben-David et al. (2008), knowledge of D can be viewed as an idealized proxy for a large number of unlabeled examples.
Questions and AnswersYou need to be logged in to be able to post here.