Please help transcribe this video using our simple transcription tool. You need to be logged in to do so.


Many machine learning tasks can be ex- pressed as the transformationor transduc- tion of input sequences into output se- quences: speech recognition, machine trans- lation, protein secondary structure prediction and text-to-speech to name but a few. One of the key challenges in sequence transduction is learning to represent both the input and output sequences in a way that is invariant to sequential distortions such as shrinking, stretching and translating. Recurrent neu- ral networks (RNNs) are a powerful sequence learning architecture that has proven capa- ble of learning such representations. How- ever RNNs traditionally require a pre-defined alignment between the input and output se- quences to perform transduction. This is a severe limitation since finding the alignment is the most difficult aspect of many sequence transduction problems. Indeed, even deter- mining the length of the output sequence is often challenging. This paper introduces an end-to-end, probabilistic sequence transduc- tion system, based entirely on RNNs, that re- turns a distribution over output sequences of all possible lengths and alignments for any in- put sequence. Experimental results are pro- vided on the TIMIT speech corpus.

Questions and Answers

You need to be logged in to be able to post here.