-
Upload Video
videos in mp4/mov/flv
close
Upload video
Note: publisher must agree to add uploaded document -
Upload Slides
slides or other attachment
close
Upload Slides
Note: publisher must agree to add uploaded document -
Feedback
help us improve
close
Feedback
Please help us improve your experience by sending us a comment, question or concern
Please help transcribe this video using our simple transcription tool. You need to be logged in to do so.
Description
To explore or exploit? In this paper, we discuss the long-standing exploration-exploration dilemma in context of designing a learning controller for stunt-style driving with scarce samples. By making an efficient use of a single demonstration by an expert, our algorithm leverages our intuitive understanding of driving to extract a coarse dynamics model from the collected driving data, then formulate the policy search in a setting of gradient update with a specially designed cost function. Both theoretical and empirical results are detailed and discussed.