Please help transcribe this video using our simple transcription tool. You need to be logged in to do so.


In this work we resolve the long-outstanding problem of how to effectively train recurrent neural networks (RNNs) on complex and difficult sequence modeling problems which may contain long-term data dependencies. Utilizing recent advances in the Hessian-free optimization approach citep{hf}, together with a novel damping scheme, we successfully train RNNs on two sets of challenging problems. First, a collection of pathological synthetic datasets which are known to be impossible for standard optimization approaches (due to their extremely long-term dependencies), and second, on three natural and highly complex real-world sequence datasets where we find that our method significantly outperforms the previous state-of-the-art method for training neural sequence models: the Long Short-term Memory approach of citet{lstm}. Additionally, we offer a new interpretation of the generalized Gauss-Newton matrix of citet{schraudolph} which is used within the HF approach of Martens.

Questions and Answers

You need to be logged in to be able to post here.