Vanishing Gradients, Xavier Initialization, Batch Normaization and Highway Architectures (Resnets, LSTMs and GRUs)

Slides

Slides from Kaiming He's Tutorial at ICML 2016.

Christopher Olah's blog post on RNNs

The original resnet paper.

Schmidhuber's highway networks paper.