Deep Residual Learning for Image Recognition
Summary
Residual Neural Networks (ResNets) is the current state-of-the-art Convolutional Neural Network architecture. The key difference between ResNets and other popular architectures is the use of “shortcut-connections”. Basically, it was determined that “very deep” neural networks, or DNNs with a large number of stacked layers, were exhibiting a degradation in accuracy not caused by overfitting. Residual Layers in a DNN attempt to learn a “residual” mapping, which is the desired hidden mapping minus the input to the residual layer. The input to the layer is then “added” back in later; hence the name “shortcut-connection”. One of the motivations for doing this is that it is easy for the network to learn to send the residual to 0 (and hence learn an identity mapping), which is useful if an identity mapping is optimal for the situation. Learning an identity mapping is very difficult for a stack of nonlinear layers. These residual layers do not add any extra parameters.
Evidence
The authors tested ResNets on ImageNet and CIFAR-10, and won first place on the ILSVRC 2015 classification task.
Strengths
The idea is simple and effective. The material is presented clearly with lots of data to back it up.
Interesting related works
Batch Norm, Highway Networks