Convolutional networks


Performed considerably better than the state of the art at the time. Has 60 million parameters, 650,000 neurons and includes five convolutional layers.

The two ‘streams’ only exist to allow training on two GPUs.

ImageNet Classification with Deep Convolutional Neural Networks, Krizhevsky et al. (2012)


CNN that won the ILSVRC 2014 challenge. Composed of 9 inception layers.

Going Deeper with Convolutions, Szegedy et al. (2014)


A basic convolutional network, historically used for the MNIST dataset.

Gradient-based learning applied to document recognition, LeCun et al. (1998)

Residual network

An architecture that uses skip connections to create very deep networks. The original paper achieved 152 layers, 8 times deeper than VGG nets. Used for image recognition, winning first place in the ILSVRC 2015 classification task. Residual connections can also be used to create deeper RNNs such as Google’s 16-layer RNN encoder-decoder (Wu et al., 2016).

Uses shortcut connections performing the identity mapping, which are added to the outputs of the stacked layers. Based on the theory that is easier to optimise than . Each layer uses the equation

Similar but superior to highway networks as they do not introduce any extra parameters.

Deep Residual Learning for Image Recognition, He et al. (2015)


A CNN that secured the first and second place in the 2014 ImageNet localization and classification tracks, respectively. VGG stands for the team which submitted the model, Oxford’s Visual Geometry Group. The VGG model consists of 16–19 weight layers and uses small convolutional filters of size 3x3 and 1x1.

Very deep convolutional networks for large-scale image recognition, Simonyan and Zisserman (2015)