We are in the middle of the third wave of interest in artificial neural networks as the leading paradigm for machine learning. The first wave dates back to the 1950s, the second to the 1980s, and the third to the 2010s. The following paper by Krizhevksy, Sutskever and Hinton (henceforth KSH) is the paper most responsible for this third wave. Here, I sketch the intellectual history surrounding this work.
The current wave has been called "deep learning" because of the emphasis on having multiple layers of neurons between the input and the output of the neural network; the main architectural design features, however, remain the same as in the second wave, the 1980s. Central to that era was the publication of the back-propagation algorithm for training multilayer perceptrons by Rumelhart, Hinton and Williams.7 This algorithm, a consequence of the chain rule of calculus, had been noted before, for example, by Werbos.8 However, the Rumelhart et. al. version was significantly more impactful as it was accompanied by interest in distributed representations of knowledge in cognitive science and artificial intelligence, contrasted with the symbolic representations favored by the main-stream researchers.
No entries found