• 0 Posts
  • 6 Comments
Joined 1 year ago
cake
Cake day: July 8th, 2023

help-circle
  • The specifics are a bit different, but the main ideas are much older than this, I’ll leave here the Wikipedia

    "Frank Rosenblatt, who published the Perceptron in 1958,[10] also introduced an MLP with 3 layers: an input layer, a hidden layer with randomized weights that did not learn, and an output layer.[11][12] Since only the output layer had learning connections, this was not yet deep learning. It was what later was called an extreme learning machine.[13][12]

    The first deep learning MLP was published by Alexey Grigorevich Ivakhnenko and Valentin Lapa in 1965, as the Group Method of Data Handling.[14][15][12]

    The first deep learning MLP trained by stochastic gradient descent[16] was published in 1967 by Shun’ichi Amari.[17][12] In computer experiments conducted by Amari’s student Saito, a five layer MLP with two modifiable layers learned internal representations required to classify non-linearily separable pattern classes.[12]

    In 1970, Seppo Linnainmaa published the general method for automatic differentiation of discrete connected networks of nested differentiable functions.[3][18] This became known as backpropagation or reverse mode of automatic differentiation. It is an efficient application of the chain rule derived by Gottfried Wilhelm Leibniz in 1673[2][19] to networks of differentiable nodes.[12] The terminology “back-propagating errors” was actually introduced in 1962 by Rosenblatt himself,[11] but he did not know how to implement this,[12] although Henry J. Kelley had a continuous precursor of backpropagation[4] already in 1960 in the context of control theory.[12] In 1982, Paul Werbos applied backpropagation to MLPs in the way that has become standard.[6][12] In 1985, David E. Rumelhart et al. published an experimental analysis of the technique.[7] Many improvements have been implemented in subsequent decades.[12]"