The gradient descent algorithm for training the multilayer
perceptron is found slow especially when getting close
to a minimum. One of the reasons is that it uses a fixed-size
step. In order to take into account the changing curvature of
the error surface