this technique typically achieves much faster learning compared to
common random weight initialization [45]. Since this algorithm
chooses values in order to distribute the active region of each
neuron in the layer approximately evenly across the layer's input
space, it has advantages over purely random initial values. First, few
units are wasted since the active regions of all the units are in the
input space. Second, training works faster since each area of the
input space has active regions [52].
The initialization of the weights from the input units to the
hidden units is accomplished by distributing the initial weights and
basis so that, for each input pattern, it is likely that the net input to
one of the hidden units will be in the range in which that hidden
neuron will learn most readily. The procedure consists of following
steps [45]: