2. HISTORY
Computers are extremely fast at numerical computations, far exceeding human capabilities. However, the human brain has many abilities that would be desirable in a computer. These include: the ability to quickly identify features, even in the presence of noise; to understand, interpret, and act on probabilistic or fuzzy notions (such as "Maybe it will rain tomorrow"); to make inferences and judgments based on past experiences and relate them to situations that have never been encountered before; and to suffer localized damage without losing complete functionality (fault tolerance). So even though the computer is faster than the human brain in numeric computations, the brain far outperforms the computer in other tasks. This is the underlying motivation for trying .to understand and model the human brain.
The neuron is the basic computational unit of the brain (see Fig. 1). A human brain has approximately 1011 neurons acting in parallel. The neurons are highly interconnected, with a typical neuron being connected to several thousand other neurons. [For more details of the biology of neurons see Thompson (1985).] Early work on modeling the brain started with models of the neuron. The McCulloch-Fitts model (McCulloch and Pitts 1943) of the neuron was one of the first attempts in this area. The McCulloch-Fitts model is a simple binary threshold unit (Fig. 2). The neuron receives a weighted sum of inputs from connected units, and outputs a value of one (fires) if this sum is greater than a thresh old. If the sum is less than the threshold, the model neuron outputs a zero value. Mathematically we can represent this
model as
where Yi is the output of neuron i, wi.i is the weight from neuron j to neuron i, x.i is the output of neuron j, f..Li is the
threshold for neuron i, and e is the activation function,
defined as
8(netinput) = { if netinput 0 otherwise.
Although this model is simple, it has been demonstrated that computationally it is equivalent to a digital computer. This means that any of the computations carried out on conventional digital computers can be accomplished with a set of interconnected McCulloch-Fitts neurons (Abu-Mostafa 1986).
In the early 1960s Rosenblatt developed a learning algorithm for a model he called the simple perceptron (Rosen blatt 1962). The simple perceptron consists of McCulloch Fitts model neurons that form two layers, input and output. The input neurons receive data from the external world, and the output neurons send information from the network to the outside world. Each input neuron is unidirectionally connected to all the output neurons. The model uses binary (-1 or 1) input and output units. Rosenblatt was able to demonstrate that if a solution to a classification problem "existed," his model would converge, or learn, the solution in a finite number of steps. For the problems in which he was interested a solution existed if the problem was lin early separable. (Linear separability means that a hyper plane, which is simply a line in two dimensions, exists that can completely delineate the classes that the classifier at tempts to identify. Problems that are linearly separable are only a special case of all possible classification problems.) A major blow to the early development of neural networks occurred when Minsky and Papert picked up on the linear separability limitation of the simple perceptron and published results demonstrating this limitation (Minsky and Papert 1969). Although Rosenblatt knew of these limita tions, he had not yet found a way to train other models to overcome this problem. As a result, interest and fund ing in neural networks waned. [It is interesting to note that while Rosenblatt, a psychologist, was interested in model ing the brain, Widrow, an engineer, was developing a similar model for signal processing applications called the Adaline (Widrow 1962).]
In the 1970s there was still a limited amount of research activity in the area of neural networks. Modeling the mem ory was the common thread of most of this work. (Ander son (1970) and Willshaw, Buneman, and Longuet-Higgins (1969) discuss some of this work.) Grossberg (1976) and von der Malsburg (1973) were developing ideas on competi tive learning, while Kohonen (1982) was developing feature maps. Grossberg (1983) was also developing his Adapiive Resonance Theory. Obviously, there was a great deal of work done during this period, with many important papers and ideas that are not presented in this paper. [For a more detailed description of the history see Cowen and Sharp (1988).]
Interest in neural networks renewed with the Hopfield model (Hopfield 1982) of a content-addressable memory. In contrast to the human brain a computer stores data as a look-up table. Access to this memory is made using ad dresses. The human brain does not go through this look up process; it "settles" to the closest match based on the