In this paper, a connectionist memory model will be presented that is capable of effective
sequential learning, exhibiting gradual forgetting where a standard backpropagation
architecture forgets catastrophically. The proposed architecture relies on separating the
previously learned representations from those that are currently being learned. Further —
and crucially — a method is described in which an approximation of the previously learned
data (not the original patterns themselves, which the network will not see again) will be
extracted from the network and will be mixed in with the new patterns to be learned.