During the training process, the input vector (P×N), including N clinical instances and P attributes of the mammographic data (where P = 5 attributes including BIRADS, age, shape, margin, and density), were fed into the input vector processor. The input vector processor first normalized the five input attributes and then generated two additional attributes, including the combined products of “Age*BIRADS” and “Shape*BIRADS.” Thus, the output data matrix of the input vector processor, which included N clinical instances and 7 attributes, was used for the two-stage neural network classification unit. The classification unit propagated all input patterns for determining all outputs. After comparing the outputs of the model with the target output class, an error was obtained and multiplied by a scaling parameter, which was adjusted by the learning rate controller. Next, the weights were updated after the error was minimized at each stage through the weight update unit. The process was repeated until a sum of squared error (SSE) or MMSE was less than a pre-defined error value or until the training epochs were used up. Then, the resulting weights were stored in the training weights unit.