On the other hand, ‘‘bottom–up’’ selection only depends on the visual input provided instantaneously or in the very recent past (as in immediately preceding frames of a movie). As such, it is not only much easier to control but it is also easier to quantify the correlation between input and resulting behavior. For this reason, Koch and Ullman [23] proposed that bottom–up attention is a suitable candidate for detailed computational models of selective attention. Specifically, they proposed that bottom–up attention is directed to salient parts of the visual scene, and they proposed the concept of a saliency map. This is a topographic map of the visual field whose scalar value is the saliency at the respective location. Saliency is computed at multiple scales from the local differences in visual submodalities (color, orientation, etc.). If both the basic premise that bottom–up attention is attracted by salience as well as their concept how salience is computed are correct, attentional control is then reduced to finding the local maxima in the saliency map and assigning the successively visited foci of attention to those maxima in order of decreasing peak value. This results in a ‘‘covert attentional scan path’’; see Fig. 1 for an illustrative example, in analogy to the sequence of eye movements in overt attention.