Here, P denotes the patch model, I denotes the i'th training image I(a:b, c:d) denotes the rectangular region whose top-left and bottom-right corners are located at (a, c) and (b, d), respectively. The period symbol denotes the inner product operation and R denotes the ideal response map. The solution to this equation is a patch model that generates response maps that are, on average, closest to the ideal response map as measured using the least-squares criterion. An obvious choice for the ideal response map, R, is a matrix with zeros everywhere except at the center (assuming the training image patches are centered at the facial feature of interest). In practice, since the images are hand-labeled, there will always be an annotation error. To account for this, it is common to describe R as a decaying function of distance from the center.
A good choice is the 2D-Gaussian distribution, which is equivalent to assuming the annotation error is Gaussian distributed. A visualization of this setup is shown in the following figure for the left outer eye corner: