The problem of face tracking can be posed as that of finding an efficient and robust way to combine the independent detections of various facial features with the geometrical dependencies they exhibit in order to arrive at an accurate estimate of facial feature locations in each image of a sequence. With this in mind, it is perhaps worth considering whether geometrical dependencies are at all necessary. In the following figure, the results of detecting the facial features with and without
geometrical constraints are shown. These results clearly highlight the benefit of capturing the spatial interdependencies between facial features. The relative performance of these two approaches is typical, whereby relying strictly on the detections leads to overly noisy solutions. The reason for this is that the response maps for each facial feature cannot be expected to always peak at the correct location. Whether due to image noise, lighting changes, or expression variation, the
only way to overcome the limitations of facial feature detectors is by leveraging the geometrical relationship they share with each other.