Many works are on multi-view face detection [10, 17, 27, 7]. They adopt a similar
divide and conquer strategy: different face detectors are trained separately
under different viewpoints or head poses, which are roughly quantized and estimated
simultaneously. Because the viewpoint estimation problem is difficult as
well and quantization also introduces inaccuracy, such training is more difficult
and resulting detectors are usually slower or not accurate enough