•
Our deep model keeps the score map output by the current classifier and it serves as contextual information to support the decision at the next stage
•
Cascaded classifiers are jointly optimized instead of being trained sequentially
•
To avoid overfitting, a stage-wise pre-training scheme is proposed to regularize optimization
•
Simulate the cascaded classifiers by mining hard samples to train the network stage-by-stage