Mapping facial tracking data to avatars is very challenging
and time consuming, where a simple, yet efficient approach is
strongly required. State-of-the-art methods are either vulnerable
to noise or heavily reliant on complicated sensor devices.
To deal with the noisy data, and without using a motion capture
device, we present a novel vision-based facial expression
animation framework by applying facial hierarchical model
on pre-processed Motion Capture (MoCap) data. Our approach
uses a facial tracking algorithm to extract rigid head
pose and a set of expression motion parameters from each
video frame. We factorize the MoCap data as prior knowledge
to filter the low-quality 2D signals. In addition, a facial
hierarchical model is established by the Hierarchical Gaussian
Process Latent Variable Model (HGPLVM) to synthesize
the holistic facial expression. Experimental results demonstrate
the effectiveness of our system.