The objective of video processing for AV ASR is to pre-process
images and extract features suitable for viseme recognition. The
form of the images and the features to be extracted can vary
substantially. Images can be in colour, grayscaled or binarised,
showing the full face, mouth region or only the lips. Orientation
can be frontal or profile [6].