extraction algorithm from highly imperfect ASR results
(WER 75 percent) has been proposed by using lecture
course-related sources such as text books and slide files [4].
Overall, most of those lecture speech recognition systems
have low recognition rate, the WERs of audio lectures are
approximately 40-85 percent. The poor recognition results
limit the further indexing efficiency. Therefore, how to continuously
improve ASR accuracy for lecture videos is still
an unsolved problem.