Fig. 2. Word error rate histogram of BN utterances and videos. We built a Good-Turing smoothed 4-gram on the News data and measured a 509 perplexity on the election test set as opposed to the 174 we obtained with the baseline language model. This clearly shows that the transcripts of the speeches differs quite significantly in style from the political news content. Computing a perplexity minimizing mixing weight, we interpolated the News and baseline LM with a 0.11 weight, marginally improving the perplexity.
In addition, we added all lexical items seen in the News sam- ple but not present in our BN baseline vocabulary. This expanded our vocabulary from 71k to 88k. Pronunciations for the new lexical items were generated by Pronunciation By Analogy [7] which was trained on the base Pronlex derived vocabulary7 . Although this per- formed well on important novel lexical items like “superdelegate”, it did poorly on some of the names. For example “Barack” was ini- tially “/b/ /ae/ /r/ /ae/ /k/” as opposed to “/b/ /aa/ /r/ /aa/ /k/” and “Putin” was “/p/ /ah/ /t/ /ih/ /n/” as opposed to “/p/ /uw/ /t/ /ih/ /n/”. We manually checked and corrected the most frequent items from the test set. The resulting adapted system obtained a 36.4% WER and the OOV rate of 0.5%.