From Table 5, the best accuracy of the emotional speech classification was one emotional segment. It could imply that almost all utterances were perfectly annotated with only one emotional state. Since having more emotional state in an utterance, it affected less accuracy following the number of emotional segments.