Conclusion
This paper has reported the research work on emotional speech classification by using speech from Thai drama corpus. There are three methods (baseline system, emotional segmentation and binary classification) that were applied implementing in this work. The results of our experiments have shown that the binary classification with no emotional segmentation provided the best accuracy, which was 53 percent. In addition, another test set named non-professional voice actor was used to evaluate the binary classification in the last experiment. The result of these extra speakers, whose utterances do not contain in training set, was not greater than 25 percent. We can assume that our emotional classification was not having enough amounts of data for training speaker independent models. However, this research did not consider about meaning of words in transcripts. The word meaning would improve the accuracy of emotional classification.