The combination module works as word summarized using the output of phoneme recognition and tone recognition module as an input data. The output will be verified by the monosyllable pronunciation database which is generated from syllable rule based defined as /C(C)V(ː)(C)T/ where C,V,ː and T represent an initial consonant (cluster consonant), vowel (short or long) and lexical tone respectively.