is when sentences contain unknown
words, such as name entities and loan words written in
Thai. Techniques using a machine learning algorithm e.g.
Winnow (Charoenpornsawat et al., 1998) and a decision
tree (Theeramunkong and Usanavasin, 2000) have been
effective in overcoming the unknown-word problem. Work
by Aroonmanakun suggests that segmenting text into a
sequence of syllable-like units and combining units which
have high collocations can also help in these cases (Aroonmanakun,
2002; Aroonmanakun, 2005).