due to being language-dependent. While the
n-gram inverted index is language-independent, it still requires
indexing term extraction using the n-gram term extraction
method before the inverted index can be constructed. Although
the n-gram inverted index can be applied to many Asian
languages and other sequence patterns due to its being
language-independent, determining the appropriate
dimensions of the gram term is problematic. This method also
requires more space for storing indexing terms when
compared to the word inverted index. Regarding the suffix
array approach, this refers to a language-independent
technique that can be applied to any language and other
sequence patterns. However, one of its drawbacks is that this
method obviously requires a large amount of storage space
for indexing because it generates and keeps all suffixes from
text documents during the indexing process. Although the
suffix array approach does not require text pre-processing in
terms of extracting the indexing terms before the suffix array
can be constructed, one of the drawbacks in terms of index
size seems to be very critical. This makes the suffix array
approach impractical at times to be used in the Thai