A. Improving Dictionary-based Cross-lingual IR
Using the dictionary based translation is a traditional
approach in cross-lingual IR systems but significant
performance degradation is observed when queries contain
words or phrases that do not appear in the dictionary. This is
called the Out-of-Vocabulary (OOV) problems [4]. This is to
be expected even in the best of dictionaries. Input queries by
user usually short and even the query expansion cannot help
to recover the missing words because of information lacking.
Generally, OOV terms are proper names or newly created
words. For example, a user wants to search the information
about the Influenza A (H1N1) disease in Malaysia by
entering “H1N1 Malaysia” as the query. The H1N1 is a
newly created term and may not be included in a dictionary
which was published only a few years ago.