Abstract— A natural language information retrieval system
ranks related documents according to criteria based on user
query keywords and document similarities. However, many
efforts have been made to make more useful query keywords
because users do not use many keywords in their natural
language search query when retrieving information on the
Web. Because a keyword does not provide much information,
however, relevance feedback is generally used to complement
the weakness of general retrieval methods. This paper
proposes a term cluster query expansion model based on
classification information of retrieved documents. This model
generates classification information from the upper ranked n
documents retrieved by retrieval system. On the basis of the
extracted classification information, the term cluster (m) that
represents each group is generated, and then the model allows
user to select term cluster that corresponds to user information
needs. The query keywords are expanded by using a relevance
feedback algorithm based on the selected classification
information. As a result of the experiments with test collection,
the retrieval effectiveness was improved by 13.2% compared to
the initial query when the Rocchio method was used.