Abstract- As the volume of clinic notes written in natural
language is rapidly increasing, physicians need a tool to
automatically extract information about diseases/treatments.
The main problem in extracting medical information is that
physicians use variant words to describe the same
disease/treatment. In order to help physicians interpret and
share disease/treatment information in clinic notes, we need to
reliably and effectively detect and normalize the medical terms.
In this study, we perform detection/normalization of medical
terms using a UMLS meta-thesaurus combined with a
document retrieval technique. We regard a medical sentence as
a query, and a UMLS ontology entry as a document, and try to
apply a language modeling-based information retrieval method
as currently used in the document retrieval field. Because the
term frequency in the UMLS dictionary is uniform, we employ a
domain-specific term frequency instead of traditional term
frequency. To retrieve only the relevant terms in 900,000 UMLS
entries, we also propose an adaptive ranking method which
dynamically determines the relevant documents for each query
without using static cut-off threshold.