MapReduce has been applied to data-intensive
applications in different domains because of its simplicity,
scalability and fault-tolerance. However, its uses in biomedical
association mining are still very limited. In this paper, we
investigate using MapReduce to efficiently mine the
associations between biomedical terms extracted from a set of
biomedical articles. First, biomedical terms were obtained by
matching text to Unified Medical Language System (UMLS)
Metathesaurus, a biomedical vocabulary and standard
database. Then we developed a MapReduce algorithm that
could be used to calculate a category of interestingness
measures defined on the basis of a 2x2 contingency table. This
algorithm consists of two MapReduce jobs and takes a stripes
approach to reduce the number of intermediate results.
Experiments were conducted using Amazon Elastic
MapReduce (EMR) with an input of 3610 articles retrieved
from two biomedical journals. Test results indicate that our
algorithm has linear scalability.