recall is supposed to measure the retrievability of an IR system whereas precision should assess the ability of an IR system in separating the nonrelevant from the relevant. But there are two major stumbling blocks in the calculation of these two measures. First, how can relevance be defined and measured? second, how can the total number of relevant documents in a system be known?