where the symbol rank = , as we mentioned previously, means that the right-hand
side is rank equivalent to the left-hand side (i.e., we can ignore the normalizing
constant P(Q)), P(D) is the prior probability of a document, and P(Q|D) is
the query likelihood given the document. In most cases, P(D) is assumed to be
uniform (the same for all documents), and so will not affect the ranking. Models
that assign non-uniform prior probabilities based on, for example, document
date or document length can be useful in some applications, but we will make
the simpler uniform assumption here. Given that assumption, the retrieval model
specifies ranking documents by P(Q|D), which we calculate using the unigram
language model for the document