How do we interpret this formula? The prior probability P(D) is usually assumed
to be uniform and can be ignored. The expression Πni=1 P(qi|D) is, in fact,
the query likelihood score for the document D. This means that the estimate for
P(w, q1 . . . qn) is simply a weighted average of the language model probabilities
for w in a set of documents, where the weights are the query likelihood scores for
those documents.
Ranking based on relevance models actually requires two passes. The first pass
ranks documents using query likelihood to obtain the weights that are needed
for relevance model estimation. In the second pass, we use KL-divergence to rank
documents by comparing the relevance model and the document model. Note
also that we are in effect adding words to the query by smoothing the relevance
model using documents that are similar to the query. Many words that had zero
probabilities in the relevance model based on query frequency estimates will now
have non-zero values. What we are describing here is exactly the pseudo-relevance
feedback process described in section 6.2.4. In other words, relevance models provide
a formal retrieval model for pseudo-relevance feedback and query expansion.