In the main loop, the function loops once for each document in the collection.
At each document, all of the inverted lists are checked. If the document appears
in one of the inverted lists, the feature function f
i
is evaluated, and the document score s
D
is computed by adding up the weighted function values. Then, the
inverted list pointer is moved to point at the next posting. At the end of each document loop, a new document score has been computed and added to the priority
queue R.
For clarity, this pseudocode is free of even simple performance-enhancing
changes. Realistically, however, the priority queue R only needs to hold the top
k results at any one time. If the priority queue ever contains more than k results,
the lowest-scoring documents can be removed until only k remain, in order to
save memory. Also, looping over all documents in the collection is unnecessary;
we can change the algorithm to score only documents that appear in at least one
of the inverted lists.
The primary benefit of this method is its frugal use of memory. The only major
use of memory comes from the priority queue, which only needs to store k entries
at a time. However, in a realistic implementation, large portions of the inverted
lists would also be buffered in memory during evaluation