Key idea 1: Do the same for queries: represent them as vectors in the space
Key idea 2: Rank documents according to their proximity to the query in this space
proximity = similarity of vectors
proximity ≈ inverse of distance
Recall: We do this because we want to get away from the you’re-either-in-or-out Boolean model.
Instead: rank more relevant documents higher than less relevant documents