vocabulary of 1M for the Oxford Buildings dataset, (i) only
5K visual words have Si > 0, and hence cause a positive"
impact on the retrieval process. (ii) 76K visual words have
Si < 0, and cause a
egative" impact. (iii) Rest 919K visual
words have Si = 0, and do not aect the retrieval process.
We use the positive" 5K visual words for our retrieval.
The mAP increases by 4% (See Figure 3 (a)) when evaluated
using a dierent test dataset. Hence, with 5K visual
words, the vocabulary is reduced by 200 times and yet we see
a performance improvement. This approach is constrained
to use an extensive ground truth, which is available with us
in the form of an annotated database. We expect our app
to work excellently on images where annotation is available.
On images without annotations, we anyway cannot do anything.