In order to accomplish this on a typical mobile phone, we
need to store data on the phone itself. Extracting SIFT descriptors
and performing spatial verication, are two computationally
expensive steps on a mobile processor. The practical
bottlenecks are the storage and RAM requirements in
the entire process. Let us look at the storage requirement
for a typical image retrieval application in this setting.
1. 10K images of size 480320 requires a typical storage
of 1.5 GB even if stored in a compressed jpg format.
2. Such an image typically has around 500 interest points,
each represented using 128-dimensional SIFT vectors
and their keypoint locations. This comes to around
1.5 GB of storage.
3. Each image typically gets represented using a histogram
of size 5K bytes. This leads to 50 MB.
4. Each annotation of around 100 B, is stored in text
format for all the images in the database. This needs
around 1MB of storage.
Our immediate challenge is to do this on a mobile phone
with 600 MHz processor, using a maximum RAM of 15 MB
and get results in close to a second. We bound our storage
requirements to 60 MB, which can be available on the
internal memory or the SD-card of the mobile phone.