same category. Existing deep learning models for image
similarity also focus on learning category-level image similarity [22]. Category-level image similarity mainly corresponds to semantic similarity. [6] studies the relationship
between visual similarity and semantic similarity. It shows
that although visual and semantic similarities are generally consistent with each other across different categories,
there still exists considerable visual variability within a category, especially when the category’s semantic scope is
large. Thus, it is worthwhile to learn a fine-grained model
that is capable of characterizing the fine-grained visual similarity for the images within the same category.
The following works are close to our work in the spirit
of learning fine-grained image similarity. Relative attribute [19] learns image attribute ranking among the images with the same attributes. OASIS [3] and local distance learning [10] learn fine-grained image similarity ranking models on top of the hand-crafted features. These above
works are not deep learning based. [25] employs deep learning architecture to learn ranking model, but it learns deep
network from the “hand-crafted features” rather than directly from the pixels. In this paper, we propose a Deep
Ranking model, which integrates the deep learning techniques and fine-grained ranking model to learn fine-grained
image similarity ranking model directly from images. The
Deep Ranking models perform much better than categorylevel image similarity models in image retrieval applications.
Pairwise ranking model is a widely used learning-to-rank
formulation. It is used to learn image ranking models in
[3, 19, 10]. Generating good triplet samples is a crucial
aspect of learning pairwise ranking model. In [3] and [19],
the triplet sampling algorithms assume that we can load the
whole dataset into memory, which is impractical for a large
dataset. We design a computationally efficient online triplet
sampling algorithm that does not require loading the whole
dataset into memory, which makes it possible to learn deep
ranking models with very large amount of training data