Due to the variations among the birds, bird breed classification is still a challenging task. In this paper, we
propose a saliency based graphical model (GMS), which can precisely annotate the object on the pixel
level. In the proposed method, we first over-segment the image into several regions. Then, GMS extracts
the object and classifies the image based on the local context, global context and saliency of each region.
In order to achieve a high precision of classification, we use SVM to classify the image 1. Introduction
With the wide spread of camera, the images on the internet
grow quickly. Due to the large time consuming of manual work,
automatic image classification and annotation becomes more
important and necessary to support scene understanding and
image retrieval. A variety of image classification methods have
been developed [1–3]. In general, traditional methods of image
classification can be divided into three steps. The first step is to
extract features [4–6] from the images. Then, the bag-of-words
(BOW) [7] is used to represent the image based on the clustering
algorithm. Finally, the category of the image is obtained by using
the classifier such as LDA [8], SVM [9]. In recent years, fine-grained
image classification has attracted lots of attentions which brings a
challenging task for the traditional methods.
For fine-grained image, there are several objects which are similar
with each other. To classify the fine-grained image, we need the
details of the objects which are easy to be mixed up with the noise
induced by the background. As shown in the first row of Fig. 1, the
background induces noise for classification. In order to solve this
problem, we argue and demonstrate that using features extracted
from the object will enhance the performance of fine-grained image
classification. In the second row of Fig. 1, we show the annotated