SURF stands for Speeded Up Robust Features. It is an algorithm which extracts some unique keypoints and descriptors from an image. More details on the algorithm can be found here and a note on its implementation in OpenCV can be found here. A set of SURF keypoints and descriptors can be extracted from an image and then used later to detect the same image. SURF uses an intermediate image representation called Integral Image, which is computed from the input image and is used to speed up the calculations in any rectangular area.It is formed by summing up the pixel values of the x,y
co-ordinates from origin to the end of the image. This makes computation time invariant to change in size and is particularly useful while encountering large images. The SURF detector is based on the determinant of the Hessian matrix. The SURF descriptor describes how pixel intensities are distributed within a scale dependent neighborhood of each interest point detected by Fast Hessian
Object detection using SURF is scale and rotation invariant which makes it very powerful. Also it doesn’t require long and tedious training as in case of using cascaded haar classifier based detection. But the detection time of SURF is a little longer than Haar, but it doesn’t make much problem in most situations if the robot takes some tens of millisecond more for detection. Since this method is rotation invariant, it is possible to successfully detect objects in any orientation. This will be particularly useful in mobile robots where it may encounter situations in which it has to recognize objects which may be at different orientations than the trained image, say for example , the robot was trained with the upright image of an object and it has to detect a fallen object. Detection using haar features fails miserably in this case. OK, lets now move from theory to practical, the actual way things works.
OpenCV library provides an example of detection called find_obj.cpp. It can be found at the OpenCV-x.x.x/samples/c/ folder of the source tar file, where x.x.x stands for the version number. It loads two images, finds the SURF keypoints and descriptors , compares them and finds a matching if there is any. But this sample code will be a bit tough for beginners. So let us move slowly, step by step. As the first step , we can find the SURF keypoints and descriptors in an frame captured from the webcam. The code for the same is given below:
feature