Mohammadreza Mostajabi
Scene classification using Bags of Visual words + SVM classifier


Bags of visual words is derived from well known algortihm in document classification that called Bags of words. It plays the role of dictionary. some keypoints are defined from training phase and compose bags of words. it is as easy as performing Vector Quantization on feature space. Number of centroids is a number of words in dictionary. when new keypoint is extracted from an input image it will be assigned to a nearest keypoint in dictionary, so an output of this stage is histogram of assigned keypoints in an input image to nearest keypoints in the dictionary.
any type of classifiers such as SVM, Naive Bays calssifier and ... can be trained using the histograms that are gained from previous stage.

The Whole procedure is as follows

Training phase

Vector quantization
Dictionary of visual codewords
I provide a simple code to explain how to use OpenCV's BOW functions. 4 classes of clatech-110 dataset is used. SURF extractor and descriptor is used for a feature extraction phase. In this implementation, Support vector machine with Radial Basis kernel is used. After optimization, the C and G parameter for this kernel are as follow:
C = 312.5 
G = 0.50625
I put comments in the code to make it self-explanatory. However, The code composed main parts such as:

Above method gives you 85% accuracy on evaluating 120 images from 4 different classes such as Tiger, Airplane, Bike and side view car.
void collectclasscentroids(); // is a function that extract features from training images
svm.train(trainingData,labels,cv::Mat(),cv::Mat(),params); // Training SVM
svm.predict(bowDescriptor); // predict a class of a new input image
Copyright 2011 Mohammadreza Mostajabi

G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. In ECCV Workshop on Statistical Learning in Computer Vision, 2004.