Pyramid Bag of Words with SVM Classification
- Login to Download
- 1 Credits
Resource Overview
This implementation utilizes Dense SIFT features with Bag of Words modeling for image representation. After encoding images using BoW, we train an SVM classifier for categorization. The methodology employs both RBF kernel and a custom histogram intersection kernel. Experimental validation uses action images across 6 categories (60 images per class) with 40 training and 20 testing samples per class. Code implementation includes feature extraction, vocabulary construction, and kernel function optimization.
Detailed Documentation
The feature extraction process employs Dense SIFT descriptors, which are subsequently encoded using the Bag of Words (BoW) model to represent image characteristics. Following BoW representation, we implement a Support Vector Machine (SVM) classifier for image categorization. Beyond the standard RBF kernel, we designed a custom histogram intersection kernel to enhance classification performance. The experimental dataset comprises action images organized into 6 distinct categories, each containing 60 images with a standardized split of 40 samples for training and 20 for testing. Key implementation steps include: 1) Dense SIFT feature extraction using sliding window approach, 2) K-means clustering for visual vocabulary generation, 3) Histogram normalization for BoW representation, and 4) SVM optimization with dual kernel functionality. The histogram intersection kernel computes similarity by measuring the minimum corresponding bins between two histograms, implemented as K(h1,h2) = Σ min(h1(i), h2(i)) where h1 and h2 represent normalized histogram vectors.
- Login to Download
- 1 Credits