OPEN ACCESS
In heterogeneous databases, images often provided from different sources and belong to different topics, hence there is a need for a large description to ensure efficient representation of their content. However, extracted features are not always adapted to the considered image database. In this paper we propose a new image recognition approach based on two innovations, namely adaptive feature selection and Multi-Model Classification Method (MC-MM). The adaptive selection considers only the most adapted features with the used image database content. The MC-MM method ensures image recognition using hierarchically selected features. Experimental results confirm the effectiveness and the robustness of our proposed approach.
Extended Abstract
In this paper, we are interested in Content Based Image Recognition (CBIR) in heterogeneous databases. Unlike the text-based approaches, CBIR systems allow image access according to their visual characteristics. The process of describing the image content with informations that can be derived from the image itself such as color, texture and shape is called feature extraction. Images in heterogeneous databases often belong to different topics and then a large description is generally required. In this work, average color vectors, histogram and correlogram are used as color features. First order statistics, co-occurence matrix coefficients and Gradient norm vectors are used as texture features. The GIST descriptor is also employed as feature covering both color and texture and finally the invariant moments of Hu are used as shape features. However, the encountered problem so far is the choice of relevant features depending on the considered image database. Indeed, extracted features are not always adapted to the content of images. Consequently, relevant feature selection is strongly needed.
Several feature selection techniques are available in the literature. Mainly, we distinguish two known selection method, named wrappers and filters approches. As they rely only on theoretical considerations, Filter methods are very fast, but not always efficient. Contrariwise, Wrapper methods use the classifier in the selection process, so they perform high recognition rates but still less fast especially for a large number of features. New methods that combine the two selection techniques are recently proposed. In this context, we propose in this paper a new adaptive feature selection. We use Support Vector Machine classifier (SVM) to evaluate the extracted features. Actually, we carry out multiple SVM learning using each feature separately. Subsequently, we apply Fisher Linear Discriminant (FLD) to select the most relevant features based on the performed SVM evaluation. In fact, we compute an FLD threshold that ensures better separation between relevant and irrelevant features depending on the obtained training performance. Hence, the proposed method achieves automatically relevant feature selection according to the observed image database content. However, selected features have not the same relevance.
Considering the negative effect of the least efficient ones, a simple concatenation of selected features does not lead to optimal recognition results. For this, we propose in this paper to recognize images by means of a hierarchical classification technique, that we call MC-MM (méthode de classification multi-modèle). It derives directly from the above described adaptive feature selection. The SVM classifier is used and the employed hierarchical order relies essentially on the selected feature training rate. Images are initially classified according to the feature model having the lowest training performance among the selected ones. Afterwards, image classification is progressively refined through different hierarchical levels. In fact, at each upcoming level in the classification hierarchy, images are classified according to the subsequent model until reaching the most relevant one at the last level. Furthermore, the own classification of each level is usually compared to that obtained within the previous level. In the case of dissimilar classification, the Nearest Cluster Center (NCC) classifier is employed. TheNCC classifier consists of a simple process in which the considered image is assigned, among two evaluated clusters, to the closest one in a given feature space.
Given selection and classification results in this paper are obtained from experiments on different COREL database subsets. For each COREL subset, we use the "3/4 - 1/4" proportion to learning and testing respectively. Thus from the 100 images of each cluster in the COREL database, 75 images are randomly sampled for learning. The remaining 25 images are used for test. To assess the individual relevance of the extracted features, we carried out SVM evaluations of the corresponding models. The performed evaluations within the different used COREL subsets prove that the training rate efficiency of a given model relies basically on the employed image content. In this context, the selected features through the proposed adaptive selection vary always depending on the processed image subsets. The proposed generalization procedure is also assessed by comparison with two other generalization methods. The first proceeds in an opposite manner, ie from the most relevant models to the less ones. The second assigns images to the considered cluster by the majority of selected models. For the different COREL subsets results prove that the proposed method provides always better classification rates.
Furthermore, the proposed hierarchical way to combine features is assessed. For comparison, we carried out the SVM based classical classification method. In addition, we compare the obtained recognition results of the proposed method with those of methods present in the literature, such as k-means-SVM, DD-SVM, MILES, MI-SVM and the SIFT based Bag of Features. This comparative analysis demonstrates that the proposed method outperforms the different evaluated methods. Indeed, it provides 83.7% as average classification accuracy while the best obtained performance from all the other discussed methods does not exceed 82.6%.
RÉSUMÉ
Dans les bases hétérogènes, les images appartiennent souvent à différentes classes thématiques et nécessitent une large description permettant leur reconnaissance. Cependant, les caractéristiques utilisées ne sont pas toujours adaptées au contenu de la base d’images considérée. Nous proposons dans cet article une nouvelle approche se basant sur deux originalités, à savoir la sélection adaptative de caractéristiques et la classification multimodèle intitulée MC-MM. La sélection adaptative permet de ne considérer que les caractéristiques les mieux adaptées au contenu de la base d’images utilisée. La méthode MCMM assure la reconnaissance des images en se servant hiérarchiquement des caractéristiques sélectionnées. Les résultats expérimentaux obtenus confirment l’efficacité et la robustesse de notre approche.
feature extraction, adaptive relevant feature selection, multi-model classification, image recognition, heterogeneous image database.
MOTS-CLÉS
extraction d’attributs, sélection adaptative des caractéristiques pertinentes, classification multi-modèle, reconnaissance d’images, bases hétérogènes.
Aly M., Welinder P., Munich M., Perona P. (2009). Automatic Discovery of Image Families: Global vs. Local Features. In Ieee international conference on image processing.
Andrews S., Tsochantaridis I., Hofmann T. (2003). Support Vector Machines for Multiple-Instance Learning. In Advances in neural information processing systems 15, p. 561-568. Cambridge, MA, MIT Press.
Androutsos P., Kushki A., Plataniotis K. N., Venetsanopoulos A. N. (2005). Aggregation of Color and Shape Features for Hybrid Query Generation in Content Based Visual Information Retrieval. Signal Processing, vol. 85, p. 385-393.
Bach J. R., Fuller C., Gupta A., Hampapur A., Horowitz B., Humphrey R., Jain R., Shu C.-F. (1996). The Virage Image Search Engine: An Open Framework for Image Management. In Spie conference on storage and retrieval for image and video databases, vol. 2670, p. 76-87.
Berry M. (2003). Survey of Text Mining : Clustering, Classification, and Retrieval. Springer. Hardcover.
Bi J., Bennett K., Embrechts M., Breneman C., SongM. (2003). Dimensionality Reduction Via Sparse Support Vector Machines. Journal of Machine Learning Research, vol. 3, p. 1229-1243.
Boujemaa N., Fauqueur J., Ferecatu M., Fleuret F., Gouet V., Saux B. L., Sahbi H. (2001). IKONA: Interactive Generic and Specific Image Retrieval. In International workshop on multimedia content-based indexing and retrieval (MMCBIR’2001).
Brunelli R., Mich O. (2000). Compass: An Image Retrieval System for Distributed Databases. In IEEE International Conference on Multimedia and Expo (I), p. 145-148.
Carson C., Thomas M., Belongie S., Hellerstein J., Malik J. (1999). Blobworld: a System for Region-based Image Indexing and Retrieval. Rapport technique. Berkeley, CA, USA.
Chen Y., Bi J., Wang J. Z. (2006). MILES: Multiple-Instance Learning via Embedded Instance Selection. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 28, no 12, p. 1931-1947.
Chen Y., Wang J. Z. (2004). Image Categorization by Learning and Reasoning with Regions. J. Mach. Learn. Res., vol. 5, p. 913-939.
Csurka G., Dance C. R., Fan L., Willamowski J., Bray C. (2004). Visual Categorization with Bags of Keypoints. In Workshop on statistical learning in computer vision, eccv, p. 59-74.
Danny P. (2003). A Review of RGB Color Spaces...From xyY to R’G’B’. Rapport technique. Montreal (Quebec) Canada, The BabelColor company.
Datta R., Joshi D., Li J.,Wang J. Z. (2006). Studying Aesthetics in Photographic Images Using a Computational Approach. In Proceedings of the european conference on computer vision, p. 7-13.
Datta R., Joshi D., Li J., Wang J. Z. (2008). Image Retrieval: Ideas, Influences, and Trends of the New Age. ACM Computing Surveys, vol. 40, no 2, p. 1-60.
Del Bimbo A. (1999). Visual Information Retrieval. San Francisco, CA, USA, Morgan Kaufmann Publishers Inc.
Del Bimbo A., Vicario E. (1998). Using Weighted Spatial Relationships in Retrieval by Visual Contents. In Content-based access of image and video libraries, proceedings. ieee workshop on, p. 35-39.
Delingette H., Montagnat J. (2000). Shape and Topology Constraints on Parametric Active Contours. Computer Vision and Image Understanding, vol. 83, no 2, p. 140-171.
Dengsheng Z., Lu G. (2003). Evaluation of Similarity Measurement for Image Retrieval. In International conference on neural networks and signal processing, vol. 2, p. 928-931.
Faloutsos C., Equitz W., Flickner M., Niblack W., Petkovic D., Barber R. (1994). Efficient and Effective Querying by Image Content. Journal of Intelligent Information Systems, vol. 3, p. 231-262.
Fenton W. G., McGinnity T. M., Maguire L. P. (2001). Fault Diagnosis of Electronic Systems Using Intelligent Techniques: a Review. IEEE Transactions on Systems, Man, and Cybernetics, Part C, vol. 31, no 3, p. 269-281.
Fisher R. A. (1936). The Use of Multiple Measurements in Taxonomic Problems. Annals Eugenics, vol. 7, p. 179-188.
Flickner M., Sawhney H., Niblack W., Ashley J., Huang Q., Dom B., Gorkani M., Hafner J., Lee D., Petkovic D., Steele D., Yanker P. (1995). Query by Image and Video Content: The QBIC System. Computer, vol. 28, p. 23-32.
Fukunaga K. (1990). Introduction to statistical pattern recognition (2nd ed.). San Diego, CA, USA, Academic Press Professional, Inc.
Gevers T., Smeulders A. W. M. (1999). The PicToSeek WWW Image Search System. In International conference on mathematics and computer science, vol. 1, p. 264-269.
Gevers T., Smeulders A. W. M. (2004). Emerging Topics in Computer Vision. In M. et S. B. Kang (Ed.),, chap. Content-based image retrieval: An overvie. Prentice Hall.
Guyon I., Elisseeff A. (2003). An Introduction to Variable and Feature Selection. Journal of Machine Learning Research, vol. 3, p. 1157-1182.
Hafner J., Sawhney H. S., EquitzW., Flickner M., NiblackW. (1995). Efficient Color Histogram Indexing for Quadratic Form Distance Functions. IEEE Trans. Pattern Anal. Mach. Intell., vol. 17, p. 729-736.
Haralick R. M., Shanmugam K., Dinstein I. (1973). Textural Features for Image Classification. IEEE Transactions on Systems, Man, and Cybernetics, vol. 3, no 6, p. 610-621.
He D., Cercone N. (2009). Local Triplet Pattern for Content-Based Image Retrieval. In Proceedings of the 6th international conference on image analysis and recognition, p. 229-238. Berlin, Heidelberg, Springer-Verlag.
Hu M. K. (1962). Visual Pattern Recognition by Moment Invariants. Information Theory, IRE Transactions on, vol. 8, no 2, p. 179-187.
Huang J., Kumar R., Mitra M., Zhu J. (1998). Spatial Color Indexing and Applications. In Proceedings of the 1998 conference on computer vision, vol. 35, p. 245-268.
Huang J., Kumar S. R., Mitra M., Zhu W.-J., Zabih R. (1997). Image Indexing Using Color Correlograms. In Proceedings of the 1997 conference on computer vision and pattern recognition (cvpr ’97), p. 762-768. Washington, DC, USA, IEEE Computer Society.
Ismail M. B., Frigui H., Caudill J. (2008). Empirical Comparison of Automatic Image Annotation Systems. In First international workshop on image processing theory, tools and applications (ipta), p. 114-121. Sousse, TUNISIA.
John G. H., Kohavi R., Pfleger K. (1994). Irrelevant Features and the Subset Selection Problem. In International conference on machine learning, p. 121-129.
Jolliffe I. (1986). Principal Component Analysis. New-York, Springer Verlag.
Kachouri R., Djemal K., Maaref H., Masmoudi D. S., Derbel N. (2008). Heterogeneous Image Retrieval System Based on Features Extraction and SVM Classifier. In J. Filipe, J. Andrade-Cetto, J.-L. Ferrier (Eds.), Icinco-spsmc, p. 137-142. INSTICC Press.
Landré J., Truchetet F. (2007). Image Retrieval with Binary Hamming Distance. In A. Ranchordas, H. Araújo, J. Vitrià (Eds.), Visapp (2), p. 237-240. Barcelona-Espania, INSTICC -Institute for Systems and Technologies of Information, Control and Communication.
Lowe D. G. (2004). Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, vol. 60, p. 91-110.
Ma W. ying, Manjunath B. (1999). NeTra: A toolbox for navigating large image databases. In Multimedia systems, vol. 7, p. 568-571.
Malki J., Boujemaa N., Nastar C., Winter R. (1999). Region Queries without Segmentation for Image Retrieval by Content. In Third international conference on visual information systems (visual’99), p. 115-122.
Manjunath B. S., Ohm J. rainer, Vasudevan V. V., Yamada A. (1998). Color and Texture Descriptors. IEEE Transactions on Circuits and Systems for Video Technology, vol. 11, p. 703-715.
Maron O., Lozano-Pérez T. (1998). A Framework for Multiple-Instance Learning. In Advances in neural information processing systems, vol. 10, p. 570-576. Cambridge, MA: MIT Press.
Mikolajczyk K., Schmid C. (2004). Scale and Affine Invariant Interest Point Detectors. International Journal of Computer Vision, vol. 60, no 1, p. 63-86.
Mikolajczyk K., Tuytelaars T., Schmid C., Zisserman A., Matas J., Schaffalitzky F., Kadir T., Gool L. V. (2005). A Comparison of Affine Region Detectors. International Journal of Computer Vision, vol. 65, no 1/2, p. 43-72.
Moeslund T. B., Granum E. (2001). A Survey of Computer Vision-Based Human Motion Capture. Computer Vision and Image Understanding, vol. 81, no 3, p. 231-268.
Moghaddam B., Biermann H., Margaritis D. (1999). Defining Image Content with Multiple Regions-of-Interest. In 8993, proceedings of ieee workshop on content-based access of image and video libraries, p. 89-93.
Muller H., Michoux N., Bandon D., Geissbuhler A. (2004). A Review of Content-Based Image Retrieval Systems in Medical Applications - Clinical Benefits and Future Directions. International Journal of Medical Informatics, vol. 73, no 1, p. 1-23.
Oliva A., Torralba A. (2001). Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope. International Journal of Computer Vision, vol. 42, no 3, p. 145-175.
Pentland A., Picard R., Sclaroff S. (1994). Photobook: Content-based Manipulation of Image Databases (vol. 2) no 2185.
Portnoy D., Dr A., Bellaachia A., Bellaachia A., Chen Y., Elkahloun A. G. (2002). E-CAST: A Data Mining Algorithm for Gene Expression Data. In Workshop on data mining in bioinformatics, p. 49-54.
Press W. H., Teukolsky S. A., Vetterling W. T., Flannery B. P. (1987). Numerical Recipes: The Art of Scientific Computing. Cambridge University Press.
Sclaroff S., Taycher L., La Cascia M. (1997). ImageRover: a Content-Based Image Browser for the World Wide Web. In Content-based access of image and video libraries, 1997. proceedings. ieee workshop on, p. 2-9.
Shamir R., Sharan R. (2001). Algorithmic Approaches to Clustering Gene Expression Data. In Current topics in computational biology, p. 269-300. MIT Press.
Teague M. R. (1980). Image Analysis via the General Theory of Moments. Journal of the Optical Society of America (1917-1983), vol. 70, p. 920-930.
Tieu K., Viola P. (2004). Boosting Image Retrieval. International Journal of Computer Vision, vol. 56, p. 17-36.
Vapnik V. N. (1999). An Overview of Statistical Learning Theory. Neural Networks, IEEE Transactions on, vol. 10, no 5, p. 988-999.
Wang L., Zhang Y., Feng J. (2005). On the Euclidean Distance of Images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 27, no 8, p. 1334-1339.
Wenyin L., Wang T., Zhang H. (2000). A Hierarchical Characterization Scheme for Image Retrieval (vol. 3).
Weston J., Mukherjee S., Chapelle O., Pontil M., Poggio T., Vapnik V. (2000). Feature Selection for SVMs. In Advances in neural information processing systems 13, p. 668-674. MIT Press.
Willamowski J., Arregui D., Csurka G., Dance C. R., Fan L. (2004). Categorizing Nine Visual Classes Using Local Appearance Descriptors. In Icpr workshop on learning for adaptable visual systems.
Yu L., Liu H. (2004). Efficient Feature Selection via Analysis of Relevance and Redundancy. Journal of Machine Learning Research, vol. 5, p. 1205-1224.
Zhang J., Lazebnik S., Schmid C. (2007). Local Features and Kernels for Classification of Texture and Object Categories: a Comprehensive Study. International Journal of Computer Vision, vol. 73.
Zhu J., Rosset S., Hastie T., Tibshirani R. (2003). 1-Norm Support Vector Machines. In Neural information processing systems, vol. 16, p. 49-56.