OPEN ACCESS
Ensemble methods have been successfully used as a classification scheme. The reduction of the complexity of this popular learning paradigm motivated the appearance of ensemble pruning algorithms. This paper presents a new efficient ensemble pruning method which not only highly reduces the complexity of ensemble methods but also performs better than the non-pruned version in terms of classification accuracy. This algorithm consists in ordering all the base classifiers with respect to their entropy which exploits a new version of the margin of ensemble methods. Confrontation with both the naive approach of randomly pruning base classifiers and another ordered-based pruning algorithm turned out convincing in an extensive empirical analysis.
RÉSUMÉ
Les méthodes d’ensemble ont été utilisées avec succès comme schéma de classification. Les algorithmes d’élagage d’ensembles de classifieurs sont apparus afin de réduire la complexité de ce paradigme populaire d’apprentissage. Cet article présente une nouvelle méthode efficace d’élagage d’ensembles qui, non seulement réduit de manière significative la complexité des méthodes d’ensemble, mais permet aussi une meilleure précision de classification que la version sans élagage. Cet algorithme consiste à ordonner tous les classifieurs de base par rapport à leur entropie qui exploite une nouvelle version de la marge des méthodes d’ensemble. La confrontation de cette méthode avec l’approche naïve d’élagage aléatoire des classifieurs de base et avec un autre algorithme d’élagage par ordonnancement a permis de montrer sa supériorité à travers une analyse empirique conséquente.
multiple classification, ensemble method, ensemble pruning, margin, decision tree
MOTS-CLÉS
classification multiple, méthode d’ensemble, élagage d’ensemble, marge, arbre de décision
Asuncion A., Newman D. (2007), UCI Machine Learning Repository.
Bakker B., Heskes T. (2003), Clustering ensembles of neural network models, Neural Networks, vol. 16, n°2, p. 261-269.
Banfield R. E., Hall L. O., Bowyer K. W., Kegelmeyer W. P. (2005), Ensemble diversity measures and their application to thinning, Information Fusion, vol. 6, n°1, p. 49-62,.
Bartlett P. J., Schölkopff B., Schuurmans D., Smola, A. J. (eds) (2000), Advances in Large Margin Classifiers, Neural Information Processing, 1 edn, The MIT Press.
Bauer E., Kohavi R. (1999), An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants, Machine Learning, vol. 36, n°1, p. 105-139.
Breiman L. (1996), Bagging predictors, Machine Learning, vol. 24, n°2, p. 123-140.
Breiman L., Friedman J., Olshen R., Stone C. (1984), Classification and Regression Trees, Publisher : Wadsworth.
Brown G., Wyatt J., Harris R., Yao X. (2005), Diversity Creation Methods: A Survey and Categorisation, Journal of Information Fusion, vol. 6, n°1, p. 5-20.
Chehata N., Guo L., Mallet C. (2009), Contribution of Airborne Full-Waveform Lidar and Image Data for Urban Scene Classification, ICIP'2009, 16th IEEE International Conference on Image Processing, p. 1667-1672.
Cohen J. (1960), A Coefficient of Agreement for Nominal Scales, Educational and Psychological Measurement, vol. 20, n°1, p. 37-46
Cutler D. R., Edwards T. C., Beard K. H., Cutler A., Hess K. T., Gibson J., Lawler J. J. (2007), Random forests for classification in ecology, Ecology, vol. 88, n°11, p. 2783-2792.
Dietterich T. G. (1995), Overfitting and under-computing in machine learning, Computing Surveys, vol. 27, n°3, p. 326-327.
Dietterich T. G. (1997), Machine-Learning Research: Four Current Directions, AI Magazine, vol. 18, p. 97-136.
Dietterich T. G. (2000), Ensemble Methods in Machine Learning, 1st International Workshop on Multiple Classifier Systems, p. 1-15.
Efron B., Tibshirani R. (1994), An Introduction to the Bootstrap, Chapman & Hall/CRC.
Freund Y., Schapire R. E. (1997), A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, vol. 55, n°1, p. 119-139.
Garey M., Johnson D. (1979), Computers and Intractability : A Guide to the Theory of NPCompleteness, Publisher : W. H. Freeman.
Giacinto G., Roli F., Fumera G. (2000), Design of Effective Multiple Classifier Systems by Clustering of Classifiers, ICPR'2000, 15th International Conference on Pattern Recognition, p. 160-163.
Gislason P., Benediktsson J., Sveinsson J. (2006), Random Forests for land cover classification, Pattern Recognition Letters, vol. 27, n°4, p. 294-300.
Guo L., Boukir S., Chehata N. (2010a), Support vectors selection for supervised learning using an ensemble approach, ICPR'2010, 20th IAPR International Conference on Pattern Recognition, p. 37-40.
Guo L., Chehata N., Boukir S. (2010b), A two-pass random forests classification of airborne lidar and image data on urban scenes, ICIP'2010, 17th IEEE International Conference on Image Processing, p. 1369-1372.
Ho T. K. (1998), The Random Subspace Method for Constructing Decision Forests, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, n°8, p. 832-844.
Kohavi R., Wolpert D. H. (1996), Bias Plus Variance Decomposition for Zero-One Loss Functions, 13th International Conference of Machine Learning, ICML'96, p. 275-283.
Kuncheva L. (2004), Combining Pattern Classifiers : Methods and Algorithms, WileyInterscience. ISBN : 0471210781.
Kuncheva L. I. (2002), Switching Between Selection and Fusion in Combining Classifiers: An Experiment, IEEE Transactions on Systems, Man, and Cybernetics, Part B : Cybernetics, vol. 32, n°2, p. 146-156.
Kuncheva L. I., Whitaker C. J. (2003), Measures of diversity in classifier ensembles and Their Relationship with the Ensemble Accuracy, Machine Learning, vol. 51, p. 181-207.
Lazarevic A., Obradovic Z. (2001), Effective pruning of neural network classifiers, IEEE/INNS International Conference on Neural Networks, p. 796-801.
Mangiameli P., West D., Rampal R. (2004), Model selection for medical diagnosis decision support systems, Decision Support Systems, vol. 36, n°3, p. 247-259.
Margineantu D. D., Dietterich T. G. (1997), Pruning Adaptive Boosting, ICML'1997, 14th International Conference on Machine Learning, p. 211-218.
Martínez-Muñoz G., Hernández-Lobato D., Suárez A. (2009), An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation, IEEE Transactions in Pattern Analysis and Machine Intelligence, vol. 31, n°2, p. 245-259.
Martínez-Muñoz G., Suárez A. (2006), Pruning in ordered bagging ensembles, ICML'2006, 23rd International Conference on Machine Learning, p. 609-616.
Pang H., Zhao H. (2008), Building pathway clusters from Random Forests classification using class votes, BMC Bioinformatics.
Partridge D., Yates W. B. (1996), Engineering multiversion neural-net systems, Neural Computation, vol. 8, n°4, p. 869-893.
R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. 2009, ISBN 3-900051-07-0.
Schapire R. E., Freund Y., Bartlett P., Lee W. S. (1998), Boosting the margin: a new explanation for the effectiveness of voting methods, The Annals of Statistics, vol. 26, n°5, p. 1651-1686.
Shannon C. (1948), A mathematical theory of communication, Bell System Technical Journal, vol. 27, n°3, p. 379-423.
Tao D., Tang X., Li X., Wu X. (2006), Asymmetric Bagging and Random Subspace for Support Vector Machines-Based Relevance Feedback in Image Retrieval, IEEE transactions on pattern analysis and machine intelligence, vol. 28, n°7, p. 1088-1099.
Tsoumakas G., Partalas I., Vlahavas I. (2009), Applications of Supervised and Unsupervised Ensemble Methods, vol. 245, Springer, chapter An Ensemble Pruning Primer, p. 1-13.
Yang Y., Korb K., Ting K. M., Webb G. I. (2005), Ensemble Selection for SuperParent-OneDependence Estimators, AI 2005: Advances in Artificial Intelligence, p. 102-112.
Zhang Y., Burer S., Street W. N. (2006), Ensemble Pruning Via Semi-definite Programming, Journal of Machine Learning Research, vol. 7, p. 1315-1338.
Zhou Z.-H., Tang W. (2003), Selective ensemble of decision trees, 9th International Conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, p. 476-483.
Zhou Z.-H., Wu J., Tang W. (2002), Ensembling neural networks: Many could be better than all, Artificial Intelligence, vol. 137, n°1-2, p. 239-263.