A classifier ensemble for classification of dynamic data. Application to an indoor air quality problem

A classifier ensemble for classification of dynamic data. Application to an indoor air quality problem

Philippe Thomas William Derigent Marie-Christine Suhner 

France, Université de Lorraine, CRAN, UMR 7039, Campus Sciences, BP 70239, 54506 Vandoeuvre-lès-Nancy cedex France CNRS, CRAN, UMR7039

Corresponding Author Email: 
prénom.nom@univ-lorraine.fr
Page: 
375-391
|
DOI: 
https://doi.org/10.3166/JESA.49.375-391
Received: 
N/A
| |
Accepted: 
N/A
| | Citation
Abstract: 

Indoor air quality has an important impact on people exposure to pollutants. The Airbox Lab company currently designs a connected object, called Footbot, measuring every minute several different parameters related to indoor air quality : temperature, humidity, VOC concentrations, CO2, formaldehyde and particle matter (pm). Moreover, Footbot ought to include some data analysis features to identify different domestic situations (presence, cooking, housework and so on) from the gathered data. The final purpose is to help user avoiding situations causing air quality degradation. In this paper, two different tools (neural networks and decision trees) are tested and compared to solve this problem of dynamic data classification. To increase the classifier performances, classifier ensembles are also studied.

Keywords: 

indoor air quality, neural networks, decision trees, classifier ensemble.

1. Introduction
2. Brève revue des problèmes de classification
3. Outils pour la classification
4. Application industrielle
5. Conclusion
Remerciements

Les auteurs remercient la société AIRBOX LAB pour son soutien à leurs travaux.

  References

Aksela M., Laaksonen J. (2006). Using diversity of errors for selecting members of a committee classifier. Pattern Recognition, vol. 39, p. 608-623.

Bi Y. (2012). The impact of diversity on the accuracy of evidential classifier ensembles. International Journal of Approximate Reasoning, vol. 53, p. 584-607.

Breiman L. (1996). Bagging predictors. Machine Learning, vol. 24, p. 123-140.

Breiman L., Friedman J.H., Olshen R.A., Stone C.J. (1984). Classification and regression trees, Chapmann & Hall, Boca Raton, USA.

Cybenko G. (1989). Approximation by superposition of a sigmoidal function. Math. Control Systems Signals, vol. 2, n° 4, p. 303-314.

Dai Q. (2013). A competitive ensemble pruning approach based on cross-validation technique. Knowledge-Based Systems, vol. 37, p. 394-414.

Dietterich T.G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Machine Learning, vol. 40, p. 139-157.

Engelbrecht A.P. (2001). A new pruning heuristic based on variance analysis of sensitivity information. IEEE transactions on Neural Networks, p. 1386-1399. 

Footbot (2013). http://getFootbot.com.

Freund Y., Schapire R.E. (1996). Experiments with a new boosting algorithm. 13th International Conference on Machine Learning ICML’96, Bari, Italy, July 3-6.

Funahashi K. (1989). On the approximate realization of continuous mapping by neural networks. Neural Networks, vol. 2, p. 183-192.

Giacinto G., Roli F. (2001). Design of effective neural networks ensembles for image classification processes. Image Vision and Computing Journal, vol. 19, p. 699-707.

Guo L., Boukir S. (2013). Margin-based ordered aggregation for ensemble pruning. Pattern Recognition Letters, vol. 34, p. 603-609.

Hajek P., Olej V. (2010). Municipal revenue prediction by ensembles of neural networks and support vector machines. WSEAS Transactions on Computers, vol. 9, p. 1255-1264.

Han H.G., Qiao J.F. (2013). A structure optimisation algorithm for feedforward neural network construction. Neurocomputing, vol. 99, p. 347-357.

Hernandez-Lobato D., Martinez-Munoz G., Suarez A. (2013). How large should ensembles of classifiers be? Pattern Recognition, vol. 46, p. 1323-1336.

Ho T. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, n° 8, p. 832-844.

Jain A.K, Murty M.N., Flynn P. (1999). Data clustering: a review. ACM Computing Surveys, vol. 31, p. 264-323.

Jones, A.P (1999). Indoor air quality and health. Athmospheric Environment, vol. 33, n° 28, p. 4535-4564.

Kim Y.W., Oh I.S. (2008). Classifier ensemble selection using hybrid genetic algorithms. Pattern Recognition Letters, vol. 29, p. 796-802.

Ko A.H.R., Sabourin R., Britto A.S. (2008). From dynamic classifier selection to dynamic ensemble selection. Pattern Recognition, vol. 41, p. 1718-1731.

Kotsiantis, S.B. (2007). Supervised machine learning: a review of classification techniques. Informatica, vol. 31, p. 249-268.

Kotsiantis S.B. (2011). Combining bagging, boosting, rotation forest and random subspace methods. Artificial Intelligence Review, vol. 35, p. 225-240.

Kotsiantis S.B. (2013). Decision trees: a recent overview. Artificial Intelligence Review, vol. 39, p. 261-283.

Kuncheva L.I. (2002). Switching between selection and fusion in combining classifiers: An experiment. IEEE Transactions on Systems, Man and Cybernetics, part B: Cybernetics, vol. 32, n° 2, p. 146-156.

Kuncheva L.I. (2004). Combining pattern classifiers: Methods and algorithms. Wiley- Intersciences.

Kuncheva L.I., Whitaker C.J. (2003). Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, vol. 51, p. 181-207.

Lewis R.J. (2000). An introduction to classification and regression tree (CART) analysis. Annual Meeting of the Society for Academic Emergency Medicine, San Francisco, California, May 22–25.

Ma L., Khorasani K. (2004). New training strategies for constructive neural networks with application to regression problems. Neural Network, p. 589-609.

Maroni M., Seifert B., Lindvall T. (1995). Indoor Air Quality – a Comprehensive Reference Book. Elsevier, Amsterdam.

Mehta M., Agrawal R., Rissanen J. (1996). SLIQ: A fast scalable classifier for data mining. Advances in Database Technology — EDBT '96, Lecture Notes in Computer Science, vol. 1057, p. 18-32.

Meyer D., Leisch F., Hornik K. (2003). The support vector machine under test. Neurocomputing, vol. 55, p. 169-186.

Murthy S.K. (1998). Automatic construction of decision trees from data: a multi-disciplinary survey. Data Mining and Knowledge Discovery, vol. 2, p. 345-389.

Nguyen D., Widrow B. (1990). Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptative weights. Proc. of the Int. Joint Conference on Neural Networks IJCNN'90, vol. 3, p. 21-26.

Paliwal M., Kumar U.A. (2009). Neural networks and statistical techniques: A review of applications. Expert Systems with Applications, vol. 36, p. 2-17.

Patel M.C., Panchal M. (2012). A review on ensemble of diverse artificial neural networks. Int. J. of Advanced Research in Computer Engineering and Technology, vol. 1, n° 10, p. 63-70.

Quinlan, J.R. (1996). Improved use of continuous attributes in C4.5. Journal of Artificial Intelligence Research, vol. 4, p. 77-90.

Rodriguez J.J., Kuncheva L.I., Alonso C.J. (2006). Rotation forest: a new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, p. 1619-1630.

Ruta D., Gabrys B. (2005). Classifier selection for majority voting. Information Fusion. vol. 6, p. 63-81.

Setiono R., Leow W.K. (2000). Pruned neural networks for regression. 6th Pacific RIM Int. Conf. on Artificial Intelligence PRICAI’00, Melbourne, Australia, p. 500-509.

Shafer J., Agrawal R., Mehta M. (1996). SPRINT: a scalable parallel classifier for data mining. 22th International Conference on Very Large Data VLDB’96, Bombay, India September 3-6.

Spengler J.D., Sexton K. (1983). Indoor air pollution: a public health perspective. Science, 221, 9-17.

Soto V., Martinez-Munoz G., Hernadez-Lobato D., Suarez A. (2013). A double pruning algorithm for classification ensembles. 11th International Workshop on Multiple Classifier Systems MCS’13, Nanjing, China, May 15-17.

Tang E.K, Suganthan P.N., Yao X. (2006). An analysis of diversity measures. Machine Learning, vol. 65, p. 247-271.

Thomas P., Bloch G. (1997). Initialization of one hidden layer feedforward neural networks for non-linear system identification. 15th IMACS World Congress on Scientific Computation, Modelling and Applied Mathematics WC'97, vol. 4, p. 295-300.

Thomas P., Bloch G., Sirou F., Eustache V. (1999). Neural modeling of an induction furnace using robust learning criteria. J. of Integrated Computer Aided Engineering, vol. 6, n° 1, p. 5-23.

Thomas P., Suhner M.C., Thomas A. (2013b). Variance Sensitivity Analysis of Parameters for Pruning of a Multilayer Perceptron: Application to a Sawmill Supply Chain Simulation Model. Advances in Artificial Neural Systems, Article ID 284570, http://dx.doi.org/10.1155/2013/284570.

Tsoumakas G., Patalas I., Vlahavas I. (2009). An ensemble pruning primer. in Applications of supervised and unsupervised ensemble methods O. Okun, G. Valentini Ed. Studies in Computational Intelligence, Springer.

Walsh P.J., Dudney C.S., Copenhaver E.D. (1987). Indoor air quality, ISBM 0-8493-5015-8.

Yang L. (2011). Classifiers selection for ensemble learning based on accuracy and diversity. Procedia Engineering, vol. 15, p. 4266-4270.