Water Quality Modeling Using Artificial Intelligence-based Tools

Water Quality Modeling Using Artificial Intelligence-based Tools

C. Couto
H. Vicente
J. Machado
A. Abelha
J. Neves

Department of Chemistry, University of Évora, Portugal

Department of Chemistry and Chemistry Centre of Évora, University of Évora, Portugal

Department of Informatics, University of Minho, Braga, Portugal

Corresponding Author Email: 
horbite@gmail.com; hvicente@uevora.pt; jmac, abelha, jneves@di.uminho.pt
5 September 2012
| Citation



Water, like any other biosphere natural resource, is scarce, and its judicious use includes its quality safeguarding. Indeed, there is a wide concern to the fact that an inefficient water management system may become one of the major drawbacks for a human-centered sustainable development process. The assessment of reservoir water quality is constrained due to geographic considerations, the number of parameters to be considered and the huge financial resources needed to get such data. Under these circumstances, the modeling of water quality in reservoirs is essential in the resolution of environmental problems and has lately been asserting itself as a relevant tool for a sustainable and harmonious progress of the populations. The analysis and development of forecast models, based on Artificial Intelligence-based tools and the new methodologies for problem solving, has proven to be an alternative, having in mind a pro-active behavior that may contribute decisively to diagnose, preserve, and rehabilitate the reservoirs. In particular, this work describes the training, validation and application of Artificial Neural Networks (ANNs) and Decision Trees (DTs) to forecast the water quality of the Odivelas reservoir, in Portugal, over a period of 10 years. The input variables of the ANN model are chemical oxygen demand (COD), dissolved oxygen (DO), and oxidability and total suspended solids (TSS), while for the DT the inputs are, in addition to those used by ANN, the Water Conductivity and the Temperature. The performance of the models, evaluated in terms of the coincidence matrix, created by matching the predicted and actual values, are very similar for both models; the percentage of adjustments relative to the number of presented cases is 98.8% for the training set and 97.4% for the testing one.


Artificial Neuronal Networks, data mining, Decision Trees, water quality


[1] Fayyad, U., Piatetshy-Shapiro, G., Smith, P. & Uthurusamy, R., Advances in Knowledge Discovery and Data Mining, MIT Press: Massachusetts, USA, 1996.

[2] Turban, E., Aronson, J.E. & Liang, T.-P., Decision Support Systems and Intelligent Systems, Prentice Hall: New Jersey, USA, 2004.

[3] Han, J. & Kamber, M., Data Mining: Concepts and Techniques, Morgan Kauffmann Publishers: San Francisco, USA, 2006.

[4] Santos, M.F., Cortez, P., Quintela, H., Neves, J., Vicente, H. & Arteiro, J., Ecological Mining - A Case Study on Dam Water Quality. Data Mining VI - Data Mining, Text Mining and their Business Applications, eds. A. Zanasi, C.A. Brebbia & N.F.F. Ebecken, WIT Press: Southampton, UK, pp. 523–531, 2005.

[5] Pinto, A., Fernandes, A.V., Vicente, H. & Neves, J., Optimizing Water Treatment Systems Using Artifi cial Intelligence Based Tools. Water Resourse Management V, eds. C.A. Brebbia & V. Popov, WIT Press: Southampton, UK, pp. 185–194, 2009.

[6] Singh, K., Basant, A., Malik, A. & Jain, G., Artifi cial neural network modeling of the river water quality - A case study. Ecological Modelling, 220, pp. 888–895, 2009. doi: http://dx.doi. org/10.1016/j.ecolmodel.2009.01.004

[7] Maier, H., Jain, A., Dandy, G. & Sudheer, K., Methods used for the development of neural networks for the prediction of water resource variables in river systems: Current status and future directions. Environmental Modelling & Software, 25, pp. 891–909, 2010. doi: http://dx.doi. org/10.1016/j.envsoft.2010.02.003

[8] Eaton, A., Clesceri, L., Rice, E. & Greenberg, A., (eds). Standard Methods for the Examination of Water and Wastewater, American Public Health Association: USA, 2005.

[9] Galushkin, A.I., Neural Networks Theory, Springer: New York, USA, 2007.

[10] Haykin, S., Neural Networks and learning machines, Prentice Hall: New Jersey, USA, 2008.

[11] Rumelhart, D., Hinton, G. & Williams, R., Learning Internal Representation by Error Propagation. Parallel Distributed Processing, eds. D.E. Rumelhart & J.L. McCleland, MIT Press: Massachusetts, U.S.A., pp. 318–362, 1986.

[12] Breiman, L., Friedman, J.H., Olshen, R.A. & Stone, C.J., Classifi cation and Regression Trees, Chapman & Hall/ CRC Press: Boca Raton, U.S.A., 1984.

[13] Quinlan, J.R., Induction of decision trees. Machine Learning, 1, pp. 81–106, 1986. doi: http:// dx.doi.org/10.1007/BF00116251

[14] Quinlan, J., C4.5 Programs for Machine Learning, Morgan Kaufmann Publishers Inc: USA, 1993.

[15] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P. & Witten, I.H., The WEKA Data Mining Software: An Update. SIGKDD Exploration, 11, pp. 10–18, 2009. doi: http:// dx.doi.org/10.1145/1656274.1656278

[16] Witten, I.H. & Frank, E., Data Mining – Practical Machine Learning Tools and Techniques, Morgan Kaufmann Publishers: San Francisco, USA, 2005.

[17] Souza, J., Matwin, S. & Japkowicz, N., Evaluating data mining models: a pattern language. Proc. of the 9th Conference on Pattern Language of Programs, pp.11–23, 2002.

[18] Kohavi, R. & Provost, F., Glossary of Terms. Machine Learning, 30, pp. 271–274, 1998. doi: http://dx.doi.org/10.1023/A:1017181826899

[19] CCDRA, Anuário de Qualidade da Água da Região Alentejo 2006-2007, Comissão de Coordenação e Desenvolvimento Regional do Alentejo: Evora, Portugal, 2008.

[20] Kewley, R., Embrechts, M. & Breneman, C., Data strip mining for the virtual design of pharmaceuticals with neural networks. IEEE Transactions on Neural Networks, 11, pp. 668–679, 2000. doi: http://dx.doi.org/10.1109/72.846738