Cluster Analysis of Fatal Accidents Series in the Infor.MO Database: Analysis, Evidence and Research Perspectives

Cluster Analysis of Fatal Accidents Series in the Infor.MO Database: Analysis, Evidence and Research Perspectives

M. Lombardi G. Rossi 

Safety Engineer, University of Roma La Sapienza, Rome, Italy

31 December 2013
| Citation



The state of the application of the techniques of cluster analysis does not include the work accidents. The applications more established for statistical data analysis include pattern recognition, image analy-sis and information retrieval.

The aim of this study is to provide a quantitative assessment, based on techniques of statistical processing of historical data in order to highlight the causality between the accident and predictive recurring events. On the basis of information provided by the analysis, it is possible to propose preven-tive strategies targeted to reducing the number of accidents (mainly the fatal accidents).

Based on the collection of fatal accidents in the Infor.MO database (INAIL), we proceeded to aggre-gate accident cases registered in order to provide cluster analysis, which with reference to generators of the danger flow mortal areas, could show typical accidents, namely preferential genesis that, proposing the causes of the same energy mortal flow, could explain a large number of events.

In order to run the analysis, a methodological assumption that describes the phenomenon of acci-dents, like any algebraic entity, as the case represented in algebraic space, is requested.

The n dimensions useful to describe the phenomenon are the n generators of the danger areas. Based on this premise, each accident can be represented by the Boolean n-tuple of coordinates in space Rn. This purpose allows to transform the descriptions of accidents in algebraic and statistic case study on which to apply the statistical cluster analysis protocol.

Applying this method to the analysis of fatal accidents in the Infor.MO database, related to the ATECO Construction Sector (F), with particular reference to the ‘falls from heights’, has showed successful clustering.

The results of the analysis aim to check the effective purposes of prevention/protection, in a perspec-tive of maximum efficiency.


Accidents at work, Boolean analysis, cluster analysis, falls from heights, fatal accidents database, slipping


[1] Abdi, H. & Williams, L.J., Principal component analysis. Wiley Interdisciplinary Reviews. Computational Statistics, 2, pp. 433–459, 2010.

[2] Dillon, W.R. & Goldstein M., Multivariate Analysis Methods and Applications. Wiley: New York, 1984.

[3] Greenacre, M.J., Correspondence Analysis, John Wiley & Sons, Inc: New York, 2010.

[4] Greenacre, M.J., Interpreting multiple correspondence analysis. Applied Stochastic Models and Data Analysis, 7(2), pp. 195–210, 1991. doi: asm.3150070208

[5] Lebart, L., Morineau, A. & Warwick, K.M., Multivariate Descriptive Statistics. Wiley: New York, 1984.

[6] Rand, W.M., Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336), pp. 846–850, 1971. doi: http://dx.doi.or g/10.1080/01621459.1971.10482356