Empirical Analysis on Multiple Regression Modeling Method of Compositional Data

Empirical Analysis on Multiple Regression Modeling Method of Compositional Data

Zhihui Zhang Zhepeng Zheng Chenxia Suo Yong Yang

Beijing Institute of Petrochemical Technology, Beijing 102617, China

Postdoctoral Programme, Bank of Zhengzhou, Zhengzhou 450018, China

Corresponding Author Email: 
10 November 2017
20 March 2018
31 December 2018
| Citation



The paper combines the logratio transformation method of compositional data with the partial least squares path analysis and puts forward the method of building the multiple linear regression model under the condition that dependent variables are compositional data and the relevant several independent variables are also compositional data. The modeling method can meet the fixed-sum constraint of compositional data, overcome the adverse effect of complete multi-collinearity on modeling in compositional data, and highlight the effect and significance of compositional data thematic meaning in modeling. As the application case, the paper used the suggested method and established the regression model among the employment demands of Beijing three industries, investments and GDP with the structural data of Beijing tertiary industry investments (including real estate), GDP and employment.


compositional data, multiple regression data, partial least squares path analysis

1. Introduction
2. Compositional Data and Logratio Transformation
3. PLS Path Model and the Regression Model of Compositional Data
4. Case Analysis
5. Summary

[1] Aitchison J. (1986). The statistical analysis of compositional data. London: Chapman and Hall.

[2] Chin WW. (1998). The partial least squares approach for structural equation modeling. in: G.A. Marcoulides (Ed.) Modern Methods for Business Research, Lawrence Erlbaum Associates, 295-336.

[3] Guinot C, Latreille J, Tenenhaus M. (2001). PLS path modellind and multiple table analysis. Application to the cosmetic habits of women in Ile-de-France. Chemometrics and Intelligent Laboratory Systems 58: 247-259.

[4] Lohmöller JB. (1989). Latent variables path modeling with partial least squares. Physica-Verlag, Heildelberg 34(1): 110-111.

[5] Bayol MP, Foye ADL, Tellier C, Tenenhaus M. (2000). Use of PLS path modeling to estimate the European consumer satisfaction index (ECSI) model. Statistica Applicata – Italian Journal of Applied Statistics 12(3): 361-375.

[6] Wang HW, Huang W. (2013). Linear regression model of compositional data. System Engineering (2).