Regression analysis of expanded polystyrene properties

Own measurements examine the tensile strength of expanded polystyrene (EPS) depending on its bulk density. 30 samples were used to calculate the correlation coefficients between these two properties. In addition to the standard Pearson coefficient, we also calculate the rank correlation coefficients, Spearman ́s and Kendall ́s. By testing the hypotheses, we verify the correlation of the entire population. After finding a relatively close correlation (0.6 0.8), we apply different regression models, especially polynomial, but also exponential. We evaluate the properties of parameters in models, their point estimates and confidence intervals. Based on the characteristics of each of the seven regressions, we found the best exponential form of the dependence, before the linear polynomial. The complexity of a mathematical model does not always mean that it is also a more accurate approximation. On the other hand, a simple model makes it possible, in addition to its ease of use, to more closely reflect the examined dependence.


INTRODUCTION
Expanded polystyrene (EPS) is a proven heat-insulating, but also sound-insulating material. It is lightweight and has a bulk density, which makes it easy to process 4. Sometimes its strength properties can be used and therefore it is appropriate to measure its tensile strength 2. The interdependence between density and strength of EPS is presented by many distributors 8, 11. To determine the intensity of this relationship, it is advantageous to apply a correlation analysis 1. A mathematical description of such dependence can be provided by regression analysis, which has a number of characteristics evaluating every regression model 14]. It is then important to evaluate these values correctly, so that we do not have to rely on subjective judgment, but we were able to classify the suitability and unsuitability of the chosen mathematical formula by comparing specific numbers.

Expanded polystyrene (EPS) -properties
Expanded polystyrene (EPS) is an organic matter that belongs to a group of foam plastics. It is made from the chemical substance styrene, which is expanded (foamed) with water vapor and blowing agent (pentane). Industrial styrene and pentane are extract ed from petroleum, although they are also commonly found in nature. EPS is one of the most widely used plastics, right after PVC, polyethylene and polypropylene. The highest rated properties of polystyrene are its excellent thermal insulating properties, easy workability due to its low bulk density and affordability.
EPS foam matter consists of about 2% polystyrene and 98% air. It is actually engineered wrapped air and it causes its excellent insulating ability. It has high compressive, tensile and flexural strengths that increase linearly with increasing bulk density. Its low weight reduces the load on the load-bearing structure, the transport costs and the effort in use. EPS is not soluble in water and therefore its cells do not absorb water into its structure, causing its very low water absorption. It can only absorb water vapor to a certain extent and therefore it is important to ensure that the dew point is not inside the polystyrene structure for a long time. Polystyrene products have been used for many years for food packaging, which is evidence of their health safety. The rapid development of the use of EPS as thermal insulation has necessitated the development of so-called self-extinguishing polystyrene, which meets the strict requirements for fire protection of buildings. Polystyrene as thermal insulation reduces fuel consumption, making it very environmentally friendly. Last but not least, EPS production is also ecological and polystyrene is also recyclable.
In construction, EPS is applied primarily for thermal insulation, but also for sound insulation. It can be used to create various decorative or shaping elements, for air filling or for foundation of traffic constructions on soft soils. Everyone has already encountered the purpose of polystyrene as a packaging material, and thus the field of application is far from over.
The extensive use of expanded polystyrene was the main motivation for us to determine the tensile strength of EPS and the corresponding density. The tests were performed on a jaw tear machine and a bulk density was determined for each sample 7, 9, 10.

Correlation coefficients and correlability
The correlation analysis examines the relations between random variables, the most important being the intensity of interdependence. The result of this examination is a correlation coefficient r, which takes values from -1 to +1. Values close to -1 or to +1 indicate a strong correlation and values close to 0, on the contrary, weak, respectively. no correlation. The Pearson correlation coefficient is calculated from the relationship    

22
Cov (x, y) x . y r Var(x) . Var(y) x x . y y

 
(1) where Cov(x,y), Var(x), Var(y) are covariance and variance of measurements of the variables x, y and the bar above the value of the variable denotes its mean value.
An assumption for applying the coefficient (1) is a linear relationship between the variables x, y and their normal probability distributions.
The sensitivity to extreme values does not show rank correlation coefficients. Nor do we need assumptions about the types of measurement distributions or about the linear dependence between variables. The Spearman rank correlation coefficient is calculated from the formula where d i = R i x -R i y , i = 1, 2, …, n, are the differences between the order numbers R i x and R i y and n is the number of observations. As can be seen from the relation (2), instead of the values of the measured quantities, we assume their order in the set.
where c = c i and d = d i , are the sums of the concordant and discordant orders between the independent variable x and the dependent y, x extra and y extra represent the number of variables with the same order.
The sample correlation coefficients r are point estimates of the correlation coefficient  of the population. Therefore, nonzero r does not mean that  will be nonzero too. The importance of testing the hypothesis H 0 : Random variables are uncorrelated ( = 0), compared to the alternative hypothesis H 1 : Random variables are correlated (  0), comes to the fore especially in the case of a small number of measurements. This test can be performed for all three types of described correlation coefficients (Pearson, Spearman, Kendall) and is most often carried out at a significance level of  = 0.05. Refer to the literature for further details 5, 6.

Regression analysis
While the correlation analysis determines the existence and intensity of the dependence between variables, the regression analysis looks for the form of this dependence. The general form of the linear function in the parameters is written k jj j0 where a j are parameters, constants and  j (x) are functions of independent variable that no contain already other parameters 6. The point estimation of parameters a j most commonly uses the least squares method, which minimizes the sum of squares for error (SSE) In the confidence intervals of parameters a j the Student distribution with n-k-1 degrees of freedom is applied.
By means of SSE it is possible to determine the mean squared error (MSE) and simultaneously root mean squared error (RMSE) When we use the least squares method, the relationship must hold SST = SSR + SSE SST is the total sum of squares and SSR is the sum of squares for residuals, further denotation is explained above. According to (7) The regression model as a whole can be tested for the significance of the coefficient of determination R as well as of all regression coefficients with the exception of the absolute coefficient a 0 . At the selected significance level  we test the hypothesis H 0 : a 1 = a 2 =…= a k = 0, i. e. the regression model is statistically insignificant compared to the alternative hypothesis H 1 : at least one of the regression coefficients a 1 , a 2 , …, a k  0, i. e. the regression model is statistically significant.

RESULTS AND DISCUSSION
Obtained values of bulk density and tensile strength are given in the Tab. 1. We performed 30 measurements, which we statistically evaluated on the boxplots in the Fig. 1. The interdependence of both sets was estimated by different types of correlation coefficients in the Tab. 2. Their values are positive and relatively high, which represents a direct dependence between quantities. As mentioned above, it is appropriate to verify the mutual correlability with the hypothesis H 0 :  = 0, at the significance level  = 0.05. Values of p in the second line of the Tab. 2 came out less than , which makes it possible to reject H 0 and assume   0.
In the Fig. 2, we plotted the measured values and fitted the first three polynomials and the exponential function. The polynomials 4 th , 5 th and 6 th order were not shown for clarity on the picture. However, there is an apparent tendency that the higher the degree of polynomial, the better it adapts to the measured values. Noteworthy is a 3 rd degree polynomial, in green, that curves down on the right side, which may indicate its poor ability to predict tensile strengths for higher bulk weights, as it is likely to go to zero values. The characteristics of regression analysis were explained in the theoretical part and are presented for individual models in the Tab. 3. It can be seen that with increasing degree of polynomial the sum of squares for error (SSE) decreases, the exponential model fits between linear and quadratic function. The root mean squared error (RMSE) decreases only to the 3 rd degree polynomial, then begins to grow. The coefficient of determination R 2 improves with increasing degree of polynomial, but its modified form R *2 improves the regression model only to the 3 rd degree of the polynomial, then begins to decrease. The meaning of SSR and SST characteristics are explained above, DFE represents the number of degrees of freedom in the model. The test of significance of the regression model as a whole is done at the level  = 0.05 and since p values are less than this value, we can conclude that the linear, quadratic, cubic and 4 th degree models significantly contribute to the estimation of the dependent variable. We did the test only to the 4 th degree polynomial, because at higher degrees numerical problems arose. In the Tab. 4, we present point estimates of the coefficients in regression models. For polynomials, the index of the coefficient a j also means its position at the degree of the independent variable x j . Since these are only values calculated from our particular realization of random variables X, Y, it is also appropriate to make confidence intervals of the coefficients, which are given in the Tab. 5. However, except for the linear and exponential model, all intervals at the higher degrees of the polynomial intersect zero, thereby degrading the regression model. In the final Figures 3 -6, we plot 95% confidence intervals of regression models by 1 st , 2 nd and 3 rd degree polynomials as well as by the exponential model.
After reviewing all results, the linear and exponential forms of regression appear to be the best choices, while the exponential shows slightly more favorable values.

CONCLUSIONS
Expanded polystyrene (EPS) is a material that is concurrently environmental, practical and reliable 4. Its use as a building material is miscellaneous and therefore deserves attention to determine its properties 3. As EPS bulk density increases, tensile strength increases too 8.
It is important to as accurately as possible determine this functional dependence 11. The intensity of the dependence is specified by various correlation coefficients, and their values in the range of about 0.6 -0.8 indicate a relatively strong correlation found from the measured values 13. The seven types of regression models used made it possible to calculate the characteristics of the corresponding model 12. As we can see from the results, it is not necessary to complicate mathematical formulations of regression, it is sufficient to have a linear polynomial, which shows the best parameters and is also easier to calculate. Better came only exponential function, which also has a relatively uncomplicated formula.