Classification and regression tree (CART) modelling for analysis of shear strength of FRP-RC members

Abstract The shear behaviour of concrete members that are reinforced with fibre-reinforced polymer (FRP) bars varies from that of steel reinforced concrete members. The use of non-parametric techniques, which can explain the relationship between different properties of FRP members in more detail, has not yet been well researched in this area. This study utilized a non-parametric technique, namely Classification and Regression Tree (CART) to predict the shear capacity of FRP reinforced concrete members. The members were only reinforced lengthways without vertical reinforcement. A total of 216 experimental results are used to train and test the CART model. The outcomes from CART were compared to traditional models and equations, as well as another non-parametric model from previous research. The comparison showed that CART has better accuracy than other models. It can therefore be used to develop guidelines for the design of FRP concrete members.


Introduction
Chloride-ion-induced corrosion in structure reinforced with steel bars causes about 40% of the modern highway bridge repair expenses (Weyers, Prowell, Sprinkel, & Vorster, 1993). To solve this issue, FRP rebar is used as a replacement of conventional reinforcement owing to their resistance to corrosion. Besides their resistance to corrosion, FRP rebar has high strength, excellent fatigue defiance, and non-magnetic properties (Nanni, 1993;Wegian & Abdalla, 2005;ACI, 2006). Nevertheless, the main disadvantages of these bars are the linear elastic properties and the lower modulus of elasticity compared to steel bars.
Members without web reinforcement carry shear by means of: arch action, residual tension stress across cracks (f t ), un-cracked concrete compression zone, V cz , dowel action by longitudinal reinforcement,V d , and aggregate interlock, V a (ACI-ASCE Committee 445 on Shear & Torsion, 1998). The detail mechanism is shown in Figure 1. The total of these contributions is called concrete contribution of shear resistance V c : The aim of this paper is to determine this contribution.
The variation of shear strength of concrete members was found to be linear with ffiffiffi ffi f 0 wheref 0 c is the compressive strength of concrete (ACI-ASCE Committee 445 on Shear & Torsion, 1998). Also the strength decreases with decreasing the longitudinal amount of reinforcement. In contrast, the shear capacity of concrete members increases with a decreasing relationship between shear span and depth (a=d) (ACI-ASCE Committee 445 on Shear & Torsion, 1998). Similarly, the shear capacity of a member decreases with increasing depth as long as all other factors remain the same (Kani, 1967).
The predictions using these methods are very fragmented and some predictions are very conservative. Some methods have changed and are becoming more rigorous.
Generalized Regression Neural Network (GRNN) model has been used successfully to estimate the shear capacity of FRP reinforced members without web reinforcement. This particular model was found to be more precise than some other available design codes and methods (Alam & Gazder, 2020). However, the performance of the model also depends on the training samples and getting the optimum results may require some effort to determine the optimum structure. There are also several other computational techniques proposed by various authors for predicting shear capacity. Some of these are computational intelligence (Naderpour, Haji, & Mirrashid, 2020), Artificial Neural Network (ANN) (Bashir & Ashour, 2012;Lee & Lee, 2014;Naderpour, Poursaeidi, & Ahmadi, 2018), Bio-inspired predictive model , gene expression programing, regression analysis (Jumaa & Yousif, 2018), and genetic programming (Kara, 2011).
On the other hand, Classification and Regression Tree Analysis (CART) is a non-parametric technique that allows variables to be included in more than one level of the tree. In this way, complex dependencies that are hidden in GRNNs can also be discovered. With its adaptive interpretation capabilities, the CART has successfully managed the complex non-linearity between predictors and response (Gong, Sun, Shu, & Huang, 2018). It can more appropriately handle multi-collinearity problems of data. In addition, the CART analysis provides a model that can be interpreted through logical statements to understand the effects of various variables on the target variable that are often not found in other data mining tools (Shaaban & Pande, 2016). However, their use to model continuous variables have gained acceptance in many fields. Some of the researchers also compared the accuracies of CART to other popular techniques such as artificial neural networks and support vector machines and found that regression trees have better accuracies than them (Rodriguez-Galiano, Sanchez-Castillo, Chica-Olmo, & Chica-Rivas, 2015). In spite of the advantages and growing popularity, its application is rarely used in some fields. The modelling of concrete properties is one such field in which its application cannot be found. This is despite the fact that the regression tree can be used directly to develop a rule base for concrete mix design for optimum strength. Hence, with this aim in mind, a regression tree was used in this study to model the shear strength of fibre-reinforced concrete. The results were then compared with the results of the GRNN model and other traditional models to ensure that CART can better represent the behaviour of FRP members than other available models.

Description of the models
Ten different types of models were applied to the data, eight of which can be considered the traditional method adopted by the American Concrete Institute  Table 1. The details of the GRNN model can be found in Alam and Gazder (2020). The following sections provide the brief explanation about CART that was developed for this study.
CART is a technique that has evolved due to advances in computer technology and the limitations of other statistical techniques (Loh, 2011). While other statistical techniques require the premodel knowledge of the model type, CART provides a convenient solution to this issue by treating the variables as classifiers. These classifiers can have a multilevel impact on the dependent (output) variable (Loh, 2011). It is a series of nodes that represent a combination of different levels of input variables. At each level of the tree, the variable that provides the best split is selected and the proceeding nodes are developed for that variable (Loh, 2011).
Similar to other regression models, each node in the regression tree provides a mean value of the dependent variable for the series of nodes up to that level. A variable can be considered more important if it is the primary split (top level node) in the tree or affects the tree at multiple levels. Alternatively, variables that do not appear to be the best split in any of the levels can be eliminated from the tree (Loh, 2011). In contrast to ANNs, the CART model gives a clear image of the data distribution with respect to the output variable.
As stated earlier, CART is similar to a regression tree that shows the most expected (average response) of dependent in a given scenario.  Although its structure seems more suitable for classification problems, it has been used in the past for prediction problems (Demisse et al., 2017). There have been studies that compared their accuracy with other competitive models and it was found to perform better in a number of cases (Aertsen, Kint, Van Orshoven, € Ozkan, & Muys, 2010).

Dataset
A test data set of 216 rectangular beams and the corresponding unidirectional slabs collapse in shear was compiled from the available literature. Only the longitudinal direction reinforcement was used in the samples with no vertical reinforcement. The samples were simply supported and the test was carried out in four-point flexure arrangement. In this study, the following parameters are considered as input parameters: shear-depth ratio (a=d), beam depth (d), concrete compressive strength (f 0 c ) and axial rigidity of the longitudinal reinforcement (q f E f ) to predict shear strength. In order to reduce the input parameters, the shear strength was normalized by the width of the beam. The normalized output was then multiplied by the width to get the shear capacity. The compressive strength of the samples ranged from 24.0 to 88.3 MPa. The relationship between shear span and depth ranged from 2.5 to 6.5; the reinforcement ratio ranged from 0.11 to 2.63%; and the effective depth of the samples ranged from 140 to 940 mm. Table 2 shows the detailed data set and Table 3 shows the statistical values of the data set parameters with regard to minimum, maximum, mean, standard deviation and coefficient of variation (CoV).

Procedure
The data were divided into two sets comprising of 144 (67% of the samples) and 72 samples (33% of the samples). The larger dataset was used to train the CART model, while 72 samples were used to test the accuracy of the model for unknown values. The large dataset for training will provide more consistent and accurate predictions. The samples were randomly assigned to each data set. The accuracy of the model was calculated using the ratio of experimental and predicted values of shear strength for each sample, and then calculating the average of this ratio and the COV (%). Furthermore, Mean Absolute Percent Error (MAPE) was calculated for CART model with training and test data sets as well as their combination for comparison with previous models. The details of the regression tree in CART analysis are shown in Figure 2 and Table 4.
Each node in Figure 2 shows the mean/expected value of the shear strength when the criteria for that node are met. It also shows the node number and the limiting criteria (in the box) for this node. Each node takes one variable for its decision criteria. For example, node 2 is applicable to samples having d less than or equal to 307.5 mm. Table 4 provides the details for each node, starting with node numbers, number of samples for combination denoted by each node, mean value for the output variable (shear strength) for the combination denoted by that node, variables that split the node in the proceeding level, and the value which is chosen as the split criteria.
For example, node 8 has a sample size of 24 i.e., 24 samples, from the available data set meet the criteria of node 4 in the regression tree. These 24 samples have a mean shear strength of 23.02 kN. Since, it is a terminal node, there is no split variable available for it, which means samples that meet the conditions of that node will not be affected by any other variable. The CART model specifies the conditions for a specific value of the shear strength. For example, node 7 shows that the expected shear strength is 26.42 kN when d 219 mm and q f E f is greater than 1,115 MPa. Figure 3 shows the accuracy parameters for the CART model for training and test data sets. The mean ratio of experimental to predicted values was 1.0 and 1.02 for training and test data sets, CoV% was 16 and 21.1, and lastly, MAPE  was 13.08 and 17.04 for training and test datasets, respectively. These values are reasonably close to one another for training and test data sets, so that the model can be considered as robust. It is further reinforced by the regression plots shown in Figures  3 and 4.

Results and discussions
All models, including GRNN from a previous study (Alam & Gazder, 2020), were assessed using the mean (m), coefficient of variations (CoV), and MAPE of the prediction based on the relationship between the test value and the estimated value of the shear strength. The values are provided in Table 5 for comparison, including CART and other traditional models. The values demonstrate that the accuracy of the developed CART model is better than that of other traditional models as well as GRNN with a lower m.
Moreover, the CART analysis shows that the most important variables in the tree are d and then q f E f , as their effects on the tree are shown in the top two levels. Naderpour et al. (2018) observed that the most important parameters in ANN modelling are the depth of beam (d) and elastic modulus of the reinforcement (E f ). The current results are similar to the authors' results. All variables show non-linear effects, which can be seen from their occurrence in the multiple levels of the tree. The highest value of the shear strength is observed to be 181.7 kN for node 21, which is achieved by increasing the depth to more than 519 mm and q f E f less than 2,072 MPa. Since it has been established earlier, that CART model has higher accuracy than other models, it was  used to develop design rules for designing the shear strength of members. Table 6 shows these rules that were developed with the CART. For each terminal node in the CART model, a 95% confidence interval was calculated using its average and its variance. These rules can serve as convenient guidelines for design engineers employing fibre-reinforced structural members. Table 6 shows that shallow beams with high elasticity, as indicated by the coefficient of q f E f , results in lower shear strength, irrespective of the strength of the concrete and the shear span to depth ratio. The strength becomes even lower when a low strength concrete is used. Alternatively, higher depths with moderate elasticity, gives the highest range of shear strength irrespective of other factors. The shear strength increases with increasing depth, q f E f and f'c, while it reduces with increasing shear span to depth ratio. The advantages of using the CART model over other models are the higher accuracy and the convenient tree structure.

Conclusions
This paper presents the prediction of the shear strength of FRP reinforced concrete members without diagonal reinforcements, carried out using CART and some other existing design codes and guidelines from previous studies. In the current study, a data set with 216 test results for rectangular beams and corresponding one-way slabs is used. The performance of the CART model developed in the current study was compared to other models used to predict shear strength. The main conclusions that can be cited from this research are as follows: 1. Performance of CART is better than other models including GRNN, while unlike GRNN it provides the trend of change in shear strength with respect to input variables. It can be recommended for use in developing standards and guidelines for design purposes of FRP reinforced members. Furthermore, the CART model provides more insight into the interrelations between the variables listed below.
2. The most important variable in the CART model for predicting shear strength is depth of the member. 3. At higher depth (>307.5mm) and an elasticity coefficient q f E f (>2,072MPa), the shear strength is the highest, irrespective of concrete strength and shear span depth ratio. 4. The CART model also shows that q f E f more than 1,115 MPa with a depth less than 219 mm has a deteriorating effect on the shear strength. 5. Since the non-parametric techniques (CART & GRNN) have shown better performance than the conventional models, there is scope to explore this non-parametric technique for steel reinforced members.

Disclosure statement
No conflict of interest.