Correlation and path coefficient analysis of polygenic traits of upland cotton genotypes grown in Zimbabwe

Abstract Cotton is a very important crop that consists of traits with different associationship due to genetic and environmental factors. In order to determine the degree of association between seed cotton yield and important traits, a study was done using an RCBD experiment with ten genotypes. Seed cotton yield, GOT, lint yield, boll weight, bolls per plant, seed weight, plant height, fibre length, elongation, fineness and strength data were collected and analysed. Genotypic and phenotypic correlation analysis was done in Meta R. Estimation of direct and indirect effects was done using path analysis in Microsoft Excel. Analysis of variance revealed significant differences for boll weight, seed weight, GOT and plant height. Seed cotton yield was correlated with lint yield (r = 0.71***), fibre elongation (r = 0.54***), bolls per plant (0.27***), seed weight (r = 0.22***), strength (r = 0.21***) and fineness (r = 0.13*) at genotypic level. Ginning outturn was correlated with lint yield (r = 0.70***), elongation (r = 0.60***) and strength (r = 0.50***). Boll weight was correlated with seed weight (r = 0.56***) whilst plant height was highly associated with fibre strength (r = 0.58***). The adjusted R Square (0.98), low standard error (0.12) and low residual effect (R = 0.01) in regression analysis indicated that variability of seed cotton yield was explained by the causal variables. Lint yield had the highest direct effect on seed cotton yield (r = 1.055). Traits that could be used for indirect selection were bolls per plant (r = 0.006), seed weight (0.022) and or plant height (0.012). Gin outturn (0.737) had the highest indirect contribution to seed cotton yield through lint yield, followed by strength (0.012) through plant height, seed weight (0.011) through boll weight, fibre fineness (0.010) through boll weight. It was therefore concluded that selection of high yielding cotton genotypes could emphasize more on lint yield, boll weights, plant height and bolls per plant for better-performing lines. Gin outturn and fibre strength could be used indirectly to improve seed cotton yield through other traits.

ABOUT THE AUTHOR Chapepa Blessing is a plant breeder working at Cotton Research Institute of Zimbabwe as a cotton breeder. His research areas are developing, implementation analyzing data and information dissemination of the national cotton breeding program in the Department of Research and Specialist Services of the Ministry of Agriculture in the country. Other research interests include crop modelling using remote sensing technology and molecular tool applications. Marco Mare is also plant breeder at Cotton Research Institute and is involved in developing, implementation analyzing data and information dissemination of the national cotton breeding program. Washington Mubvekeri is the head of Cotton Research Institute and is responsible for all cotton research programs in the country. Dr. Dumisani Kutywayo is the director of research in the Department of Research and Specialist Services and plays a critical role in ensuring quality research outputs.

PUBLIC INTEREST STATEMENT
Cotton variety development is a complex and rigorous activity due to the presence of many negatively correlated traits of economic importance that exist within the plant. It is therefore imperative that the traits are studied and appreciated so as to facilitate creation of data base for informed decisions during selections of best varieties.

Introduction
Cotton (Gossypium hirsutum L.) is one of the most important industrial crops in the world (Farias et al., 2016a). It is used for both fibre and feed production. It accounts for a third of the global trade, with the major exporters being the United States, Uzbekistan, and India, and the major importers being China and Southeast Asian countries. In Zimbabwe, it is the third most important cash crop that is exploited much for the benefit all levels of the cotton value chain (AMA, 2018). It contributes significantly to the gross domestic product of the economy and as such has received a lot of attention in funding and research. This has greatly resulted in improvement of local varieties on yield and other associated traits (Riaz et al., 2019). However, the relationship that exists amongst the traits are greatly influenced by genetic constitution and the environment and it needs clarification under existing conditions through exploration of genotypic and phenotypic correlations (A.M. Khan et al., 2017).
Estimates of genotypic and phenotypic correlations among characters are useful in planning and evaluating breeding programmes. Knowledge of the correlation that exists between important characteristics facilitates the selection of genotypes with high field and fibre performance in cotton (Shabbir et al., 2016). The genetic correlation value offers a measure of the genetic inter-relationship between characteristics and may explain the degree of relationship between characters genetically, rather than phenotypically. As the phenotypic variance of a trait can be partitioned into environmental and genetic components, the covariance between the two traits can also be partitioned into environmental and genetic components (Ahmad et al., 2016). The overall effect of this is that, selection of cotton genotypes based on one trait may result in gross changes in the genetic constitution influencing the expression of other important agronomic traits (Farias et al., 2016b). The relative importance of direct and indirect effects of the traits on seed cotton yield are not considered in the correlations thus causing the changes (Lynch & Walsh, 1998). Wright (1921), proposed the path analysis which provided the understanding of trait association, through partitioning of the correlation coefficients in direct and indirect effects on the trait of interest which in the case of cotton is the seed cotton yield. The causal variables are first standardised to obtain the estimates of these effects using the regression equations. A few studies for cotton using fibre yield as the primary dependent variable have been done using this method for breeding purposes (Iqbal et al., 2003;Tyagi et al., 1998). The objective of the study was to establish the relationship between seed cotton yield and polygenic traits and determine the relative importance of direct and indirect effects each trait have on seed cotton yield.

Description of the study sites
The field evaluations were carried out at seven locations that are representative of the cotton growing areas in Zimbabwe namely Kadoma (29° 53ʹ east), Matikwa (32° 14ʹeast), Shamva (31° 71ʹeast), Kuwirirana (30°48ʹeast), Muzarabani (31° 00ʹ east), Wozhele (30°14ʹeast) and Chitekete (28° 56ʹ east) across the country. The site selection was based on representation of all the major cotton growing areas within the middleveld cotton production areas of the country making the locations random effects.

Experimental layout and design
The trials were laid out in randomized complete block designs (RCBD) with three 3 replications. The gross plot sizes used were 36 m 2 and net plot sizes were 16 m 2 where the data was collected. The recommended agronomic practices at each location were applied as for raising a successful crop as recommended in the Cotton handbook of 1998.

Data collection and data analysis
Data was collected on seed cotton yield, boll weight, ginning outturn (GOT), plant height, bolls per plant and fibre data collected on fibre strength, fibre length, fibre elongation and micronaire (fibre fineness). Analysis of variance was carried out for these traits using GenStat 18 th version. Correlations of both types (genetic and phenotypic) were calculated using Meta R software version 6.0 from seasonal means using analysis of variance and covariance procedures proposed by Al-Jibouri et al. (1958) and Falconer and Mackay (1996) as follows: Þ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi

Where r P = phenotypic correlation coefficient, COV P (AB) = phenotypic covariance between variables A and B = COV G (A,B) + V e (A,B),V P (A) = V G (A) + V e (A) phenotypic variance for variable A,V P (B) = V G (B) + V e (B) phenotypic variance for variable B,V e = M e /r error variance,r = number of replications.
Mean scores of the various traits were standardized since the units of measurements of the traits were different following the expression x = (x-m)/sd, where x = score, m = group mean, sd = group standard deviation. Regression analysis was then carried out using Microsoft Excel 2007 to determine the partial regression coefficients (also known as the direct path coefficients). These were then multiplied with the correlation coefficients to determine the indirect path coefficients (Akintunde, 2012). Seed cotton yield was considered as a dependent variable and the rest of the traits were considered as independent variables. Statistical significance of the correlation coefficients for the different trait associations were determined using Microsoft Excel 2007.

Results and discussion
The analysis of variance for the mean values of the 10 genotypes revealed highly significant differences (P < 0.001) for boll weight and seed weight, significant differences (P < 0.05) for gin outturn percentage and plant height as shown on the mean squares of the different traits in Table 1. According to Salahuddin et al. (2010), this provides enough evidence of for the significant genetic variability of the traits among the genotypes, which have further influenced on other traits associated with them. 6.1. Significance levels: * = 0.05, ** = 0.01, *** = 0.001 Table 2 shows the results of genotypic correlation analysis of seed cotton yield and the associated traits. More traits revealed high positive correlation at genotypic level more than at phenotypic level which indicated the high inherent association of the traits in the study (Table 1) Seed cotton yield was significantly and highly positive correlated with lint yield (r = 0.71***) and fibre elongation (r = 0.54***) at genotypic level. Other positive correlation at genotypic level which were significant existed between seed cotton yield and bolls per plant (0.27***), seed weight (r = 0.22***), fibre strength (r = 0.21***) and fibre fineness (r = 0.13*) with negative correlation being observed with plant height (r = −0.52***). A strong and significant genotypic relationship was observed among GOT and lint yield (r = 0.70***), fibre elongation (r = 0.60***) and fibre strength (r = 0.50***). Some positive and significant relationship also existed between the GOT and boll weight (r = 0.24***), plant height (r = 0.30***), fibre length (r = 0.11**) and micronaire (r = 0.40***). Boll weight genotypically correlated with seed weight (r = 0.56***) whilst plant height was highly associated genotypically with fibre strength (r = 0.58***). Elongation was also correlated genotypically with strength (r = 0.55***) whilst strength with micronaire (r = 0.71***). Khalid et al. (2018) similarly reported positive and significant correlation seed cotton yield with plant height, number of bolls per plant and boll weight. In another study, Salahuddin et al. (2010) also had similar results of high correlation between seed cotton yield and boll numbers per plant, plant height and boll weight. Hence, selection of cotton plants having these distinguishing traits will facilitate cotton seed yield improvement.
At phenotypic level, there were high positive and significant (P < 0.001) correlation between seed cotton yield and lint yield (r = 0.93) whilst positive and significant (P < 0.001) correlation existed   between seed cotton yield and boll weight (r = 0.37) and plant height (r = 0.31) as shown in Table 3. Seed weight was significantly and positively correlated with boll weight (r = 0.61***), elongation with fibre length (r = 0.54***), fibre strength with fibre length (r = 0.72***), fibre micronaire with fibre length (r = 0.63***), fibre micronaire with fibre elongation (r = 0.56***) and fibre micronaire with fibre strength (r = 0.73***). In traits that exhibited higher phenotypic correlation values than the genetic correlation values, apparent association amongst the two traits would not be only due to genes but due to favourable influence of the environment and where values were zero or insignificant then the two traits would be independent (Hussain et al., 2010).
The positive association of seed weight with boll weight, plant height and bolls per plant will facilitate simultaneous improvement through selection for the traits as the later are associated with seed cotton yield. However, simultaneous improvement of seed weight and seed cotton yield or plant height and seed cotton yield is very difficult due to the negative association of the pair of traits. An independent selection should be done for the improvement of such traits.
In this study, the regression analysis results of seed cotton yield against the associated traits as presented in Table 4, revealed high values of adjusted R Square (0.98), low standard error value (0.12) and low residual effect (R = 0.01). This indicated that most of the variability of seed cotton yield was explained by the causal variables in the study. The P value for the intercept was significantly less than alpha (P < 0.05) implying that slope of the regression line was significantly different from zero and the null hypothesis of no effect of the causal variables on seed cotton yield rejected. As noted by Zaiontz (2015), this indicated that regression model was a significant good fit. The P values of the all the coefficients of the traits except for gin outturn, lint yield, boll weight and seed weight were greater than 0.05 suggesting that these four traits significantly contributed to the observed variation in seed cotton yield The results of direct and indirect correlation coefficients regressed with seed cotton yield and are presented in Table 5. Lint yield had the highest direct effect on seed cotton yield (r = 1.055) which implied that lint yield could be used as marker for direct selection. However, since this is a trait that can be determined after harvest, traits in the field that can be used for indirect selection could be bolls per plant (r = 0.006), seed weight (0.022) and or plant height (0.012) as they have direct effect values that are greater than the correlation coefficients (r = 0.00, r = −0.005 and r = 0.004, respectively) with seed cotton yield. Doggett (1988) reported that the height of the plant influences the number of nodes produced, which equates with the number of leaves produced thereby increasing the photosynthetic capacity, and hence increased yield. The indirect correlation coefficients between seed cotton yield and elongation (r = 0.000) and the direct effect (r = 0.002) implied that the observed phenotypic correlation coefficient explained the true associationship (Akatwijuka et al., 2019). Gin outturn percentage, boll weight, fibre elongation and fibre fineness had positive genetic correlations but with negligible to positive direct effects, indicated that the indirect effects caused the observed correlation.
The path coefficient analysis of indirect and direct effects of the associated traits with seed cotton yield revealed that gin outturn (0.737) had the highest indirect contribution to seed cotton yield through lint yield, followed by strength (0.012) through plant height, seed weight (0.011) through boll weight, fibre fineness (0.010) through boll weight indicating the importance of these traits to plant height, boll weight and seed cotton yield. These need to be carefully considered simultaneously when selecting for yield improvement in cotton.

Conclusion
The study concluded that for selecting high yielding cotton genotypes for the middleveld areas of Zimbabwe, more emphasis should be given to genotypes that produce more lint yield, boll weights, average plant height and high number of bolls per plant for better performing lines as they were closely associated. This was the order of importance in terms of association of the traits contributing to seed cotton improvement. However, seed cotton yield improvement greatly affected seed weight and similarly high seed weights were associated with low number of bolls per plant. It was also concluded that the greater number of bolls per plant, greatly reduced the boll weight in cotton so a balance have to be reached to achieve optimum yields. Fibre elongation, strength and fineness were linked to seed cotton yield genetically and was concluded that these could play positive roles as selection criteria to improve seed cotton yield. It was also concluded that gin outturn and fibre strength could be used indirectly to improve seed cotton yield through other traits such as plant height and boll weight.