Symmetric kappa as a function of unweighted kappas

ABSTRACT It is shown that a symmetric kappa corresponding to a c × c table with c ⩾ 3 categories can be written as a function of the unweighted kappa corresponding to the same table and the c(c − 1)/2 distinct unweighted kappas associated with the (c − 1) × (c − 1) tables that are obtained by combining two categories. The result is a new MGB-type result.


Introduction
In many applications in the social, behavioral, and biomedical science, rating systems and scales are used to classify subjects and objects into mutually exclusive categories. Examples are, a pathologist that rates the severity of lesions from scans, a psychologist that assesses the degree of anxiety of a client, and competing diagnostic devices that classify the extent of a disease in patients. Categorical ratings usually entail a certain degree of subjective judgment. Therefore, an important issue is assessment of the reliability of the categorical rating system. One way to assess reliability is to let two raters classify independently the same set of subjects with the rating system. The reliability of the system can then be assessed by analyzing the agreement between the observers. If there is high agreement between the ratings there is consensus in the diagnosis and the ratings can be considered interchangeable.
Researchers usually express agreement between raters in a single number. Commonly used coefficients for doing this are Cohen's unweighted kappa in the case of nominal (unordered) categories (Warrens, 2013a(Warrens, , 2015aYang and Zhou, 2014) and weighted kappa in the case of ordered categories (Warrens, 2012a(Warrens, , 2014Yang and Zhou, 2015). Both kappa coefficients correct for agreement due to chance. Not all disagreements may be equally important. Weighted kappa allows the use of weights to take the importance of the disagreements between the categories into account. For example, with ordered categories one usually expects more disagreement or confusion between adjacent categories than on categories that are further apart.
A criticism frequently formulated against the use of weighted kappa is that the weights are arbitrarily defined (Vanbelle and Albert, 2009;Warrens, 2012a). To find support for certain weighting schemes several authors have studied algebraic and analytic properties of variants of weighted kappa. For example, kappa with quadratic weights can be interpreted as an intraclass correlation coefficient (Schuster, 2004). Furthermore, squared weights can be chosen such that weighted kappa is equivalent to another intraclass correlation or the Pearson correlation (Warrens, 2014). The latter result suggests that for rating systems with ordinal categories we may abandon the weighted kappa altogether and replace it with the Pearson correlation.
Several authors have also found support for linear kappa, i.e., weighted kappa with linear weights. In each case, it was shown that the kappa corresponding to the original contingency table is a function of kappas corresponding to smaller tables that can be obtained by merging adjacent categories of the original table. Vanbelle and Albert (2009) showed that the components of linear kappa can be obtained from all distinct 2 × 2 tables. In fact, linear kappa is a weighted average of the 2 × 2 kappas, where the weights are the denominators of the 2 × 2 kappas (Warrens, 2011a). This result was generalized by Warrens (2012b) who showed that the overall linear kappa is a weighted average of the linear kappas corresponding to all distinct smaller tables of a specific size that can be obtained by merging adjacent categories. Warrens (2013bWarrens ( , 2015b showed that the properties of linear kappa extend to a more general kappa coefficient called additive kappa, i.e., weighted kappa with additive weights. With additive weights, the categories have an underlying one-dimensional interval scale but are not required to be equally spaced. The properties of linear and additive kappa preserves, in a way, an analogous property of unweighted kappa. With ordered categories it only makes sense to combine adjacent categories. With unordered categories, we may combine all the categories in different ways. Warrens (2011b) showed that given a partition type of the categories, the overall kappa is a weighted average of the kappas of the collapsed tables corresponding to all partitions of that type. The weights are the denominators of the kappas corresponding to the collapsed tables.
In the above-mentioned research on unweighted, linear, and additive kappa it was shown that an overall kappa coefficient is a function of the same type of kappa coefficients associated with smaller tables. A completely new type of result was presented in Moradzadeh et al. (2017), or MGB for short. These authors showed that certain weighted kappas are functions of unweighted kappas. The unweighted kappas correspond to smaller tables that can be obtained by combining adjacent categories. A requirement with regard to the weighting schemes is that pairs of categories that are the same number of steps apart have identical weights. Examples are the linear and quadratic kappa.
In this article, we present a new MGB-type result. It is shown that a symmetric kappa corresponding to a c × c table with c ≥ 3 categories can be written as a function of the unweighted kappa corresponding to the same table and the c(c − 1)/2 distinct unweighted kappas associated with the (c − 1) × (c − 1) tables that are obtained by combining two categories. The article is organized as follows. Definitions are presented in Section 2 and the main result is presented in Section 3. In Section 4, it is shown, in the context of a rating system with three categories, how the above-mentioned results and the new result are related. Section 5 contains a conclusion.

Weighted and unweighted kappa
In this section, we define Cohen's weighted kappa and unweighted kappa. Suppose two raters classify independently the same group of N subjects into one of c ≥ 2 categories. The categories are defined in advance and the raters use the same c categories. Let π i j for i, j = 1, 2, . . . , c denote the proportion of subjects classified into category i by the first rater and category j by the second rater. The square contingency table {π i j } reflects the joint agreement between the two raters. The marginal totals are denoted by π i+ for the first rater and π +i for the second rater. The totals reflect how often the raters have used the categories.
Let the weight w i j be a real number with 0 ≤ w i j ≤ 1 for i, j = 1, 2, . . . , c. We set w i j = 1 if i = j. Hence, the elements on the main diagonal of the contingency table {π i j } get the maximum weight of 1. Commonly used weights are the linear weights (Vanbelle and Albert, 2009;Warrens, 2011aWarrens, , 2012b and quadratic weights (Schuster, 2004;Warrens, 2012aWarrens, , 2013a. Weighted kappa can be defined as where is the weighted observed agreement and is the weighted expected agreement. To avoid indeterminate case, we assume that E w < 1. The kappa coefficient in (1) has value 1 if the weighted observed agreement equals unity (O w = 1) and value 0 if the weighted observed agreement is equal to the weighted expected agreement (O w = E w ). If we use the identity weights in (1), we obtain Cohen's unweighted kappa given by is the raw observed agreement and is the expected agreement. The kappa coefficient in (5) has value 1 if there is perfect observed agreement between the raters (O = 1) and value 0 if the observed agreement is equal to the agreement expected under independence (O = E).

A theorem
In this section, it is shown that weighted kappa with symmetric weights can be expressed as a function of unweighted kappas. In the case of symmetric kappa the (i, j)th cell and ( j, i)th cell of the contingency table {π i j } have the same weight, i.e., w i j = w ji . The commonly used unweighted kappa, linear kappa, and quadratic kappa are examples of symmetric kappa. Since w ii = 1 for all i and w i j = w ji for all i and j, symmetric kappa can be written as . (8) In Theorem 3.1, we are interested in the observed agreement, expected agreement and unweighted kappa corresponding to the (c − 1) × (c − 1) that is obtained if categories i and j are combined. The categories can be combined by merging the ith and jth row and the ith and jth column of the square contingency table {π i j }. Theorem 3.1 shows that for a given c × c table the corresponding symmetric kappa can be written as a function of the unweighted kappa associated with the c × c table and the c(c − 1)/2 distinct unweighted kappas associated with the (c − 1) × (c − 1) tables that are obtained by combining two categories.
Theorem 3.1. Suppose c ≥ 3. Let O i j and E i j denote, respectively, the observed and expected agreement of the (c − 1) × (c − 1) that is obtained by combining categories i and j. The associated unweighted kappa is given by It holds that (10) Proof. Using the identities and combining different terms, kappa (10) can be written as Next, if we combine categories i and j the observed and expected agreement of the collapsed (c − 1) × (c − 1) are given by Using these identities in kappa (11), we obtain kappa (8), which completes the proof.
Although formula (11) is a simpler expression of symmetric kappa, formula (10) specifies how symmetric kappa is a function of the unweighted kappas. Furthermore, if all quantities in the numerator and denominator of (10), except the unweighted kappa values, are non-negative. Thus, if inequality (14) holds symmetric kappa is not just a function but a weighted average of the unweighted kappas as well.

The 3 × 3 case
In this section, we show in detail how Theorem 3.1 from Section 3 is related to other results from the literature. For notational convenience, we assume that the rating system has c = 3 categories, i.e., the contingency table {π i j } has size 3 × 3. Let κ denote the corresponding unweighted kappa and let κ 12 , κ 13 and κ 23 denote the three unweighted kappas corresponding to the 2 × 2 tables that are obtained by combining categories 1 and 2, 1 and 3, and 2 and 3, respectively. Furthermore, define w = w 12 + w 13 + w 23 .
Theorem 3.1 shows that symmetric kappa can be written as Since we have E, E 12 , E 13 , E 23 < 1, all weights in (16) are non-negative if w ≤ 1. In other words, formula (16) shows that symmetric kappa is not only a function, but also a weighted average of the four unweighted kappas if and only if w < 1. Moradzadeh et al. (2017) considered the case where w 12 = v = w 23 and w 13 = 0. In this case, symmetric kappa is given by All weights in (17) are non-negative if v ≤ 1 2 . In other words, formula (17) shows that symmetric kappa is not only a function, but also a weighted average of the three unweighted kappas if and only if v < 1 2 . Formulas (16) and (17) shows that the result in Moradzadeh et al. (2017) is a special case of Theorem 3.1 in Section 3 if the rating system has precisely three categories. If the system has four or more categories the result in Moradzadeh et al. (2017) and Theorem 3.1 are different.
Next, if the weights w 12 , w 13 and w 23 in (16) are chosen such that w = 1 symmetric kappa becomes Formula (18) shows that symmetric kappa is a weighted average of the three 2 × 2 kappas only if w = 1. If we also require that w 13 = 0 symmetric kappa becomes additive kappa (Warrens, 2013b(Warrens, , 2015b. The latter kappa is given by Formula (19) shows that additive kappa is the unique kappa that is a weighted average of kappas κ 12 and κ 23 . Finally, we may require that the distance between categories 1 and 2 and between categories 2 and 3 is equal, i.e., w 12 = w 23 . If we use this condition in (19) additive kappa reduces to linear kappa (Warrens, 2011a(Warrens, , 2012b. The latter kappa is given by

Conclusion
In this article, we presented a new MGB-type result. The result holds for any weighted kappa that has a symmetric weighting scheme. A weighting scheme is symmetric if the (i, j)th cell and ( j, i)th cell have the same weight. The result specifies that symmetric kappa corresponding to a c × c table with c ≥ 3 categories can be written as a function of the unweighted kappa corresponding to the same table and the c(c − 1)/2 distinct unweighted kappas associated with the (c − 1) × (c − 1) tables that are obtained by combining two categories. Moradzadeh et al. (2017) also showed that weighted kappa can be written as a function of various unweighted kappas. Their result requires that pairs of categories that are the same number of steps apart have identical weights. This condition is less general than the requirement of symmetry in this article.