A blinded validation of the Swedish version of the Clinical Assessment Interview for Negative Symptoms (CAINS)

Abstract Purpose The Clinical Assessment Interview for Negative Symptoms (CAINS) was developed in order to advance the assessment of negative symptoms. The aim of this study was to validate the Swedish version of the CAINS. Materials and methods Thirty-four out-patients with a schizophrenia spectrum disorder were recruited. All patients were videotaped while interviewed with the CAINS and the Brief Psychiatric Rating Scale (BPRS). Another rater watched the video recordings in the reverse order, enabling a blinded design. The patients also filled in self-reported measures of depression, quality of life, and social and vocational functioning. We calculated inter-rater agreement and internal consistency for the CAINS. We also calculated validity measures by correlating the subscales Motivation and Pleasure (CAINS-MAP) and Expression (CAINS-EXP) to subscales of the BPRS. Results The blinded inter-rater agreement for the CAINS total score was high (ICC = 0.92) but slightly lower for the expression subscale (ICC = 0.76). Cronbach’s alpha was 0.84 for the total score. Convergent validity with the negative symptoms subscale of BPRS was different for the blinded and the unblinded data, with a CAINS-MAP correlation of 0.10 (p = 0.580) and a CAINS-EXP correlation of 0.48 (p = 0.004) in the blinded data. The unblinded data had a CAINS-MAP correlation of 0.38 (p = 0.026) and a CAINS-EXP correlation of 0.87 (p < 0.001). Self-rated measures of anhedonia correlated to CAINS-MAP with a coefficient of 0.68 (p < 0.001), while the CAINS-EXP only had a correlation of 0.16 (p = 0.366) to these measures. Conclusion The Swedish version of the CAINS displays adequate psychometric properties in line with earlier validation studies.


Introduction
Negative symptoms comprise motivational deficits such as anhedonia, avolition and apathy, as well as expressional deficits such as alogia and blunted affect [1]. These symptoms have been shown to be related to the social and occupational difficulties that plague many individuals with schizophrenia [2]. While antipsychotic medication can ameliorate positive symptoms or reduce the risk of rehospitalization [3,4], these drugs have proven less effective in ameliorating negative symptoms [5].
The earliest assessment scales for negative symptoms were subscales of already existing global rating scales such as the Brief Psychiatric Rating Scale (BPRS) [6,7] and the Positive and Negative Syndrome Scale (PANSS) [8]. One of the most commonly used specific scales for negative symptoms is the Scale for the Assessment of Negative Symptoms (SANS) [9], although there are many others available [10].
The fact that specific treatments for negative symptoms were still lacking led to a consensus conference in 2006 [11]. During the conference, the SANS was deemed more suitable than the PANSS for assessing negative symptoms, but it was also criticized for the inclusion of certain items that were not considered necessary for the negative symptom concept (e.g. inappropriate affect and attentional deficits). There were also voices raised for the assessment of consummatory anhedonia aspects (as opposed to anticipatory) and for including the desire for social interactions. The consensus statement stressed the need of a new instrument for assessing negative symptoms and a group of researchers were put together to construct such an instrument [12]. This led to the development of the Clinical Assessment Interview for Negative Symptoms (CAINS) [13][14][15].
The CAINS is a semi-structured interview with 13 items and takes approximately 30 min to administer. It is divided into two parts, reflecting the two subdomains of negative symptoms that had been proposed earlier [1]. The first nine items assess the subject's motivation and pleasure for activities such as spending time with family/friends/partners, taking part in social activities/work/studying and doing recreational activities. This part also includes items assessing both recent and future frequencies of such activities, thereby enabling assessments of both consummatory and anticipatory pleasure. This subscale is called the CAINS Motivation and Pleasure subscale (CAINS-MAP). The second subscale is called the CAINS Expression subscale (CAINS-EXP), and it consists of four items in which the interviewer observes the subject's facial and vocal expression, expressive gestures and quantity of speech. All items are rated 0-4 where 0 is described as 'No impairment', 1 as 'Mild deficit', 2 as 'Moderate deficit', 3 as 'Moderately severe deficit' and 4 as 'Severe deficit'. The total score of the CAINS thus ranges from 0 to 52.
The CAINS was developed in several steps, and the final validation study concluded that the CAINS is reliable and valid, shows high intra-and interrater agreement, good internal consistency, strong convergent and discriminant validity and that it is linked to functional outcome [15]. These results have been replicated when the CAINS was compared to another newly developed rating scale for negative symptoms called the Brief Negative Symptom Scale [16].
Several translations of the CAINS have been made. The original developers have provided educational material as well as video recordings to facilitate the training process. To date, there are validation studies for versions of the CAINS in German [17], Mandarin and Cantonese [18], Spanish [19], Korean [20,21] and Serbian [22]. There has also been a validation of the English version in Singapore [23]. All of these studies have agreed on that the CAINS exhibits good psychometric properties. However, few have adopted a blinded study design as recommended by the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) initiative [24][25][26]. The aim of this study was to assess the reliability and validity of the Swedish version of the CAINS in a blinded study design.

Subjects
Participants were patients at the Uppsala University Hospital with a diagnosis of schizophrenia or schizoaffective disorder according to case notes and confirmed by a clinical interview using the Mini-International Neuropsychiatric Interview (M.I.N.I.) [27]. The sample consisted of 34 participants from two different projects; the first being a randomized controlled trial (n ¼ 16) of repetitive transcranial magnetic stimulation (clinicaltrials.gov ID: NCT02905604), and the second being the validation study of the CAINS (n ¼ 18). Inclusion criteria for both studies were being 18 to 59 years old and having a diagnosis of schizophrenia spectrum disorder according to the International Statistical Classification of Diseases and Related Health Problems (ICD-10). For the participants from the trial, there was also an inclusion criterion of scoring less than 40 points on the Motivation And Pleasure Scale-Self-Report, reflecting a certain degree of negative symptom burden [28]. Exclusion criteria were a severe medical condition, epilepsy, metal or cochlea implant in the head, changes in medication in the last month, addiction (illicit drugs or alcohol), pregnancy and insufficient Swedish command. All participants provided written informed consent and the study procedures were approved by the regional ethical board. The study was conducted in accordance with the Helsinki declaration.

Procedures
The CAINS was translated into Swedish by two of the authors (RB and JB) with permission from the original developers. It was re-translated into English by an independent authorized translator and the re-translated version was approved by the original developers (see Supplementary material).
All raters (two psychiatrists, one resident in psychiatry and one clinical psychologist) were trained in using the CAINS under supervision by the most senior psychiatrist (RB). Training consisted of reading the manual and conducting at least one interview together with RB. In addition, all raters also watched and assessed nine videotaped interviews of both CAINS and BPRS and scoring was discussed among the raters to reach optimal agreement. Data collection was conducted at the Uppsala University Hospital between 2016 and 2020. Participants were scheduled to the research facility and were first interviewed with the CAINS and BPRS (in that order) by one of the raters. The interviews were videotaped. Another rater watched the videos in the reverse order (i.e. first BPRS and then CAINS). The video raters were not given any specific instructions in this regard and were thus able to for example pause and rewind the recording. However, the raters reported that they mostly watched and rated without pausing. When analyzing the data, we used the CAINS scores from the live interview, but the BPRS scores from the video-rater. This procedure assured that the scores from the different scales were blinded from each other, as recommended by the QUADAS initiative [24][25][26]. A research nurse also administered cognitive tests and the participant filled in self-report assessments. The cognitive tests used in this study was the Animal Naming Test assessing semantic verbal fluency, where the participant is asked to mention as many animals as possible during one minute [29]. We also used the Digit Symbol Substitution Test (DSST), assessing processing speed [30].
The convergent validity was examined using the negative symptoms subscale of the BPRS as the reference test [31]. This subscale consists of the sum of items Blunted affect, Emotional Withdrawal, and Motor retardation. The items are scored from 1 (no symptoms) to 7 (very severe). We also used the sum of two items from the self-rated Montgomery Asberg Depression Rating Scale (MADRS-S) [32], namely items Lassitude and Inability to feel. These items are scored 0 to 6 where 6 is the most severe. The Clinical Global Impression (CGI) scale [33], which is a clinician rated 7-point Likert scale where 7 is the most severe, was also used and rated by the same rater who also conducted the CAINS live rating. Correlation to general well-being was examined by the EQ-VAS questionnaire [34]. We calculated correlations for both the total CAINS score and the scores of the subscales CAINS-MAP and CAINS-EXP. We also tested whether the correlations between the CAINS-ratings and blinded versus unblinded ratings of the BPRS negative symptoms subscale were statistically different.
For discriminant validity we used the affective symptoms subscale of BPRS (sum of items Anxiety, Depression, Suicidality, and Guilt) and the total MADRS-S score. We also used the positive symptoms subscale of BPRS (sum of items Suspiciousness, Hallucinations, and Unusual thought content) for discriminating positive symptomatology [31]. Discrimination to cognitive function was assessed with the Animal Naming Test and with the DSST. Sensitivity to extrapyramidal side effects was examined with the Extrapyramidal Symptoms Rating scale (ESRS) [35]. In order to assess the relationship between the CAINS and functioning we used the Sheehan Disability Scale [36] in which the participant judge to what extent the symptoms have disturbed their work/ studies, social life and family interactions, on a scale from 0 to 10, where 10 is equal to 'Very much'.

Statistics
All data was assessed for normality by visual inspection of histograms and by Kolmogorov-Smirnov and Shapiro-Wilk tests for normality. Where data was not normally distributed we compared the mean and the median values. Inter-rater agreement was assessed by using the intraclass correlation coefficient (ICC) [37]. Internal consistency was estimated using Cronbach's alpha [38] with a 95% confidence interval (CI). Linearity between correlation variables was assessed by visual inspection of scatter plots. Validity analyses was conducted using the Pearson correlation coefficient (and Spearman's rank correlation coefficient, which did not yield substantially different results, data not shown). SPSS version 26 was used for the statistical analyses. Differences between blinded versus unblinded correlations were tested using the r.test function in psych package [39] in R version 3.6.3.

Demographic and clinical characteristics
No data differed substantially between the mean and the median values and the means are therefore presented. A majority of the participants were males and had some comorbidity with prior depression and/or anxiety. Almost a third had a primary diagnosis of schizoaffective disorder. All participants were prescribed antipsychotics (most commonly atypical ones). Over a third were prescribed clozapine. See Table 1 for all descriptive data.

Inter-rater agreement
The inter-rater agreement obtained during the training (n ¼ 9) was high for the CAINS (CAINS total ICC ¼ 0.97, CAINS-MAP ICC ¼ 0.99 and CAINS-EXP ICC ¼ 0.83) as well as  for the BPRS and its subscales (all ICC's above 0.89). The interrater agreement coefficients between the live interview and the video rating are presented in Table 2. Coefficients were high for the CAINS total score (ICC ¼ 0.92) and for the motivation and pleasure subscale (ICC ¼ 0.95). It was lower for the expression subscale (ICC ¼ 0.76). The ICC for the total and subscale scores of BPRS was above 0.91 for all but the negative symptoms subscale where the ICC was 0.62. See Table 2.

Convergent validity
See Table 4 for all convergent validity measures and the Supplementary material for a scatter matrix. There were different results when comparing the blinded values as opposed to the unblinded. Regarding the blinded, the correlation coefficient (r) between the CAINS total score and the BPRS negative symptoms subscale was 0.25 (p ¼ 0.162), and between the CAINS-MAP and the BPRS negative symptoms subscale 0.10 (p ¼ 0.580). The correlation coefficient between the CAINS-EXP and the BPRS negative symptoms subscale was 0.48 (p ¼ 0.004). For the unblinded values (i.e. only data from the live interview where the rater was aware of the results from the CAINS when conducting the BPRS interview) there were significant correlations throughout. However, only the correlations between the CAINS-EXP rating and the blind versus unblinded BPRS ratings were statistically different (z ¼ 3.19, p < 0.001) whereas the correlations between the CAINS total rating and the blind versus unblind BPRS ratings were not statistically different (z ¼ 1.79, p ¼ 0.07), and neither were the correlations between the CAINS-MAP rating and blinded versus unblinded BPRS ratings (z ¼ 1.18, p ¼ 0.24). When correlated to the self-rated MADRS-S anhedonia subscale, there were moderate correlations between the CAINS total score and the CAINS-MAP, but only a weak correlation with the CAINS-EXP. There were also moderate correlations between all CAINS subscales and CGI. Regarding self-rated general well-being with the EQ-VAS, there were moderate negative correlations between the CAINS total score and the CAINS-MAP (i.e. the higher the score on the CAINS, the lower general well-being), but a weak correlation with the CAINS-EXP.

Discriminant validity
See Table 5 for all discriminant validity measures. There were weak correlations between the CAINS and the BPRS positive  symptoms subscale. The correlation coefficient between the BPRS affective symptoms subscale and the CAINS total score was 0.30 (p ¼ 0.084) and between the CAINS-EXP 0.36 (p ¼ 0.035). These results were similar for the blinded and the unblinded data (data not shown). There were moderate correlations between the total score of MADRS-S and the CAINS total score as well as the CAINS-MAP, but a weak correlation with the CAINS-EXP. There were no correlations between the CAINS and cognitive function (Animal Naming Test or DSST), nor with extrapyramidal side-effects or functional outcome.

Discussion
This is one of the few blinded validation studies of the CAINS. Only the initial study [13] and a subset of the patients in the final validation study [15] report blinding procedures. Some other studies administer the reference and the index test in a random order, but still unblinded [17,19]. Bearing this in mind, it is interesting to note that our results differ somewhat between the blinded and the unblinded data. For the blinded data, the CAINS total score and the CAINS-MAP did not correlate significantly to the BPRS negative symptoms subscale. However, the CAINS-EXP did, which is notable since the BPRS negative symptoms subscale consists of three items that are all observational in character (as is the CAINS-EXP). It should also be noted that the correlation between the CAINS-EXP rating and blinded BPRS rating was significantly weaker than the correlation with the unblinded BPRS rating. The CAINS-MAP is based on patient reports, and it is therefore also interesting to note that the CAINS-MAP correlated well with the self-reported MADRS-S anhedonia subscale, whereas CAINS-EXP did not. We can thus distinguish a discrepancy between reported measures and observational, where our results within these categories are quite feasible.
Still, our blinded data does not show good convergent validity with the BPRS negative symptoms subscale. One reason to this might be the somewhat lower inter-rater agreement observed for this subscale (ICC ¼ 0.62). This value was obtained even though we put effort in obtaining good interrater agreement during the training process (ICC ¼ 0.89 for BPRS negative symptoms subscale). However, during the training process there was an opportunity for the raters to discuss and reach agreement after having done the independent ratings; an opportunity that naturally did not exist for the video ratings in the actual study. This might have contributed to the lower inter-rater agreement. Almost all the other validation studies have reported slightly lower inter-rater agreement for the CAINS-EXP compared to the CAINS-MAP, and it is perhaps not surprising that assessing the expression items on a video recording is even more challenging than during live interviews.
For comparison, one can note that the final validation study of the CAINS [15] resulted in almost the same correlation coefficient between the CAINS-EXP and the BPRS negative symptoms subscale as in our study (0.52 versus ours 0.48). The correlation coefficient for the CAINS-MAP in that study was 0.28, and in our study 0.10. A Korean study that also used the BPRS negative symptoms subscale (albeit without describing the exact items included) reported correlation coefficients of 0.46 for the CAINS-MAP, 0.69 for the CAINS-EXP and 0.60 for the CAINS total score. This study however, included only patients with schizophrenia (i.e. not schizoaffective disorder) which might have yielded a slightly different sample. It was also not reported to be blinded. The last validation study using the BPRS negative symptoms subscale (also not stating the actual items) reported a correlation coefficient of 0.798 for the CAINS total score [22]. The majority of the patients in this study were however diagnosed with unspecified nonorganic psychosis which might hamper comparisons, and there were no blinding procedures reported.
The fact that our unblinded data shows significant convergent correlations throughout is worth a special mention. However, only the correlations between the CAINS-EXP and blind versus unblinded BPRS ratings were statistically different, not the correlations for CAINS total rating and CAINS-MAP subscale when comparing blind and unblinded BPRS correlations, but this may in part be due to the small sample size. These results for the unblinded data are more in line with the other validation studies, but it cannot be ruled out that the knowledge of the scores on the index test has affected the scoring of the reference test (or vice versa) in other studies and in our unblinded data.
Higher scores on the CAINS-MAP correlated to lower general well-being measured with the EQ-VAS. In fact, here we see one of the strongest correlation coefficients in our study (-0.73). There are no other studies available for comparison with this measure, but we believe that it possesses certain strengths in being easier to understand for the patients, thus converging with the aim to engage patients in the research [41]. There is a risk that the impact of various symptoms is granted different weight by the clinicians and the patients, and these measures can nuance that picture. At the same time, negative symptoms have long been regarded as burdensome for the patients, even though the patients themselves not always report them as the origin of their hardship. An inherent problem with motivational deficits is naturally that the motivation to change might also be low, and affective flattening will of course also implicate that even negative feelings are dulled. Our results indicate that what we deem to be motivational and pleasure deficits indeed correlate to lower general well-being. This gives hope that by treating negative symptoms, we might actually also make the patients feel better. For the discriminant validity, there was a significant correlation between the CAINS-EXP and the BPRS affective symptoms subscale. Although distinguishable, it can be expected that patients with affective symptoms also exhibit expressional deficits such as those commonly seen in depression [42]. It has been argued that negative symptoms in schizophrenia and depressive symptoms are indeed distinct domains, but that there is an overlap [1,43]. In line with this, we found no significant correlation between CAINS-MAP and the BPRS affective symptoms subscale. The CAINS-MAP can thus be said not to assess only affective symptoms as captured by the BPRS. The CAINS total score correlated to BPRS affective symptoms subscale, but this is probably driven by the correlation seen for CAINS-EXP, and it has been argued by the initial developers that the CAINS can preferably be analyzed with the two subscales separated [15].
There was however, a correlation between the CAINS-MAP and the total MADRS-S score, indicating an overlap between negative symptom ratings and depressive symptoms. This is the first time that the CAINS is compared to the MADRS-S and comparisons with other studies is therefore not possible. Most of the other validation studies have used the Calgary Depression Scale for Schizophrenia (CDSS) [44] for discriminant validity. The results have been mixed; some have found significant correlations [19,20], while others have not [15,21]. Our study would have merited from also including the CDSS, but MADRS-S is somewhat of a gold standard rating scale for depression in Sweden. It is also worth noting that a third of the participants in our study had a primary diagnosis of schizoaffective disorder.
We only used two tests of cognitive function is this study, but our results echo with earlier findings that the CAINS does not merely assess deficits in cognitive function [15]. However, one study did find a correlation between the CAINS and verbal fluency, and the authors argue that this might reflect a common fronto-striatal dysfunction [20].
In our study, we cannot replicate the finding that the CAINS is correlated to functional outcome for the patients [15,17,18]. Noteworthy though, is that we used another assessment scale for functioning.
Regarding internal consistency, our results (Cronbach's alpha ¼ 0.84 for the total CAINS) play in concert with other validation studies. The mean Cronbach's alpha for the last eight validation studies (excluding the first two studies since they used an earlier version of the CAINS) was 0.88 (SD ¼ 0.07). Our results are comparable to the final validation study [15] with very similar Cronbach's alphas and nearly identical ICCs.
The main strength in this study is the blinded procedure. The major limitation is the modest sample size, leaving the possibility open that our results would be different had we recruited more patients. The generalizability of our sample can also be questioned. Overall, our sample resembles those from the majority of the other studies in variables such as age, gender and medication status. We judge that the main difference is the high percentage of patients with schizoaffective disorder (32%). Some studies do not include patients with schizoaffective disorder at all, while those who do report percentages of 14% [15], 21% [17] and 16% [22]. Further, comorbidity was seldom reported in other studies, although some excluded patients with a mood disorder episode within a certain time prior to the study (usually one month). In our study, as many as 41% had comorbidity with a depressive disorder, although not ongoing. The general psychiatric patient also tends to present with a high degree of comorbidity, resulting in that isolation of a single diagnose ends up being rather unrepresentative. It is also worth noting that the comorbidities assessments origin from the M.I.N.I. interview undertaken during the inclusion process, in which the diagnostic process is more limited to checklists. An overlap is therefore expected. To summarize, these issues might have influenced our results mainly regarding convergent and discriminant validity for affective symptoms.
Another major difference between our study and earlier are our somewhat different inclusion criteria. Since we corecruited patients for a repetitive transcranial magnetic stimulation study, we had to apply standard inclusion and exclusion criteria for this intervention [45]. However, these exclusion criteria consist of rather rare conditions, such as metal implants in the head or epilepsy, which the majority of the patients with schizophrenia spectrum disorders do not have, after all. It is therefore unlikely that these criteria have affected our sample in a substantial way. For the same trial, we also used a cut-off of 40 points on the MAP-SR as an inclusion criterion (i.e. the patients had to exhibit a certain level of negative symptoms). This might have weighted our sample towards a dominance of negative symptoms versus positive symptoms. However, the CAINS mean scores and the BPRS negative symptoms subscale scores resemble those in other studies.
Our study would have been strengthened if we had used other negative symptom ratings as well, such as the PANSS and the SANS, but the BPRS has been used in several of the other validation studies and especially the first ones. Of course, there is also always an inherent contradiction in comparing a new concept of negative symptoms with an old one, that has been argued not to be accurate. A perfect convergent validity would then not necessarily add something new.

Conclusion
In this small blinded validation study of the Swedish version of the CAINS, we conclude that this assessment interview for negative symptoms seems to have adequate psychometric properties. It is encouraged for further use and validation in a Swedish context.