Validation of the 2011 and 2016 American college of rheumatology diagnostic criteria for fibromyalgia in a Chinese population

Abstract Purpose To provide a foundation for clinical diagnosis, epidemiological investigation and intervention trials, we examined the reliability and validity of the American College of Rheumatology (ACR) 2011 and 2016 survey diagnostic criteria among Chinese patients based on the fibromyalgia severity (FS) scale. Methods In this study, 200 fibromyalgia patients diagnosed according to the 1990 criteria (1990c) were matched with rheumatoid arthritis (RA) patients based on age and gender. The FS scale score and its subscales were examined to determine their correlations with the revised fibromyalgia impact questionnaire (FIQR). Receiver operator characteristic (ROC) analysis was performed, and test-retest reliability, internal consistency, and construct validity were examined. Results The area under the curve (AUC) for the ACR 2011c and 2016c was 0.870 and 0.845, respectively, and the sensitivity and specificity were 78.0% and 96.0% for the 2011c and 70.5% and 98.5% for the 2016c, respectively. The FS scale and its subscales were confirmed to exhibit good internal consistency, and they were significantly correlated with the FIQR, thereby indicating adequate construct validity. Using a lower cutoff value 11 points for the FS scale score based on the generalized pain requirement might be a more effective approach in the Chinese population; this approach yielded an AUC of 0.923 and a sensitivity of 87.0% and specificity of 97.5%. Conclusion The 2011c and 2016c are reliable instruments for diagnosing fibromyalgia patients in China. The FS scale could be a valid tool to assist in fibromyalgia diagnosis, and a cutoff value 11 points is more suitable in Chinese patients. Trial registration ClinicalTrials.gov ID: NCT03381131


Introduction
Fibromyalgia is a chronic disorder characterized by generalized pain, fatigue, insomnia, and mental disorders, such as anxiety and depression [1,2].this condition affects up to 5% of the general population globally, making it the second most common rheumatologic condition (trailing only osteoarthritis) [3].however, in mainland china, it remains challenging to diagnose fibromyalgia, leading to a high frequency of delayed diagnoses and misdiagnoses.a recent cross-sectional study reported that 90% of fibromyalgia patients were misdiagnosed at their first visit complaining of fibromyalgia symptoms [4], and fibromyalgia was correctly diagnosed an average of 2 years later [5].Moreover, the misdiagnosis rate of fibromyalgia is also high, reaching 87% [6].
there are several hypotheses regarding the difficulty of diagnosing fibromyalgia in china.First, there is a lack of knowledge and a misunderstanding of this syndrome among doctors, even rheumatologists [7].second, according to previous findings, chinese patients may have milder fibromyalgia-related symptoms and a better quality of life (QOl) than individuals from other cultural and ethnic groups [4]; therefore, the acR diagnostic criteria for fibromyalgia may not be generalizable to the chinese population.third, the diagnostic level of fibromyalgia in china needs to be improved.
since fibromyalgia is a complex clinical syndrome without distinct biomarker(s) [8], the diagnosis of fibromyalgia was and still is mostly made by clinical presentation and excludes alternative diagnoses.to facilitate the diagnosis of fibromyalgia, in 1990, the acR developed fibromyalgia classification criteria that require the presence of generalized pain for 3 months and physical examination to identify tenderness at 11 or more of the 18 specific tender point counts (tPcs) [9].although the acR 1990 criteria (1990c) is recognized as the gold standard for fibromyalgia diagnosis, it is not widely used because of the challenging nature of tPc examination, which is the main reason for delay diagnoses of fibromyalgia [10]. in addition, the 1990c lacks the ability to characterize the clinical symptoms and severity and the ability to monitor the progression of the disease.therefore, acR developed new preliminary diagnostic criteria for fibromyalgia in 2010 [11] as well as a self-report questionnaire for patient surveys and clinical research in 2011 [12]; in these new criteria, the tPc examination was eliminated. in 2016, acR published a revision to the 2010/2011 fibromyalgia diagnostic criteria [13] that validated their utility, made modifications to the previously identified problems, and addressed the clinical importance of the fibromyalgia severity (Fs) scale.although the above-mentioned criteria were developed and validated meticulously elsewhere in the world, these criteria were not evaluated and validated in china before being adopted as the diagnostic standard.this is problematic because symptoms and symptom severity are subjective and are likely influenced by racial and cultural backgrounds.therefore, to provide a solid foundation for clinical diagnosis, and future investigation and researches, this study aimed to verify the applicability, reliability and validity of the acR 2011 criteria (2011c) and 2016 criteria (2016c) for fibromyalgia in china, and assumed that the 2011c and 2016c were reliable instruments for diagnosing chinese patients with fibromyalgia.it also aimed to examine the clinical diagnostic perspective of the 2011c and 2016c for identifing the optimal cutoff score for using the Fs scale to diagnose fibromyalgia among chinese patients.

Sampling size estimation and research design
the Power analysis and samplesize (Pass) version 11.0 software (www.ncss.com)was used to calculate the sample size of this diagnostic test.a previous acR 2016c assessment study conducted in a Norwegian fibromyalgia population resulted in a sensitivity of 88.8% and a positive ratio of 86% according to the acR 1990c [14].Based on the confidence level (1-α) = 0.95 and confidence interval width (two sided)= 0.2, a total of 178 patients with fibromyalgia should be investigated.considering 10% questionnaires might be failed to fill qualifiedly, there should be about 200 cases in fibromyalgia group.
this multicentre diagnostic trial was conducted from 1 October 2018, to 30 september 2021.We recruited 200 patients with fibromyalgia who met the 1990c from the rheumatology clinic at 6 hospitals.the rheumatoid arthritis (Ra) group was matched with fibromyalgia group based on gender and age. the study protocol was approved by the ethics committee of the Guang'anmen hospital (approval number 2018-059-KY) and have been performed in accordance with the Declaration of helsinki, and written informed consent was obtained from all participants.investigator training, including tenderness site examination, was conducted prior to the clinical study.

Patient selection
the inclusion criterion for fibromyalgia participants was receiving a diagnosis of fibromyalgia based on the 1990c at the first outpatient visit regardless of whether they had a fibromyalgia diagnosis previously.the inclusion criteria for control participants were (i) no previous diagnosis of fibromyalgia and not meeting the criteria for fibromyalgia on the day of the visit and (ii) previously diagnosed with Ra by a rheumatologist according to acR 1987 criteria [15] or 2010 criteria of Ra [16].Patients with (i) severe mental diseases, including schizophrenia, personality disorders, dissociative disorders, mood disorders, post-traumatic stress disorder, etc.; (ii) and fractures, defined neuropathic causes or other nonrheumatic causes of pain were excluded from the study.

Study procedure
Patients visiting the outpatient clinic for body pain were first screened for the diagnosis of fibromyalgia using the 1990c by a trained investigator.if they met the inclusion criteria, their demographic and diseaserelated information was collected after informed consent was obtained.the assessment of fibromyalgia based on the 2011c and 2016c was completed by the patients themselves and the same investigator separately at the first visit and the second visit within 7 to 14 days.the investigators in this study were well trained and experienced in checking the tPc and using the scale.

Translation of Chinese versions of the 2011c and 2016c
With the permission of the original author by e-mail correspondence, the 2011c and 2016c were translated into chinese following an adequate translation procedure according to optimal standards [17].two native chinese-speaking experts, including one physician and one layperson, independently translated the 2011c and 2016c.then, two native english-speaking experts, including one physician and one layperson, reverse translated the chinese versions of the 2011c and 2016c reversely.after the translation, two bilingual multidisciplinary experts compared the original versions with the translated versions and modified and improved the translated versions to ensure that the translated versions were fully understandable, and the experts verified the cross-cultural semantic consistency between the two versions.two bilingual patients volunteered to complete the translated versions and to discuss the translated items with their physicians to ensure that the semantics were clear; then, the translated versions were further modified and improved.Finally, the acR 2011c and 2016c were translated and culturally adapted successfully.

Questionnaires
the Fs scale includes two subscales: the generalized pain index (WPi) and the symptom severity scale (sss).the WPi assesses the number of areas in which the patient has had pain over the last week with a score ranging from 0 to 19. the sss measures the severity level of the following symptoms over the previous week with a score ranging from 0 to 12: fatigue, waking unrefreshed, cognitive symptoms and the extent of somatic symptoms.a higher score on the sss indicates more severe fibromyalgia-related symptoms.the sum of the WPi and sss generates the Fs scale score, which ranges from 0 to 31; higher scores indicate worse fibromyalgia severity [12,13].
the diagnosis of fibromyalgia can be made only if 1) the diffuse symptoms last for at least 3 months and 2) the symptoms are not better explained by other disorders.Furthermore, the 2011c for fibromyalgia were as follows: WPi ≥7 and sss score ≥5 or WPi score from 3-6 and sss score ≥9. the 2016c criteria for fibromyalgia were as follows: WPi ≥7 and sss score ≥5 or WPi from 4 to 6 and sss score ≥9.additionally, there must be generalized pain involving at least 4 of the following 5 areas (excluding jaw, chest and abdomen pain).
the revised fibromyalgia impact questionnaire (FiQR) is a self-administered questionnaire with 21 questions across three subscales: (1) the 'function' subscale focuses on the ability to perform large muscle tasks, (2) the 'overall impact' subscale focuses on the overall impact of fibromyalgia, and (3) the 'symptoms' subscale focuses on the common fibromyalgiarelated symptoms.the total FiQR score ranges from 0 to 100, with lower scores indicating more improvement or less negative impact, and this questionnaire has been validated in chinese patients with fibromyalgia [18].

Statistics
statistical Package for social science (sPss) version 26.0 (iBM sPss, china) was used for statistical analysis.the acR 2011c and 2016c for validity, reliability, and the relevant consensus-based standards for the selection of health measurement instruments (cOsMiN) checklist [19] were applicable for the analysis.
Qualitative data were represented as numbers and percentages, and continuous variables were summarized as mean and standard deviation (sD), as well as the median and interquartile range (iQR) if they were nonnormal distributed.the fibromyalgia group and Ra group were compared with the Mann-Whitney U test or independent sample t-test for the continuous variables and the chi-square statistic for qualitative data.the standardized mean difference effect size statistical was used in the comparisons of the continuous variables between the fibromyalgia group and Ra group.the test-retest reliability of the WPi, sss and Fs scales was assessed with spearman's correlation coefficient, with 1 indicating a strong correlation and 0 indicating a weak correlation.cronbach's α coefficient was analysed to assess the reliability and internal consistency of the 2011c and 2016c, and the value was in the range of 0.7 to 0.9, representing good internal consistency [20].construct validity was analysed by assessing the Fs scale, and its subscales (WPi and sss score) were compared with the FiQR, including the FiQR total score and its subscales (FiQR-function, -overall, and -symptom scores) by spearman correlation coefficient analysis.
the receiver operator characteristic (ROc) analyses of the 2011c and 2016c were generated based on whether patients met these two diagnostic criteria.the area under the curve (aUc), sensitivity and specificity were calculated to assess the reliability and validity of the 2011c and 2016c. in addition, positive and negative predictive values (PPV and NPV) and positive and negative likelihood ratios (PlR and NlR) were calculated to assess accuracy.ROc curves were analysed to determine whether new cutoff values of the Fs scale combined with the 'at least 4 of 5 regions' criterion could lead to better diagnostic indices than the 2010c and 2016c.
all statistical tests were two-sided.to avoid comparison bias, p values less than 0.01 were considered statistically significant.

Comparison of demographic and clinical characteristics
the demographic and clinical characteristics of the fibromyalgia group and Ra group are shown in table 1. the mean age was 48.7 in fibromyalgia group and 49.2 in Ra group, and the proportion of females was 87%. a shorter symptom duration, higher WPi, sss, and Fs scale scores, as well as more tPc were observed in the fibromyalgia group (p< 0.001).the percentages of those who met the 2011c and 2016c were significantly greater in the fibromyalgia group than in the Ra group (p< 0.001).

The internal consistency reliability
the internal consistency and test-retest reliability were determined by examining the data of the first visit and the second visit in the fibromyalgia group.the results of spearman's correlation analysis for the Fs scale, its subscales, and all 25 single items revealed highly positive correlations (ranging from 0.53 to 0.82) for each item in the two diagnostic criteria.cronbach's α coefficient was 0.82 for the total Fs scale score, 0.84 for the WPi, and 0.65 for the sss.

The construct validity
the spearman's correlation coefficients between the Fs scale and its subscales (WPi and sss) and between the FiQR total score and its subscales were calculated to examine construct validity, and the results are shown in table 2. the Fs scale, WPi and sss were significantly correlated with all the FiQR scores (total, function, overall, and symptom scores), which indicates good construct validity.

Validity analysis of ACR 2011c and 2016c
the aUc, sensitivity, specificity, PPV, NPV, PlR and NlR of the 2011c and 2016c are shown in table 3. the high specificity of the 2011c (96.0%) and the 2016c (98.5%) indicates that those two criteria are reliable for diagnosing fibromyalgia patients in china; however, the sensitivity values were low (78.0%and 70.5% for the 2011c and 2016c, respectively) and the aUc values were 0.870 for the 2011c and 0.845 for the 2016c.

Characteristics comparison of patients positive or negative against 2011c and 2016c
comparative analyses were performed to investigate the differences in characteristics and who were positive or negative based on the 2011c and 2016c among patients who met the 1990c, and the results are shown in table 4. No differences were found in terms of tPc among 2011c positive, 2011c negative (which means also failed to satisfy 2016c), 2016c positive (which means also satisfied the 2011c), 2016c negative, and 2011c positive but 2016c negative groups.all inconsistencies occurred in the 15 2011c positive cases that failed to meet the 2016c, because all 15 cases did not meet the generalized pain requirement of 4 pain  PPV: positive predictive value; NPV: negative predictive value; PLR: positive likelihood ratio; NLR: negative likelihood ratio.FS: fibromyalgia severity; WPI: widespread pain index; SSS: symptom severity scale; the 2011c include (1) a WPi ≥7 and a sss score ≥5 or a WPi 3-6 and a sss score ≥9, (2) symptoms present at a similar level of severity for at least 3 months; the 2016c include (1) a WPi ≥7 and a sss score ≥5 or a WPi 4-6 and a sss score ≥9, (2) widespread pain involving at least 4 of 5 specific areas (excluding the jaw, chest and abdomen), (3) diffuse symptoms lasting for at least 3 months.regions, 2 cases did not have a WPi minimum score of 4 (the minimum scores were 3 in the 2011c).however, it seems that 1) males are less likely to meet either the 2011c or the 2016c, 2) the shorter the duration of FM is, the less likely it is for patients to meet either the 2011c or the 2016c, and 3) young patients tend to meet the 2011c but not the 2016c.additionally, the patients who either satisfied the 2011c or 2016c had a longer symptom duration and higher WPi, sss and Fs scale scores than those who failed to meet the criteria.

ROC analyses for the cutoff point of the FS scale based on the generalized pain criterion
considering that 2016c is the latest set of acR fibromyalgia criteria and the advanced standard that additionally applies the generalized pain criterion, we performed ROc analysis according to the cutoff score for the Fs scale, which is based on the generalized pain criterion.as shown in table 5, the results show higher aUc and sensitivity values but lower specificity values than the acR original 2016c.the highest aUc value was 0.923 when the cutoff score of the Fs scale was 11; at this score, the sensitivity and specificity were 87.0% and 97.5%, respectively.

Discussion
Our findings reveal that the 2011c and 2016c are valid instruments with high specificity that can be used to diagnose fibromyalgia in the chinese population.the demographic characteristics and clinical features of the participants in this multicentre study were similar to those in a previous report of chinese patients with fibromyalgia [4].Our results also suggested that based on chronic pain (lasting for at least 3 months), a Fs scale score ≥ 11 combined with generalized pain (pain involving at least 4 of the 5 areas) may be more suitable for use as modified 2016c and are worthy of further study to determine whether they could be used as valid tools for diagnosis in china.some adjustments had to be made when validating the 2011c and 2016c for chinese fibromyalgia patients from the original versions of Wolfe et al.On the one hand, based on the advanced diagnostic concept of the 2016c that clarity was unnecessary with respect to fibromyalgia diagnosis in the presence of other diseases, we also eliminated this condition when using the 2011c for diagnosis.On the other hand, all items of the WPi and sss are put into tables, and the interpretation of scoring criteria are marked as notes.these changes make the Fs scale, WPi and sss scoring procedures simpler and more comfortable for both patients and physicians.thus, these changes result in a more rapid clinical diagnosis for fibromyalgia.
the results of the current study reveal that the 2011c and 2016c had strong reliability and a good level of internal consistency in china.these findings are acceptable and comparable with previous reports.the 2016c validation study in Korean patients with fibromyalgia also showed an acceptable cronbach's α coefficient of 0.942 (95% ci: 0.930-0.964)[21], similar to a previous French study [22].however, a weak internal correlation of each item on the sss was found; this was indicated by the cronbach α coefficient of 0.65 for the sss when considering both the 2011c and 2016c.Nevertheless, this relatively low reliability of the sss scale was also shown in a few studies [23,24], which indicated the variance in the severity of fibromyalgia symptoms.
the 2011c and 2016c presented acceptable sensitivity (78.0% and 70.5%, respectively) and specificity values (96.0% and 98.5%, respectively) in chinese patients with fibromyalgia.those findings show that the 2011c tends to have lower sensitivity but higher specificity than criteria reported in validation studies from other countries, such as america (sensitivity: 86% and specificity: 90%) [13], spain (sensitivity: 88.3% and specificity: 91.8%) [25], italy (sensitivity: 79.8% and specificity: 91.7%) [26] and Norway (sensitivity: 93.9% and specificity: 71.3%) [14]. in contrast, a Japanese version that was tested in 462 fibromyalgia patients with 231 Ra or osteoarthritis controls yielded a sensitivity of 64% and a specificity of 96% [24].similarly, the 2016c is likely to have lower sensitivity but higher specificity than those found in validation studies from america (sensitivity: and specificity: 90%) [13], italy (sensitivity: 78% and specificity: 90.5%) [26], Norwegian (sensitivity: 88.8% and specificity: 77.5%) [14] and south Korea (sensitivity: 93.1% and specificity: 90.7%) [21].however, our findings are in line with a spanish study that found a low sensitivity (75.6%) and a high specificity (99.7%) when patients were diagnosed both by 1990c and acR 2011c [25].contrary to our findings, a Norwegian study found high sensitivity when using the Norwegian version of the 2011c (93.9%) and 2016c (88.8%) but low specificity values (71.3% and 77.5%, respectively) among patients who fulfilled the acR 1990c [14].additionally, the aUc values for the 2011c and 2016c are 0.870 and 0.845, respectively, which is similar to the aUc value of 0.86 observed for the 2016c in the Norwegian study [14], higher than the aUc of 0.79 for the 2010c observed in a spanish study [27] and lower than the aUc of 0.97 for the 2016c observed in a Korean study [21].the sociocultural characteristics may explain the different results in terms of sensitivity and specificity among different populations.the social variables vary in fibromyalgia patients from different countries or races, such as culture [28], social factors [29], diet structure [30], body composition [31], anxiety and depression level [32,33], and past experiences [34] can interfere the clinical presentation and symptoms severity of fibromyalgia.thus, the sociocultural factors should be considered as important factors to take into account in the diagnosis of fibromyalgia.among the 200 patients who satisfied the 1990c criteria, there were 156 patients (78.0%) who met the 2011c and 141 patients (70.5%) who met the 2016c criteria.this resulted in agreement in 141 cases (70.5%) and disagreement in 15 cases (7.5%).this result came about because 15 of the 2011-positive cases failed to meet the new generalized pain requirement, and 2 cases also failed to meet the WPi minimum score of 4. the 2016c does not misclassify positive patients who were negative based on the 2011c.all the changes in criteria status occurred in those who were positive based on the 2011c, amounting to almost 10% (15/156) of all 2011c-positive subjects.the 2016c is revised from 2011c to ensure that patients with regional pain syndromes would not be misclassified as having fibromyalgia.however, the cost of this assurance was that nearly 30% (59/200 cases) of patients diagnosed by the 1990c and 10% (15/156 cases) of patients diagnosed by the 2011c fail to meet the 2016c.thus, these changes were useful but costly because they reduced the sensitivity of the diagnostic criterion and enhanced the limitations in reliability and validity.
it seems that an improvement in sensitivity without a significant decrease in specificity can be achieved by using a lower cutoff score for the Fs scale instead of both of its subscales (WPi and sss) among chinese patients.score on the Fs scale and its subscales were significantly different between the fibromyalgia and nonfibromyalgia control groups.the Fs scale could differentiate fibromyalgia from other chronic pain disorders with acceptable sensitivity, specificity, PPV and NPV values.Using a lower cutoff score for the Fs scale (≥11) based on meeting the generalized pain might be a more effective approach in the chinese population, yielding a higher aUc of 0.923 and a sensitivity and specificity of 87.0% and 97.5%, respectively.this finding indicates that symptoms among chinese patients with fibromyalgia are likely to be less severe, which is consistent with our previous study [4].the use of a lower cutoff score for the Fs scale should be researched further to test its reliability in assessing the symptom intensity and fibromyalgia diagnosis for chinese patients.
there are some limitations in the current study.Firstly, the disease duration of patients in the fibromyalgia group was similar to that in our previous report (median duration of fibromyalgia: 24.0 vs. 24.0months) [4] but was shorter than that in the Ra group (median: 24.0 vs. 49.5 months).additional chinese epidemiological studies are needed to determine whether the onset age of fibromyalgia is older than that of Ra. secondly, none of the fibromyalgia subjects were recruited from a primary care setting; thus, they probably do not reflect the performance of these criteria in the general population.thirdly, the study solely matched the control group of Ra patients may limit the generalizability of the findings.however, the feasibility of clinical implementation if involving myofascial pain syndrome and generalized osteoarthritis as control group is challenged because of their low prevalence, so the study did not include other diseases as control.
although the acR 2011c and 2016c were validated at the tertiary level of care, the present study shows acceptable validity of those tools for fibromyalgia diagnosis for application in research settings among chinese patients with fibromyalgia in a fairly large sample size.it remains to be clarified whether the new cutoff value for the Fs scale could yield better aUc, sensitivity, and specificity values for the diagnosis of fibromyalgia among chinese patients.this approach should be examined further in clinical and research settings to enhance the treatment response and outcomes of this disorder.
in conclusion, the 2011c and 2016c are reliable instruments for diagnosing fibromyalgia patients in china.the Fs scale could be a valid tool to assist in fibromyalgia diagnosis, and a cutoff value 11 points is more suitable in chinese patients.

Table 1 .
demographic and clinical characteristics between the fibromyalgia and rheumatoid arthritis groups (fibromyalgia diagnosis met the AcR 1990c).

Table 2 .
fs construct validity: spearman's correlation coefficients among subscales of the fiQR and fs scale.
FS: fibromyalgia severity, the sum of the WPi and sss scores; FIQR: revised fibromyalgia impact questionnaire; WPI: widespread pain index; SSS: symptom severity scale.* correlation is significant at the 0.01 level (2-tailed).

Table 4 .
differences between fibromyalgia patients according to their diagnostic condition, with the fibromyalgia diagnoses meeting the AcR 1990c.

Table 5 .
Receiver operator characteristic analyses based on the fs cutoff point in fibromyalgia patients meeting the generalized pain requirement of 4 pain regions.