Psychometric evaluation of the appearance anxiety inventory in adolescents with body dysmorphic disorder

ABSTRACT The Appearance Anxiety Inventory (AAI) is a self-report measure assessing the typical cognitions and behaviours of body dysmorphic disorder (BDD). Despite its use in research and clinical settings, its psychometric properties have not been evaluated in young people with BDD. We examined the factor structure, reliability, validity, and sensitivity to change of the AAI in 182 youths with BDD (82.9% girls; Mage = 15.56, SD = 1.37) consecutively referred to two specialist outpatient clinics in Stockholm, Sweden (n = 97) and London, England (n = 85). An exploratory factor analysis identified three factors, namely “threat monitoring”, “camouflaging”, and “avoidance”, explaining 48.15% of the variance. The scale showed good internal consistency (McDonalds omega = 0.83) and adequate convergent validity with the Yale-Brown Obsessive-Compulsive Scale Modified for Body Dysmorphic Disorder for Adolescents (BDD-YBOCS-A; rs = 0.42) and the Clinical Global Impression-Severity Scale (rs = 0.32). Sensitivity to change was adequate, with AAI total scores and individual factor scores significantly decreasing over time in the subgroup of participants receiving multimodal treatment for BDD (n = 79). Change of AAI scores over treatment showed a positive statistically significant moderate-to-good correlation (r = 0.55) with changes in BDD symptom severity, measured by the BDD-YBOCS-A. The study provides empirical support for the use of the AAI in young people with BDD in clinical settings.


Introduction
Body dysmorphic disorder (BDD) is characterized by an excessive preoccupation with perceived defects in appearance and associated time-consuming rituals and avoidance The aim of this study is hence to conduct a comprehensive psychometric evaluation of the AAI in a large sample of 182 well-characterized youth with BDD referred to specialist clinics in Stockholm, Sweden and London, England.Further, we explored the AAI's sensitivity to change in a subsample of young people who had received specialist multimodal treatment and provided pre-and post-treatment data (n = 79).

Participants
This secondary data analysis includes a clinical sample of 210 well-characterized youths with BDD, consecutively referred to two specialist paediatric obsessive-compulsive and related disorders outpatient clinics in Stockholm, Sweden (n = 108) and in London,England (n = 102) between 2011 and 2023.Detailed characteristics of the sample have been provided elsewhere (Rautio, Gumpert, et al., 2022;Rautio, Jassi, et al., 2022).For this study, 182 adolescents (Stockholm n = 97, London n = 85) of the original sample of 210 participants had available AAI data at pre-treatment and were included in the analysis.Of these, 79 adolescents (Stockholm n = 67, London n = 12) received multimodal treatment for BDD and had available post-treatment AAI data.The remaining participants (n = 103, 56.59%) either did not receive treatment at their respective clinics or had no post-treatment data available (e.g., still in treatment).

Settings and procedures
The study was approved by the Regional Ethical Review Board in Stockholm (reference number 2015Stockholm (reference number /1977-31/4-31/4) and by the South London and Maudsley Child and Adolescent Mental Health Service Audit Committee.In the Stockholm site, informed consent was required and provided by all patients and their parents/legal guardians.In the London site, informed consent was not required because the study was part of an audit of routinely collected clinical data.
Participants from both the Stockholm and London specialist obsessive-compulsive and related disorders outpatient clinics underwent similar intake procedures, consisting of a 3-hour assessment by a multidisciplinary team where they completed a series of interviews, including a full psychiatric and developmental history.In Stockholm, this included the Mini International Neuropsychiatric Interview for Children (MINI-KID) (Sheehan et al., 1998), supplemented with additional modules for obsessive-compulsive and related disorders.In London, this was done with the Development and Well-Being Assessment (DAWBA) (Goodman et al., 2000).Clinical diagnoses were made according to DSM-5 criteria.Following the assessment, adolescents were either offered treatment at the specialist clinic or referred to more appropriate services.All youth that were offered multimodal treatment received individual sessions of cognitive-behaviour therapy (CBT) for paediatric BDD according to an evidence-based treatment protocol (Mataix-Cols et al., 2015;Rautio, Gumpert, et al., 2022) and medication when deemed appropriate.The same treatment protocol was followed at both sites.Additionally, a subset of patients was also treated with medication, when considered clinically relevant.All patients that were offered treatment were assessed again after treatment.Since this is a naturalistic study, not all patients completed all measures at all time-points and analyses were done on available data.However, our sample size included over 10 cases per indicator variable and, following previous studies on the appropriateness of sample sizes (Nunnally, 1978), it is considered to be sufficiently powered to perform the planned analyses.Further details on the clinical settings, assessment procedures, and treatments are detailed elsewhere (Rautio, Gumpert, et al., 2022;Rautio, Jassi, et al., 2022).

Measures
The following clinician-administered and self-reported measures were administered to all participants at both sites at baseline and post-treatment, unless otherwise specified.
The AAI is a self-reported 10-item measure that covers typical BDD cognitions and behaviours (Veale et al., 2014).Items are scored on a 0-4 Likert scale, yielding a total score range from 0 to 40, with higher scores denoting greater symptom severity.The original psychometric evaluation of the scale (Veale et al., 2014) suggested two subscales: avoidance, including items that describe strategies to hide body parts of concern or avoid feared situations (e.g., "I try to camouflage or alter aspects of my appearance") and threat monitoring, referring to excessive checking behaviours to verify exactly how one looks (e.g., "I brood about past events or reasons to explain why I look the way I do").For the current study, the two sites used slightly different versions of the AAI, with Stockholm using the original 10-item version (Veale et al., 2014), while London employed a 14-item version that was an earlier iteration of the instrument (D.Veale, personal communication, 31 December 2022).Only the 10 items included in the final version of the scale were utilized in this study.The translation process of the Swedish version of the AAI was conducted according to international standards; the original English version was translated into Swedish by a Swedish BDD expert.An independent bilingual translator then carried out a reverse translation from Swedish to English.The final version was then approved by the scale's creator, Professor David Veale.
The BDD-YBOCS-A is a widely-used clinician-administered, semi-structured interview that measures BDD symptom severity and has demonstrated good reliability and sensitivity to change, as well as adequate convergent and divergent validity (Monzani et al., 2022).The instrument contains 12 Likert-type items ranging from 0 to 4: five questions on obsessions, five on compulsions, one about insight, and one to measure avoidance.The total BDD severity score ranges from 0 to 48, with higher scores denoting greater symptom severity.
The Clinical Global Impression-Severity scale (CGI-S) is a clinician-rated single-item measure of symptom severity, in this case of BDD symptoms, ranging from 1 ("normal, not at all ill") to 7 ("among the most extremely ill patients"; Busner & Targum, 2007).It is often used in treatment trials (Busner & Targum, 2007) and has shown good concurrent validity and sensitivity to change (Leon et al., 1993).
The Children's Global Assessment Scale (CGAS) is a clinician-rated single-item measure of the global functioning of a young person during the last month.Scores range from 1 (more disabled) to 100 (best functioning).The CGAS has good psychometric properties, with high reliability as well as discriminant and concurrent validity (Shaffer et al., 1983).
Self-reported depressive symptoms were assessed by means of different instruments.In Stockholm, the Children's Depression Inventory-Short Version (CDI-S), a 10-item instrument (Allgaier et al., 2012), was used from 2015 and was replaced in 2018 with the Short Mood and Feeling Questionnaire (SMFQ-C), a 13-item measure (Rhew et al., 2010).In London, the 33-item Mood and Feeling Questionnaire (MFQ-C) was used throughout the whole inclusion period (Burleson Daviss et al., 2006).All these measures of depressive symptoms have shown good psychometric properties (Allgaier et al., 2012;Burleson Daviss et al., 2006;Rhew et al., 2010;Thabrew et al., 2018).A z-transformation was conducted to combine the scores from these instruments (for details see Rautio, Gumpert, et al., 2022).

Data analysis
Exploratory factor analysis with direct oblimin rotation was conducted to examine factor structure of the AAI at baseline, using principal axis factoring as the data were significantly non-normal.Exploratory factor analysis was chosen over confirmatory factor analysis due to previous inconsistent results and no theoreticallybased expected factor structure.Parallel analysis was conducted to determine the number of factors, as this is regarded as a more accurate method compared to the traditional scree plot and Kaiser rule approach (Wilson & Cooper, 2008;Zwick & Velicer, 1986).
McDonald's omega was used for evaluation of internal consistency of the AAI as data were not normally distributed (Xiao & Hau, 2023).A recommended minimal value of 0.70 was regarded as an acceptable internal consistency (McNeish, 2018).Reliability assessment also included item-rest correlation (IRC).A recommended minimal correlation value of 0.2 to 0.4 was considered to be an acceptable contribution of the item to the measure (Hobart & Cano, 2009).
Convergent and divergent validity were evaluated using Spearman's correlation of the AAI with clinician-rated BDD symptom severity (BDD-YBOCS-A and CGI-S) and functional impairment (CGAS), as well as self-reported depressive symptoms (combined z-scores of the CDI-S, MFQ-C, and SMFQ-C).
To evaluate the sensitivity to change of the AAI, we conducted paired-sampled t-test to calculate the pre-to post-treatment changes in the AAI and correlations of the change scores (post-treatment minus pre-treatment values) of the AAI and the BDD-YBOCS-A, using Pearson's correlation as change scores were normally distributed.A significant decrease in total pre-to post-treatment scores, within-group effect size (Cohen's d), and a significant correlation of the change scores would constitute evidence of sensitivity to change.
The threshold of statistical significance (p-value) was set to 0.05.The magnitude of all correlations was considered using guidelines from Colton (1974), where correlations ranging from 0 to 0.25 mean little or no relationship, correlations from 0.25 to 0.50 indicate a weak to fair relationship, correlations from 0.50 to 0.75 moderate to good relationship, and correlations above 0.76 are interpreted as good to excellent.All statistical analyses were conducted in Jamovi (The jamovi program, 2022).

Sample characteristics
The sample characteristics are shown in Table 1.The majority of participants were girls (n = 150, 82.87%).The mean age at intake was 15.56 years (SD = 1.37).At baseline, the mean AAI total score was 28.54 (SD = 7.18), the mean BDD-YBOCS-A score was 32.27 (SD = 5.59), and the mean CGI-S score was 4.92 (SD = 0.79), overall corresponding to moderate to severe levels of BDD symptom severity.

Factor structure
The Bartlett's test of Sphericity was significant (χ 2 (45) = 570, p < .001),demonstrating an acceptable number of significant correlations among variables for a factor analysis.The Kaiser Meyer Olkin Measure of Sampling Adequacy (KMO) for the overall sample was good (0.81).From the factor analysis, we extracted three factors accounting for 48.15% of the total variance, using optimal implementation of Parallel Analysis (PA) for determining the number of dimensions (see Figure 1 for the scree plot).Factor loadings after rotation ranged from 0.34 to 0.91 and are reported in Table 2. Factor 1, characterized by threat monitoring, accounted for 17.94% of the total variance and included all items of the threat monitoring subscale suggested by Veale et al. (2014) (i.e., items 1, 2, 4, 6, and 8), as well as one additional item assessing the mental act of comparing oneself to others.Factor 2, characterized by camouflaging behaviours, accounted for 17.59% of the total variance and included items related to thinking about or trying to alter one's appearance as well as preventing others to see aspects of one's appearance (i.e., items 5, 9, and 10).Factor 3, characterized by avoidance, accounted for 12.62% of the total variance and included items related to avoiding people, situations or looking at oneself, as well as preventing others from seeing aspects of one's appearance (i.e., items 3, 7, and 10).One cross-loading occurred, with item 10 (i.e., preventing others to see aspects of one's appearance) loading highly on both Factors 2 and 3 (>0.4).

Internal consistency
Means, standard deviations, and reliability statistics for individual item scores on the AAI at baseline are presented in Table 3. McDonalds ω for the 10 items on the whole sample were 0.82, demonstrating good internal consistency.The internal consistency of the scale could not be improved by removing any of the individual items from the measure.All item-rest correlations were also positive and greater than 0.20, ranging from 0.28 to 0.67, supporting the adequate contribution of all items to the total score.Further, McDonalds ω for the 5 items loading on Factor 1 was 0.71, for the 3 items  loading on Factor 2, 0.80, and for the 3 items loading on Factor 3, 0.72, hence demonstrating good internal consistency.However, if item 10, loading on both the second and third factor, were to be removed from either of the factors, the internal consistency would still be acceptable for Factor 2 (ω = 0.78) but not for Factor 3 (ω = 0.59).The internal consistency of the individual factors could not be improved by removing any of the items.

Convergent and divergent validity
In support of convergent validity, there was a statistically significant and weak to fair correlation between the mean total score on the AAI and the clinician rated BDD-YBOCS -A (r s = 0.42, p < .001)and between the AAI and the CGI-S (r s = 0.32, p < .001).However, the AAI also showed similarly sized correlations with self-reported depressive symptoms at baseline (r s = 0.49, p < .001)and with clinician-rated CGAS scores (r s =-0.29, p < .001).
Supplementary Table S1 shows the baseline differences between those who did vs. those who did not have available post-treatment data on the AAI.No significant differences between groups were observed, except for the fact that participants with missing data were slightly older (M = 15.82,SD = 1.24) than those with available data (M = 15.22,SD = 1.5; t = 2.97, df=179, p = 0.003, d = 0.45 [95% CI, 0.20, 0.99]).

Discussion
The current study is the first psychometric evaluation of the AAI in a large clinical sample of well-characterized youths diagnosed with BDD.The exploratory factor analysis suggested a three-factor structure for the AAI, which contrasted with the two-factor structure identified in an adult clinical sample and the one-factor structure identified in a community sample of adults and adolescents (Roberts et al., 2018;Veale et al., 2014).Our first factor was very similar to the "threat monitoring" factor suggested by Veale et al. (2014), apart from also including item 1 (i.e., "comparing myself to others").In the previous study, this item was suggested to belong to the "avoidance" factor, although it did also load on the "threat monitoring" factor to a lesser extent.Our remaining factors capture the typical BDD behaviours of camouflaging (i.e., Factor 2) and avoidance (i.e., Factor 3).In Veale et al. (2014), these items loaded in the same factor.While both camouflaging and avoidance are functionally related and are both thought to fuel and maintain BDD preoccupations, having two separate factors also makes good clinical sense.Thus, we speculate that some young people may predominantly employ one or the other strategy in their attempt to cope with their BDD preoccupations.Item 10 (i.e., "preventing others seeing aspects of one's appearance within particular situations [e.g., by changing my posture, avoiding bright lights etc.]"), targeting both rituals and avoidance, demonstrated complexity as it loaded on both the second and third factor.However, the factors' internal consistency results suggest that item 10 should be included in Factor 3.
Consistent with previous findings in both clinical samples of adults with BDD and community samples of adults and adolescents (Roberts et al., 2018;Veale et al., 2014), our results showed that the AAI has good internal consistency.Each item of the AAI was also positively correlated with the total score minus that item, suggesting that the AAI is a cohesive measure.Additionally, internal consistency was also acceptable for all three factors.If item 10 were to be removed from either the second or third factor, the internal consistency would still be acceptable for Factor 2, but not for Factor 3. A possible explanation for this may be the relatively low explained variances (Factor 2: 17.59%, Factor 3: 12.62%), as well as the reduced number of items, which makes factors more unstable (Costello & Osborne, 2005).Taken together, the results indicate that both total score and scores for the three factors of the AAI may be used, with item 10 being included in the "avoidance" factor to make each subscale as coherent as possible.However, subscales should be interpreted cautiously and further testing of additional items is warranted in future studies.
In line with previous research (Roberts et al., 2018;Veale et al., 2014;Yurtsever et al., 2022), support was also found for the convergent validity with the AAI correlating with the BDD-YBOCS-A and the CGI-S, assessing BDD symptom severity.The fact that the correlation between the AAI and BDD-YBOCS-A was only modest may be argued to be in line with the intended differences in measurement design by Veale et al. (2014), where the AAI was developed as a process measure, in comparison to the BDD-YBOCS-A, which primarily assesses treatment outcomes and can be recognized as distinct but partially overlapping factor.However, convergent validity in this study was lower than in previous studies (Roberts et al., 2018;Veale et al., 2014;Yurtsever et al., 2022) though this may be attributed to common method variance, as all but Veale et al. (2014) only used other self-report measures.Additionally, differences in sample characteristics (e.g., symptom severity, age) may have influenced convergent validity.Further, evidence for divergent validity was more modest as the AAI correlated similarly with self-reported depression as BDD symptom severity (BDD-YBOCS-A, CGI-S), and, somewhat more weakly, with global functioning (GGAS).These findings may also be attributed to common method variance, as the AAI and depressive measures were completed by the same informant (i.e., self-reported) while the other measures were clinician-reported.The high correlation between symptoms of BDD and depression could also be explained by depression being one of the most common comorbid disorders in BDD (Veale et al., 2016), which is also true for this sample, as reported in our previous publication (Rautio, Jassi, et al., 2022).Future research should aim to further examine the validity of the AAI in clinical samples of adolescents using additional measures for divergent validity.
In line with the findings by Veale et al. (2014), the AAI also demonstrated good sensitivity to change, with scores in the total scale and subscales all significantly decreasing from baseline to post-treatment.Furthermore, the total change scores on the AAI and the BDD-YBOCS-A from baseline to post-treatment were significantly inter-correlated.Thus, results support the use of the AAI as a brief and easily administered measure to gauge or monitor treatment progress.
The strengths of the study include a well-characterised and relatively diverse sample from two European countries and a wide participant age range within a population of youths.The study also had some limitations.Participants were recruited from two specialist BDD clinics and included individuals with relatively severe BDD symptoms, thus limiting the generalizability of the results to milder samples.Furthermore, data for some clinical characteristics of the sample (e.g., comorbidity and use of medication) were not available for the current study.Additionally, we did not have information on the participant's ethnic background, which is needed to evaluate if the scale is suitable for different ethnical groups.Thus, replication of the current results in a new sample of youths with BDD would strengthen the findings.The large majority of participants were girls (82.87%), although this may be typical in clinical settings, both for adults and adolescents (Krebs et al., 2017;Phillips et al., 2006;Veale et al., 2016) further work will be needed to evaluate the suitability of the scale between different gender groups.We were also unable to evaluate the test-retest reliability of the AAI.Finally, data for this study was collected at two different clinics and over a period of several years, which resulted in some differences in data collection, as well as data loss in different parts of the process.

Conclusions
This study was the first evaluation of the psychometric properties of the AAI in a clinical sample of youths with BDD.The AAI showed a three-factor structure, strong internal consistency, acceptable convergent validity, and good sensitivity to change.Taken together, the study supports the use of the AAI in young people with BDD as an outcome or process measure for use during treatment, in both research and routine clinical practice.

Figure 1 .
Figure 1.Scree plot of the appearance anxiety inventory (AAI) using principal axis factoring.
Version; CGAS, Children's Global Assessment Scale; CGI-S, The Clinical Global Impression-Severity scale; MFQ-C, Mood and Feeling Questionnaire-Child Version; SD, standard deviation; SMFQ-C, Short Mood and Feeling Questionnaire-Child Version; CDI-S, Children's Depression Inventory-Short Version.

Table 2 .
Factor loadings for individual items of the AAI at baseline (n = 182).
I compare aspects of my appearance to others 0.558 −0.069 0.143 2. I check my appearance (e.g., in mirrors, by touching with fingers or by taking photos of myself)a "Principal axis factoring" extraction method was used in combination with "oblimin" rotation.

Table 3 .
Means, standard deviations, and reliability statistics for individual items of the AAI at baseline (n = 182).Item Rest Correlation = Correlation coefficient for correlation between item score and total scale score minus the item score; b Internal consistency of the whole scale if item deleted; c Internal consistency of Factor 1 if item deleted; d Internal consistency of Factor 2 if item deleted; e Internal consistency of Factor 3 if item deleted.Abbreviations: AAI, Appearance Anxiety Inventory; SD, standard deviation. a