Character Strengths in Adults and Adolescents: Their Measurement and Association with Well-Being

Abstract Character strengths are assessed in adults and adolescents using different measurements. However, a comparison of character strengths across age groups requires the equivalence of these measurements. The present study examined the comparability of the two questionnaires most frequently used in research: The VIA Inventory of Strengths (VIA-IS) for adults and the VIA Inventory of Strengths for Youth (VIA-Youth). A sample of N = 170 high-school students in the age of about 18 years and up to two informants (N = 164 mostly parents; N = 156 mostly friends and partners) completed both character strengths questionnaires and several well-being questionnaires. The psychometric characteristics and the correlations with well-being scales were examined once exclusively on the basis of self-rating and once on the basis of combined self- and informant-ratings. Substantial differences between the questionnaires were found in each of the criteria examined (e.g., identification of signature strengths, or largest associations with well-being). The results indicate that VIA-IS and VIA-Youth measure character strengths differently, so that a comparison across age groups may lead to biased conclusions. Therefore, differences in character strengths between adults and adolescents should not be exclusively interpreted in terms of differences on the trait level as these differences may be caused by nonequivalent questionnaires.

The study of character strengths originated more than 20 years ago from the cross talk between the emerging fields of positive youth development 1 and positive psychology . At the beginning of this research, it soon became clear that two questions needed to be answered, namely "how can one define the concepts of 'strength' and 'highest potential' and how can one tell that a positive youth development program has succeeded in meeting its goals?" (Peterson & Seligman, 2004, p. v).
The first question was answered though the development of the Values in Action (VIA) classification of character strength of virtues . The VIA classification of strengths distinguished 24 strengths, which were assigned to one of six core virtues (i.e., wisdom and knowledge, courage, humanity, justice, temperance, and transcendence). To answer the second question, self-report scales for each of the 24 strengths were developed based on the VIA classification. In the beginning, different versions of the survey for adults (i.e., 18 years and above)-the VIA Inventory of Strengths (VIA-IS)-were designed and evaluated. The final version of the VIA-IS consisted of 240 items, the scales correlated with self-nominations substantially (<.50), were stable over a 4-month period (>.70), and only few scales had a marginal correlation with social desirability (Peterson et al., 2005a). Meanwhile many translations into other languages were undertaken, more than a million people filled in the scale, and more than 350 journal articles were published using the scale (Ruch & Stahlmann, 2020). In many of these studies, the criterion-related validity (e.g., wellbeing; Bruna et al., 2019;health;Proyer et al., 2013) and evidence for measurement invariance (e.g., with regard to different cultures; McGrath, 2016) of the VIA-IS could be demonstrated, whereas studies on the factorial validity provided mixed results (for an overview, see Feraco et al., 2021).
As the VIA project focused on building a bridge between the emerging positive psychology field and its application in youth development, it is not surprising that an inventory for young people (aged 10-17) followed soon after, the VIA Inventory of Strengths for Youth (VIA-Youth; Park & Peterson, 2006). Initially, separate inventories for preadolescents and adolescents were created by adapting items from the adult survey and phrasing them in a developmentally appropriate way, but later they were merged into one. The final inventory for adolescents presented by Park and Peterson (2006) contained 198 items. Park and Peterson (2006) presented results from different US samples and reported reliability estimations of a > .70 for all 24 scales, and six-month test-retest correlations between .46 and .71. In addition, several studies provided evidence for the criterion-related validity (e.g., well-being; Ruch et al., 2014;academic achievement;Wagner et al., 2020) and the measurement invariance (e.g., with regard to different cultures; McGrath & Walker, 2016) of the VIA-Youth, whereas the factorial validity seems to be different with regard to the VIA-IS (e.g., McGrath & Walker, 2016).
In general, with the VIA-IS and the VIA-Youth, two psychometrically sound and valid measurements of character strengths are available to research as well as to practice. Although both measurements have their different domains of applicability (i.e., adolescents vs. adults), they are implicitly expected to yield comparable results to generalize findings across the entire life span. However, and in contrast to other personality measures (for a comparison of different personality measures for adults, see Pace & Brannick, 2010), no comparison of the two was undertaken conceptually or empirically, although discrepancies in results have become more apparent over time, as will be discussed below, calling into question the assumed equivalency of the questionnaires.

Comparison of the VIS-IS and the VIA-Youth
Indications on conceptual level for the equivalence and nonequivalence From a conceptual perspective, there are several factors that allow suggesting that the VIA-IS and the VIA-Youth will yield similar results whereas other factors will help making them different. First, both inventories are based on the same 24 strength classification . Second, there is some overlap in the item substance; Peterson and Seligman (2004) wrote that they "created separate inventories for preadolescents and adolescents by adapting items from the adult survey and phrasing them in what [they] thought were developmentally appropriate ways" (p. 634). If the item substance of many items was preserved when reformulating the items, then one expects them to make the two versions more similar. Third, both versions apply the same answer format (5-point Likert scale) and word the items in extreme fashion ("I always … "), thereby anchoring the upper end of the scale. This then also is seen as a prerequisite to be able to compare across strengths; that means to rank order strengths from top to bottom in an individual.
There are factors that will make the two inventories different. First, the three points above that can make the inventories similar, also have the potential to make them dissimilar if they fail. Second, contrary to the VIA-IS, the VIA-Youth has inverted items, which might increase the variability between the two measures because of reduced acquiescence effects. Also, impulsive individuals, lower reading level and time urgency might make people overlook the negation in the VIA-Youth items. Similar issues can occur with the polarity of the response scales. While the scale in VIA-IS ranges from 1 ¼ very much unlike me to 5 ¼ very much like me, the opposite is true for VIA-Youth (i.e., 1 ¼ very much like me to 5 ¼ very much unlike me). Third, the number of items and thus the time required to complete the questionnaires varies slightly: the VIA-IS contains 240 items, whereas the VIA-Youth contains only 198 items.

Empirical indications for the nonequivalence
The authors of the character strengths questionnaires never claimed that the two versions are strictly parallel, and it appears as if these are from different domains in the literature. Although differences, for example, in the expression of character strengths between adults and adolescents were discussed in the article that introduced the VIA-Youth (Park & Peterson, 2006), no reference to measurements' specifics of the VIA-IS were made in this article; and to our knowledge no study exists that tests the same hypothesis with the VIA-Youth (i.e., among adolescents) and the VIA-IS (i.e., among adults). Nevertheless, findings based on adolescents and adultsand especially discrepancies between these two groupsare repeatedly discussed comparatively in the literature.
For example, the first criterion defining character strength states that that a strength "contributes to various fulfillments that constitute the good life, for oneself and for others" (Peterson & Seligman, 2004, p. 17). Therefore, the 24 strengths of the VIA-Youth should predict well-being during childhood and adolescence, and the VIA-IS during adulthood, using a life satisfaction scale for students (e.g., SLSS; Huebner, 1991) and adults (e.g., SWLS; Diener et al., 1985), respectively. However, this criterion reveals a crucial discrepancy between the two questionnaires. Bruna et al. (2019) included 28-30 studies in their meta-analysis regarding the relation between character strengths and well-being in adults. For the sake of comparison and as no such analysis exists for adolescents, we used Goh et al.'s (2016) approach to meta-analyze the relation between character strengths and well-being across two published studies (Blanca et al., 2018;Ruch et al., 2014) and seven unpublished data sets from our own lab (see Supplement Table  S7) 2 . According to Bruna et al. (2019), the highest five correlations for the VIA-IS were hope (.56), zest (.52), love (.44), gratitude (.44), and curiosity (.40). According to our data, the highest five coefficients found for the VIA-Youth were love (.49), hope (.49), gratitude (.48), zest (.45), and social intelligence (.33). Despite some other discrepancies, curiosity was only weakly correlated with life satisfaction (.14) in our data. Thus, curiosity was not among the happiness strengths and actually ranked in the last places in contrast to Bruna et al. (2019; see Supplement Table S7). This observation that VIA-IS curiosity correlated well with life 2 Please note that a systematic literature review was outside the scope of the present study. The results presented here serve to illustrate the discrepancy often described in the literature regarding the relation between curiosity and life satisfaction. However, the results do not replace a systematic metaanalysis and should be interpreted with caution. satisfaction in adults (see also Park et al., 2004;, but VIA-Youth curiosity does not (see also Park & Peterson, 2006;Ruch et al., 2014), is often mentioned in the literature (e.g., Peterson & Park, 2009;Ruch et al., 2014) without a conclusive explanation. In fact, it is unclear whether this finding is caused by differences in the samples (adults vs. adolescents) or whether differences in the assessment of life satisfaction (SWLS; SLSS) or in the assessment of character strengths (VIA-IS, VIA-Youth) are the reasons. If this discrepancy is indeed due to differences in the samples, then this would not only have consequences for the validity of character strengths (i.e., a different correlation pattern indicates a different criterion-related validity of character strengths for adults and adolescents) but would also have an impact on the development of intervention programs for adolescents in educational contexts (i.e., interventions usually target those character strengths associated with well-being, which would mean that programs for adults but not for adolescents should include curiosity).
Furthermore, Park and Peterson (2006) reported different highest strengths for adults and adolescents. In detail, they found a rank correlation of .53 between the order of character strengths elevation of adults (based on Park et al., 2004) and adolescents. In addition, they emphasized that appreciation of beauty and excellence, honesty, leadership, and open-mindedness 3 were higher elevated among adults than adolescents, whereas hope, teamwork, and zest were higher elevated among adolescents than adults. Likewise, Heintz and Ruch (2021) presented a meta-analysis (47 samples with a total N of 1,098,748) that investigates cross-sectional age differences in the 24 character strengths. Although they did not compare age differences between adolescents and adults because of the different questionnaires used in the respective age groups (i.e., VIA-Youth for adolescents and VIA-IS for adults), a closer inspection of the relative elevation reveals some unusually large gaps exactly between the age range in which the questionnaire for adolescents (up to 17 years) is changed to the questionnaire for adults (from 18 years). This gap indicated sometimes a higher (e.g., curiosity, leadership, fairness) and sometimes a lower (e.g., bravery, teamwork, gratitude) relative elevation among adults compared to adolescents (see Figure 2 in . However, as Heintz and Ruch (2021) correctly emphasized, in order to interpret these differences in terms of developmental trajectories, comparable questionnaires are required, as it cannot be ruled out that the questionnaires measure character strengths differently. This issue is especially important for the assessment of the signature strengths (i.e., the character strengths that represents the core of a person's identity; , which are used, for example, for vocational and career counseling. If both questionnaires measure character strengths differently, then in the worst casealbeit somewhat constructedscenario, a person whose character strengths were assessed shortly before the 18th birthday could receive a different career recommendation than if the character strengths were assessed shortly after the 18th birthday. In summary, the adult and adolescent versions of the VIA questionnaires are expected to yield comparable results, but there are several findings in the literature that indicate either true differences in character strengths between adults and adolescents or non-comparable measures. As age span and test forms (VIA-IS vs. YIA-Youth) were confounded in previous studies, it is not clear whether age differences or nonequivalent test forms are the reason for these differences. In addition, it should be noted that one of the most important comparison with regard to the construct validity of both questionnaires has not been done so far, namely how high the homologous scales of the character strengths questionnaires do indeed intercorrelate (i.e., whether they measure the same construct). For example, a meta-analysis comparing the convergent validity of questionnaires for adults regarding the Five Factor model of personality demonstrated only a moderate correlation (r mean $ .60), which was interpreted as insufficient (Pace & Brannick, 2010). Extending these findings to character strengths questionnaires in the present study and, thus, to determine the extent to which the results found for one questionnaire would also be valid for the other, both character strength questionnaires and the same life satisfaction measures were assessed in the same sample; that means a sample where both instruments are applicable (e.g., 18-year-olds). If the difference still emerges, then it is, at least in part, due to the forms of character strengths questionnaires.

The present study
The present study aims at empirically comparing the assessment of 24 character strengths with the VIA-IS and the VIA-Youth based on the same sample. Thus, we want to shed light on whether potential differences in the assessment of character strengths are small enough to allow the results to be compared across questionnaires and, thus, across age groups. If the VIA-IS and the VIA-Youth are sufficiently equivalent, then only negligible differences should be observed.
We have considered the following criteria regarding the equivalence of the questionnaires based on commonly used criteria of test construction and validation (e.g., Flake et al., 2017;Ziegler, 2014) 4 . First, we examined the descriptive 3 In the following we use the term open-mindness, which is sometimes referred to as judgment. Please also note that we use the term modesty in the present study, which is sometimes referred to as humility. 4 Measurement invariance (e.g., Meredith, 1993) is commonly used to test the assumption of constant measurement features of one questionnaire (or parallel questionnaires) in different groups (e.g., with respect to age or language). In the present study, however, two different questionnaires (i.e., different number of items, non-corresponding item contents, etc.) are investigated on a single sample with regard to their equivalence. Therefore, the traditional invariance approach (e.g., configural, metric, scalar equivalence) cannot be applied based on different measurements (Tyrell et al., 2019). Alternative approaches to investigate measurement invariance based on different measurements rely on scale scores instead of single items (McGrath & Walker, 2016) or item parcels (e.g., Tyrell et al., 2019), which in turn are themselves subject to criticism (e.g., Marsh et al., 2013;Meade & Kroustalis, 2006). This is especially relevant for the questionnaires used here, as previous research has emphasized challenges with the measurement models of the respective questionnaires (e.g., Ng et al., 2017). With this in mind, these statistics of both questionnaires (i.e., mean scale scores, internal consistency, and signature strengths), in particular with regard to whether the differences between adolescents and adults (e.g.,  are due to the questionnaires used in the respective age group. Second, and for the first time in character strength research, we examined the construct validity in terms of convergent and discriminant correlations of the character strengths scales from both questionnaires. Third, we compared the associations between character strengths and well-being with regard to the potentially different criterion-related validity of character strengths in adolescents and adults (e.g., Peterson & Park, 2009).
In order to draw conclusions about the equivalence of the VIA-IS and the VIA-Youth, age effects and the effects of the measurements of well-being have to be ruled out. Therefore, and with regard to the former, participants in the age of about 18 years participated in the present study as this is an age at which both characters strengths questionnaires can be applied. In addition, the homogenous age means that no developmental trends can be present in the data. Regarding the influence of well-being measurements, different questionnaires were considered, which are commonly used in the research field of character strengths and which were developed either specifically for adults or for adolescents. This approach makes it possible to investigate whether the differences between character strengths and well-being reported in the literature are due to different and age-specific measurements of well-being. The present study focuses on self-reported measurements of character strengths and well-being. Self-reports are the most common assessment method within the research field of character strengths. Therefore, the findings obtained in the present study can be related to those of previous studies. However, self-reports are prone to several biases (e.g., socially desirable answers, answer tendencies, response styles), which could influence the association between character strengths and well-being (McCrae, 1982). Therefore, we also have included informant-reports in our study and additionally present analyses based on both forms to investigate whether the results change when information from different sources is considered. Furthermore, as this is the first study in which a comparison of both questionnaires is conducted, we did not formulate specific expectations regarding differences, but consider the present study as exploratory (see also de Groot, 2014).

Participants
Self-rating sample We asked 175 students to participate in our study. Three participants did not provide permission of their parents and, thus, were excluded from the study. Furthermore, two participants did not complete the character strengths questionnaires and were excluded as well. The final size of the self-rating sample was N ¼ 170. Participants were primarily high school students (97.60%). The average age was 18.30 years (SD ¼ 0.68, Min ¼ 17, Max ¼ 20) and 77.60% of the participants were female.

Character strengths
The VIA Inventory of Strengths (VIA-IS; Peterson et al., 2005a) consists of 240 items for the self-report assessment of the 24 character strengths (10 items per strength) in adults (18 years and above). All items of the VIA-IS are positively worded and have to be answered using a 5-point Likert-style format (i.e., 1 ¼ very much unlike me; 5 ¼ very much like me). Scale scores were computed by calculating the means of the respective items. We used the German version of the VIA-IS , which showed good reliabilities for all 24 scales (i.e., Cronbach's ⍺ Mdn ¼ .77; test-retest reliability for 3 months: r Mdn ¼ .85).
The German version (Ruch et al., 2014) of the VIA Inventory of Strengths for Youth (VIA-Youth; Park & Peterson, 2006) was additionally applied. The VIA-Youth was developed to assess the 24 character strengths in adolescents (10 to 17 years). It consists of 198 items (i.e., 7-9 items per strength) and about one third of the items are negatively worded. All items have to be answered using a 5-point Likert-style format (i.e., 1 ¼ very much like me; 5 ¼ very much unlike me). For the present study, the response scale of the VIA-Youth was reversed coded in order to have the same orientation as the VIA-IS. Scale scores were computed by calculating the means of the respective items after inverting the negatively worded items. Ruch et al. (2014) reported good reliabilities of the German version of the VIA-Youth (i.e., Cronbach's ⍺ Mdn ¼ .77; test-retest reliability for 4 months: r Mdn ¼ .72).
In addition to the self-rating form, an informant-rating form for both measurements were applied. The informantrating forms were identical to the self-rating questionnaires, but all items were rephrased for informant-evaluations. The same answer format was used but with rephrased categories (e.g., 1 ¼ very much unlike him/her). The informant-rating version of the measurement demonstrated good reliabilities in previous studies: Cronbach's ⍺ Mdn ¼ .81  and .80 (Ruch et al., 2014) for the VIA-IS and the VIA-Youth, respectively. alternative approaches to testing measurement invariance based on different questionnaires have not been considered in the present study.

Well-Being
The Satisfaction with Life Scale (SWLS; Diener et al., 1985) in the German version  is a global measure of satisfaction of life in adults. The scale consists of 5 positively worded items which have to be answered using 7point Likert-style format (i.e., 1 ¼ strongly disagree, 7 ¼ strongly agree). All items were averaged to calculate a scale score. Previous studies demonstrated a high reliability of the German version of the SWLS (e.g., Cronbach's ⍺ ¼ .86; . The Students' Life Satisfaction Scale (SLSS; Huebner, 1991) adapted to German by Weber et al. (2013) is a sevenitem measure of global satisfaction with life in youth. Its items are positively worded and it uses a 6-point Likert-style format (i.e., 1 ¼ strongly disagree, 6¼ strongly agree). All items were averaged to calculate a global score. The German version of the SLSS showed good reliabilities in previous studies (e.g., Cronbach's ⍺ ¼ .88; test-retest reliability for 4 months r ¼ .55; Weber et al., 2013).
The German version (e.g., Weber et al., 2013) of the Brief Multidimensional Students' Life Satisfaction Scale (BMSLSS; Seligson et al., 2003) was applied to measure adolescents' satisfaction with five specific topics (i.e., family, relationships, school experiences, self, and living environment) as well as overall life satisfaction. The six items (i.e., the original five topic-specific items and the additional overall item) are positively worded and use a 7-point Likert-style format (i.e., 1 ¼ extremely unsatisfied, 7 ¼ extremely satisfied). All items were averaged in order to calculate a general satisfaction with life score. Reliabilities of the German version of the BMSLSS (i.e., without the overall life satisfaction item) can be considered as good (Cronbach's ⍺ ¼ .75; Weber et al., 2013).
The Orientation to Happiness scale (OTH; Peterson et al., 2005b) in the German version  was applied. The OTH scale is an 18-item questionnaire for the subjective assessment in adults of life of pleasure, life of engagement, and life of meaning (six items each). All items are positively worded and have to be answered using a 5point Likert-style format (i.e., 1 ¼ very much like me; 5 ¼ very much unlike me). The reliability of the German version was acceptable in previous studies: Cronbach's ⍺ ! .63, test-retest reliability for 3 months r ! .63 .
Furthermore, informant-report forms of these four questionnaires were applied. The informant-report forms were identical to the self-report forms except the items and response scales were formulated in third person singular. Previous studies provided evidence regarding reliability and validity of some of these informant-report forms in German or English (e.g., SWLS: Pavot & Diener, 2009;OTH: Wagner et al., 2019).

Procedure
Data were collected in schools in Switzerland and in Germany. An informed instructor introduced all participants directly in the classrooms. Participants completed the questionnaires at home and within one week. The questionnaires were divided into two sessions, each lasting about 30-45 minutes. The order of presentation of the VIA questionnaires were randomized (i.e., half of the participants first completed the VIA-IS and then the VIA-Youth, and the other half first completed the VIA-Youth and then the VIA-IS). All received written individualized feedback on their character strengths and additional information on the meaning of each of the strengths and the VIA classification. All participants attended voluntarily and without remuneration. Additionally, all participants younger than 18 years provided the permission of their parents or legal guardians. According to the local university guidelines, no ethics approval was required for this study.

Statistical analysis
All analyses were performed once with the self-rating sample (N ¼ 170) and once with the combined sample of self-rating and informant-ratings. With regard to the latter, all participants were considered from whom self-rating data and at least the data from one informant-rating were present (N ¼ 166). The self-ratings and informant-ratings were aggregated into one global score (see Supplement Table S2 and Supplement Table S4 for the correlations between selfratings and informant-ratings for the VIA questionnaires and the well-being measures, respectively). As averaging across multiple raters increases trait variances and decreases methods variance (e.g., Chang et al., 2012), the findings based on the combined scores provide potentially less biased insights in particular regarding the association with well-being. 5 In the first step, we inspected the descriptive statistics of both character strengths questionnaires. In detail, we examined whether the average scale score for each character strength was similar between the VIA-IS and the VIA-Youth. We considered a mean difference (i.e., Cohen's jd z j for correlated measurements; see, e.g., formula 6 in Lakens, 2013) of 0.22 as substantial as it approximately correspondents to a small effect according to Gignac and Szodorai (2016) 6 . Following an equivalence testing approach with regard to mean differences (Lakens, 2017), we deemed a difference as statistically significant if the 90% confidence interval included the critical effect size and the 95% confidence interval did not include zero. An undetermined difference implies that the sample size was not large enough to determine whether the mean difference was statistically different or not. Given the sample size, the critical difference of jd z j ¼ .22, and a ¼ .05, a sensitivity analysis revealed a statistical 5 Although it is interesting to investigate the informant-report form separately (e.g., Buschor et al., 2013), we have decided to use the aggregation approach. In our view, this approach is more suitable for answering our research question (i.e., reducing shared method variance due to aggregation instead of relying on another source of shared method variance; see also Wittmann, 1988). 6 Due to a lack of standards for interpreting effect sizes in the context of comparability of measurement instruments, we used Gignac and Szodorai (2016)  power for each equivalence test of 0.80 for the self-rating sample and 0.79 for the combined sample. Furthermore, McDonald's x was calculated as a reliability measure of the scale scores (see Kelley & Pornprasertmanit, 2016).
In addition to the comparison of the average scale scores, we investigated the signature strengths (i.e., the character strengths with the highest scale sore; e.g., McGrath & Wallace, 2021; of both character strengths questionnaires. As tied scale scores can be a challenge in identifying a person's signature strengths (e.g., Blanchard et al., 2019), we applied a more liberal approach and classified all character strengths with the five 7 highest scale scores as signature strengths. We then counted how often each character strength was a signature strength across all participants based on bootstrapped rank order positions. Differences regarding signature strengths between the questionnaires were examined based on bootstrapped 95% confidence intervals. In addition, we compared per person how often the signature strengths in the VIA-IS were also signature strengths in the VIA-Youth.
In the second step, we examined the construct validity of the character strengths by the means of a classical multitrait-multimethod (MTMM) analysis (Campbell & Fiske, 1959). As a strict test for construct validity, the lowest convergent validity coefficient (i.e., monotrait-heteromethod correlations) should be higher in value than the highest discriminant validity coefficient (i.e., heterotraitheteromethod correlations). Furthermore, none of the correlations between the character strengths within each measurement (i.e., heterotrait-monomethod correlations) should be higher than the lowest convergent validity coefficient. In addition to this global test, the convergent and discriminant correlations were separately examined for each character strength.
In the final step, we investigated the association between character strengths and well-being in terms of criterion validity. As the different well-being measurements had different scales (i.e., 5-point vs. 7-point Likert-style format), all scale scores were transformed into percent of maximum possible (POMP) scores (Cohen et al., 1999) to facilitate comparability of the descriptive statistics. McDonald's x was calculated to estimate the reliability after examining the structural validity (i.e., unidimensionality) of the well-being measures (not presented in detail). The correlations between character strengths scales and the well-being scales were evaluated based on Pearson's correlation coefficients. With regard to the difference of correlations between well-being and character strengths measured either with the VIA-IS or with the VIA-Youth, we calculated for each character strength difference scores of correlations (based on the Fisher z-transformation) and their confidence intervals based on Wilcox' (2016) bootstrap approach. Following an equivalence testing approach with regard to differences in correlations (Lakens, 2017), we deemed a difference as significant if the 90% confidence interval included the critical effect size and the 95% confidence interval did not include zero. We considered a difference of the correlations of jDrj ¼ .22 to be significant. This effect size, which approximately corresponds to a medium effect according to Gignac and Szodorai (2016), allows to statistically examine the differences with a statistical power of 0.80 for both samples (based on a sensitivity analysis given the sample size, the critical difference jDrj, and a ¼ .05). As a smaller effect size would result in less statistical power for the given sample size, we considered this effect size to be appropriate.
The maximum percentage of missing data (i.e., across all items per person) was 4.1% and 8.4% for the self-rating sample and the informant-rating sample, respectively. Missing data were imputed on item level based on polytomous regression. In order to control for measurement error, all correlation coefficients including the confidence intervals (see Padilla & Veprinsky, 2012) were corrected for attenuation based on Spearman's (1904) formula. Bootstrapped analyses were conducted with 10,000 draws. For all analyses, we used the software R (Version 3.6.2; R Core Team, 2019) 8 .

Descriptive statistics
Self-rating sample Supplement Table S1 displays the descriptive statistics of each character strength, the mean differences of the scale scores of the VIA-IS and the VIA-Youth, and the reliability estimations. According to the equivalence testing approach, 18 of the 24 scales had a different average scale score.
The median reliability (x Mdn ) of the VIA-IS and the VIA-Youth scales was .82 (x Min ¼ .60, x Max ¼ .92) and .72 (x Min ¼ .60, x Max ¼ .90), respectively. Reliability was comparable for most of the scales (i.e., Dx Mdn ¼ .03).

Combined self-and informant-rating sample
Descriptive statistics and the reliability estimations are displayed in Supplement Table S2. Similar to the self-rating sample, 19 of the 24 VIA-IS scales had a substantially different mean scale score compared to the VIA-Youth scales.
The reliabilities were higher compared to the self-rating sample. The median reliability of the VIA-IS and the VIA-Youth scales was .85 (x Min ¼ .72, x Max ¼ .96) and .82 (x Min ¼ .75, x Max ¼ .94), respectively. Reliabilities were slightly more similar between the VIA-IS and the VIA-Youth scales compared to the self-rating sample (i.e., Dx Mdn ¼ .02).

7
The pattern of findings reported here does not change when a different number of signature strengths (i.e., 3 or 7) was considered. With regard to the tied scale scores, it should be noted that more than 5 signature strengths could be identified per person. For example, if the fifth and sixth highest character strength had the same scale score, then 6 signature strengths were identified for this person. 8 We, furthermore, used the R-packages boot (Version 1.3.23; Davison & Hinkley, 1997)

Signature strengths
Self-rating sample In the next step, we investigated the signature strengths (i.e., the five character strengths with the highest scale sores) in both questionnaires. Figure 1 displays the relative frequency for each character strength separately for the VIA-IS and VIA-Youth. As can be seen, there are substantial differences: The character strengths of zest, social intelligence, fairness, and leadership were more often a signature strength in the VIA-IS than in the VIA-Youth. It was the other way around with bravery, forgiveness, appreciation of beauty and excellence, gratitude, and hope, which were more often a signature strength in the VIA-Youth than in the VIA-IS. Furthermore, we examined how often the same five signature strengths were identified in the VIA-IS and the VIA-Youth. Only for 5.29% of the participants were the signature strengths in the VIA-IS the same as in the VIA-Youth. If one or two discrepancies were allowed, then the percentage increased to 31.18% and 73.53%, respectively.

Combined self-and informant-rating sample
The character strengths of curiosity, open-mindedness, social-intelligence, and fairness were more often identified as signature strengths in the VIA-IS than in the VIA-Youth (i.e., the previously discovered difference with regard to zest and leadership was not found here, but a difference regarding curiosity and open-mindedness). Similar to the findings based on the self-rating sample, the character strengths of forgiveness, appreciation of beauty and excellence, gratitude, and hope were more often identified as signature strengths in the VIA-Youth than in the VIA-IS (i.e., the previously discovered difference with regard to bravery was not found here). Finally, the signature strengths of the VIA-IS were the same as in the VIA-Youth only in 1.81% of cases. Allowing for one or two discrepancies increased the percentage to 27.71% and 64.46%, respectively.

Multitrait-Multimethod (MTMM) analysis
The complete 48 Â 48 MTMM matrices among the 24 character strengths from the VIA-IS and the VIA-Youth for both samples can be found as an online supplement: https:// osf.io/m7xg9/.

Self-rating sample
The median convergent correlation (monotrait-heteromethod correlations) across all character strengths was substantial with r Mdn ¼ .89 (r Max ¼ .63 for self-regulation, r Max ¼ .97 for fairness). The median discriminant correlation (heterotrait-heteromethod correlations) was substantially lower with r Mdn ¼ .21 (r Max ¼ À.34, r Max ¼ .80). As 16 discriminant correlations were numerically higher than the lowest convergent validity correlation, the strict test of construct validity failed. Table 1 summarizes the convergent and discriminant correlations per character strength. Most of the character strengths showed evidence with regard to their construct validity (i.e., convergent validity coefficients were substantially higher than discriminant validity coefficients). However, for seven character strengths this was not the case: curiosity, perspective, zest, teamwork, leadership, self-regulation, and hope. They showed a higher discriminant correlation with character strengths either from the other measurement (heterotrait-heteromethod correlations) or within the same measurement (heterotrait-monomethod correlations) compared to the convergent correlation (monotrait-heteromethod correlation). For example, the leadership scale of the VIA-IS and the VIA-Youth had a convergent correlation of r ¼.66. The highest heterotrait-heteromethod correlation for the VIA-IS leadership scale was found for the VIA-Youth fairness scale (r ¼.75), and for the VIA-Youth leadership scale it was the VIA-IS social intelligence scale (r ¼.74). Furthermore, the highest heterotrait-monomethod correlation for the VIA-IS leadership scale was found for the VIA-IS fairness scale (r ¼.92), and for the VIA-Youth leadership scale it was the VIA-Youth perspective scale (r ¼ .69). Thus, the leadership scale showed several discriminant correlations that were higher than the convergent correlation.

Combined self-and informant-rating sample
The median convergent correlation across all character strengths was similar to the self-rating sample (r Mdn ¼ .89, r Min ¼ .71 for self-regulation, r Max ¼ 1.00 for perseverance). Although the median discriminant correlation was slightly higher compared to the self-rating sample (r Mdn ¼ .31, r Min ¼ À.37, r Max ¼ .84), only 12 discriminant correlations were numerically higher than the lowest convergent validity correlation.
Supplement Table S3 summarizes the convergent and discriminant correlations per character strength. Except for three (i.e., curiosity, leadership, and self-regulation), all character strengths provided evidence with regard to their construct validity (i.e., convergent validity coefficients were lower than discriminant validity coefficients).
Although the strict test of construct validity also failed in the self-and informant-rating sample, the aggregation across multiple raters led to better findings regarding the construct validity compared to the self-rating sample.

Preliminary analyses
Self-rating sample. Supplement Table S4 displays the descriptive statistics, reliability estimates, and correlations for the well-being scales. The three life satisfaction scales (i.e., SWLS, SLSS, BMSLSS) showed good or very good reliability estimates (x ! .76). As mentioned in the introduction, potential differences with regard to the association between character strengths and well-being can also be caused by different measurements of well-being. However, the very high inter-scale correlation between the well-being measurements (r ! .94) as well as the findings of additional analyses following the approach proposed by Gignac and Kretzschmar (2017;not presented in detail) indicate that there was no evidence for unique variances of the specific well-being scales. Moreover, there was also no evidence that the association between character strengths (measured via VIA-IS or VIA-Youth) and well-being differs depending on the questionnaire used to measure life satisfaction (i.e., SWLS, SLSS, BMSLSS). 9 In summary, there were no significant differences between the life satisfaction questionnaires that could influence the association between character strengths and life satisfaction. Therefore, we present only the results regarding a single, aggregated score for life satisfaction based on the three life satisfaction scales for all further analyses. In detail, we examined whether questionnaires for the same age group (e.g., adults: VIA-IS and SWLS; adolescents: VIA-Youth and SLSS, BMSLSS) show different relations compared to questionnaires for different age groups (e.g., VIA-IS and SLSS, BMSLSS; VIA-Youth and SWLS) based on an equivalence testing approach similar to that used for the main analyses.
The Orientation to Happiness (OTH) scales showed relatively low reliabilities (x ! .63), which, however, are comparable to previous studies (e.g., . The correlations between the OTH subscales themselves (.00 r .34) as well as between the OTH subscales and the life satisfaction scales (.06 r .43) were low to medium. We interpret these findings in such a way that the OTH subscales capture heterogeneous and distinct aspects of wellbeing (Peterson et al., 2005b) and, thus, should be considered separately.
Combined self-and informant-rating sample. As can be seen in Supplement Table S5, reliability estimates were higher than in the self-rating sample (x ! .83). The associations between the three life satisfaction scales (i.e., SWLS, SLSS, BMSLSS) also implied the use of an aggregated life satisfaction score (r ! .95). The Orientation to Happiness (OTH) scales showed higher reliability estimates (x ! .68) and correlations with each other (.19 r .58) compared to the self-rating sample. The associations between the OTH scales and the life satisfactions scales (.04 r .39) were similar to those in the self-rating sample.
Association between character strengths and well-being Self-rating sample. The correlations between the well-being scales and the character strengths are displayed in Table 2. With regard to life satisfaction and from a descriptive perspective, there were several correlations that differed between the VIA-IS and the VIA-Youth (i.e., jDrj Mdn ¼ .08; .01 jDrj .34). However, the equivalence testing approach revealed that only seven character strengths showed a substantially different association with life satisfaction: creativity, curiosity, love of learning, bravery, zest, love, and gratitude. The median difference of the correlations between the character strengths and the OTH scales was jDrj Mdn ¼ .08 (.01 jDrj .31). Based on the equivalence testing approach, there were ten character strengths that showed a substantially different correlation for a least one OTH subscale: creativity, curiosity, love of learning, bravery, kindness, fairness, forgiveness, prudence, humor, and religiousness.
In summary, 13 character strengths demonstrated a different association with at least one well-being scale depending on whether they were measured with the VIA-IS or VIA-Youth.
Combined self-and informant-rating sample. Supplement Table S6 displays the correlations between character strengths and well-being scales based on the combined sample. In general, the differences between the VIA-IS and the VIA-Youth regarding the association with life satisfaction were similar to those of the self-rating sample (i.e., jDrj Mdn ¼ .07; .00 jDrj .27). However, there was only a partial overlap with the self-rating sample with regard to the results of the equivalence testing approach. Whereas no substantially different association was found for creativity and bravery, in addition to the other five character strengths mentioned above, a significantly different correlation was also found for kindness in the self-and informant-rating sample.
With regard to the OTH scales, the median difference of the correlations was jDrj Mdn ¼ .11 (.00 jDrj .33). In addition to the ten character strengths described in the selfrating sample, honesty, social intelligence, teamwork, selfregulation, gratitude, and hope were also identified as a character strength that showed a different correlation pattern depending on the measurement instrument.
In summary, 18 character strengths had substantially different associations with at least one well-being measure depending on whether the character strength was measured with the VIA-IS or the VIA-Youth.

Discussion
The present study aimed to answer the question of whether it is justified to compare the character strengths of adults and adolescent across the two questionnaires VIA-IS (Peterson et al., 2005a) and VIA-Youth (Park & Peterson, 2006). Based on a self-report sample and a combined selfand informant-report sample, we observed many similarities between the questionnaires, but also several discrepancies regarding the psychometric characteristics of the questionnaires and the relations to well-being. As self-reports have been used in the vast majority of research, the following discussions refer only to the self-report sample, while the results of the combined self-and informant-report are addressed later.
(Non-)equivalence of the VIA-IS and the VIA-Youth

Descriptive statistics
About three-quarters of the character strengths scales showed different average scale scores. These findings provide evidence that a direct comparison of the relative elevation of character strengths of adults and adolescents (e.g., Park & Peterson, 2006) based on the two questionnaires is not justified. These results also provide insights into the age differences presented in the meta-analyses of Heintz and Ruch (2021). As age and measurement were necessarily confounded in their analyses, it can be concluded that potential differences between the age groups of 16-17 (i.e., using the VIA-Youth) and 18-20 (i.e., using the VIA-IS) are not only based on developmental changes, but may also be significantly caused by different measurements per age group. As a consequence of the different average scale scores, nine character strengths were in average more often identified as a signature strength either in the VIA-IS or in the VIA-Youth. Furthermore, the exact same five signature strengths were identified in VIA-IS and VIA-Youth in about 5% of cases. Although the percentage could be increased up to almost 75% if up to two deviations were taken into account, the results indicate that the identification of signature strengths varies considerably depending on the questionnaires used.
The implications of these differences have to be considered from a methodological, conceptual and practical perspective. With regard to the latter, the implications can be described by following the illustrative example from the introduction. If a person were to go to career counseling shortly before their 18th birthday, then based on the VIA-Youth, they would most likely receive a different strengths profile and, thus, a different career choice recommendation than if the person went to counseling shortly after their birthday and, thus, had completed the VIA-IS. Strictly speaking, therefore, it does not seem reasonable for only one or the other questionnaire to be used in this age range. As long as no questionnaires are available that provide comparable results across the lifespan, it would at least be a possibility to combine the information from both questionnaires in applied contexts.
From a conceptual point of view, an evaluation of the longitudinal development of character strengths during the transition from adolescence to adulthood as well as a crosssectional comparison of the manifestation of character strengths between adolescents and adults is not appropriate as emphasized by Heintz and Ruch (2021). To solve this problem, it would be possible to empirically determine correction factors based on representative samples. That is, for example, to what extent does an expression in a character strengths scale of the VIA-Youth correspond to an expression of the scale in the VIA-IS. From our point of view, however, it is unclear whether this effort is justified compared to a new development of equivalent questionnaires (see below).
The methodological perspective is especially relevant for the identification of the signature strengths. In the present study, we used the simplest and commonly used approach, identifying character strengths via the highest scale scores of the VIA-IS or VIA-Youth (e.g., Gander et al., 2013;Littman-Ovadia et al., 2017). However, as pointed out by McGrath and Wallace (2021) and others, this approach is not optimal due to tied scale scores. For example, we only classified the five highest scale scores as signature strengths, even though the difference between the lowest signature strength and the highest non-signature strength was relatively small. Our recommendation is therefore that caution is advised in using this approach to identify character strengths based on these two questionnaires. Future research has to demonstrate whether standardized norm data based on representative samples per questionnaire, different algorithms to identify signature strengths (McGrath & Wallace, 2021), or specific questionnaires aiming to assess signature strengths (e.g., Signature Strengths Survey; McGrath, 2019) can minimize or even eliminate the differences found in the present study.

Construct and criterion validity
The multitrait-multimethod analysis revealed that convergent validity (r Mdn ¼ .89) between the two character strengths questionnaires is higher than the typical overlap between other personality questionnaires (see, e.g., Pace & Brannick, 2010). However, seven character strengths scales showed a lower convergent correlation compared to discriminant correlations. Furthermore, seven character strengths showed a different association with at least one of the wellbeing measures in terms of criterion validity. These results demonstrate that the two character strengths questionnaires do indeed come from different domains and are not comparable in terms of construct and criterion validity of character strengths.
One of the key findings from the present study is the striking fact that curiosity is one of the character strengths most strongly associated with life satisfaction among adults, but not among adolescents. Our results demonstrate thatfor the same people (i.e., with the same trait level of curiosity)the correlation between curiosity and life satisfaction differs significantly depending on whether one uses the VIA-IS or the VIA-Youth questionnaire. Therefore, we can conclude that comparative interpretations regarding the role of curiosity for life satisfaction in the different age groups are not appropriate based on the two different questionnaires. Of course, the question inevitably arises as how these differences originate. From an explorative perspective, we therefore took a closer look at the items for curiosity in both questionnaires and their correlations with life satisfaction. In the VIA-IS it was interesting that the item "I think my life is extremely interesting" showed a particularly high correlation with life satisfaction (r ¼ .58), while the correlations of the other six items were much lower (r Mdn ¼ .19; .04 r .28). In the VIA-Youth it is worth noting that the two inverted items ("I don't have many questions about things" and "I am not curious about things") had the lowest correlations with life satisfaction (r ¼ À.03 and .04; recoded) compared to the other six non-inverted items (r Mdn ¼ .20; .05 r .26). If one excludes these three items from the curiosity scale score of the VIA-IS and the VIA-Youth, respectively, the correlations between curiosity and life satisfaction are much more similar (.33 vs. .26; compared to .40 vs. .21 based on all items). However, whereas the convergent correlation between the two modified curiosity scale scores remained similarly high (.72; compared to .79 based on all items), there was a significant difference in the means of the scales (d z ¼ 0.29; compared to 0.08 based on all items). The example is intended to illustrate that while it is indeed possible to optimize equivalence for certain criteria by selectively excluding items, unforeseen side effects are likely to occur at the same time. The most serious threat of such a data-driven approach is to content validity, so that in the end abbreviated measurements are created (see, e.g., Ng et al., 2017) that have to be critically questioned with regard to an appropriate representation of the constructs.
In summary, what becomes clear in the overview is that there is not a problem with a few specific scales, but that different scales are problematic depending on the research question.

Limitations and implications for future research
The findings presented here need to be interpreted in light of some limitations and specific features of the study. First, the sample was not representative of the population. Thus, the differences between the questionnaires should not be interpreted in terms of absolute values or to determine correction factors (e.g., a mean scale score in the VIA-IS corresponds to a mean scale score in the VIA-Youth) but rather indicate that a comparison of character strengths across the questionnaires (e.g., the most common signature strengths of adults and adolescents) is generally not appropriate. Furthermore, the results of the study should be interpreted as rather conservative and, therefore, an absent effect on a particular scale in our study does not necessarily mean that there is no difference. Although the analysis strategy was chosen to ensure a statistical power of .80 for all analyses, only differences of at least medium effect size could be tested by inferential statistics. Therefore, a replication of the present study should be carried out with a larger sample size (see Kretzschmar & Gignac, 2019), so that even smaller differences, which were also descriptively observed in the present study, can be investigated by inferential statistics. Nevertheless, one could also argue that the small effect sizes in the present study have little practical relevance. However, since the comparability of measures for adolescent and adult is important especially in the context of personality development, we follow Roberts et al.'s (2006;see also Funder & Ozer, 2019) perspective that even very small effect sizes are relevant. Indeed, interpreting the small effect sizes typically reported in the personality development literature is only valid if the differences are not due to nonequivalent measurement instruments. In addition, further criteria should also be taken into account to evaluate the equivalence of the questionnaires. For example, a more gender-balanced sample than the present one could be used to examine whether the gender differences, which are more pronounced in adults than in adolescents (see Heintz et al., 2019), are also due to the different questionnaires used in the respective age groups. Moreover, an increasing body of work is concerned with the hierarchical conceptualization of character strengths and virtues (for an overview, see e.g., Feraco et al., 2021). Much larger sample sizes as used in the present study (see, e.g., Hirschfeld et al., 2014;Ng et al., 2017), different methodological approaches (e.g., Partsch et al., 2021), and various cultural contexts (e.g., Duan et al., 2012;Khumalo et al., 2008) should be considered in future studies to investigate whether, for example, adolescents might have a different structure of character than adults (McGrath & Walker, 2016) or whether this is the result of different character strengths measurements.
The second important limitation relates to the measurements used in the study. The VIA-IS (Peterson et al., 2005a) was recently revised (VIA-IS-R; McGrath & Wallace, 2021) with the aim of correcting some shortcomings of the previous version. It is unclear to what extent the findings presented here also apply to a comparison of the VIA-IS-R and the VIA-Youth. However, if we look at the criteria concerning the development of the VIA-IS-R (e.g., unambiguous and more appropriate wording, factorial validity, number of items; McGrath, 2019), it becomes clear that the convergence of the VIA-IS-R and the VIA-Youth was not considered. In fact, the item "I think my life is extremely interesting" that significantly enhanced the correlation between the VIA-IS curiosity scale and life satisfaction is also included in the revised version VIA-IS-R. In addition, a recent study provided evidence for the equivalence for the VIA-IS and the VIA-IS-R (Vylobkova et al., 2021). Therefore, it is necessary that future studies also examine the equivalence of the VIA-IS-R and the VIA-Youth before character strengths in adults and adolescent will be compared based on these questionnaires. Ideally, the equivalence of the questionnaires should also be a criterion to be considered in future revisions, especially when the VIA-Youth will be revised.
A specific feature of the present study is the consideration of self-reports and informant-reports. Although most of the studies in positive or personality psychology are based on self-reports, they are prone to several biases due to mono-method variance (e.g., response styles). As demonstrated in this study, findings based on self-reports or combined self-and informant-reports can differ substantially. Whereas the differences in descriptive statistics were relatively comparable, the combined self-and informant-report sample showed fewer violations of construct validity in the multitrait-multimethod analysis. In fact, there were only three scales in the combined sample for which there were issues regarding construct validity. It should be noted that some of these scales were also identified as problematic in previous research (i.e., the leadership scale in particular), so these scales were revised in the new version of the VIA-IS (see McGrath & Wallace, 2021). However, there was a tendency for more differences between the questionnaires in terms of the association with well-being in the combined sample. Given the finding that trait variance is increased by aggregation across different raters (Chang et al., 2012), it can be assumed that the results of the combined self-and informant-report sample better reflect the potential differences and similarities in character strengths assessment. Future research should further investigate whether aggregation across raters actually provides more valid results when examining the construct validity of personality traits across different age groups and/or measurements.

Conclusion
The VIA-IS and VIA-Youth questionnaires can be considered as reliable and valid in the specific age group indicated for them. However, the present study provides evidence that the differences between the VIA-IS and the VIA-Youth are not small enough to allow a comparison across age groups (i.e., adolescents vs. adults). Therefore, differences in character strengths between adults and adolescents should not be interpreted in terms of differences on the trait level (e.g., different signature strengths, different importance of character strengths regarding well-being) based on these questionnaires. The present study, thus, also emphasizes the need for further developed measurements in positive psychology (e.g., to study the longitudinal development; Gander et al., 2019) and research on the equivalence of personality questionnaires, particularly with regard to adolescents and adults.

Data availability statement
The data are not publicly available as the consent form excluded data sharing with third parties based on wording that was common at the time. The data that support the findings of this study are available on request from the corresponding author, AK.