Sociolinguistic context matters: Exploring differences in contextual linguistic diversity in South Africa and England

ABSTRACT Individual reports of language history, use, and proficiency are generally considered sufficient for language profiling. Yet, these variables alone neglect the contribution of contextual linguistic diversity to one’s overall language repertoire. In this study we used the Contextual Linguistic Profile Questionnaire to evaluate whether there is a difference in contextual linguistic diversity between participants across the linguistically dissimilar contexts of South Africa and England. We further assessed whether self-reported lingualism status groups (monolinguals, bilinguals, multilinguals) scored differently on contextual linguistic diversity to evaluate the utility and uniformity of categorical labels across varying contexts, and investigated how codeswitching and socio-economic status contributed to these effects. Our results demonstrated that contextual linguistic diversity differs between nations: South Africans score higher, promotion of multilingualism is dependent on socio-economic status only in England, lingualism status is not contextually comparable when measured categorically, and codeswitching accounts for linguistic features of South Africans.


Introduction
A growing body of literature has highlighted the impact of context on bilingual language use (Beatty-Martínez, Valdés Kroff, & Dussias, 2018;Kreiner & Degani, 2015;Montrul, 2015) and its effects on cognition more broadly (Abutalebi & Green, 2016;Freedman et al., 2014). It has been well documented that second language (L2) learning imposes almost immediate and long-term changes to the first language (L1), irrespective of whether the L2 is learned early or late in life (Bice & Kroll, 2015;Chang, 2012;Schmid, 2013). However, fewer studies have addressed whether the sociolinguistic context from which speakers are drawn contributes to their language repertoire. This area of research is extended across the language continuum to speakers who may not consider themselves to be bilingual (speakers/learners of two languages) or multilingual (speakers/learners of more than two languages) in the usual senses of the words, but who may be immersed in a linguistically diverse context and gain linguistic information from their environment, without necessarily having an acute awareness of such gains, or the impetus to learn another language.
Contextual diversity can include instances of interpersonal communication where it is not necessary for an interlocutor to attain full comprehension, knowledge, or use of the totality of languages available within their specific context, but where mutual understanding and recognition of meaning may nevertheless be reached. Such situations are largely evident in countries with a colonial past, where English is not always a principal mother tongue, yet occupies a place of prestige and socio-political power (Tsimpli et al., 2019). Accordingly, language knowledge can include linguistic information via either active or passive linguistic exposure (Wigdorowitz, Pérez, & Tsimpli, 2020). Active linguistic exposure is the direct use, production, and upkeep of language(s) that one employs in their regular communicative endeavors, where the speaker has a conscious representation and has developed some degree of proficiency in each of their languages. In contrast, passive linguistic exposure is the summation of linguistic knowledge as a consequence of implicit and contextual linguistic exposure and input that is derived from the sociolinguistic environment within which the speaker is situated. Passive linguistic exposure is more commonly observed in multilingual societies and settings, where language information is mainly gained implicitly as a consequence of the sociolinguistic milieu.
While active linguistic exposure is the common model used in language research, passive exposure is steadily gaining recognition as an integral feature of one's language repertoire. For example, ambient exposure to multilingualism in a more multilingual society (southern California compared to central Pennsylvania) predisposes the brain of monolingual adults to learning another language (Bice & Kroll, 2019), while adults who are passively exposed to foreign speech sounds manifest an enhancement of auditory discrimination (Kurkela, Hämäläinen, Leppänen, Shu, & Astikainen, 2019). In fact, this learning effect seems to extend beyond the auditory domain, as it has been found that primary school children in India coming from monolingual households show an advantage in fluid intelligence as an effect of sociolinguistic diversity in their school and community settings . Moreover, extensive literature has demonstrated that language knowledge is observed in the absence of deliberate learning and, possibly via limited language exposure (Gullberg, Roberts, Dimroth, Veroude, & Indefrey, 2010;Oh et al., 2020), when participants receive linguistic input from an immersive learning setting rather than a non-immersive one (Kroll, Dussias, & Bajo, 2018;Morgan-Short, Steinhauer, Sanz, & Ullman, 2012;Pliatsikas, DeLuca, Moschopoulou, & Saddy, 2017), and via receptive input in heritage bilingual communities (Sherkina-Lieber, Perez-Leroux, & Johns, 2011).
Furthermore, the effects of passive exposure also appear to endure over one's lifespan. For instance, adults who consistently overheard a language as children were able to learn native-like phonological features of their overheard language despite not retaining explicit knowledge or awareness of the exposed-to language (Au et al., 2002;Au, Oh, Knightly, Jun, & Romo, 2008;Knightly, Jun, Oh, & Au, 2003). Additionally, behavioral and neuroimaging research on functionally monolingual international adoptees allude that even durationally minimal exposure to a language in childhood with no subsequent maintenance or conscious recollection of the language leads to longterm linguistic and neural effects similarly expected in bilinguals who speak the "lost" language (Oh, Au, & Jun, 2010;Pierce, Chen, Delcenserie, Genesee, & Klein, 2015). Overall, this evidence indicates that passive linguistic exposure has an important effect on one's language repertoire.
In light of these findings, it is possible that in linguistically diverse, multilingual contexts, where various languages are spoken and displayed across the linguistic landscape (Gorter, 2006), at least some linguistic information is cognitively integrated by the people situated within and exposed to such contexts. Consequently, exposure to a linguistically diverse context over a substantial period is imperative to one's language repertoire, particularly when exploring the holistic linguistic experience of speakers across the continuum of language knowledge.

Context and the lingualism status spectrum
Another area of importance regards the classification of individuals into respective language groups, such as the commonly used descriptors of monolingualism, bilingualism, and multilingualism. We refer to these as an individual's lingualism status since these describe some continuous stature of language experience. Yet, an obvious caveat observed from the accumulated literature is the lack of clarity and regularity in conceptualizing the lingualism status of participants and the often-imposed categorical grouping of participants into one of these groups (Surrain & Luk, 2019). We propose that a categorical classification of one's lingualism status may pose unique problems to the conception of the individual's language knowledge and experience, given the limitations of the nature and diversity of language experience imposed by one linguistic label. It is therefore necessary to evaluate whether speakers are homogeneous in their contextual language exposure if they classify themselves with the same linguistic label because, under the conventional view, we would expect monolinguals (extended to bilinguals and multilinguals) from one context to be largely equivalent to monolinguals from another. Yet, are monolinguals in multilingual societies equal to (i) bilinguals/multilinguals within the same sociolinguistic context, or (ii) monolinguals in less linguistically diverse contexts? It is only when contextual linguistic diversity and, in particular, passive linguistic exposure is added to an epistemology of language experience that the prospect of greater understanding about the confounds of one's language profile is real.
As noted, if monolingualism is a unanimous experience, then it is assumed that those who identify or are classified as monolingual have homogeneous and stable linguistic knowledge. However, such a restricted classification often fails to take contextual linguistic diversity and classificatory norms into account. The categorization of participants into boxed language groups, whether self-reported or designated, may therefore conceal nuances of language knowledge and experience that could alter research findings across various domains. For instance, studies have found differences in electrophysiological responses to language processing across seemingly similar monolingual groups differing in contextual linguistic exposure (Bice & Kroll, 2019) and proficiency (Pakulak & Neville, 2010). In addition, the situation is further complicated because different linguistic contexts employ unique standards and ideologies as to what counts as monolingualism, bilingualism, and multilingualism, as well as how language knowledge may be integrated as a result of the general linguistic milieu. As is the binary status quo, bilinguals are usually compared to monolingual counterpart "control" groups to assess whether bilingualism has an effect on X behavioral, linguistic, or cognitive outcome. Moreover, the majority of bilingualism research has been conducted within Global North, Western, or Anglosphere contexts, where conceptions of language knowledge and use may have particular meanings distinct from those from other, more linguistically diverse contexts outside of the Global North.
In a study exploring the consequences of context on language use, Beatty-Martínez et al. (2020) compared three groups of ostensibly equivalent, highly proficient Spanish-English bilinguals, situated across three contexts that differed in language practice on lexical production and proactive/ reactive inhibitory control. One group lived in Granada, Spain, where Spanish and English are mainly used separately and across specific domains (e.g., Spanish at home, English at school/work, little-to-no codeswitching). The second group lived in San Juan, Puerto Rico, where Spanish and English are more integrated and there is greater flexibility of language choice across domains and more opportunities to codeswitch. The last group consisted of participants born and raised in Spanish-speaking environments who immigrated to the United States during childhood or adolescence and were studying at State College, Pennsylvania. Although English is the dominant language used across domains in this State, there are varied but limited opportunities for Spanish use and codeswitching. Overall, they found that lexical access and how it relates to cognitive control greatly depends on the language practices and demands of the linguistic context. Specifically, accuracy on a picture naming task was modulated by inhibitory control only for the immersed bilinguals in the United States but not the other two groups, suggesting that this group of bilinguals needed to actively monitor their context prior to speaking, as their language choice was facilitated in relation to the language of their interlocutors, which had habitually become English. The authors acknowledged that most studies would normally aggregate such seemingly similar speakers into a single Spanish-English bilingual group, whereby this approach could lead to "a failure to characterize the complexity associated with the context of language use" (Beatty-Martínez et al., 2020, p. 15).
Although much work has been done to describe and quantify language experience, there is little uniformity or standard method as to how and by what measure this should be captured. Furthermore, assessment efforts have largely amassed around children's linguistic knowledge, with far fewer questionnaires available for adults (for a review see Kašćelan et al., 2022). Given the expansive interest in multilingualism research, the goal should be attaining language knowledge about the participants that meets a best practice standard, and we propose that this includes a measure that captures sociolinguistic experience along with history, usage, and proficiency indicators. In order to appropriately characterize speakers within dynamic language contexts, it is clear that we need to consider their diverse experiences with respect to contextual language diversity and use (Bice & Kroll, 2019;Kroll et al., 2018;Tsimpli et al., 2020;Wigdorowitz et al., 2020). Having a holistic measure of language experience that captures information about sociolinguistic context can, accordingly, indicate where the speaker is linguistically immersed, including how certain languages may be privileged over others, and the flexibility, codeswitching, and exchange of language practices.
One recent questionnaire has been designed to evaluate both contextual and individual linguistic diversity as distinct features comprising the language profile: The Contextual and Individual Linguistic Diversity Questionnaire (CILD-Q; Wigdorowitz et al., 2020). The CILD-Q assesses contextual linguistic diversity by three scales: a) Multilingualism in Context, which encompasses the contextual use, societal practice, and community language norms (e.g., codeswitching) of multiple languages in addition to the dominant language within a context including via interlocutors, the media, and across the linguistic landscape (e.g., signage); b) Multilingualism in Practice, which includes individual exposure of linguistic diversity as a feature of spoken engagement that one is either directly or indirectly (i.e., ancillary engagement such as overhearing a conversation) involved in; and c) Linguistic Diversity Promotion, which is the societal and governmental promotion and encouragement of language variation and use within the context. Both Multilingualism in Context and Multilingualism in Practice scales consist of some items about codeswitching. The former reports on codeswitching about the general language situation within the context, while the latter concerns communicative practices of a more personal nature. Moreover, the CILD-Q is part of a larger language profiling questionnaire: The Contextual Linguistic Profile Questionnaire (CLiP-Q), which also captures demographic information, language history, use and proficiency, as well as socio-economic status (SES). As mentioned, apart from contextual linguistic diversity, most of these factors are usually considered in bilingual studies; however, SES has received less attention.
SES is commonly measured as a proxy of parental and/or personal educational attainment, occupation, and/or household assets and income, and it has been consistently found to have a large and sustained impact on language development (Brito & Noble, 2014Letourneau, Duffett-Leger, Levac, Watson, & Young-Morris, 2013). Unsurprisingly, individuals from lower SES backgrounds have poorer linguistic outcomes compared to those from higher SES backgrounds, whereby SES is operationalized as the quality and quantity of the childhood linguistic environment (De Cat, 2020). Given its ubiquitous influence, Wigdorowitz and colleagues (2020) also acknowledged that any measure of language profiling must include SES indicators if it is to provide a comprehensive overview of the population under investigation. Accordingly, factors that characterize the sociolinguistic context and the nature of language input, including SES, cannot be undervalued when the goal is to obtain a best practice perspective able to tease apart the contributions of contextual and individual language experience in order to attain a comprehensive account of linguistic knowledge and influence.

The present study
This study investigates the importance of contextual linguistic diversity when comparing groups from different sociolinguistic contexts, but where English is the lingua franca. In this way, we recruited participants from South Africa and England. South Africa is more multilingual than England in terms of the number of speakers of multiple languages as well as language policy (Constitution of the Republic of South Africa, 1996; Department for Education, 2014; Statistics South Africa, 2012; The British Academy, 2019). Although English is the dominant language in both countries, it is not numerically predominant and may not be the most proficient language of many South Africans, where it is reported as the fourth most common L1 (9.6%) after Zulu (22.7%), Xhosa (16.0%), and Afrikaans (13.5%) (Statistics South Africa, 2012). While English is the only de facto official language of England, reported as the main language for 92% of the population (Office for National Statistics, 2013), 11 languages have attained official status in South Africa (Mesthrie, 2002). In fact, there are more opportunities for active, and crucially passive exposure to multiple languages for South African speakers, whereas the opportunities for widespread and diverse language use are scarcer and more restricted to particular circumstances and engagements (such as amongst family or religious gatherings) in England.
Both groups were assessed on the CLiP-Q (Wigdorowitz et al., 2020), a language profiling measure designed to capture imperative information about contextual and individual linguistic diversity (regarding both active and passive linguistic exposure) as well as additional variables associated with explaining linguistic findings, to understand the relationship between contextual linguistic diversity (CILD-Q scales), sociolinguistic context (England vs South Africa), and SES, on the one hand, and contextual linguistic diversity, lingualism status (monolinguals vs bilinguals vs multilinguals), and SES, on the other hand. Finally, we also explored the interplay between sociolinguistic context and lingualism status by analyzing codeswitching practice. The specific research questions and predictions are as follows: RQ1. (A) Do people who live in a more multilingual context (South Africa) report greater contextual linguistic diversity than those from a less multilingual context (England)? (B) Do individual differences in SES explain these differences? Given that South Africa is a more multilingual country than England, and that South African speakers have more opportunities for passive linguistic exposure than speakers from England, we predict that South Africans will score higher than participants from England in terms of their contextual linguistic diversity overall, and across all three scales of the CILD-Q: Multilingualism in Context, Multilingualism in Practice, and Linguistic Diversity Promotion. We also predict that SES will not differ between countries since our sample broadly represents individuals across a range of socio-economic standing. However, given the prevalence of multilingualism in South Africa, we predict that across the socio-economic spectrum, contextual linguistic diversity will be consistently experienced and promoted irrespective of one's SES. In contrast, for the England participants, SES could influence the uptake of multilingualism in two possible ways. First, those with lower SES could score highly on contextual linguistic diversity, especially if it is the case that lower SES groups use more than one language, as has been reported in many immigrant populations (Fernández Reino, 2019). As a second alternative, participants in England with high SES could score higher on contextual linguistic diversity because of an awareness that foreign language learning is an index of higher education, global advancement, and cognitive benefit (Hogan-Brun, 2017). This hypothesis would be particularly associated with higher SES families sending their children to independent/private schools, in which foreign language learning is valued and promoted more than in state schools (Tinsley, 2019).
RQ2. (A) Is lingualism status contextually consistent when assessing linguistic diversity in South Africa and England? (B) Do individual differences in SES contribute to any possible differences? Given that contextual linguistic diversity is a phenomenon comprising active and passive linguistic input as derived from the sociolinguistic milieu, we consider it distinct from lingualism status, and therefore predict that self-reported monolinguals from South Africa will score closely to their bilingual and multilingual counterparts on the CILD-Q, who at the same time, are expected to attain close scores (overall and across the three scales), due to the great contextual linguistic diversity of this country. In contrast, because England is less linguistically diverse, we predict a larger difference between monolinguals and bilinguals/multilinguals. Regarding the last two groups, it is less clear whether bilinguals and multilinguals will differ from one another, since these participants usually have explicit knowledge of more than one language, and therefore do experience overt linguistic diversity. Therefore, as with South Africans, we do not expect differences on contextual linguistic diversity between bilinguals and multilinguals from England. Finally, whether individual differences in SES should play a role regarding lingualism status in each country remains completely exploratory.
RQ3. Does codeswitching contribute to the effects of contextual linguistic diversity in South Africa and England across lingualism status groups? As mentioned above, only two of the three CILD-Q scales have items that refer to codeswitching: Multilingualism in Context (MIC) and Multilingualism in Practice (MIP). Recall, MIC reports on the global practice of codeswitching within the context, and in contrast, MIP concerns communicative practices of a more personal nature. Accordingly, the codeswitching analysis focuses on these two scales, distinguishing between non-codeswitching and codeswitching items. Our predictions are as follows. Regarding MIC, we predict that monolingual, bilingual, and multilingual South Africans will score higher on both codeswitching and non-codeswitching items than counterpart lingualism groups from England, given that individuals living in multilingual contexts will be exposed, purely as a consequence of their natural language environment, to more linguistic variation than individuals living in predominantly unilingual contexts. In contrast, for the MIP scale, differences across countries are exclusively expected in the codeswitching items. Specifically, given that this scale relates more directly to one's personal language experience, we predict that South African monolinguals will score higher on codeswitching items than monolinguals from England, since codeswitching is a positive and acceptable linguistic attribute for South Africans across the lingualism spectrum (Mesthrie, 2002). It is less clear whether bilinguals and multilinguals will differ in their codeswitching practice across South Africa and England since these participants report to have access to more than one language and may engage directly or overhear others engaging in codeswitching. Despite this, we expect South African bilinguals and multilinguals to score higher on codeswitching items than the same groups from England.

Participants
The country that the participants reported to have lived in for the longest amount of time guided the classification of the sample into their respective South African and England groups. An initial sample of 353 participants accessed the CLiP-Q but were removed if they had incomplete or anomalous responses, leaving a sample of 269 participants (South Africans = 67.29%, significantly older than England participants). To address the unequal sample size and reduce the age disparity, we computed a propensity score match with age as the matching variable. Matching yielded a final sample of 176 (88 participants across each group; South Africans: M age = 28.53, SD = 7.81, female = 84.1%, male = 15.9%; England: M age = 22.80, SD = 6.10, female = 62.5%, male = 36.4%, non-binary = 1.1%). English was reported as the most proficient language of 144 participants and the second most proficient language of 28 participants, with all participants reporting daily English exposure (see Table 1 for English proficiency ratings). Furthermore, English was reported as the primary medium of instruction for the majority of participants across primary (70.35%) and secondary (70.86%) school settings. Through self-report, our sample was divided into 63

Contextual Linguistic Profile Questionnaire (CLiP-Q)
The CLiP-Q (Wigdorowitz et al., 2020) is a holistic language profiling measure which captures information that has been found to be imperative to language profiling research. The questionnaire includes the following sections: A. Demographic information. Participants report the country they have lived in over the majority of their lives, which leads them to answer country-specific demographic questions about their nationality, country and province/region of birth and current residence, total years lived in reported country, age, gender, and ethnicity.
B. Contextual and Individual Linguistic Diversity Questionnaire (CILD-Q). As mentioned above, the CILD-Q differentiates linguistic diversity within the individual and as a feature of their contextual exposure in relation to the country participants report to have lived in over the majority of their lives (i.e., where they have received the greatest sociolinguistic exposure). Because of its dominance, English is the reference language used to frame the questions. Participants respond, on a Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree), to 18 items that have been found to reliably measure three scales pertaining to contextual linguistic diversity: Multilingualism in Context (MIC), Multilingualism in Practice (MIP), and Linguistic Diversity Promotion (LDP; see description above and Table 2 for CILD-Q items). Higher mean scores reflect greater general exposure and mixing of different languages from that of English across spoken and written domains (MIC), greater individual and conversational exposure to linguistic diversity (including engagement in code-switching; MIP), and higher governmental and societal encouragement of language diversity (LDP). Item order is randomized across participants.
C. Language history, use, and proficiency. This section is concerned with individual accounts of language history, use, and proficiency and is divided into two components. The first gathers information about general language background including all spoken and written languages, home and most comfortable language, formally and informally learned languages, medium of schooling, exposure to English, and lingualism status self-classification. The second component asks questions about participants' first, second, and third most proficient languages, with repeated questions for each language. Information is obtained about age milestones (e.g., acquisition, writing); years of language use; ability in speaking, understanding, reading, and writing; extent of language use with interlocutors, across activities, and when engaging in personal states (e.g., thinking); and degree of cultural association. Lastly, an open-ended question is presented probing participants for any additional language background and usage information they deem to be noteworthy.
D. Socio-economic status (SES). A composite SES score is computed from variables associated with SES, including an index of household assets (home security, computer, paid TV subscription, internet access, car, and domestic worker and/or gardener; 0 = no, 1 = yes, summed), annual household income (value range according to tax brackets; 1 = Less than R195,850/£11,850 to 6 = R1,500,001/£150,001 and above), self, maternal, and paternal level of education (1 = lower than or up to Grade 11/less than or up to GCSE to 6 = PhD). Scores from each variable were averaged to create a composite SES score ranging from 1 (low SES) to 6 (high SES).
Taken together, the four sections of the CLiP-Q provide a comprehensive linguistic profile of adults situated across various sociolinguistic contexts.

Procedure
Ethical clearance was granted from the University of Cambridge Research Ethics Committee prior to conducting the research. The CLiP-Q was distributed online across the researchers' networks (e.g., listservs, social media sites) via Qualtrics and took around 30 minutes to complete. Participation was voluntary and informed consent was requested once participants were informed of the study overview, inclusion criteria, and ethical guidelines.

Data analysis
Linear mixed-effects models (LME) were conducted in R using the "lme4" package (Bates et al., 2020), which accounts for both fixed and random effects. LME models can estimate participant and item-level data under one analytic framework, which increases the generalizability of the results (Baayen, Davidson, & Bates, 2008). Participants and items were computed as random factors in all models. In addition, fixed factors varied according to the research question. For instance, to address RQ1, Model 1 included nationality (England vs South Africa), CILD-Q scales (MIC vs MIP vs LDP) and SES fixed factors, whereas to address RQ2, Models 2 and 3 included the fixed factors of lingualism status (monolingual vs bilingual vs multilingual), CILD-Q scales (MIC vs MIP vs LDP) and SES, for the South African and England samples separately. SES was considered as a continuous variable, using centered values. In all cases, the dependent variable was the mean score (from 1 to 5) obtained from the CILD-Q.
To determine the optimal structure for the random and fixed components of each LME model, the procedure, as outlined by Zuur, Ieno, Walker, Saveliev, and Smith (2009) was followed. We first looked for the best random structure using restricted maximum likelihood, while the full fixed structure (i.e., a three-way interaction in all cases) was retained (Barr, Levy, Scheepers, & Tily, 2013). More specifically, the random structure was tested by running an ANOVA between all possible models containing the various combinations of intercepts and/or slopes, using nationality and lingualism status as random slopes, while keeping the full fixed structure (see Pérez, Joseph, Bajo, & Nation, 2016, for the same rationale). The model containing the lowest AIC and BIC values was selected.
Once the best random structure was identified, we then tried to obtain the best fixed structure. To do this, we ran stepwise model comparisons from the most complex model (i.e., three-way interaction) to the simplest model (i.e., main effect), by selecting the significant χ 2 test for the loglikelihood, using maximum likelihood. Third, χ 2 and p values were provided by the ANOVA function of the "lmerTest" package using Satterthwaite's approximation for denominator degrees of freedom (Kuznetsova, Brockhoff, Christensen, & Jensen, 2020). It was also important to evaluate the effect size of the significant effects, in order to describe the proportion of the total variability attributed to the factor, and in this way, provide an indication of the practical significance of the result. Partial eta squared (η p 2 ) was calculated using the "eta_sq" function of the "sjstats" package (Lüdecke, 2020). To qualify the two-way interactions, the "testInteractions" function of the "phia" package (De Rosario-Martínez, 2015) and "lsmeans" function of the "lsmeans" package (Lenth, 2018) were used for post-hoc analyses with Bonferroni correction where necessary. For the three-way interactions, we divided the data into subsets according to the levels of the CILD-Q scales and fitted adjusted LME for these subsets.
Finally, to address RQ3, we ran independent samples t-tests with a Bonferroni correction, comparing lingualism status (monolingual vs bilingual vs multilingual) across nationality (South Africa vs England) in all crossed conditions including either MIC or MIP scales and noncodeswitching and codeswitching items.

Results
Results of the LME analysis are presented in three main sections with a focus on the fixed effects output. Summary details regarding model fit and random effects of the significant models are provided in the Appendix. We first evaluated whether participants from South Africa and England differed in their contextual linguistic diversity and whether possible differences were explained by SES (RQ1). Secondly, we evaluated whether there were differences of lingualism status within each country and, once more, if individual differences in SES explained their effects (RQ2). Finally, we explored the role of codeswitching in the MIC and MIP scales to investigate differences in lingualism status across South Africa and England (RQ3). Means and standard deviations for item-level data are provided in Table 2, while means and standard error values for CILD-Q scores across nationality groups as a factor of linguistic status are provided in Table 3.

Contextual linguistic diversity across countries
Regarding differences between countries (Model 1, see Appendix), our LME model showed a significant main effect of nationality, F(1,187.46) = 13.96, p < .001, η p 2 = .07, where as expected, participants from South Africa (M = 3.85, SE = .12) scored higher on the CILD-Q than participants from England (M = 2.50, SE = .17). In addition, the two-way interaction of nationality × scale was marginally significant, F(2,184.82) = 2.99, p = .05, η p 2 = .03, where pairwise comparisons with Bonferroni correction showed that South Africans scored higher than participants from England across two CILD-Q scales: MIC, t(125.4) = 11.51, p < .001; and LDP, t(29.2) = 5.18, p < .001. No difference was found in the MIP scale (p = .19; see Figure 1). To further clarify this interaction, a second pairwise comparison was run dividing by nationality. This analysis demonstrated that the three scales did not differ from one another for the South African group (ps > .55), but they did for the England group. Post hoc analyses showed that this group had lower scores in MIC compared to MIP, t(24.1) = −4.19, p < .01, but no differences were found between MIC and LDP, nor between MIP and LDP (ps > .15). More importantly, the previous interaction was qualified by a three-way interaction with SES: nationality × scale × SES, F(2,174.67) = 3.83, p < .05, η p 2 = .04. To understand this interaction, a subset was performed by dividing by scale. This division produced a non-significant interaction of nationality × SES for both the MIC and MIP scales (ps > .45). In contrast, the same interaction was significant in the LDP scale, F(1,175.98) = 5.37, p < .05, where pairwise comparisons showed that SES explained differences in LDP in England, χ 2 (1) = 6.15, p < .05, but not SA (p = .88). Specifically, England participants with higher SES scored higher in LDP, while those with lower SES scored lower in this scale (see Figure 2). No other effects were significant (all ps > .12).  Altogether, our findings regarding contextual linguistic diversity across countries suggest that compared to England participants, South Africans had greater exposure to linguistic diversity within their contextual confounds and the promotion of linguistic diversity across governmental and societal strata was larger. In addition, contextual linguistic diversity was experienced consistently for South Africans, but the England group varied in how they experienced it. In fact, individual differences in SES showed that contextual linguistic diversity depends on this factor, but only in England which is a less linguistically diverse context. That is, in England, SES appeared to influence whether there was promotion and endorsement of multilingualism, but for South Africans, multilingualism was promoted regardless of one's socio-economic standing.

Contextual linguistic diversity across lingualism status
Taking into account the unequal sample sizes between the two countries across the lingualism conditions, a separate model for each nationality was run to evaluate differences across lingualism status. No significant effects were found in the South African group (Model 2; all ps > .14). Regarding differences across lingualism status for England (Model 3, see Appendix), our LME manifested a significant effect of scale, F(2,23.82) = 10.51, p < .001, η p 2 = .47, where pairwise comparisons demonstrated differences between participants across the scales. England participants scored significantly lower on MIC in comparison to MIP, t(25) = 4.12, p < .01, and LDP, t (23.7) = 2.53, p < .05. In contrast, no difference was found between MIP and LDP (p = .59; see England scale scores in Figure 1). No other main or interaction effect was significant (all ps > .13).
Taken together, self-reported monolinguals, bilinguals, and multilinguals did not differ from one another in terms of their contextual linguistic diversity in South Africa or England.

Contextual linguistic diversity and codeswitching
Finally, to explore whether some of the differences found across nations and/or lingualism status were a result of items that refer specifically to codeswitching, we ran independent t-tests comparing South Africa and England participants across monolinguals, bilinguals, and multilinguals in all crossed conditions including either MIC or MIP scales and non-codeswitching and codeswitching items (see Table 4). A Bonferroni correction for the two types of items separately, set the alpha at .008.
Results showed that codeswitching played a role depending on the CILD-Q scale. South Africans scored higher than England participants on both codeswitching and non-codeswitching items in the MIC scale (all ps < .001). Therefore, codeswitching alone did not account for the differences observed in contextual exposure to multilingualism. In contrast, codeswitching explained differences between countries in the MIP scale. That is, there were no significant differences between the two nations across the three lingualism status groups for the non-codeswitching items (all ps > .43), whereas codeswitching items revealed differences between lingualism groups across the two countries. More specifically, monolinguals and multilinguals in South Africa scored higher than their England counterparts on the codeswitching items (ps = .05 and < .001, respectively, after Bonferroni correction); in contrast the bilingual groups did not manifest significant differences across countries on codeswitching items (p = .19). These findings indicate that codeswitching was a feature of direct communicative practices of South Africans who self-identified as monolinguals or multilinguals compared to the same populations in England. On the contrary, codeswitching did not seem to be a feature more greatly experienced by South African bilinguals than England bilinguals, suggesting that bilingualism was more similarly defined across the two countries, with equivalent opportunities of codeswitching.

Discussion
A core aim of sociolinguistic research is to, systematically and accurately, describe and quantify language experience. Typically, this is carried out using a language profiling measure where linguistic information of input and usage, age of acquisition and exposure, and proficiency is captured (de Bruin, 2019). Language profiling is largely the first point of data collection in empirical studies aimed at investigating differences across designated language groups. Such information gleaned for these measures plays a vital role in the description, division, and comparison of participants (Luk & Bialystok, 2013;Silva-Corvalán & Treffers-Daller, 2015). Therefore, it is essential that a language profile measure incorporates a broad and valid range of linguistic information that can be used, at minimum, as a baseline for describing an individual's language experience. Accordingly, it was important to assess, using a holistic language profile measure, whether exposure to a predominantly multilingual or unilingual context (where English is the lingua franca), affects linguistic knowledge overall, as well as whether self-classification of lingualism status is contextually consistent, taking socio-economic status and codeswitching behavior into account. Table 4. Mean, standard deviation, p-value (t-test comparison) and g-value (Hedges' g effect size) for non-codeswitching and codeswitching item clusters of Multilingualism in Context (MIC) and Multilingualism in Practice (MIP) across nationality groups, as a factor of lingualism status.

Contextual linguistic diversity across contexts
First, we tried to understand whether people who live in a more multilingual context (South Africa) report greater contextual linguistic diversity than those from a less multilingual context (England; RQ1A). Unsurprisingly, the results show that the sociolinguistic context of where people are immersed does contribute to their linguistic experience overall even when the prominent language across the contexts is the same. In line with our hypothesis, South Africans reported greater overall contextual linguistic diversity, and more specifically, higher multilingual exposure (MIC) and multilingual endorsement (LDP), in comparison to participants from England. Though, it is noteworthy that the differences between nationality groups did not emerge for communicative engagement (MIP; see Figure 1). In this way, we can be confident that contextual linguistic diversity is greatly influenced by exposure to multilingual content within one's environment -whether through the media, on signage, or as a factor of creative and intermingled language practices at the community level -in addition to the value placed on multilingualism from society and the government. The lack of difference between South Africa and England participants on MIP could be due to the fact that this scale relates more to active instances of communicative engagement between interlocutors, where an individual may be privy to an exchange either through directly partaking in it or being an overhearer. In this way, for participants who are exposed to additional languages other than English, their exposure here is more explicit, obvious, and may involve unique strategies of communication (such as codeswitching), and as such, this is reflected in the smaller mean differences between the nationality groups. Nonetheless, future research should clarify this interpretation. In addition, South Africans scored similarly across all of the CILD-Q scales, but in England, participants scored greater on communicative multilingualism (MIP) in comparison to contextual multilingualism (MIC). This suggests that contextual linguistic diversity is more consistently experienced and pervasive in South Africa, while in England this is not the case. In both countries English attains societal language dominance status since it is the lingua franca and privileged in terms of use across pedagogy, government, business, and media. Yet, since South Africa is a unique setting of widespread linguistic fusion, which hosts numerous languages and a diverse array of dialects, we argue that the country's speakers are ubiquitously exposed, both actively and passively, to linguistic diversity in a way distinct from those situated in England. For instance, a South African adult has undoubtedly been surrounded by speakers and content of different languages to that of their L1 (more so, on average, than speakers situated in England), but may not use or have acute knowledge of most or any of these languages. Their knowledge here may be predominantly an awareness of linguistic diversity, but perhaps without actual receptive or productive ability in an additional language, albeit their language skills "may vary along a continuum from zero to full ability" (Tsimpli et al., 2020, p. 2). Therefore, the context of language use is an imperative aspect to consider when comparisons are to be drawn about individuals situated in different sociolinguistic settings.

The interplay of multilingual promotion and socio-economic status
We next wanted to address whether SES contributed to the effects of contextual linguistic diversity in South Africa and England (RQ1B). Indeed, we found that the results appear to be dependent on SES, but only in England which is a less linguistically diverse context in both number of speakers of multiple languages as well as policy. In England, SES appears to influence whether there is promotion and endorsement of multilingualism, with those of a higher socio-economic standing acknowledging a greater uptake and acceptance of multilingualism than those of a lower socio-economic standing. A possible explanation for this finding can be linked to the notion of what Hogan-Brun (2017) calls Linguanomics (the economics of language), whereby multilingualism is viewed as an economic asset and prospect of human capital, as measured both tangibly (e.g., teaching, translation) and intangibly (e.g., culture, identity, human rights).
Our results suggest that those of a high socio-economic standing in England perceive multilingualism as being adequately promoted by the people and country more generally, while those of lower economic advantage perceive multilingualism as being promoted less adequately. It is not the case that the lingualism status of the participants are influencing this finding, since the distribution of those with low SES scores (< 3.5; monolinguals = 22, bilinguals = 7, multilinguals = 14) are fairly equal to those of high SES scores (≥ 3.5; monolinguals = 22, bilinguals = 11, multilinguals = 12). Rather, the perception of multilingual endorsement is greater for participants with more economic resources, who may therefore have access to such resources through education or other privileged means. Compare, for instance, access to language education in state-funded versus independent schools across England. Although there has been a steady decline in language education across both school sectors, language learning remains more of a pedagogical priority and viable subject option for students attending independent schools (Collen, 2020;Tinsley, 2019). This disparate priority may be one of the factors that accounts for the observed results, since 41% of the sample reported to attend an independent or grammar secondary school, and of those that did attend a state school, 42% had received some secondary education in a medium of instruction other than English. Having the economic and social means to gain experience with language education may, therefore, foster the view that multilingualism is promoted within the country.
Contrastively, for South Africans, multilingualism appears to be promoted by society and the government regardless of one's socio-economic standing. That is, across the low to high socioeconomic spectrum, participants equally agree that multilingualism is encouraged. This may be largely influenced by language policies that have been implemented to safeguard multilingualism (e.g., National Language Policy Framework, South African Department of Arts and Culture, 2003), the pedagogical necessity for a multilingual nation, as well as a pledge of inclusiveness and fair representation, including that of language equity in post-apartheid South Africa (Plüddemann, 2015;Weideman, Read, & du Plessis, 2021). Given that this finding is particularly novel, we encourage further research to investigate these results in more depth to unpack the influence of SES on multilingual promotion.

Contextual linguistic diversity and lingualism status
We next investigated whether lingualism status was contextually consistent when assessing contextual linguistic diversity in South Africa and England separately (RQ2). We found that selfreported monolinguals, bilinguals, and multilinguals did not differ from one another in terms of their contextual linguistic diversity in South Africa or England. That is, South African/England selfdescribed monolinguals scored closely to South African/England self-described bilinguals and multilinguals on the CILD-Q, and likewise, the bilinguals and multilinguals scored closely to one another. In line with our hypotheses, a null result of lingualism status on contextual linguistic diversity was present in the South African group, but our findings partially differed from our second hypothesis, where we predicted England monolinguals to score lower than their bilingual and multilingual counterparts given that this is a less linguistically diverse context. In fact, what we did find was a null result for lingualism status on contextual linguistic diversity for England too.
Interestingly, these results suggest that, on the one hand, regardless of the individual lingualism status descriptor one identifies with, their sociolinguistic context is shown to be a consistent contributor to language experience. Contextual linguistic diversity is therefore an imperative variable to consider over-and-above lingualism status, since it is not always possible to show categorical differences across groups that classify themselves under separate linguistic labels. Importantly though, Tsimpli et al. (2020, p. 2) notes that "linguistic diversity at the societal level may or may not translate as multilingualism at the individual level." There is thus a distinction between active and passive language exposure, but where both contribute holistically to the language repertoire.
On the other hand, this is an illuminating finding, adding evidence to the growing body of literature questioning the appropriateness of a categorical notation of language experience. Recently, bilingualism has been disputed as a categorical variable; no definitive experiential threshold suddenly transforms an individual from a monolingual to a bilingual (Luk & Bialystok, 2013;Marian & Hayakawa, 2021;Takahesu Tabori et al., 2018). Rather, language experience is much more complex and multifaceted than the oftentimes reductionist approach it is condensed to. In this sense, our study provides substantial and supplementary evidence supporting this claim, and therefore offers the opportunity to explore similar questions where lingualism status is treated on a continuum, rather than categorically.
Clearly, the conceptions of lingualism categories are not as bounded as have been previously assumed (for a review see Surrain & Luk, 2019). We have shown that self-descriptions of lingualism status in one context can appear similar, thereby providing further evidence illustrating how heterogeneous language groups are, even if people identify themselves under the same linguistic label (also see Beatty-Martínez et al., 2020;Bice & Kroll, 2019;de Bruin, 2019). Accordingly, the more linguistic information we can attain from an individual, the better scope we will have at making some group classifications. We should therefore strive toward a graded approach that positions speakers on a continuum from less multilingual toward more multilingual (Gullifer et al., 2018;Gullifer & Titone, 2020;Luk & Bialystok, 2013;Marian & Hayakawa, 2021), specifically with an acknowledgment of their contextual linguistic experience (Wigdorowitz et al., 2020).

Contextual linguistic diversity and codeswitching
Lastly, we aimed to investigate whether codeswitching contributed to the effects of contextual linguistic diversity in South Africa and England across lingualism status groups (RQ3). In line with our hypotheses, we found that codeswitching did not account for the differences across groups in the MIC scale (which addresses more of the general consensus of language use within the context), where all South African lingualism groups scored higher than the same groups from England across items that both contained and did not contain codeswitching. In contrast, codeswitching, as a direct communicative practice, accounted for the higher MIP scores reported by South African multilinguals and monolinguals (this scale refers more specifically to one's personal language practice and engagements). We therefore cannot overlook the fact that there may be a difference in codeswitching practice between South African and England so-called multilinguals, and especially monolinguals. This finding is striking if we assume that monolinguals have access to one language only and codeswitching is generally a proposed bilingual phenomenon (e.g., Beatty-Martínez, Navarro-Torres, & Dussias, 2020) that requires access to more than one language.
In addition, there was no difference in MIP codeswitching items for the bilingual groups across the two contexts, nor were there any differences for lingualism groups across the nations on MIP items that did not refer to codeswitching. The fact that codeswitching explained the higher MIP scores reported by multilingual and monolingual South Africans but not for the bilingual group across the contexts may also suggest that bilingualism is a more concretely defined notion of language knowledge when contrasted with monolingualism and multilingualism, and can be taken to refer specifically to the knowledge of two languages. In contrast, monolingualism and multilingualism are more nuanced notions of language and may be broadly influenced by linguistic (e.g., number of languages spoken, proficiency) and non-linguistic (e.g., community-based norms) factors. Alternatively, these findings further illustrate the problematic categorization of language knowledge as encompassed within a single lingualism group, as mentioned above.
Essentially, the main point of interest is how speakers have perceived themselves in linguistic terms. If it is expected that a typical monolingual or multilingual would not engage in codeswitching because this phenomenon is considered a "bilingual" practice, then it is surprising that South African multilinguals and monolinguals report codeswitching more so than their England counterparts. The sociolinguistic context of where individuals are situated is then clearly an essential contributor to one's language repertoire. More precisely, individuals situated in multilingual contexts have language experiences that are different from individuals situated in unilingual contexts, even though they may classify themselves under the same linguistic label. Codeswitching is a common communicative practice in South Africa (Slabbert & Finlayson, 1999), observed across formal and informal settings (Mabule, 2019;Rose & Van Dulm, 2011) and in different communities (McCormick, 2002;Slabbert & Finlayson, 2002), suggesting that it is a widespread and pervasive part of many peoples' language repertoires. In England, however, it is less common, where it mainly, if at all, occurs within immigrant or heritage communities (Promprakai, 2018). Furthermore, attitudes and communitybased norms facilitating (or impeding) codeswitching are also important to consider (Beatty-Martínez et al., 2020). In multilingual societies, speakers' intentions to codeswitch may therefore be driven by pragmatic and interactional opportunities irrespective of one's lingualism status.

Conclusions
While individual reports of language history, use, and proficiency have generally been considered sufficient for language profiling, these variables alone neglect contextual linguistic experience as a factor that contributes to one's overall language repertoire. We have demonstrated that a language profiling measure, such as the CLiP-Q, that captures variables pertinent to language knowledge and experience overall should be a goal toward best practice. When conclusions are to be drawn about the role of factors that contribute to linguistic experience and knowledge, it is imperative that researchers are cognizant of the types of information they have acquired. A full picture cannot be drawn if it lacks an inspection of the sociolinguistic experience and exposure of those being evaluated, given how important sociolinguistic context is to one's linguistic repertoire. Fundamentally, we argue that contextual linguistic diversity can influence one's sensitivity to sociolinguistic variation that may be devoid of actual awareness or use of multiple languages, and furthermore, it can shape the status of languages within the socio-cultural context. The nature of contextual linguistic diversity has a profound effect on the linguistic knowledge and use of inhabitants of these contexts. This sociolinguistic factor should be prioritized if research in the language sciences is to make progress.

Disclosure statement
No potential conflict of interest was reported by the author(s). Appendix. Significant Linear Mixed-Effect (LME) models: Fixed and random effects Model 1. LME on the CILD-Q scores including the fixed factors of nationality, scale, and SES