Linguistic and Cognitive Abilities in Children with Specific Language Impairment as Compared to Children with High-Functioning Autism

ABSTRACT This study investigates the question as to whether and how the linguistic and other cognitive abilities of children with Specific Language Impairment (SLI) differ from those of children with High-Functioning Autism (HFA). To this end, 27 Dutch-speaking elementary-school-age children with SLI, 27 age-matched children with HFA, and a control group of 27 age-matched Typically Developing (TD) children were experimentally tested on various components of grammar, pragmatics, and nonverbal cognition. Prima facie, the results suggest a resemblance between SLI and HFA in their lower-than-TD performance on pragmatics. However, the children with SLI perform significantly weaker than the TD children on grammar and several cognition tests, while the children with HFA do not. It is concluded that, despite their initial resemblance in terms of pragmatics, children with SLI have profoundly different profiles from children with HFA in terms of grammar and nonverbal cognition and can thus not be considered as instantiations of the same continuum, as proposed by Bishop (2010).


Introduction
The diagnosis of children with SLI has sparked a tremendous amount of research and literature, including comparisons with other impaired populations, such as children with Williams Syndrome, Down Syndrome, AD(H)D, and Autism Spectrum Disorder (ASD) (e.g., Laws & Bishop 2004;Rice, Warren & Betz 2005). This study focuses on the differentiation between children with SLI and children with High-Functioning Autism (HFA). In this study, children with HFA are defined as children with an ASD who have fluent speech and normal intelligence. Although children with SLI have traditionally been hypothesized to have poor language but normal cognitive abilities (Leonard 1998), other more recent studies suggest that many children with SLI also show cognitive impairments (e.g., Henry, Messer & Nash 2012). Vice versa, children with (high-functioning) autism are described as having "persistent deficits in social communication and social interaction" (DSM-5, American Psychiatric Association 2013), suggesting that pragmatics (i.e., the social use of language) is the domain of primary impairment (see Baron-Cohen 1988 andEigsti et al. 2011 for reviews). Yet, recent research also indicates difficulties in the grammar of children with autism (e.g., Eigsti & Bennetto 2009;Perovic, Modyanova & Wexler 2013).
This raises the question as to whether SLI and autism are part of the same spectrum or, as Bishop (2010) coins it, the same "continuum." Bishop takes the above-chance co-occurrence of SLI and autism as an indication that they share the same etiology. Additionally, family studies have shown that relatives of individuals with SLI or autism often show mild symptoms, such as subtle phonological difficulties or mild social and communicative difficulties (Bailey et al. 1998). Bishop then concludes that SLI and autism correspond to points on a continuum of impairment, rather than being all-or-none diseases (Bishop 2010:619).
The current study focuses exactly on this question: Should SLI and autism (in particular, HFA) be distinguished as different impairments, or do they show so much overlap that they cannot really be separated and should be considered instantiations of the same continuum? Moreover, if the latter, does this mean that SLI and (high-functioning) autism belong to the same phenotype, with the same underlying etiology?
It is argued that this question can only be answered if a large and diverse test battery, including different types of linguistic and nonlinguistic cognitive tests, is used. Only then is it possible to reveal linguistic and cognitive profiles of different types of children. To preview the results, it is shown that, despite some resemblance of SLI and HFA in terms of pragmatics, the grammatical and nonverbal cognitive profiles of the children with SLI differ strongly from those of the children with HFA. This suggests that SLI and HFA are not part of the same spectrum or continuum and potentially have different underlying etiologies. It also suggests that the language problems of children with SLI have a different cause than the language problems of children with HFA and should thus be recognized and treated in different ways.
An additional focus of this study is the relationship between grammatical and nonverbal cognitive abilities. For example, does low nonverbal reasoning ability or low nonverbal working memory predict poor grammar? Although the SLI results show no evidence for a link between nonverbal reasoning and grammar, nonverbal working memory scores do seem to have some predictive power with respect to grammatical ability.
Finally, the question is asked what the SLI and HFA results can tell us about the dissociation of language components such as grammar and pragmatics. If there is a subgroup of the children with SLI that performs well on the pragmatics-driven phenomena but poorly on the grammar tests, this suggests a dissociation between the investigated phenomena in pragmatics and in grammar. Vice versa, if it can be shown that a subgroup of children with HFA make errors on the pragmatics-driven tests but performs well on grammar, this would show the opposite dissociation. The results demonstrate that this is the case indeed, providing a double dissociation between the grammatical and pragmatics-driven phenomena under investigation.

Theoretical background
2.1. Grammatical phenomena 2.1.1. The Mass-Count Distinction Languages that mark number on nouns, including English and Dutch, distinguish mass nouns from count nouns morphosyntactically. As exemplified in the schema in (1), count nouns such as dog can be preceded by an indefinite determiner, they can be pluralized by attaching a plural suffix, and they can be preceded by a numeral. In contrast, mass nouns, such as dough, cannot have an indefinite determiner or a bare numeral and can only be pluralized by adding a measure phrase.
(1) Morphosyntactic differences between count and mass

Count
Mass dog (a dog, three dogs) water (*a water, *three waters, a bottle of water) rope (a rope, three ropes) rope (a piece of rope) Interestingly, as the last row in the schema in (1) shows, there are nouns that can behave as either a count noun, or a mass noun-for example, rope. Such nouns are often referred to as "flexible nouns" (see Barner & Snedeker 2005). Borer (2005) takes this further by claiming that in principle, all nouns are flexible in the sense that the syntax in which they appear determines their status as mass or count, as illustrated in (2) and (3): (2) There is dog in the soup (3) Territorial waters In (2), a classical count noun is used in mass syntax (without a determiner or numeral and without a plural suffix) and is therefore interpreted as mass, whereas in (3), a classical mass noun is used in count syntax (with a plural suffix) and is therefore interpreted as count. For the purposes of this study, I follow Borer in assuming that it is mass or count syntax that determines the mass or count interpretation of the noun, and the mass-count distinction is therefore grammatical in nature. In the current study's mass-count experiment, truly flexible nouns such as rope(s) and pizza(s) are used, in which the plural morpheme is the only element distinguishing between mass and count.
Previous research on the mass-count distinction in Dutch shows that Dutch-acquiring children start distinguishing mass from count nouns around age 6 (van . As for children with autism or SLI, studies on the mass-count distinction are rare. Froud and van der Lely (2008) investigated 17 English-speaking children with G(rammatical)-SLI (aged 8;00-15;06), two groups of younger TD children (mean age 6;02 and 7;04), and a group of chronologically agematched controls. Using a production task, children were presented with novel nouns with a simple CVC structure (e.g., dap) together with potential syntactic and semantic cues. Syntactic cues could signal mass (some dap) or count readings (a dap) as well as the semantic cues (object referents in case of count nouns, substance referents in case of mass nouns). The results suggest that in the TD children the integration of syntactic and semantic cues mature over time, while the children with G-SLI perform unlike any of the control groups, as they are not able to discriminate between mass and count novel nouns. For instance, they pluralize nouns in mass syntax, showing only limited use of syntactic cues to distinguish between novel count and mass nouns.
To date, there are no studies examining the acquisition of the mass-count distinction in children with autism. The present study is therefore the first to investigate this phenomenon in children with HFA.

Subject-Verb Agreement
Agreement between the subject and the verb in Dutch is expressed by a suffix on the verb as illustrated in the schema in (4): (4) Subject-verb agreement in Dutch for the verb werken ('to work)

Person
Singular Plural 1 (ik) werk (wij) werk-en 2 (jij) werk-t (jullie) werk-en 3 (hij/zij/het) werk-t (zij) werk-en Verbal agreement inflection on the Dutch verb expresses person (in the singular) and number. As these are typically grammatical features, subject-verb agreement is a grammatical phenomenon. Previous research on the acquisition of subject-verb agreement in TD Dutch shows that children generally correctly express verbal agreement from the moment they begin to produce two-word utterances, between the ages of 2 and 3 (Blom 2003). As for Dutch-speaking children with SLI, it is well known that they experience problems producing correctly inflected finite verbs de (de Jong 1999;Rispens & Been 2007). Whereas TD children no longer produce root infinitives (i.e., mama lopen 'mommy walk') by age 3 (Blom 2003), children with SLI continue to do so until the age of 8 (Wexler, Schaeffer & Bol 2004). To date, there is no study systematically investigating subject-verb agreement in Dutch-speaking children with (high-functioning) autism. Now that the two grammatical phenomena to be tested (mass-count distinction and subject-verb agreement) have been described, I turn to the pragmatics-driven phenomena.

Article Choice
As argued by Stalnaker (1974) and Heim (1982) (among many others), the choice between a definite and an indefinite article as in (5) and (6) depends on knowledge of speaker/hearer assumptions (to be elaborated on subsequently) and can thus be assumed to be part of pragmatics.
De jongen woonde in een groot kasteel. 'This is a story about a (certain) boy. The boy lived in a big castle' (6) Ik heb zin om een boek te lezen (wat voor boek dan ook).
'I feel like reading a book' (whatever book it may be).
The first sentence in (5) contains the noun jongen ('boy'), which is introduced by the speaker while its referent is still unknown to the hearer. Therefore, the indefinite article een ('a') is chosen. In the second sentence, the referent of jongen is known to both the speaker and the hearer, yielding the choice of the definite article de ('the'). In (6), the referent of the noun boek ('book') is unknown to both speaker and hearer, resulting in the choice for an indefinite article as well.
Inspired by Schaeffer & Matthewson (2005), I propose the schema in (7) for the canonical realizations of the three possible assumption states in the Dutch adult article system ("Assumed by X" is shorthand for "X has grounds for an existential assertion"): (7) The Dutch adult article system A Assumed by speaker and hearer Part of common ground de ('the') B Assumed by speaker only Not part of common ground een ('a') C Assumed by neither speaker nor hearer Not part of common ground een ('a') Furthermore, Hawkins (1991) and Horn (2006) propose that in the interpretation of indefinite NPs, the Maxim of Quantity (Grice 1975) is involved: Be as informative as is required/necessary (not more and not less). Based on this maxim, adults draw a scalar implicature when they interpret indefinite NPs. Scalar implicatures are implicitly communicated propositions linked to relatively weak terms (consider, for example, how some pragmatically implies not all) (Pouscoulous et al. 2007). The general consensus is that the weaker term (e.g., the quantifier some), while logically/semantically compatible with a stronger term from the same scale (e.g., all), prompts the inference because the speaker did not use the stronger term. Hawkins and Horn propose a Definiteness Scale, in which the is the logically stronger and most informative member of the pair: <a, the>. Indefinite interpretations are then analyzed as implicatures that result from not using the definite article in corresponding expressions (Hawkins 1991:417).
As the choice between a definite and an indefinite article requires the consideration of speaker and hearer assumptions, as well as adherence to the Maxim of Quantity, I consider it a phenomenon driven by pragmatics.
Interestingly, van Hout, Harrigan & de Villiers (2010) also found nonadultlike interpretation of the indefinite article a. Their results indicate that English-acquiring children aged 3;07-5;03 (mean: 4;06) incorrectly interpret a as referring to a unique referent 59% of the time. van Hout, Harrigan & de Villiers (2010) explain this nonadultlike interpretation of indefinites by the failure to draw scalar implicatures.

Direct Object Scrambling
Direct object scrambling concerns the placement of a direct object before or after an adverb or negation, as exemplified in (8)- (11): Referential objects strongly prefer to be scrambled, as illustrated in (8)- (10). In contrast, nonreferential objects must remain unscrambled, as shown in (11). Like definiteness, referentiality is tied to the different states of speaker and hearer beliefs. In Schaeffer (2000) I argue that the reason for the preference of referential objects to be scrambled is as follows: In order to obtain its referential interpretation, a referential object tries to be as close as possible to its antecedent (which is outside the sentence, in the preceding discourse), i.e., toward the left periphery of the sentence.
Direct object scrambling is an interface operation involving (at least) pragmatics as well as grammar: Pragmatics, because consideration of both speaker and hearer beliefs is necessary to establish (non-) referentiality; grammar, because direct object scrambling involves syntactic placement of the referential direct object in a noncanonical fronted position, assuming that its canonical position is sister-of-V.
Schaeffer (2000) carried out an Elicited Production Task on DOS over negation and different types of adverbs with 49 monolingual TD Dutch-acquiring children between the ages of 2 and 7. She found that 2-(and to a lesser extent) 3-year-old TD Dutch-acquiring children often fail to scramble over negation in referential contexts (70% and 28% respectively for definite DPs and 69% and 27% respectively for proper names). Schaeffer attributes this to the failure to distinguish between different types of referentiality, distinguishing between noun phrases such as the girl, and nouns such as the sun, resulting in nonmovement of the referential object. As the relevant distinction relates to speaker/hearer assumptions and what they are based on, it is concluded that it is the children's underdeveloped pragmatics that causes the failure to scramble referential direct objects in young Dutch-speaking children.

Hypotheses and Predictions
Taking Bishop's (2010) hypothesis that SLI and ASD are part of the same continuum, it is predicted that children with SLI and children with ASD show an overlap in their linguistic and nonlinguistic cognitive profiles. That is, some children originally diagnosed with SLI will show a similar linguistic and nonlinguistic cognitive profile as some children originally diagnosed with ASD. If this prediction is not borne out, and no overlapping profiles between SLI and ASD are found, this provides evidence for different underlying causes of SLI and ASD and therefore for a distinct etiology of SLI as compared to ASD.

Participants
Four groups of participants were tested, as presented in Table 1. The children with HFA were recruited through Dutch organizations for autism, autism groups on Facebook and personal contacts, and had an official diagnosis of Autism Spectrum Disorder by a psychiatrist, based on the DMV-IV (American Psychiatric Association 2000). The children with SLI were selected from special schools for children with speech and language problems in The Netherlands. Children with an IQ < 85 and/or officially diagnosed with any additional disorder (such as autism in the SLI group or language impairment in the HFA group) or AD(H)D) were not included. Nevertheless, we do not exclude the existence of comorbidity with other developmental disorders in both the SLI and the HFA group. All child participants were individually matched on age and gender.
To further confirm the SLI and HFA disorder of the impaired populations, age-normalized scores of expressive and receptive linguistic ability were obtained from the Dutch version of the Clinical Evaluation of Language Fundamentals (CELF-4-NL) (Semel et al. 2008). Whereas the SLI group performed far below the norm score of the 50th percentile (mean 7.9, SD 7.34) (as expected), the HFA and TD groups performed around or above the norm score (HFA mean 53.7, SD 29.5, TD mean 73.4, SD 24.9). Additionally, parents of all HFA participants completed the Dutch version of the Children's Communication Checklist (CCC-2-NL) (Geurts 2005), a parents' questionnaire whose scores provide an impression of pragmatic skills (among other things). As expected, the HFA group has a high SIDI (Social Interaction Difference Index) score (mean 81.6, SD = 19.8), indicating pragmatic difficulties. In contrast, the SLI group has a low SIDI score (mean = 14.9, SD = 12.9), suggesting no particular pragmatic weaknesses.

Materials and Procedure
The tasks used for the current study are part of a large battery of 16 tests designed to investigate the grammatical, pragmatic, and cognitive development of children with SLI and children with HFA, in collaboration with Iris Duinmeijer, at the University of Amsterdam.

Mass-Count
Using a Quantity Judgment Task based on Barner & Snedeker (2005), participants are presented with an image containing two characters. In every picture, one character has two large objects and the other has four, five, or six smaller objects of the same kind. Importantly, the two large objects have a larger combined volume and surface area than the smaller objects combined.
All nouns used for the current study are flexible between mass and count: When occurring with plural marking they are count, e.g., strings; without plural marking they are mass, e.g., string. 1 Participants are asked which of the two characters has more X. A noun (X) presented in count syntax elicits a response based on number of individual items; a noun presented in mass syntax (or more specifically, in the absence of count syntax) elicits a response based on volume. An illustration is given in (12) (count syntax) and (13)  The mass-count experiment contains two conditions (mass and count), with 12 experimental items each, and 8 fillers, presented in pseudo-randomized order. The filler items are count nouns for which the accompanying images differ only in number, rather than in both number and overall volume as in the experimental count condition. The utterances were prerecorded and shown using Microsoft PowerPoint.

1
The mass-count experiment presented in the present study is part of a larger mass-count experiment containing additional conditions such as count, classical mass, and object-mass ).

Subject-Verb Agreement
Subject-verb agreement was tested by an Elicited Production Task (Duinmeijer 2012) in which participants are asked to describe actions presented on cards. There are five conditions (first, second, and third person singular and first and second person plural) with 12 items each, using six transitive verbs twice (bake, comb, read, clean, film, and drink). The task is presented as a game involving the participant, the experimenter, and a doll named Kim. All three receive a pile of 30 cards depicting a person acting out one of the six verbs mentioned. The cards are upside down, and at each round the upper card of each pile is turned over. The participant is then asked to say what everyone is doing (i.e., What are you doing? What am I doing? What is Kim doing? What are we doing?) In some cases two of the cards depict the same action and object, eliciting a plural.

Article Choice
Article choice was tested by an Elicited Production Task based on Schaeffer & Matthewson (2005) in which participants are asked to describe an event in a picture or short video clip displayed on a computer screen to an experimenter (A) who cannot see the screen, while a second experimenter (B) is sitting next to the participant.
The article choice experiment consists of 18 fillers (testing direct object scrambling-see section 3.2.4) and 18 experimental items in three conditions: six items in a definite condition, six items in an indefinite-referential condition, and six items in an indefinite-nonreferential condition, corresponding to the schema in (7) in section 2.2.1. All items were shown using Microsoft Power Point. An illustration of the definite condition is given in (14)

Direct Object Scrambling
As mentioned in section 3.2.3, direct object scrambling and article choice were tested in the same experiment, serving as fillers for each other. The direct object scrambling part is based on methods used by Schaeffer (2000). All experimental items concern direct object scrambling with respect to sentential negation, rather than adverbs, because they provide the clearest contexts for obligatory (non-)scrambling.
As in the article choice experiment, three different conditions are distinguished, namely (i) definite (six test items), (ii) indefinite-referential (six test items), and (iii) indefinite-nonreferential (six test items). A sample scenario of the definite condition is provided in (15). For illustrations of the other conditions, see Schaeffer (in press Patrick goes not the book read Note that in the preamble by experimenter B there is a sentence containing both negation niet and a direct object (dat 'that'). Yet, the direct object is in neither scrambled nor nonscrambled position. Dutch allows for a third position for a direct object, namely, the topicalized position at the beginning of the structure. By employing this topicalized position for the direct object, giving away a (non-)scrambled order in the preamble is avoided.

Other measures (CELF, ToM, inhibition, reasoning skills, working memory)
In addition to the experimental tasks described previously, several background measures were administered. Age-normalized scores of expressive and receptive linguistic ability were obtained from the Dutch version of the Clinical Evaluation of Language Fundamentals (CELF-4-NL) (Semel et al. 2008). Age-normalized scores of nonverbal intelligence were obtained from the Raven's Progressive Matrices (Raven 1976). Additionally, a nonverbal False Belief (FB) task was conducted (Colle, Baron-Cohen & Hill 2007). In order to assess inhibition, a nonverbal task was administered (VIMI Hand-Fist game, Henry, Messer & Nash 2012), in which participants were requested to copy and inhibit certain hand positions. Nonverbal Working Memory (WM) was tested with the so-called odd-one-out task (Henry 2001), in which the participant had to point at the odd-one-out figure (out of three) and subsequently indicate the (blank) positions where the odd-one-out figures were before (maximum of six). Finally, three verbal WM tasks were also administered: In the Forward Digit Span task (WISC-R; Wechsler 1974) the participant is asked to repeat digits, up to a level of a maximum of eight digits in a row. The task in the Backward Digit Span (WISC-R; Wechsler 1974) is to repeat the digits in reverse order. In the Non-Word Repetition (NWR) task (Rispens & Baker 2012), the participant has to repeat nonsense words, varying in syllable length and phonotactic probability.

Results and discussion
4.1. Results grammar 4.1.1. Mass-count results Figure 1 presents the collapsed accuracy scores for the flexible mass and count conditions for all groups. 3,4 A Kruskal-Wallis test reveals a difference between accuracy scores of the groups, Chi 2 (3)= 33,732, p ≤ .001, with a mean rank of 46.89 for the HFA group, 27.05 for the SLI group, 60.84 for the TD group, and 73.50 for the adult group. A pairwise post hoc Kruskal-Wallis test shows that the TD group as a whole does not significantly differ from the adults (p = 1.0). Crucially, the children with SLI perform significantly worse than the TD children (p ≤ .001), while the children with HFA do not (p = .352). Moreover, the children with HFA perform significantly better than the children with SLI (p = .043). Figure 2 presents the accuracy scores for subject-verb agreement for all groups. The statistics on the subject-verb agreement results show a picture very similar to those of the mass-count results. A Kruskal-Wallis test reveals a difference between accuracy scores of the groups, Chi 2 (3)= 26,708, p ≤ .001), with a mean rank of 51.196 for the HFA group, 28.13 for the SLI group, 60.31 for the TD group, and 60.13 for the adult group. A pairwise post hoc Kruskal-Wallis test shows that the TD group as a whole does not significantly differ from the adults (p = .892). Crucially, the children with SLI perform significantly worse than the TD children (p ≤ .001), while the children with HFA do not (p = .158). Moreover, the children with HFA perform significantly better than the children with SLI (p = .001). One of the items in the mass condition was excluded (Wie heeft er meer papier? 'Who has more paper?'), since even the adults had an accuracy score of only 38.46% on this item. The TD children (35.71%) and the children with SLI (17.86%) and ASD (50%) also performed less well on this item as compared to the other items. 4 For the results on all mass-count conditions, see Creemers (2014).

Article Choice Results
also Schaeffer, van Witteloostuijn & de Haan 2014). Substitution responses are responses in which a definite article is used in an indefinite condition or when an indefinite article is used in the definite condition.
As for the indefinite conditions, the tall bars in Figures 3 and 4 show that all children and adults overwhelmingly choose to produce the target indefinite article. Kruskal-Wallis tests show that none of the differences between child groups or response types is statistically significant-Indefinite groups, H(2) = 8.676, p < .05. Mann-Whitney U tests between the different pairs of groups show that the difference between the HFA and the TD group is significant (U = 245, p < .05); the same holds for the SLI and the TD group (U = 224.5, p ≤ .005). The HFA and SLI groups do not differ significantly from one another (U = 356, p = .875). Both the HFA group and the SLI group produce correct definite articles significantly less often than the TD group. This lower use of correct definite articles in the definite condition is largely due to the substitution of indefinite articles in both groups: 15% in the HFA group and 13% in the SLI group. Both these numbers are significantly higher than that of the TD children (4%): HFA: U = 272, p < .05, SLI: U = 265.5, p < .05. Figures 6 and 7 present the proportions of scrambled, nonscrambled, and irrelevant responses in the nonreferential (6) versus the referential conditions (7). As indicated by the tall light gray bars for each group, virtually all indefinite nonreferential direct objects remain unscrambled. In fact, a Kruskal-Wallis test reveals no significant differences between the proportion of direct objects correctly left unscrambled between the groups (Chi 2 = 7.611, p = .055), with a mean rank of 47.09 for the HFA group, 41.79 for the SLI group, 56.61 for the TD group, and 61.03 for the adult group. In addition, there is no significant difference regarding the proportion of scrambled items between the groups (Kruskal-Wallis test: Chi 2 = 3.716, p = .294), with a mean rank of 54.00 for the HFA group, 50.61 for the SLI group, 48.89 for the TD group, and 47.00 for the adult group. The same holds for the proportion of irrelevant answers (Chi 2 = 6.608, p = .086), with a mean rank of 52.64 for the HFA group, 59.21 for the SLI group, 44.25 for the TD group, and 42.44 for the adult group. In contrast, Figure 7, representing the (definite and indefinite) referential conditions, in which scrambling is obligatory, does show differences in the accuracy rate of HFA and SLI children versus the TD children and adults. In fact, Kruskal-Wallis tests reveal a difference in the proportion of correctly scrambled items between groups (Chi 2 = 32.728, p ≤ .001) with a mean rank of 45.50 for the HFA group, 29.04 for the SLI group, 64.93 for the TD group, and 71.56 for the adult group. Similar results are found for the proportion of items incorrectly left unscrambled (Chi 2 = 32.969, p ≤ .001) with a mean rank of 55.25 for the HFA group, 71.66 for the SLI group, 34.59 for the TD group, and 33.00 for the adult group.

Direct Object Scrambling Results
A pairwise post hoc Kruskal-Wallis test shows that the TD group does not differ significantly from adults on both measures (scrambled: p = .378 and unscrambled: p = .791). However, the HFA group scrambles referential direct objects only at a rate of 63%, while the children with SLI do this even less often: 45%. A one-sample t-test shows that the SLI group's performance does not differ significantly from chance (t = -1.119, p = .273), whereas the children with HFA perform above chance level (t = 2.126, p < .05). Moreover, pairwise post hoc Kruskal-Wallis tests reveal that the scrambling rates of the HFA group (63%) as well as the SLI group (45%) are significantly lower than those of the TD group (84%) (HFA-TD: p < .01, SLI-TD: p < .001). Looking at it from the perspective of nonscrambled structures, pairwise post hoc Kruskal-Wallis tests reveal that both the HFA and the SLI groups fail to scramble referential direct objects significantly more often than their TD peers: The HFA group does this at a rate of 27% and the SLI group at a rate of 40% (HFA-TD: p < .01, SLI-TD: p < .001).

Interim Summary of Results
The following picture emerges from the examination of the results on the two grammatical and on the two pragmatics-driven tests of our test battery: (16) Grammar (mass-count distinction and subject-verb agreement) SLI < TD HFA = TD Pragmatics (article choice and direct object scrambling) SLI < TD HFA < TD As predicted, the children with SLI perform significantly worse than their TD age-mates on grammar, while the children with HFA do not differ from the TD children on these tests. In contrast, both the SLI and the HFA groups perform significantly worse than the TD controls on parts of the two pragmatics-driven tests. This was predicted for the children with HFA but not for the children with SLI. Does this mean that children with SLI and children with HFA have similar language impairments? To answer this question, a more individual analysis was performed, as presented in section 4.4.

Individual Results Pragmatics as Compared to Grammar and Cognition
For both pragmatics-driven tests, the participants were classified as "passers" or "failers." In the article choice task, participants received a "pass" if they produced zero or one (out of six) indefinite articles in the definite condition, and a "fail" if they produced two or more (out of six) indefinite articles in the definite condition. In the direct object scrambling task, participants were assigned a "pass" if they left no more than 2 out of 12 referential items unscrambled, and a "fail" if they left at least 3 out of 12 referential items unscrambled. This results in the pattern in (17): (17) Article choice SLI 4/27 failers (age 8-10, mean 9;3) HFA 6/27 failers (age 6-11, mean 8;8) Direct Object Scrambling SLI 18/28 failers (age 6-13, mean 9;11) HFA 15/28 failers (age 5-14, mean 9;10) The schema in (22) shows that there are comparable numbers of failers in the SLI and HFA groups: 4 versus 6 respectively, for article choice, and 18 versus 15 for direct object scrambling. This only confirms the resemblance of SLI and HFA in pragmatics. Nevertheless, a comparison of the failer groups' pragmatics results with their performance on the grammatical tests and the cognitive tests reveals very different profiles. As for the grammatical tests, besides the mass-count and the subject-verb agreement scores, we also included the scores of the "Sentence Recalling" subtest of the CELF-IV (Semel et al. 2008), since these scores have been argued to be an indication of grammatical skills (see Polišenská, Chiat & Roy 2015).
Regarding the cognitive tests, we present the scores on the nonverbal cognitive tests as mentioned in section 3.3 (Theory of Mind, Inhibition, Reasoning Skills, Working Memory). Tables 2 and 3 demonstrate that the SLI pragmatic failers perform worse than their TD controls at grammar but that the HFA pragmatic failers are mostly TD-like at grammar. Thus, despite the resemblance of the SLI and HFA failer subgroups in terms of pragmatics, their grammatical profiles differ strongly. As Table 3 shows, the HFA direct object scrambling failers differ significantly from the TD children on both mass-count and sentence repetition. If the direct object scrambling experiment only reflected pragmatic skills, this would be an unexpected result. However, besides pragmatic knowledge of speaker/hearer knowledge, DOS also involves syntactic knowledge on word order, in particular Accuracy scores in inhibition items. 8 Raven's percentile scores. 9 Memory level (max. 6).
direct object placement. Thus, failure on DOS can be caused by either a pragmatic or a syntactic deficit, or both. I propose that the children with SLI fail this test mainly because of their weak grammatical/syntactic skills. Children with HFA fail the DOS test mainly because of their weak pragmatic skills, but a small subgroup of the children with HFA may also fail DOS because of weak grammar. This is confirmed by the analysis of Creemers (2014), who compares the same children's results on mass-count to those of article choice. She finds that there are nine children with HFA who fail the mass-count experiment, suggesting that there is an HFA subgroup that is slightly grammatically impaired. 11 (Some of) these children are probably responsible for the significantly lower performance of the HFA DOS failers on mass-count and on sentence repetition. Interestingly, the mass-count and sentence repetition scores in the article choice failer groups (Table 2) show no such patterns: The HFA AC failers perform TD-like on both mass-count and on sentence repetition (and on subject-verb agreement, for that matter). This is in line with the analysis that article choice (speaker/hearer knowledge) involves mainly pragmatic knowledge, as opposed to grammatical knowledge (see Schaeffer, Van Witteloostuijn & De Haan, 2014). Returning now to the profile issue, the scores on mass-count, subject-verb agreement, and sentence repetition clearly show that the SLI pragmatic failers have a weak grammatical profile, whereas the HFA pragmatic failers show relatively strong grammatical skills. This argues against a phenotypical overlap between SLI and HFA and may suggest different underlying etiologies for SLI and HFA.
This profile difference between SLI and HFA is further emphasized by the scores on the nonverbal cognitive tests. Parallel to the grammatical differences between the SLI and HFA groups, the SLI pragmatic failers are also weak at reasoning skills and working memory, whereas the HFA pragmatic failers generally perform TD-like on these cognitive measures, except for the HFA scrambling failers, who show a marginally significantly worse performance than the TD children on inhibition. Nevertheless, the HFA AC failers are TD-like on the inhibition measure. Interestingly, the nonverbal Theory of Mind scores are normal in all groups, even in the HFA group.

Overall summary and repercussions for SLI as compared to autism
In sections 4.1, 4.2, and 4.3 we saw that a subgroup of the children with HFA as well as a subgroup of the children with SLI have a pragmatic impairment. However, despite similarities in terms of pragmatics, the two subgroups have strikingly different profiles regarding grammar and other cognitive skills, such as nonverbal reasoning skills, and nonverbal working memory. Whereas the HFA pragmatics failers show virtually no weaknesses in these areas, the SLI pragmatics failers also perform significantly worse than their TD peers on grammar, nonverbal reasoning skills, and nonverbal working memory. From this, it is difficult to conclude that SLI and HFA are instantiations of the same continuum, as Bishop (2010) proposes. The clearly different profiles of SLI and HFA on grammar and nonlinguistic cognitive functions suggest that the underlying causes for the pragmatic impairments in HFA and SLI may well be very different.
Recall that the pragmatic weakness only concerns a subgroup of SLI (and a subgroup of HFA, for that matter). There are also children with SLI who show no pragmatic impairment: 8 children with SLI pass the direct object scrambling test, and 23 children with SLI pass the article choice test. Yet, all SLI subgroups have grammar scores that are significantly lower than those of the TD children and can thus be considered to have a grammatical impairment. In contrast, the HFA pragmatics failers mostly have TD-like scores on the grammatical tests. This means that, besides an SLI subgroup that is grammatically weak but pragmatically strong, we can identify an HFA subgroup with the opposite pattern: grammatically strong but pragmatically weak. In other words, the test results reveal a double dissociation: The grammar required for the phenomena under investigation can be impaired independently of the pragmatics required for the phenomena under investigation and vice versa. Such a double dissociation suggests that (parts of) grammar and (parts of) pragmatics can be impaired independently, providing evidence for language-internal modularity.
The deficiencies in both grammar and cognition observed in the SLI group raises the question as to whether there is a relationship between grammatical impairment and cognitive weaknesses. Let us first consider the SLI group's nonverbal reasoning skills, which are significantly lower than those of the TD controls. According to the Raven's instruction manual, seven children in the SLI group can be categorized as failers on this test (scores between 6 and 10). However, as we saw before, all children with SLI have low grammatical scores. A Spearman's rank-order correlation reveals no significant correlations between the SLI group's Raven's scores and their grammatical scores (masscount: p = .597, subject-verb agreement: .423, sentence repetition: p = .110). Thus, the grammatical deficit in SLI cannot be caused by lower nonverbal reasoning skills.
In contrast, weak nonverbal working memory does seem to be related to weak grammatical skills. A Spearman's rank-order correlation reveals a significant correlation between the SLI group's nonverbal working memory scores and their scores on the mass-count task (p ≤ .001) and on the subject-verb agreement task (p < .01) (but not between nonverbal working memory and the performance on the sentence repetition task of the CELF, p = .732). In order to learn more about the nature of this correlation, the SLI group was divided up into "working memory passers" (memory level 3 or higher out of 6) and "working memory failers" (memory level lower than 3 out of 6). The numbers in Table 4 show that the working memory failers perform significantly worse on mass-count and subject-verb agreement than the working memory passers (but not on sentence repetition). This suggests that nonverbal working memory has some predictive power regarding grammar outcome and thus that weak nonverbal working memory may negatively impact grammatical performance, as is demonstrated by the SLI group tested in the current study. If this is true, (one of) the underlying cause(s) for grammatical impairment in SLI may be sought in (nonverbal) working memory skills. Nonetheless, weak working memory could never be the sole underlying cause of grammatical impairment, since all the children with SLI are grammatically weak, while only a subgroup (N = 9) fails on working memory tasks, as Table 4 shows.
If nonverbal working memory influences grammatical abilities, it is expected that verbal working memory also correlates with grammar, since this contains a language component as well. The inclusion of verbal working memory tasks such as Forward Digit Span, Backward Digit Span, and Non-Word Repetition in our test battery allows us to calculate the correlations between the grammatical tasks and the verbal working memory tasks as well. 12 A Spearman's rank order correlation in the SLI group indicates a significant correlation between the scores on the forward digit span and sentence repetition (p = .000) and mass-count (p = .000), but not with subject-verb agreement, although there is a trend there (p = .064). Similar correlations were found regarding the Backward Digit Span scores in the SLI group: with sentence repetition (p = .018) and with masscount (p = .002) but not with subject-verb agreement (p = .323). Finally, the SLI children's NWR scores correlate significantly with sentence repetition (p = .000), mass-count (p = .000), and subject verb agreement (p = .000). This provides further evidence for the hypothesis that working memory abilities (both verbal and nonverbal) are linked to grammatical abilities. Additional support for this hypothesis comes from some older studies reporting that "the limitations imposed by low working memory capacity are felt only or mostly in complex sentences" (King & Just 1991;Miyake, Carpenter & Just 1994). However, in these studies no distinction is made between verbal and nonverbal working memory, and it is debatable whether the mass-count distinction and subject-verb agreement should be considered "complex sentences." More recent corroborating results are provided by Henry, Messer & Nash (2012), who reports correlations between language and verbal and nonverbal working memory in a group of 41 children with SLI aged 8;01-14;01. However, Henry et al. also find correlations between language and nonverbal inhibition, something that the current study does not show. The fact that Henry et al. do not distinguish explicitly between different components of language (e.g., grammar, pragmatics), as the current study does, makes the results somewhat difficult to compare. Finally, a study by Meir and Armon-Lotem (2014) shows that children with SLI (regardless of being mono-or bilingual) are outperformed by their TD peers on the forward digit span, nonword repetition, and a sentence repetition task. Additionally, they report associations between language proficiency and verbal working memory (forward digit span, NWR, and sentence repetition). In other words, the effect of SLI was observed for all verbal working memory tasks.
In summary, several studies have pointed to the direction of a correlation between general language abilities and working memory in SLI, and most studies find this for verbal working memory. This is not all that surprising, since verbal working memory contains a language component, and if language is weak, this may affect the performance on a working memory task including language. The novel finding of the current study is the correlation and possibly the causal relationship between weak nonverbal working memory and problems in the grammar component of language in SLI. In this respect, the children with SLI clearly distinguish themselves from the children with HFA, who show no weaknesses in either grammar or nonverbal working memory.
Interestingly, the current study reveals no correlations between the SLI group's nonverbal working memory scores and their scores on the pragmatics tests (Article Choice: p = .411 and Direct Object Scrambling: p = .409). This indicates that weak nonverbal working memory is not related to just any linguistic weakness but specifically to a grammatical deficit, as opposed to a pragmatic deficiency.
Returning then to Bishop's (2010) hypothesis that SLI and HFA (ASD) are part of the same continuum, the results of this study clearly point to the opposite: Despite a prima facie resemblance between SLI and HFA in terms of pragmatics (direct object scrambling and article choice), the SLI profile in terms of grammar (mass-count, subject-verb agreement, and sentence repetition) and in terms of cognition (nonverbal reasoning, verbal and nonverbal working memory) strongly differs from that of the children with HFA: The children with SLI show weaknesses in the relevant grammatical and cognitive skills; the children with HFA are TD-like in this respect. This suggests different underlying causes for the language problems children with SLI and children with HFA experience and thus different etiologies for the two pathological populations. I proposed that one of 12 Forward and backward digit span are often considered nonverbal working memory tasks. However, in repeating the numbers, the words for the numbers are processed, comprehended, and produced. In this sense the digit span task does have a language component. the underlying causes for grammar problems in children with SLI may be an underdeveloped (non-) verbal working memory. However, this cannot be the entire explanation of the grammatical weakness of children with SLI, since there are also children with SLI who have TD-like working memory scores but still underperform at the grammar tests. Moreover, underdeveloped working memory is not responsible for the pragmatic errors made by both the children with SLI and the children with HFA. None of the other cognitive functions tested in this study (nonverbal reasoning, nonverbal inhibition, Theory of Mind), nor grammar, can explain the pragmatic errors either. Further research is thus needed to investigate the underlying causes of the pragmatic weakness in SLI and in HFA, which are likely to be different. One idea is that the weak grammatical abilities in SLI cause lower performance in pragmatics. This is particularly attractive for direct object scrambling, since this phenomenon clearly also includes a grammatical (syntactic) component, namely, object placement/ word order. For article choice this is less clear, but if we assume that semantics is also included in grammar, problems with the semantic uniqueness requirement for the definite article may be the culprit for the errors by the children with SLI. In contrast, the errors in the pragmatics-driven tests by the children with HFA would be caused by underdeveloped pragmatic knowledge, related to speaker/hearer assumptions.
Finally, although behavioral results such as the ones described in the current study can make suggestions about the different underlying causes/etiologies of children with SLI versus children with HFA, they cannot prove that this is indeed the case. Further study with other experimental techniques is needed to investigate the real underlying causes of the children's language problems.

Conclusion
The aim of the current study was to characterize and potentially differentiate the language problems of children with Specific Language Impairment as compared to those of children with High-Functioning Autism. The study shows the importance of a large and diverse test battery, focusing on specific language components, such as grammar and pragmatics, and on other, nonlinguistic cognitive skills. When several different cognitive abilities are distinguished and investigated, distinct profiles arise, suggesting different underlying causes for the language difficulties in SLI and in HFA.
These profiles are as follows: All children with SLI are grammatically impaired, and a subgroup has an additional pragmatic impairment. Furthermore, subgroups of SLI have weak nonverbal reasoning and working memory skills. Correlation tests and passer/failer analyses suggest that weak working memory (but not weak nonverbal reasoning) is a contributing factor to grammatical problems in SLI. However, since all children with SLI experience difficulties in grammar, but not all of them working memory problems, working memory cannot be the entire explanation, and part of the grammatical difficulties in SLI may be due to a representational deficit. Further research with other experimental techniques (such as reaction time, eye tracking, and ERP) is needed to investigate the relative roles of working memory problems and other potential deficiencies with respect to grammatical impairment.
In contrast, the HFA group shows an impairment only in the pragmatics-driven phenomena. The fact that parts of grammar can be impaired independently of parts of pragmatics (in an SLI subgroup) and that parts of pragmatics can be impaired independently of parts of grammar (in HFA) provides a double dissociation between the grammatical and pragmatic knowledge required for the phenomena under investigation.
As for the impairment of pragmatics in both the SLI and the HFA groups, future research should be conducted to unveil the exact nature of the pragmatic errors in the two different groups. One way to do this is to carry out a nonverbal scalar implicature experiment. If the pragmatic errors we find in the SLI group are related to their weak grammatical abilities, the children with SLI are predicted to perform TD-like on such a nonverbal scalar implicature experiment. In contrast, if the underlying cause for the low pragmatics scores in the HFA group is really due to the failure to draw a scalar implicature, they are predicted to fail on nonverbal scalar implicature as well.