General- and Language-Specific Factors Influence Reference Tracking in Speech and Gesture in Discourse

ABSTRACT Referent accessibility influences expressions in speech and gestures in similar ways. Speakers mostly use richer forms as noun phrases (NPs) in speech and gesture more when referents have low accessibility, whereas they use reduced forms such as pronouns more often and gesture less when referents have high accessibility. We investigated the relationships between speech and gesture during reference tracking in a pro-drop language—Turkish. Overt pronouns were not strongly associated with accessibility but with pragmatic context (i.e., marking similarity, contrast). Nevertheless, speakers gestured more when referents were re-introduced versus maintained and when referents were expressed with NPs versus pronouns. Pragmatic context did not influence gestures. Further, pronouns in low-accessibility contexts were accompanied with gestures—possibly for reference disambiguation—more often than previously found for non-pro-drop languages in such contexts. These findings enhance our understanding of the relationships between speech and gesture at the discourse level.


Introduction
In order to produce coherent discourse, speakers track the novelty versus continuity of the entities they mention by choosing between richer versus reduced forms of referring expressions (Ariel, 1990;Givón, 1976). Speakers usually introduce referents with a richer referring expression (RE), such as "a child" and tend to maintain reference to the same entity with a reduced form, such as "she" later in the discourse. Speakers vary the richness of the referring expression they use, taking the accessibility and the discourse status of referents into account. When referents are introduced into discourse, they are new and they do not have activated and accessible representations in the memories of speakers and the addressees. Therefore, they need to be expressed with richer forms of referring expressions for a successful communication. When referents are maintained, however, reduced forms as pronouns and in some cases null pronouns (i.e., argument drop) may be sufficient to track referents, because those referents have already accessible representations. Varying the richness of referring expressions in relation to accessibility and discourse status (i.e., whether referents are (re)introduced or maintained) has been shown to be a language-general strategy across typologically different languages (Aksu-Koç & Nicolopoulou, 2015;Arnold, 2010;Contemori & Dussias, 2016;Debreslioska & Gullberg, 2017;Hendriks, Koster, & Hoeks, 2014;Hickmann & Hendriks, 1999;Perniss & Özyürek, 2015).
Recent research has shown that reference tracking is a multimodal phenomenon and that referent accessibility influences referring expressions in speech and gestures in similar ways (Gullberg, 2006;Levy & McNeill, 1992;So, Kita, & Goldin-Meadow, 2009). During reference tracking, speakers may produce gestures that accompany referring expressions and vary the presence versus absence of gestures according to the discourse status of referents. For example, speakers tend to produce gestures more often with re-introduced referents versus maintained referents (Debreslioska, Özyürek, Gullberg, & Perniss, 2013;Levy & Fowler, 2000;Levy & McNeill, 1992;Perniss & Özyürek, 2015). Gestures that accompany referring expressions are also found to be sensitive to the richness of expressions in speech. Speakers tend to gesture more with referents that are expressed with richer referring expressions in speech such as noun phrases (NPs) as opposed to reduced referring expressions such as overt pronouns (Debreslioska & Gullberg, 2017;Gullberg, 2006;Perniss & Özyürek, 2015;Yoshioka, 2008). Therefore, speech and gesture are closely related at the level of discourse production (Levy & McNeill, 1992).
Previous studies that approached reference tracking as a multimodal phenomenon have mostly focused on non-pro-drop languages like English and German. In such languages, referent accessibility and the richness of referring expressions in speech go hand-in-hand. That is, NPs as richer referential forms are used for referents with low accessibility, and overt pronouns as reduced referential forms are used for referents with high accessibility (Arnold, 1998;Carminati, 2002;Kibrik, 2011). Looking at non-pro drop languages only, however, it is not possible to disentangle whether speakers gesture with NPs more than they do with overt pronouns because the richness of expression in gesture parallels that of speech, which can be attributed to speech and gesture being part of an integrated system (Kita & Özyürek, 2003;McNeill, 1992;So et al., 2009), or because it is a function of the accessibility context that modulates the production of referring expressions in speech and gestures in similar ways.
Here, we study a typologically different language than the majority of previous research has focused on-Turkish, which is a pro-drop language where the relationship between the use of overt pronouns and referent accessibility is less prominent and less understood. In pro-drop languages like Turkish, NPs mark low accessibility but it is null pronouns-argument drop-rather than overt pronouns that are the preferred markers of high accessibility (Carminati, 2002;Kibrik, 2011). That is, speakers of Turkish usually maintain referents with a null pronoun as in (1b). 1 Further, the use of overt pronouns in pro-drop languages has been suggested to be motivated mainly by the pragmatic context-whether referents are marked for pragmatic information such as similarity, contrast, topic shift or not (Enç, 1986). For example, the subject referent in (2b), ablam "(my) sister" is maintained with an overt pronoun in (2c), rather than with a null pronoun because it is contrasted with the subject referent in (2a), annem "(my) mother." In Turkish, pronouns as reduced forms might be mainly markers of pragmatic information and may not necessarily go hand-in-hand with accessibility of the referents unlike in non-pro-drop languages. This poses interesting questions for orchestration of speech and gesture use in discourse. Very little is known about how gestures and referring expressions are used for reference tracking in such languages as Turkish.
"(My) sister j was also going to come." c. Fakat o j son anda iptal etti.
"But she j cancelled at the last moment." The aim of this study to understand the relationships between the discourse status of subject referents (i.e., whether subject referents are re-introduced or maintained) and the use of richer and reduced forms of referring expressions (i.e., NPs, overt and null pronouns) in elicited narratives in Turkish, and the use of gestures in relation to these types of expressions. For speech, we examine whether the use of overt pronoun versus null pronoun is influenced by discourse status of referents and/or pragmatic context (i.e., similarity or contrast among referents as well as topic shift). We also examine whether discourse status also influences the use of overt pronouns as opposed to NPs in Turkish. We expect that overt pronouns will be less influenced by the discourse status of referents but mainly by pragmatic context, unlike in nonpro drop languages. If so, this raises interesting questions such as whether the presence/absence of gestures still align with the richness of referring expressions in speech (i.e., NPs vs. overt pronouns) and/ or discourse status of referents and the pragmatic context. As such, this study aims to be the first comprehensive investigation of speech and gesture synchrony in discourse in a non-pro-drop language. The findings of this study will expand and generalize our understanding of the principles of coordinating information in speech and gesture during reference tracking as a multimodal phenomenon.

Reference tracking in speech
Previous studies have shown that speakers vary the form of REs they use depending on the discourse status of referents that they mention. They tend to introduce and re-introduce referents with richer forms as NPs but maintain them with reduced forms as pronouns. This relationship between the richness of the referring expression and the discourse status conforms to the Principle of Quantity for topic continuity (Arnold, 2010;Givón, 1984;Perniss & Özyürek, 2015) and the Accessibility Theory (Ariel, 1990). That is, new and less accessible referents are expressed with richer expressions, while reduced referring expression are usually informative enough for maintained and more accessible referents. This principle is found to be present across typologically different spoken languages as well as in sign languages (Aksu-Koç & Nicolopoulou, 2015;Contemori & Dussias, 2016;Debreslioska & Gullberg, 2017;Hendriks et al., 2014;Hickmann & Hendriks, 1999;Perniss & Özyürek, 2015).
Although different languages adhere to the previous principles during reference tracking, languages show variation regarding which REs they prefer to mark the same discourse status. For example, while speakers of pro-drop languages like Italian and Turkish drop highly accessible referents (i.e., using null pronouns), speakers of non-pro-drop languages prefer using overt pronouns in such contexts (Carminati, 2002;Kibrik, 2011). Overt pronouns as referring expressions are also available in pro-drop languages, yet they are used for marking similarity, contrast, or topic shift between the referents, and therefore they are pragmatically marked forms compared to null pronouns (Carminati, 2002;Enç, 1986;Silva-Corvalán, 1994). It is not clear, however, whether and how discourse status of referents also influences the use of pronouns in pro-drop languages, and whether this influence interacts with the influence of pragmatic marking on referents.
Although there is rich literature on the use of overt and null pronouns in Turkish, the initial analyses were only theoretical (Enç, 1986;Erguvanlı-Taylan, 1986;Özsoy, 1987) or drew their data from fiction novels (Kerslake, 1987;Turan, 1995). A few studies with naturalistic production data from adults either focused on one form only (e.g. NPs only in Küntay, 2002) or collapsed overt and null pronoun in one category as a function of discourse status without focusing on the distribution of pronouns in pragmatic contexts (Aksu-Koç & Nicolopoulou, 2015). Hence, data-driven analysis of overt pronouns in relation to null pronouns during reference tracking in adult Turkish is missing from the literature. In particular, whether discourse status of referents modulates the use of overt pronouns is still not very clear. Additionally, previous studies of subject pronouns in pro-drop languages in general and also in Turkish have mostly focused only on the contexts in which pronouns are present and described the discourse-pragmatic function of the pronouns only in those contexts, without looking, for example, at the contexts in which null pronouns are used (e.g., Doğruöz, 2007;Haznedar, 2010). In this study, we address the use of referring expressions in Turkish in relation to both discourse status and pragmatic contexts with the aim of contributing to the existing literature on reference production.

Reference tracking in gesture
Speakers employ co-speech gestures in systematic ways during reference tracking. Similar to speech, gestures that accompany referring expressions have been found to be sensitive to accessibility and the discourse status of referents. Speakers are more likely to gesture while re-introducing referents than while maintaining them (Azar & Özyürek, 215;Gullberg, 2006;Levy & Fowler, 2000;Levy & McNeill, 1992;Perniss & Özyürek, 2015;So & Lim, 2012;So, Lim, & Tan, 2014;Yoshioka, 2008). Gestures are also argued to be sensitive to the richness of expression in speech such that speakers are more likely to gesture with referents that are expressed with richer forms in speech, for example, NPs, as opposed to reduced forms, such as pronouns (Azar & Özyürek, 2015;Gullberg, 2006;Levy & McNeill, 1992;Marslen-Wilson, Levy, & Tyler, 1982;Perniss & Özyürek, 2015). Hence, similar to speech, gestures are sensitive to the Principle of Quantity for topic continuation (Givón, 1984) and the Accessibility Theory (Ariel, 1990).
The majority of previous findings on speech-gesture relation in discourse are based on non-prodrop languages where the richness of the RE in speech (e.g., NPs vs. pronouns) goes hand-in-hand with discourse status of referents. That is, NPs are mainly used for re-introduced referents (low accessible) and pronouns for maintained (less accessible) referents in non-pro-drop languages (Contemori & Dussias, 2016;Hendriks et al., 2014;Perniss & Özyürek, 2015). Thus, in such languages it is not possible to differentiate whether it is the richness of the expressions or the discourse status of the referent that gestures are sensitive to. Therefore, studying pro-drop languages where overt pronouns might not be associated with a particular discourse status can enable us to dissociate the influence of the richness of expression in speech from that of discourse status of referents on gesture production. This will shed new light into the relationship of gesture production in relation to speech production during reference tracking.
As stated previously, the use of pronouns in pro-drop languages is assumed to be governed by the pragmatic context. As another novel contribution to the literature on multimodal reference tracking, here we also explore whether co-speech gestures that speakers use during reference tracking are also sensitive to pragmatic context in which overt pronouns are used. That is, we look at whether speakers are more likely to accompany pronouns with gestures when pronouns mark referents for similarity or contrast in speech (regarding subject referents' states and actions) as opposed to when they do not. Although this relationship has not been explored so far, it has been shown that prosodic prominence as a pragmatic marker in speech is mostly associated with beat gestures in both production and processing of language (Krahmer & Swerts, 2007). Beat gestures are short and quick hand movements such as up and down, or back and forth (McNeill, 1992), and they do not express content in relation to the speech they accompany. Whether gestures that accompany referring expressions during reference tracking are also sensitive to pragmatic marking is not known yet.
Research that has investigated speech-gesture interaction cross-linguistically and also in Turkish has mostly focused on motion event expressions (e.g., Furman, Küntay, & Özyürek, 2014;Kita & Özyürek, 2003;Özçalışkan, 2016;Özyürek, 2002). Studies examining co-speech gestures accompanying referring expressions in Turkish on the other hand are very few and mostly about children's utterances. In a recent study, Ateş and Küntay (2018) found that by age 1;09, children showed sensitivity to discourse status by using deictic gestures predominantly with new referents. Additionally, Turkish-speaking children were found to use gestures to clarify potentially ambiguous speech (Ateş & Küntay, 2018;Demir, So, Özyürek, & Goldin-Meadow, 2012). In Ateş and Küntay, for example, children were approximately six times more likely to use reduced REs (overt and null pronouns) for new referents when they did accompany those new referents with a pointing gesture as opposed to when they did not. It should, however, be noted that children in those two studies had referents available in their physical surrounding during data collection, which allowed them to point to the present objects when they underspecified them in speech (i.e., when they used overt or null pronouns for new referents). Most of these cases occurred when the children and the caregivers interacted about an object during toy-play. Therefore, how adult speakers of Turkish use gestures while referring to physically absent third-person referents during narrative production and how discourse status, the type of referring expression that is used in speech, and the pragmatic marking of referents influence multimodal reference tracking are still open questions.

Present study
This study examines multimodal reference tracking strategies in a pro-drop language, Turkish, by eliciting narratives of two silent videos. Regarding speech, we ask how discourse status of referents modulates the use of overt pronouns as opposed to null pronouns and NPs. Additionally, we ask whether pragmatic context indeed modulates the use of overt versus null pronouns as previously suggested (Enç, 1986). Regarding gestures, we ask whether both discourse status of referents and the richness of the expressions in speech-NPs versus pronouns-modulate the presence/absence of gestures even though pronouns in Turkish may not be associated with a certain discourse status (unlike in non-pro-drop languages where pronouns usually mark maintained referents). We also explored whether co-speech gestures accompanying overt subject pronouns were sensitive to pragmatic context. That is, we analyzed whether overt pronouns that do mark similarity or contrast between referents in speech are more likely to be accompanied by gestures than overt pronouns that do not mark such information in speech.
As for our contribution to the literature on Turkish, first we study reference tracking in a controlled setting, controlling for the topics to be narrated and in a context in which referents to be mentioned are not physically present. Second, we do not a priori assume that pronouns are mainly used to mark pragmatic information. Instead, we examine the influence of the discourse status of referents in addition to pragmatic contexts as a possible factor that may also modulate the use of overt pronouns. Thus, we aim for a quantitative, hypothesis-testing driven approach to understanding the role of pronouns in Turkish rather than only describing the contexts where pronouns are used. Finally, we aim to understand reference tracking with a multimodal approach and study the use speech and gesture for third-person subject referents in narratives by adult speakers of Turkish for the first time (see, however, Azar & Özyürek, 2015 for a similar research question with a smaller sample size).

The pronominal system of Turkish
Turkish is a pro-drop language that may have clauses without overt subject arguments (Kerslake, 1987) and the discourse-pragmatic context determines the choice between overt and null pronouns (Enç, 1986;Erguvanlı-Taylan, 1986;Kerslake, 1987;Özsoy, 1987;Turan, 1995). The third-person pronoun in Turkish (o for singular and onlar for plural) does not encode gender or animacy. Further, it has the same form as the distal demonstrative pronoun (Göksel & Kerslake, 2005).
Overt pronouns in Turkish are suggested to be mainly used when speakers mark pragmatic information such as similarity (see 3d) or contrast (see 4c) between different discourse referents and/ or actions performed by those referents (Enç, 1986;Kerslake, 1987). For example, in (3d), the similarity of the actions of the two discourse referents, that is, ordering coffee, is highlighted by the use of an overt pronoun as opposed to a null pronoun and the overt pronoun is accompanied by dA "also," which is a clitic that marks topic and focus in Turkish (Azar, Backus, & Özyürek, 2016;Bican, 2000). In (4c) on the other hand, the overt pronoun is preferred over the null pronoun in order to contrast the actions of the two discourse referents in terms of failing versus passing the exam.
"Yesterday (I) j was having coffee by the Bosphorus" b. Pelin k beni görmüş.
"Selin m too took the same exam." c. Ama o m geçmiş.
"But she m passed." Overall, the relative distribution of null and overt pronouns in Turkish has been mainly studied only with regard to pragmatic context so far. However, it is not known whether and how discourse status of referents also plays a role on the quantitative distribution of these two forms in narratives.

Predictions
Regarding speech, we expect speakers to use mainly NPs to reintroduce referents and to use null pronouns to maintain referents considering Turkish is a pro-drop language (Enç, 1986). Therefore, we do not expect overt pronouns to be the dominant form in either re-introduced or maintained referent contexts. As for the use of overt pronouns in relation to null pronouns, one might expect the relative distribution of overt pronouns to be higher in reintroduced referent contexts than in maintained referent contexts considering null pronouns are assumed to be the default forms to mark high accessibility in pro-drop languages. We also expect the speakers to use overt pronouns mainly in contexts that signal similarity or contrast among discourse referents or their actions and states (i.e., pragmatically marked contexts) and to use null pronouns in contexts that do not signal such information (i.e., pragmatically unmarked contexts), in line with previous theoretical accounts of pronouns in Turkish (cf. Enç, 1986). As for the for the use of overt pronouns in relation to NPs, we expect the relative distribution of overt pronouns to be possibly higher in maintained referent contexts as we expect speakers to use mainly NPs in re-introduced referent contexts, considering such pattern has been found for several languages (Aksu-Koç & Nicolopoulou, 2015;Arnold, 2010;Contemori & Dussias, 2016;Hendriks et al., 2014;Perniss & Özyürek, 2015).
Regarding gestures, we expect to find an effect of both discourse status and the richness of RE on the production of gestures in Turkish. We predict that the speakers will be sensitive to the general principles of referent accessibility and they will be more likely to gesture with re-introduced referents than maintained referents. Further, we expect speakers to be more likely to produce gestures when they express referents with NPs in speech compared to overt pronouns even though pronouns in Turkish might not be strongly associated with a certain discourse status in Turkish. Finally, if gestures are sensitive to pragmatic prominence of pronouns, we expect that overt pronouns will be more likely to be accompanied by gestures when they mark similarity or contrast compared to when they do not mark such information.

Participants
Twenty pairs of native speakers of Turkish studying in Istanbul (17 females; M age = 22.2) participated in the study in return for payment or course credits. They had normal or corrected-to-normal vision and no history of language impairment.

Stimuli
We used two short silent videos to elicit narratives. In one video (kitchen video) three women were engaged in cooking activities (Perniss & Özyürek, 2015), and in the other video (office video) two women and a man were engaged in office activities (Azar et al., 2016;Azar, Backus, & Özyürek, 2017). Both videos contained three characters to give speakers enough opportunities to switch between different characters so that they produce several referring expressions in each discourse status context. Figure 1 illustrates stills depicting different segments from each video. (See Appendices, Table A1 and Table A2 for a detailed list of events taking place in each video stimulus.)

Procedure
Participants were assigned roles as speaker or addressee randomly. Speakers watched two stimulus videos one by one on a laptop screen and narrated what they had watched to the addressees. The computer screen turned white after each video played and stayed white during the narrations. The addressees did not see the videos before or during the narrations. They could ask clarification questions after each narration was complete. They were also informed that they would be given two written short questions about each narrative. The purpose of this was to ensure that speakers included enough details in their narratives and that addressees paid attention to the narratives. Once the instructions were given, the experimenter left the room and came back after each narrative with the questions for the addressee. Each session was video-recorded.

Data coding
We used video and audio annotation tool ELAN for data transcription and annotation of both speech and co-speech gestures (see Lausberg & Sloetjes, 2009 for more information).

Speech coding
A native speaker of Turkish transcribed the data from 20 speakers. First, we divided the narratives into clauses, units with a single subject argument and a single predicate (Berman & Slobin, 1994). We coded coordinated clauses as separate clauses (e.g., "the woman who is cooking took the vegetables and she put them into the pan" was coded as two clauses). We coded only animate subjects to control for animacy as a possible factor affecting referent accessibility (Rohde & Kehler, 2014;Vogels, Maes, & Krahmer, 2014), and omitted commentary about the characters (e.g., "I think she is the mother") from the analyses to be able to compare our results with previous studies of reference tracking in extended discourse, which followed a similar coding scheme (e.g., Debreslioska et al., 2013;Gullberg, 2006;Perniss & Özyürek, 2015;Yoshioka, 2008).

Discourse status and referring expression type
We coded each subject referent for discourse status (re-introduced; maintained), taking into account subject-to-subject coreference, following Hickmann and Hendriks (1999). A re-introduced subject referent is mentioned in the previous discourse but not in the immediately preceding clause. A maintained subject referent, on the other hand, is the same referent as the subject of the immediately preceding clause. A referent is maintained only if the exact same referent was mentioned as the subject argument in the previous clause. That is, changes from plural to singular (e.g., from "three women" to "one of the women" in the next clause) or vice versa were coded as re-introduced (cf. Debrelioska et al., 2013). The first mention of referents, coded as introduced, was not included in the analyses, as we are interested in the use of REs after referents are introduced into discourse. Note that although we did not code and analyze the subject referents of commentary clauses, we took them into account while coding the subject of the next clause as either re-introduced or maintained.
We later coded each subject referent for one of the following referring expression (RE) types: noun phrase (NP) (see Appendices, Table B for the detailed list of noun phrase constructions that occurred in our dataset, but note these were all collapsed), overt pronoun (personal pronoun, demonstrative pronoun, indefinite pronoun), and null pronoun (omitted subject referents). We coded constructions with omitted head nouns as NPs. Those were partitive constructions with an ablative where the head noun was omitted, for example (kadınlardan) iki tanesi "two of (the women)" and constructions where the head noun that a relative clause modified was omitted, for example, domates kesen (kadın) "the (woman) who is/was cutting tomatoes" (cf. Göksel. & Kerslake, 2005). We coded pronominalized indefinite determiners "diğeri/öbürü" "other one" as pronouns, following Göksel and Kerslake (2005). Those pronouns were used scarcely and exclusively for reintroduced referents (N = 14). A second coder coded around 10% of the subject referring expressions for RE type and the two coders had 100% agreement, Cohen's kappa = 1.000, p < 0.001. Example (5) exemplifies coding of discourse status and RE type.
"A woman i is working on a computer." introduced/NP b. Bir erkek t de kağıtları düzene sokuyor.
"A man t is organizing sheets of paper." introduced/NP c. Ø t sınıflandırıyor.
"The woman i is getting (her) phone." re-introduced/NP

Pragmatic context
Two native speakers of Turkish coded clauses with an overt or a null subject pronoun for whether speakers organized clauses in a way that would signal similarity or contrast between different referents or between the propositions related to referents as well as topic shift. The two coders reached 100% agreement in a meeting where the initial discrepancies were discussed and resolved (Cohen's kappa for the initial agreement was 0.838, p < 0.001). (6c) is an example of similarity context; the subject referent of (6c) cannot open the jar in the stimulus video similar to the subject referent of (6a) who fails to open the jar as well. (6e) on the other hand is an example of contrast; the third woman in the stimulus video, the subject referent of (6e), opens the jar after the other two women fail to do so. (7e) is an example of topic shift. Clauses (7a-d) are about the cooking activities performed by the characters in the stimulus video while in (7e), the topic shifts from kitchen activities to speakers' not talking to each other. There were only four cases of topic shift, which were all encoded with null pronouns and accompanied by topic markers as bu arada "by the way." (6) a. Domates kesen kadın t bir kavanozu açamıyor. "The woman who is cutting tomatoes t cannot open a jar." re-introduced/NP b. Daha sonra diğer kız j alıyor.
"Later the other girl j takes (it)." re-introduced/NP c. O j da açamıyor.
"The ones who are at the table z are cutting tomato, cucumber." maintained/NP e. Bu arada ∅ z hiç konuşmuyorlar.
"By the way, (they) z are not talking at all." maintained/null pronoun Additionally, we coded whether pronouns were accompanied by the emphatic marker dA "also" (as in 6c). This clitic has been suggested to be a focus marker (Enç, 1986) in Turkish and it has been shown to accompany pronouns when used for maintained subjects marking similarity (Azar et al., 2016).

Gesture coding
We first identified the gesture strokes, the meaningful part of the gestural movement (Kendon, 2004;McNeill, 1992) as the expressive segments of the stream of manual production (Kita, van der Hulst, & van Gijn, 1998). Later we coded co-speech gestures that temporally aligned with subject arguments in speech. Following previous studies of multimodal reference tracking (Gullberg, 2006;Perniss & Özyürek, 2015;Yoshioka, 2008), we coded the presence/absence of a gesture for each subject referring expression in speech. Therefore, each gesture that accompanied a referring expression had a single value regarding discourse status (re-introduced or maintained) and the RE type in speech (NP or overt pronoun). Note that when the subject argument was dropped in speech as in the case of null pronouns, the subject slot was linguistically empty and therefore, it was not possible for gestures to temporally align with the subject. Hence, we only analyzed co-speech gestures that temporally aligned with subject arguments that were expressed with either an NP or an overt pronoun in speech.
We only analyzed the gestures that anchored subject referents in gesture space (see Figures 2 and 3) by means of an index-finger pointing or a whole-hand extended gesture because when gestures were located as such, there was a link between the location of those gestures in gesture space and the location of the characters in the stimulus videos. This made it easier to judge whether gestures were indeed associated with the subject referents. There were in total 210 subject referring expressions that were accompanied by such gestures and both types of gestures occurred equally frequently in the data set (47% index-finger and 53% whole-hand gestures). A second coder coded around 30% of the gestures for reliability. The two coders had an initial agreement of 85% for the presence of a stroke and  high agreement for the gesture type (index finger or whole hand gestures versus any other category of gestures, Cohen's kappa = 0.884, p < 0.001). The two coders resolved all disagreements in a meeting.
We excluded two classes of gestures that we believed were unlikely to be associated with subject referents: iconic and beat gestures. We excluded beat gestures because they do not depict information about the referent but rather direct attention to the rhythmical peak of speech (McNeill, 1992). We excluded iconic gestures (e.g., a stirring gesture or a cutting gesture) because we considered them to be more about the predicate rather than specifically about subjects as they were not localized in gesture space associated with subject referents and most of them overlapped not only with subject Res, but their production was temporally extended to the production of the predicates of the clauses. In total we excluded 55 gestures that temporally aligned with subject referring expressions; 25 of those gestures were iconic gestures and 30 were beat gestures.

Analyses
We analyzed the data using generalized logistic mixed effect regression using the glmer function from the lme4 package in R software (cf. Bates, Maechler, Bolker, & Walker, 2015), version 3.3.2. See Appendices, Table C1 and Table C2 for specifications of the models reported in this section. All analyses made use of variants of the generalized linear model with binomial error structure because the dependent variables were binary, coded as 1 for presence and as 0 for absence of a category (following Debreslioska & Gullberg, 2017). The analyses accounted for the random variation for participants by including random intercepts and random slopes in the models (see Baayen, Davidson, & Bates, 2008 for more information on mixed-effects modelling in language research). Sometimes a maximal model with both random intercepts and slopes (cf. Barr, Levy, Scheepers, & Tilly, 2013) did not converge, or the model returned a perfect correlation (± 1.00) between the random factors, which suggests the data might have been over-fitted. We explain below the procedure that we followed in those cases for each analysis. Although all analyses were run on presence/absence of a category as the dependent variable, figures show mean proportions of a category across all participants for ease of illustration.

Reference tracking in speech
The speakers produced 969 subject referring expressions in total, 561 of which were maintained referents (10% NPs, 10% overt pronouns, 80% null pronouns) and 408 were re-introduced referents (74% NPs, 7% overt pronouns, 19% null pronouns). This distribution shows that the most commonly used RE types in Turkish are NPs and null pronouns; NPs mainly used for re-introduced referents and null pronouns for maintained referents. We will first report the analyses on the use of overt pronouns as opposed to null pronouns, and later on the use of overt pronouns as opposed to NPs.

Overt versus null pronouns
We first examined the effect of discourse status and pragmatic context on the use of overt as opposed to null pronouns, excluding NPs from the analysis, Figure 4 2 illustrates the proportions of overt pronouns in re-introduced and maintained referent contexts (note that the proportion of overt and null pronouns together add up to 100% in each context). (See (6c) and (6e) for examples of subject referents in pragmatically marked contexts-the contexts that signal similarity or contrast between referents or the actions related to them.) The dependent variable was presence/absence of overt pronoun, and the fixed factors were Discourse Status (maintained, re-introduced) and Pragmatic Context (marked, unmarked). The maximal model with random intercepts for participants and by-participant random slopes for Discourse Status and Pragmatic Context did not converge. Following the advice in Brauer and Curtin (2017, p.16), we first removed the interaction of random slopes for Discourse Status and RE Type from the model; however the model still did not converge. Next, we forced random intercept and slopes not to be correlated, which again retuned a non-converging model. We then removed the random intercepts from the model; however the model still did not converge. Finally, we removed the random slopes and re-introduced random intercepts into the model, 3 which this time converged. Note that by excluding the random slopes from the model, we assume that the effect of Discourse Status and Pragmatic Context on the use of REs is invariant across participants. The analysis did not return a significant main effect of Discourse Status (ß = 0.818, SE = 0.446, z-value = 1.834, p = 0.066), but there was a significant main effect of Pragmatic Context (ß = −3.299, SE = 0.351, z-value = −9.391, p < 0.00001). There was no significant interaction of Discourse Status and Pragmatic Context (ß = 0.397, SE = 0.642, z-value = 0.619, p = 0.536).
The analysis suggests that the discourse status of referents did not influence the use of overt pronouns as opposed to null pronouns-even though there was a trend for using overt pronouns as opposed to null pronouns more often in re-introduced referent context than in maintained referent contexts. On the other hand, the speakers used overt pronouns as opposed to null pronouns more often in marked contexts than in unmarked contexts, which is in line with previous theoretical analyses of overt versus null pronouns in Turkish. Table 1 summarizes the distribution of overt and null pronouns across marked and unmarked contexts in re-introduced and maintained referent contexts. Sixty-two percent of overt pronouns that were used in pragmatically marked contexts marked referents for similarity and 38% for contrast. Additionally, once speakers used pronouns to mark the similarity of actions performed by two referents, they accompanied 100% those pronouns (N = 38) with the focus marker dA "also." Note that the speakers used null pronouns in re-introduced referent contexts relatively often. This occurred mainly when referents had been previously introduced as a group performing a joint  activity a few clauses earlier (e.g., two girls are slicing vegetables at the table). When those referents were re-introduced further in the discourse, they were re-introduced with a null pronoun (e.g., ∅ bir kavanoz açamaya çalışıyolar "(They) are trying to open a jar") and the predicate was marked for 3rd person plural (-lAr) and therefore the subject referent was unambiguous.

Overt pronouns versus NPs
We next examined whether discourse status also influenced the use of overt pronouns as opposed to NPs in Turkish. This time we analyzed overt pronouns and NPs only, excluding null pronouns from the analysis. The dependent variable was presence/absence of overt pronoun (as opposed to an NP) and the fixed factor was Discourse Status (maintained, re-introduced). Figure 5 illustrates the proportions of overt pronouns in re-introduced and maintained referent contexts. The maximal model with random intercepts for participants and by-participant random slopes for Discourse Status returned a perfect correlation between the random factors (1.00), which indicates that the model might have been overfitted. We first took out the interaction of random factors by forcing the random intercepts and random slopes not to be correlated. That model still returned a perfect correlation (1.00) between the slopes for two levels of Discourse Status (maintained, re-introduced). We then simplified the model by taking out random slopes. Note that by excluding the random slopes for Discourse Status from the model, we assume that the effect of Discourse Status on the use of REs is invariant across participants. The simplified model 4 with only random intercepts returned a significant main effect of Discourse Status (ß = −2.293, SE = 0.278, z-value = −8.243, p < 0.00001) such that speakers used overt pronouns as opposed to NPs less frequently in re-introduced referent contexts than in maintained referent contexts. The findings so far show that speakers of Turkish prefer null pronouns over overt pronouns to maintain referents, and NPs over overt pronouns to re-introduce referents. Therefore, it seems that the use of overt versus null pronouns in Turkish is not strongly associated with discourse status but with pragmatic context.

Reference tracking in gesture
Out of 441 subject referring expression in speech (NPs and pronouns), 210 were accompanied by gestures that temporally aligned with them. Table 2 summarizes the raw speech and gesture data across referring expression types and discourse status categories. We first tested the effect of Discourse Status (maintained, re-introduced) and richness of expression in speech (i.e., RE Type -NP or overt pronoun) on the speakers' likelihood of accompanying subject referents with a gesture (see Figure 6 for the mean proportions of NP and overt pronouns in speech that were accompanied by gestures across maintained and re-introduced referent contexts). The dependent variable was presence/absence of a gesture. The maximal model with random intercepts for participants and by-participant random slopes for Discourse Status and RE Type did not converge. We followed the same procedure as in the analysis for the influence of discourse status and pragmatic context on the use of overt as opposed to null pronouns in speech. That is, we first removed the interaction of random slopes for Discourse Status and RE Type; however the model still did not converge. Next, we forced the random intercepts and slopes not to be correlated, which retuned a converging model. The analysis returned a significant main effect of Discourse Status (ß = 0.676, SE = 0.328, z-value = 2.062, p = 0.039) and a significant main effect of RE Type (ß = −1.134, SE = 0.538, z-value = −2.109, p = 0.035) but no significant interaction of the two (ß = 0.834, SE = 0.686, z-value = 1.216, p = 0.224). The analyses revealed that the speakers were more likely to gesture with re-introduced referents than with maintained referents and also were more likely to gesture with NPs than with pronouns.
Finally, we examined whether pragmatic context influenced the speakers' likelihood of accompanying overt pronouns by gestures. The dependent variable was the presence/absence of a gesture accompanying pronouns and the fixed factor was Pragmatic Context. The maximal model with random intercepts for participants and by-participant random slopes for Pragmatic Context returned a perfect correlation between the random factors (1.00). We first took out the interaction of random   Maintained  58  54  22  11  Re-introduced  300  29  164  13  Total  358  83  186  24 factors by forcing the random intercepts and random slopes not to be correlated. That model still returned a perfect correlation (1.00) between the slopes for the two categories of Pragmatic Context (marked, unmarked). We then simplified the model by taking out random slopes. 5 Note that by excluding the random slopes for Pragmatic Context from the model, we assume that the effect of Pragmatic Context on the use of gestures with overt pronouns is invariant across participants. The simplified model with only random intercepts for participants did not return a significant main effect of Pragmatic Context (ß = 0.199, SE = 0.568, z-value = 0.350, p = 0.726). The analysis suggests that the gestures accompanying subject pronouns during reference tracking in Turkish are not sensitive to pragmatic information the pronouns mark in speech. Table 3 summarizes the proportion of pragmatically marked and unmarked overt pronouns that were accompanied by gestures.

Summary and discussion
In this study, we investigated multimodal reference tracking strategies in Turkish as a typologically different language than the majority of languages that have been studied in this domain. Previous studies in non-pro-drop languages (e.g., English, German) have shown that gestures are more likely to accompany re-introduced referents than maintained referents and also the referents that are expressed with richer expressions in speech versus reduced expressions. In those languages, however, the richness of the REs in speech (e.g., NPs vs. pronouns) goes hand-in-hand with discourse status of referents. That is, NPs are mainly used for re-introduced referents (low accessible) and pronouns for maintained (less accessible) referents. Therefore, in non-pro-drop languages, it is not possible to differentiate whether it is the richness of the REs or accessibility of the referents to which gestures are sensitive. Here, we investigated whether previous findings for the influence of discourse status and richness of referring expressions in speech (i.e., NP vs. pronoun) on gesture production also held for pro-drop Turkish, a language where the use of pronouns may not necessarily be associated with the discourse status of referents but with pragmatic context-whether referents are marked for pragmatic information such as similarity, contrast, or topic shift.

Reference tracking in speech
We investigated the role of discourse status and pragmatic contexts on the use of subject referring expressions (NPs, overt and null pronouns) in Turkish in a comprehensive way.
In line with the general principles of reference tracking, we found that speakers of Turkish used richer forms-NPs, dominantly for re-introduced referents and they used reduced forms, such as null pronouns, for maintained referents. As for the use of overt pronouns as opposed to null pronouns, there was no strong association between the discourse status of referents and the use of pronouns-even though there was a trend for using overt pronouns more often in re-introduced referent contexts than in maintained referent contexts. This trend may suggest that the competition between overt and null pronouns is stronger in re-introduced referent contexts (25% overt pronouns as opposed to 75% null pronouns) than in maintained referent contexts (10% overt pronouns and Table 3. Proportion of pragmatically marked and unmarked overt pronouns in speech that were accompanied by gestures. The proportions were calculated as the number of pronouns that were accompanied by gestures divided by the number of pronouns in speech. 90% null pronouns). Note, however, that the relative distribution of the two forms in Turkish differs from that in non-pro-drop language where null pronouns would be infrequently used and overt pronouns would be preferred over null pronouns in maintained referent contexts. On the other hand, pragmatic context modulated the use of overt pronouns as opposed to null pronouns. Speakers were more likely to use overt pronouns in the contexts that signaled similarity or contrast among discourse referents (i.e., pragmatically marked contexts), while null pronouns were dominantly used in the contexts that did not signal such information (i.e., pragmatically unmarked contexts). These findings are in line with previous theoretical accounts of the pragmatic status of pronouns in Turkish.

Number
As for the use of overt pronouns as opposed to NPs, we found that speakers used fewer overt pronouns in re-introduced referent contexts compared to maintained referent contexts. Additionally, they did not seem to have a strong preference for overt pronouns over NPs in maintained referent contexts (45% overt pronouns and 55% NPs). This is different than what we would see in non-pro-drop languages where speakers would have a strong preference for overt pronouns in such contexts (e.g., 97% overt pronouns and 3% NPs in maintained referent contexts in German in Perniss & Özyürek, 2015).
Overall speech findings showed that speakers of Turkish preferred null pronouns over overt pronouns to maintain referents, and NPs over overt pronouns to re-introduce referents. Considering overt pronouns were not used as a default/preferred marker of a certain discourse status, we can say that overt pronouns in Turkish are not strongly associated with discourse status but their main function is to mark pragmatic information. Based on our findings, we suggest that even though discourse status is a universal strategy that governs the choice between richer and reduced REs in general, the scope and the details of its effect may show cross-linguistic variation.

Reference tracking in gesture
Regarding gestures, we investigated whether both discourse status of referents and the richness of the expressions in speech-NPs versus overt pronouns, modulated the presence/absence of gestures. We also explored whether co-speech gestures accompanying overt subject pronouns were sensitive to pragmatic context. That is, we analyzed whether overt pronouns that do mark similarity or contrast between referents in speech are more likely to be accompanied by gestures than overt pronouns that do not mark such information in speech.
Even though we did not find the use of overt pronouns to be strongly associated with the discourse status of referents in speech, we found that gestures were influenced by both the discourse status of referents and the richness of referring expressions that were used in speech. Speakers of Turkish were more likely to gesture with re-introduced referents than with maintained referents. Speakers were also more likely to accompany referents with gestures when referents were expressed with an NP as opposed to a pronoun in speech. Hence, speakers of Turkish produced gestures in ways that were in line with the Principle of Quantity for topic continuity (Givón, 1984) and the Accessibility Theory (Ariel, 1990). Therefore, our findings are in line with those of previous studies that examined multimodal reference tracking mainly in non-pro-drop languages (Debreslioska & Gullberg, 2017;Gullberg, 2006;Levy & McNeill, 1992;Perniss & Özyürek, 2015). As a novel contribution to the literature, we showed that this is also the case in a pro-drop language, suggesting that speakers take both discourse status and the richness of speech into account while accompanying discourse referents with gestures as a possibly languagegeneral strategy of multimodal referent tracking. Additionally, speakers of Turkish gestured with NPs more than with pronouns even though pronouns were not strongly associated with the discourse status of referents in Turkish, which contributes to the idea that speech and gesture are parts of an integrated system (Kita & Özyürek, 2003;McNeill, 1992;So et al., 2009) and gestures are sensitive to the richness of expression in speech. Our findings show that this is the case also in a pro-drop language.
As an additional note, the speakers in this study accompanied 51% of overt pronouns with gestures when overt pronouns were used in re-introduced referent contexts. If we compare this proportion to that in German, for example, Perniss and Özyürek (2015) found that only 15% of overt pronouns in re-introduced referent contexts were accompanied by gestures (we calculated the proportions form the numbers provided in Table 1 in Perniss & Özyürek, 2015). It seems that pronouns in Turkish are relatively frequently accompanied by gestures when they are used in low-accessibility contexts compared to non-pro-drop German (see Figure 7 for an example of gestures accompanying pronouns in re-introduced referent contexts in the narratives we elicited). We suggest that this difference might point to a language-specific effect. Speakers of German may not frequently accompany overt pronouns with gestures because pronouns are habitually high accessibility markers in German, and expressions that mark high accessibility in speech are usually not accompanied by gestures. Thus, pronouns may be associated with infrequent gestures in German. In Turkish, however, pronouns are not strongly associated with high accessibility and possibly not with infrequent gestures, either. Pronouns then may be more likely to be accompanied with gestures in Turkish compared to a non-pro-drop language like German. It is also possible that speakers of Turkish use gestures to disambiguate referents when they are underspecified in speech in low-accessibility contexts, as it would be the case when pronouns are used for re-introduced referents. This would be in line with what Ateş and Küntay (2018) found for Turkish-speaking children. That is, children usually accompanied pronouns with gestures to disambiguate their speech when they used pronouns in low-accessibility contexts. Speakers of Turkish from very early on may develop a strategy of using gestures to specify potentially ambiguous referents when they use reduced expressions in speech, which they continue doing when they are adults, as well. Note that we did not systematically investigate the relation between under-specificity in speech and gestures in this study. Hence, a proposal like we outlined here would merit further research.

Conclusion
We investigated the relationships between speech and gesture during reference tracking in a prodrop language-Turkish. We showed that overt pronouns were not strongly associated with the discourse status of referents in Turkish unlike in non-pro-drop languages where overt pronouns are the most commonly used RE type for maintained referents. Nevertheless, we found that speakers of Turkish were more likely to accompany subject referents with gestures when referents were reintroduced versus maintained and when referents were expressed with NPs versus overt pronouns in Figure 7. The speaker first mentions that two women sitting at the table cannot open a jar. Then she re-introduces the character that is highlighted in still (a) with a pronominalized indefinite determiner and the character that is highlighted in still (b) with a third-person pronoun. Her whole-hand gestures temporally align with subject pronoun in bold in both (a) and (b). speech. Our findings, therefore, support those from previous research on multimodal reference tracking, which showed that the discourse status of referents and the richness of expression used in speech influence the use of gestures in discourse. Studying multimodal reference tracking extensively in a pro-drop language for the first time, we showed that both of these factors influence gesture production possibly as a language-general strategy. Further, as a possible language-specific finding, when pronouns were used in low-accessibility contexts in Turkish, they were likely to be accompanied by gestures more often than found for non-pro drop languages in such contexts-possibly to disambiguate referents. The claims we present here, however, would merit further research on prodrop languages different than Turkish. Finally, we showed that even though pronouns in pro-drop languages were modulated by whether referents are pragmatically marked or not (e.g., for similarity or contrast), gestures were not sensitive to this kind of pragmatic information.
Studying a typologically different language, the findings we presented here illuminate further mechanisms underlying the orchestration of speech and gesture and they could be important for theories that try to account for the relations between information encoded in speech and gesture at the discourse level. 3. When the model with random slopes failed to converge, we also tried a different approach where we started with a 'simpler' model with only fixed factors and by-participant random intercepts, added random slopes for one fixed factor at a time (first Discourse Status, then removing Discourse Status and introducing Pragmatic Context) and compared the "fuller" model with the random slope to the intercept-only model. The likelihood ratio test examining the variation accounted for when including/excluding random slopes in the model showed that adding by-participant random slopes for Discourse Status or Pragmatic Context did not account for more variation than the model with by-participant random intercepts only χ2(2) = 1.112, p = 0.573 and χ2(2) = 0.819, p = 0.664, respectively. We therefore report the model only with random intercepts. 4. The log likelihood ratio test comparing the models with and without random slopes suggested that the model with random slopes did not account for more variation χ2(3) = 1.152, p = 0.765. 5. The log likelihood ratio test comparing the models with and without random slopes suggested that the model with random slopes did not account for more variation χ2(3) = 0.005, p = 0.999. Example from the data (Turkish; English) bare noun anne; mother demonstrative + noun o kız; that girl heavy modifier + noun yemek yapan kadın; the woman who is cooking simple modifier + noun iki kadın; two women definite noun annesi; (his/her) mother heavy modifier without head noun masada oturan (kadın); (the woman) who is sitting at the