Understanding the dynamics of MOOC discussion forums with simulation investigation for empirical network analysis (SIENA)

Abstract This study attempts to make inferences about the mechanisms that drive network change over time. It adopts simulation investigation for empirical network analysis to examine the patterns and evolution of relationships formed in the context of a massive open online course (MOOC) discussion forum. Four network effects—homophily, reciprocity, transitivity, and preferential attachment—were tested to explain the dynamic mechanisms of interaction in the MOOC forum. Understanding the network dynamics of relationships will allow us to explore how to develop a robust peer-supported learning environment and in turn improve the online learning experience in MOOCs.


Introduction
As an online mode of teaching and learning, massive open online courses (MOOCs) have generally defied conventional wisdom as well as a growing body of knowledge in online education by placing enormously large numbers of learners in educational contexts with little to no support with their learning. In addition, most contemporary MOOCs have tended to adopt a predominantly content-centric approach to teaching and learning with little or no regard to the value of promoting and supporting a rich set of interactions between and among students and their teachers about the subject matter (see Naidu, in press). From experience with online education, as well as education more generally, we know that this is not a good idea because doing so invariably leads to problems with student persistence with their learning (see Klingsieck, Fries, Horz, & Hofer, 2012;McElroy & Lubich, 2013). Simply transporting what might have worked in a face-to-face educational setting into the online mode is neither advisable nor an effective approach to teaching and learning online, as doing so amplifies the act of delivering educational resources to students while undermining the importance of generating and promoting interaction among them and their teachers about the subject matter, as well as the potential of the online mode for supporting such interaction (Anderson, 2003;Harasim, 1989;Henri, 1992).
Interactions among participants in online educational settings can take many forms (see Bernard et al., 2009). How these forms of interactions can be understood and explored is critical to the design of online educational experiences. Furthermore, seeing these online learning interactions either as an artifact of the learning process or a product of learning can help to decide how they can and should be examined. In many courses with relatively small numbers of students, for example, in conventional online distance and flexible learning settings, the contents of the discussion forum, as intended by the instructor or perceived by the learners, is usually seen as an artifact of the learning outcomes. Interactions among participants in such discussion forums usually include instructional support from tutors and moderators, and their quality and quantity are taken into consideration in the formative assessment of student learning. For this reason, it is important to examine these interactions more carefully in order to be able to understand whether or not, and to what extent, learning is occurring.
The MOOC discussion forum is often an optional, open, and loosely structured environment, and the contributions in it can be seen as artifacts of the students' learning process. A major purpose of the discussion forum and its activities is to allow students to engage in an exploration of their ideas to develop their knowledge and understanding of the subject matter, hopefully resulting in improved learning. As part of this experience, instructors and students partake in posing questions and offering answers in discussion forums, expecting to form complex relationships and social structures in order to develop knowledge and understanding about the subject matter.
The conceptualization of online learning environments as MOOCs by Siemens and Downes, in fact, arose out of their realization of the role of this kind of connective knowledge building (Siemens & Downes, 2011). And in so, doing MOOCs promote the idea of self-regulating, peer-supported and collaborative learning environments that are driven by a wide variety and range of interactions between and among students and teachers. These interactions can have important consequences on how students learn in such a learning environment. In order to be able to do this, it is necessary to understand how online interactions in discussion forums can be designed to create and foster autonomous peer-supported online learning for a massive and global student body.
The massiveness of MOOCs and their open architecture serve as a dynamic educational environment for the study of interactions among participants, where learners can join and leave the space at their will and at any time. This offers educational researchers a unique opportunity for understanding the dynamics of interactions among participants. A promising approach for the analysis of the dynamics of such free-flowing discussion forums is simulation investigation for empirical network analysis (SIENA), which enables insight into the changes of the studied network using a family of actor-driven models, that is, creating, maintaining, or terminating ties by an individual actor from other actors (Snijders, van de Bunt, & Steglich, 2010). This approach considers changes in networks as a stochastic process, where the probabilities of changes are dependent on the network structure (e.g., network closure) and on attributes by observed covariates (e.g., role). The changes in relations over time affect a network's potential to foster or inhibit the development of self-generating and peer-supported learning environments. Using data from a MOOC discussion forum, and with the help of empirical network analysis, this study sought to measure the dynamic mechanisms by which networks are formed among MOOC participants.

Potential of network science for studying MOOC interactions
There is growing interest in analyzing large amounts of interactions in educational settings with the help of methods drawn from the field of network science, and in particular network analysis, which is derived from graph theory (see Bondy & Murty, 2008). The scholarly world has gradually realized that the network paradigm, which provides deep insights into network phenomena, has shifted the way we understand the nature of interactions in traditional group settings. It regards scalable interactions as a network by describing its parts and how they are connected to each other. Such a network representation of interactions is a powerful approach for studying the characteristics of scalable interactions (gillani, Yasseri, Eynon, & Hjorth, 2014).
Network analysis has the potential to provide the metrics we need to understand the structural patterns of scalable interactions in MOOCs. More generally, in providing the metrics for quantifying social structure at different levels (i.e., actor, dyadic, triadic, and global), it has tremendous potential as an analytical tool for addressing fundamental questions in e-learning, that is, how actors, relationships, and network formations are maintained and developed to support learning. Classic types of social network analysis (SNA) studies (e.g., Xu, Zhang, Li, & Yang, 2015) have tended to be descriptive using data from one point in time to generate a network graph. The graph of the network looks rather complicated and is difficult to explain what is in it. A major problem with these studies has been the assumed nature of discussion forums as being stationary or in equilibrium.
More recent studies have realized the importance of exploring the changes in discussion forums. A few, for instance, have studied how interaction changes through forum discussions and called for measuring the changing interactions at the network level. These studies have utilized conventional SNA but they have attempted to capture the changes in interactions by presenting network structures at different points in time. For example, Yang, Sinha, Adamson, and Rosé (2013), in their preliminary work by applying SNA to measure the network graph at different points in time, examined how students interacted with their community as it existed at the time of their entry. This type of research is by nature exploratory rather than hypothesis-driven. Such studies have their limitations, as they can generate only a static snapshot of a network at different points in time due to their failure to link the observed behavioral patterns within a network to the underlying effects of network structure and the characteristics of actors that may explain why these patterns emerge.
Since 2014, collaboration between network scientists and educational researchers has fortuitously resulted in the development of an interdisciplinary approach to understanding the dynamics of networks in educational settings. As an example, the work by gillani et al. (2014) adopted complex network analysis to explore the vulnerability and information diffusion potential of the MOOC discussion forums. In their study, they performed the same diffusion simulation on a MOOC forum network and on a randomized network. This approach, commonly used in the field of network science, is not widely seen in educational research and has greatly helped to uncover patterns and dynamics of discussion forums, which would not be possible with traditional methods. In another such interdisciplinary study, Kellogg, Booth, and Oliver (2014) took a social network perspective to study peer-supported learning in MOOC forums and found that MOOCs can be leveraged to foster these networks and facilitate peer-supported learning. Their research sheds light on the dynamic mechanisms driving the emergence of discussion networks and the channels by which these networks may be used in achieving the desired learning outcomes.
There is need for research that utilizes dynamic network analysis to identify the underlying mechanisms influencing changes in interactions in MOOC networks and to further understand the social complexity of learning in such scalable network-based contexts. A major premise of this paper is that network science and, in particular, SIENA, has the potential to help us better understand the changing structures of MOOC discussion forums, and it should be considered a priority area for inter-disciplinary research in such educational settings.

Potentials of understanding network dynamics for social learning
The effects of network formation on learning processes have been receiving a good deal of attention at international conferences, for example, at the 7th International Conference on Networked Learning held 2010 in Denmark (Haythornthwaite & De Laat, 2010). The network view offers important contributions to knowledge about how learning is occurring among participants, as it provides measures such as hierarchy, clustering, or connectivity to describe what holds the network together (Haythornthwaite & De Laat, 2010). But educational research typically lacks a clear understanding of the structural dynamics of networks. Studies of how network structures impact learning in optional, open, and loosely structured settings are rare. There is an urgent need to explore how the overall change in relations in a network could affect a network's potential to foster or inhibit the development of a better learning environment in MOOC educational settings.
Understanding learning from a network perspective is not new. The social and communal foundations of learning are widely recognized (see Lave & Wenger, 1991;Sloep et al., 2012). A learner's ability to learn is seen more in terms of their ability to interact and work productively with their peers than in their ability to perform as an individual, with individuals developing a shared understanding through the use of technologies (Luckin et al., 2009). This view recognizes that learning is a social process which takes place beyond the individual mind (see Levy, 2011;Viswanathan, 2012), and SNA sits comfortably within this broad perspective of social learning theory.
An example of this is the notion of networked learning, a term that arose around the mid-1990s, which aims to understand network processes and properties in educational settings (Haythornthwaite & De Laat, 2010). The concept of networked-based learning is well placed to address the way learning takes place informally, with its focus on building and cultivating social networks and seeing technology as one part of this process rather than as an end in itself (De Laat, 2006). A network of relationships consists of resources as well as contexts in which learning takes place. Empirical studies along these lines have shown evidence of the central role networks play in the learning process (e.g., Kellogg et al., 2014;Mansur, Yusof, & Othman, 2011;Siciliano, 2016;Stepanyan, Borau, & Ullrich, 2010;Stepanyan, Mather, & Dalrymple, 2013).
A derivative of social learning theory is the notion of connectivism, which proposes that learning is best achieved through the formation of networks, in which connections are the key to effective learning (see Siemens, 2005). The interactions by which people learn from others comprise network of relationships (Sie et al., 2012). That is, when learning socially from peers, people implicitly form relationships with them. Over time, these relationships lead to the building of a community of practice, which is widely acknowledged as a closely knit social structure associated with learning and commitment to joint practice (Wenger, 1998). These views of the role of social learning underscore the importance of taking a network perspective to understanding learning and learning processes. In this regard, understanding the formation of the network of relationships will allow us to explore how social learning takes place and in turn improve the design of the online learning experience in MOOCs.
The theory of information transmission also offers some insights into the learning process from a network perspective. The structure of networks can affect participants' access to information and resources. In discussion forums, through the interaction and communication of participants, information and knowledge can be dispersed among the participants. This is described in the literature as information transmission or information distribution. In the theory of social learning, information transmission plays a critical role in giving learners access to relevant information and other participants' knowledge with relatively low effort in a discussion forum. Arguably, the potential of a network to allow information diffusion depends to a large extent on the network structure and the actors involved (Toivonen et al., 2009).
High rates of attrition and steeply unequal participation patterns from learners are common in MOOCs and similar educational settings, and these issues continue to deserve significant research attention towards building a robust and sustainable learning environment (Clow, 2013). In the social-ecological systems literature, the concepts of sustainability, resilience, and robustness are implicitly similar or, at least, highly related terms (Berkes & Folke, 1998). These concepts refer to the capacity of a system to retain its basic functions when subject to unforeseen change. In MOOCs, if highly active learners drop out, the discussion network is likely to encounter a sudden change in its structure, and the discussion might shift in a way where others jump in to lead the conversation, move in a different direction, or fizzle out. The challenges arising from the large scale inherent in MOOCs can be addressed by leveraging the power and potential of massive numbers to develop a robust online learning environment (Kellogg et al., 2014). The more robust a scaled learning environment, the more effectively it can address the problem of the lack of instructional and social supports.
Examining the structural dynamics of networks has the potential to offer important insights into how to design a better social learning environment in MOOCs. Who interacts with whom by posting to the discussion forum and how these interactions lead to form a certain social structure can inform how social learning is taking place among participants in MOOCs. The current study attempts to explore the mechanisms that drive the structural dynamics of networks in the context of MOOCs.

Effects of network dynamics
This study focuses specifically on using network configurations (reciprocity, transitivity, and preferential attachment) and actor covariant (homophily) to examine the overall change in relationships formed in a MOOC discussion forum, and explaining how these four network effects could be used as metrics to inform the design of a better social learning environment.
Reciprocity is a very important measurement of mutual relationships in network settings and is studied on the dyadic level through a dyad census. Reciprocity refers to a communicative relationship in which a conversation is paired up with a returned flow. This does not necessarily mean that an equal amount of information is transmitted at a quantitative or qualitative level, but the emphasis is on there being a return of conversation flow. Research has shown that it is important that participants use the forum not only to express their own ideas and thoughts but also to interact with others by responding to their messages (Arvaja, Rasku-Puttonen, Häkkinen, & Eteläpelto, 2003). Students who actively engage in forum discussions are likely to receive more feedback from peers and to develop their thinking further. Reciprocal interaction is considered as a vitally important part of sharing the cognitive processes at a social level (Resnick, Levine, & Teasley, 1993). To test network dynamics and the tendency to form mutual relationships, the following hypothesis was formulated: There is a tendency towards reciprocation in the studied MOOC network (i → j and j → i). (dyadic level) A transitive relationship, in which A connects to B, B connects to C, and A also connects to C, may be more conducive to social learning, as participants are more likely to receive stimuli from multiple peers as the desired information diffuses through a network (Centola, 2010;Todo, Matous, & Mojo, 2015). Linking to different clusters is expected to support information transmission, as these clusters might serve to provide an individual with a variety of information sources (Hartman & Johnson, 1989). To support social learning, relations in a network should generally be characterized by a high tendency for reciprocity and transitivity, which leads to network cohesiveness. In cohesive subgroups (Everett & Borgatti, 1994), trust can be established (Sparrowe & Liden, 1997), which is supposed to form a learning-supported environment (Booher & Innes, 2002;Liebeskind, Oliver, Zucker, & Brewer, 1996). In such a learning-supported environment, robust exchange may occur within a large network (i.e., as in MOOCs) rather than in small groups as might be expected. To test network dynamics and the tendency forge network cohesiveness, the following hypothesis was formulated: There is a tendency towards network cohesiveness (i.e., increasing transitivity and reducing distance between actors; i → j, j → k and i → k). (triadic level) Homophily, which is a term coined by Lazarsfeld and Merton (1954), refers to the tendency of individuals to associate with those similar to themselves. Homophily is one of the most pervasive and robust tendencies of the way in which people interact with others (see McPherson, Smith-Lovin, & Cook, 2001). Intuitively, understanding the changing patterns of homophily in a network is very important for learning. The distribution of knowledge and the flow of ideas mostly occur among individuals who are similar, or homophilous (Rogers, 1995, p. 18). A network with a high degree of average homophily among actors is likely to disseminate information and (tacit) knowledge rapidly, that is, the actors have a better source for learning (Cross, Borgatti, & Parker, 2001;Powell, 1990). In this study, teaching staff (including instructors and teaching assistants) and students are considered as two distinctive groups of actors. To understand what role instructors play towards encouraging social interaction and supporting learning in MOOCs, and whether students can act as learning companions to assist learning, it is very important to unravel the interactions between students and instructors in MOOC learning environments. Empirical investigation of the actual case of MOOC discussion forums is essential for this kind of understanding (Selwyn, 2011). given that most MOOCs are open-ended, voluntary learning environments and are likely to attract more intellectually open-minded learners, students are likely to play an important role in promoting interactions in MOOC discussion forums. To test network dynamics and the tendencies of homophily effect, the following hypothesis was formulated: There is a tendency towards an increasing volume of interactions between students.
Preferential attachment represents the tendency of heavily connected nodes to receive more connections in a network. That is, if a new participant contributes to the forum, the probability of replying to or being replied to by another would be proportional to its degree. The change in networks resembles a positive feedback loop where initially random variations (e.g., a participant having started to contribute earlier than others) are increasingly enlarged, thus greatly amplifying differences among participants. This dynamic process in networks is often described as the Matthew effect (Merton, 1968), or the rich-get-richer phenomenon, which is known to generate a power-law distribution in networks (Barabási & Albert, 1999). The tendency of preferential attachment results in a highly centralized network structure. Network centralization is a measure of how unevenly centrality is distributed in a network (Scott, 2000). Centrality is an actor-related measure and can be defined in different ways that relate to the importance or power of a participant in a network. Highly centralized networks appear to be conducive to the efficient transmission of information (Crona & Bodin, 2006), as the central participants play an important role in delivering messages. But central participants can manipulate the communications in networks, and thus, centralized networks are not likely to enable optimum levels of intellectual exchange because of the high imbalances of power in such settings (Leavitt, 1951). Highly centralized networks tend to be less robust to abrupt changes and can be vulnerable due to their strong reliance on a few heavily linked individuals. Learning processes are more likely to collapse if a central participant leaves the networks (Nicolini & Ocenasek, 1998). The high dropout rate of MOOC learners will affect communication patterns in MOOC forums, as a central participant is not likely to be replaced by another. To test network dynamics and the tendencies of preferential attachment effect, the following hypothesis was formulated: There is a tendency towards preferential attachment within the studied network.

Context of the study
The context of this study is the 'Introduction to Psychology' (30700313X) Spring 2014 course at Tsinghua University (https://www.xuetangx.com/courses/TsinghuaX/30700313X/_/about), which is a required undergraduate course for majors in the Department of Psychology. In 2013, Tsinghua University launched its new learning portal XuetangX to host local MOOCs as well as courses from a consortium of leading universities worldwide. XuetangX, powered by the open-source platform edX, aims to increase Chinese students' access to high quality education, while transforming universities' campus-based learning experience. 'Introduction to Psychology' was one of the first courses to be launched at XuetangX as part of this initiative. The course began on 10 April 2014 and ran for about 5 months through to 12 September 2014. More than 10,000 learners registered for the course. As one of the most popular MOOCs in China, the course was offered again in Autumn 2014, Spring 2015, and Autumn 2015.
The instructor for the course has taught 'Introduction to Psychology' at Tsinghua University for a number of years. Two teaching assistants (TAs) joined the course, and they were responsible for releasing course information and answering questions asked by students in a discussion forum.
A set of videos, narrated by the instructor, and averaging less than 10 min each, was released on a weekly basis. Videos were interspersed with online exercises that allowed learners to put into practice the concepts covered in the videos. The course also provided a discussion forum for students. As there was no group assigned by the instructor, the discussion forum served as an optional, open, and loosely structured environment for participants to communicate and collaborate if they wished to do so. Compared to the first release of other Chinese MOOCs at XuetangX, this course had a rather popular discussion forum where students asked and answered questions by others. About 5000 discussion messages were posted online during the first offering of the course (April 2014-September 2014). Some of these messages were in the form of enquiries about subject matter, exercises, course materials, and the logistics of the course. Students also used the discussion forum as a platform to report on their own study, as well as seek out social activities. Some of the messages also included feedback on the course.
The ubiquitous online discussion forum has long been seen as a suitable place for asynchronous communication and discussion among participants on a large scale. In these settings, students can either create new posts to initiate a discussion or join a discussion by either replying to posts or commenting on any reply to a post. Each one of these discussions may consist of one or more posts, which might be responded to with replies or comments. The discussion forum embedded in the MOOC platform maintains a simple interface for its use. No advanced functions, such as rich site summary subscription, are provided; thus the discussion forum will not automatically deliver every new message to its participants via email. Participants must visit the interface of the forum to follow discussions posted by others. When using the forum, participants decide whether to read or respond to a post. In this sense, participants can be very selective with regard to what interests them.

Method
This study adopted SIENA, which performs statistical estimation of stochastic actor-driven models for repeated measures of empirical networks . The family of the dynamic actor-driven models assumes that participants who are represented by the nodes in the network play a crucial role in changing their connections to others (Steglich, Snijders, & Pearson, 2010). The behaviors of actors (treated as dependent variables) affect network dynamics, and in return, the network dynamics influence their behaviors, which is seen as a co-evolution of networks and behavior (Snijders, 2011). The method is implemented in the R package RSiena (SIENA version 4), developed by Snijders et al. (2010;https://www. stats.ox.ac.uk/~snijders/siena/).
The underlying assumption of actor-driven models to the analysis of data is that all network changes can be broken down into very small steps, so-called ministeps, in which one actor creates or terminates one outgoing tie. These ministeps are probabilistic and made sequentially. Whether an actor makes a ministep, as well as which ministep an actor will make, is estimated by means of an objective function that actors are assumed to optimize.
Hence, based on the objective function, all possible ministeps are evaluated, and the acts of the actor are determined.
The objective function, which is used in this research, depends on two types of effects: structural and covariate. Structural effects capture endogenous network mechanisms. In this research, the following structural effects were used: • Reciprocity is represented by the number of reciprocated ties (measure of mutuality).
Reciprocity estimates the probability of participant B replying to participant A, given that A has replied to B. • Transitive triplets or transitivity is represented by the number of ties to participants who are the friends of my friends (measure of network closure; that is, transitivity estimates the probability of A replying to C, if A has replied to B, and B has replied to C). • Activity of alter is defined by the sum of the outdegrees of the others to whom the actor is tied (measure of activity attraction; that is, participants who have already received many replies are likely to be replied to by others).
Covariate effects estimate the network dynamics based on exogenous factors, for example, the role of actors. In this research, one dyadic constant covariate effect (same role) was used: • Same role is coded as 0 for students, and the non-student role (i.e., instructor and TA) was coded as 1. Based on the conjectured hypotheses, three models were defined. In model 1, two structural effects, reciprocity and transitivity, were estimated to either accept or reject H1 'there is a tendency towards reciprocation in the studied MOOC network' , and H2 'there is a tendency towards network cohesiveness. ' In model 2, the two structural effects, reciprocity and transitivity, plus the covariate effect, same role, were estimated to either accept or reject H3 'there is a tendency towards an increasing volume of interactions between students. ' In model 3, the three structural effects, reciprocity, transitivity, and activity of alter, were used to either accept or reject H4 'there is a tendency towards preferential attachment within the studied network. ' In SIENA, the conditional method of moments estimation (MoM) was used to decide how significant the structural and covariate effects were (Ripley, Snijders, Boda, Voros, & Preciado, 2015). The conditioning variable was the total number of observed changes (distance) in the network variable. For models where the convergence was unsatisfactory (i.e., the absolute value of the t-statistic was above 0.25 for some effects from the model), the analysis was rerun using the obtained results as a starting value (Ripley et al., 2015). This procedure was repeated several times until convergence was reached.
Messages posted after 2 September 2014 were excluded from data analysis, because these messages are related to greetings on Teacher's Day on 10 September. As there are only 8% of weighted ties, the non-valued reduction of this original network will not significantly alter network dynamics. In this research, the weighted network was transformed into a binary network, which contains 1915 participants and 4547 ties. The density of the binary network (the total of all tie values divided by the number of possible ties) is 0.001. This suggests that the interaction in the network was relatively sparse. Since most of the ties are always absent, sparse adjacency matrices were used to store the relational data.
The evolving network was examined using SIENA with six time periods (see Observation time in Table 1), with approximately equal number of messages (i.e., posts, replies and comments) being posted in each time period. As interactions in MOOCs are asynchronous and replies to messages can occur later, the network grows with time. Thus, the accumulated number of links was used for analysis in each period. As a measure of stability, the Jaccard coefficients (see Snijders et al., 2010) for two sequential periods, varying from 0.529 to 0.892, indicate that the network dynamics of six periods are smooth enough, which justifies the use of six periods as appropriate in this study.

Descriptive statistics of the discussion network
In this MOOC course, 1915 participants posted 5251 messages in total, of which 217 are original posts, 2553 are replies to the original posts, and 2481 are comments to the replies (see Table 2). On average, each discussion thread attracted 23 replies and comments, ranging from the minimum of one to a maximum of 560 replies and comments. The instructor led 26 discussion threads, replied to 17 messages, and commented on 91 messages. TAs posted two messages, replied to 24 messages, and commented on 134 messages. Table 3 presents the results of SIENA estimation. The rate parameter estimates the speed with which the dependent variable will change over two sequential time periods. As shown in Table 3, a participant on average had 0.3698 times the opportunity to change one of their reciprocal or transitive ties in the first period of observational time, and 0.3850 times opportunity to change their outgoing ties to those similar to themselves starting with the first time period.  Table 3. siena estimation results with standard errors in parentheses.

SIENA estimations
* non-significant at p = 0.05. Reciprocity estimates the tendency of participant B replying to A, if A has replied to B. Transitive triplets estimate the tendency of participant A to reply to C, given that participant A has replied to B and B has replied to C. The effect of reciprocity and transitivity were estimated in model 1. As shown in Table 3, these two effects are significant (p < 0.001) and the coefficients are positive. The reciprocity effect is significant and, in model 1, varies from 2.9055 to 5.8586 in different time periods. Thus, hypothesis H1, which asserts that there is a tendency to create reciprocal links, is accepted. This means that if participant A replied to participant B earlier, there is a tendency for participant B to reply to participant A later in the studied network. The transitivity effect is significant with a positive coefficient ranging from 1.2575 to 3.1882. At the triadic level, it is confirmed that there is a tendency for participant A to reply participant C, if A has replied to B and B has replied to C. The results of model 1 indicate a tendency for participants to create mutual relationships at both dyadic and triadic levels, which leads to cohesiveness in the studied network. This confirms hypothesis H2: that there is a tendency towards network cohesiveness.
Hypothesis H3 explores the influence of the same role on participants' interaction. To test this hypothesis, the same role covariate was estimated in model 2 (as shown in Table 3), and reciprocity and transitivity were used as control variables. As shown in Table 3, same role is a significant covariate effect (p < 0.001) with the negative coefficients in six periods of observational time. Thus, the considered network is heterophilic; that is, participants do not have a preference for creating links with those similar to themselves. Thus, H3 is rejected, indicating that there is no tendency towards an increasing volume of interactions between students.
Hypothesis H4 states that there is a tendency towards preferential attachment within the studied network. The activity of alter estimation was calculated to test this hypothesis. In model 3 (as shown in Table 3), the activity of alter effect is significant (p < 0.005), with a coefficient that is positive but relatively small: it varies from 0.0545 to 0.2920 in different time periods. Thus, H4 is accepted, indicating a tendency for participants who are actively involved in forum discussions in the early stages to become even more engaged over time.

Discussion and conclusion
This study is an empirical investigation of the network dynamics of a MOOC discussion forum, which contributes to the knowledge of how online interactions can be designed to create and foster a better networked learning environment to meet the challenges of accommodating a massive and global student body. In an attempt to understand the mechanisms that drive the structural dynamics of networks in the context of MOOCs, SIENA using the actor-driven models was performed.
Designing a robust and sustainable learning environment in MOOCs where learners can join and leave the space at their will is a challenging task. As stated earlier, particular concerns are the extremely high rates of attrition and the pattern of steeply unequal participation in MOOCs (Clow, 2013). Due to the massive number of registered learners, it is difficult for MOOC instructors to interact with individual students on a one-to-one basis (Bruff, Fisher, McEwen, & Smith, 2013). The key to a successful MOOC is to be able to promote openness by creating a self-generated and learner-supported learning environment where learners are intellectually open to change and willing to share ideas with others. And to be able to achieve this outcome, it is important to understand the structural dynamics of MOOC discussion forums which comprise the engines for the development of such a robust learning environment.
The results of the present study show that preferential attachment is present in the studied networks, commonly referred to as the rich-get-richer effect. In the studied network, participants who are actively involved in forum discussions are likely to become even more engaged, and these participants are likely to play a key role in delivering messages. Their contributions to the discussion forum appear to promote the efficient transmission of information. However, it is also likely that some participants manipulate the communication flow in the MOOC forums. If these participants drop out of the course, the discussion is most likely to discontinue. Such a learning context is not a robust learning community, as was illustrated in the work of Kellogg et al. (2014). How to lessen preferential attachment effects to maintain efficient information transmission as well as increase robustness of a discussion forum in the design of a networked learning environment is key to the development of sustainable MOOCs.
Reciprocity and transitivity are represented by the number of reciprocated and triple ties in this study. The results of this study show that the effects of reciprocity and transitivity are significantly evident in the studied MOOC forum. This finding echoes the results of some qualitative studies using interviews to study MOOC participation. For example, in their qualitative study Waite, Mackness, Roberts, and Lovegrove (2013) also found reciprocal relationships in learner participation in a MOOC. This study further identified an increased level of cohesiveness in the studied network, since participants tend to respond to reciprocal partners and connect to others in a transitive way. The measurable effect of these metrics on participant interaction appears to be network closure. The change towards network closure potentially affects the behaviors of participants in the discussion forum. In such a network setting, participants are likely to become more selective when interacting with others. Buder, Schwind, Rudat, and Bodemer (2015) have also realized this problem in scalable discussions, and tested a new design of navigation in their study. More studies using network effects to inform the design of a social learning environment that could accommodate a global scale body of students are urgently needed.
The significant effects of reciprocity and transitivity also lead to the creation of many different cohesive clusters, which support the transmission of different information sources. The tendency of this network structure provides students with multiple channels to access information and knowledge so that they obtain information or (tacit) knowledge from divergent student groups. The volume of information and knowledge distributed in such a scalable network appears to be greater than for those in homogeneous networks. To increase the level of reciprocity and transitivity in scalable discussions seems like a useful strategy that merits serious attention. This strategy is the key to promoting 'global-scale conversations' (gillani & Eynon, 2014, p. 25), and can be used to help improve learning in the context of MOOCs.
Role heterophily and its measurable effect on participant interaction are significantly evident in the studied MOOC forum. There is no apparent tendency for students to reply to the messages initiated by their student peers. This finding is in contrast to the findings reported in the literature (e.g., Kellog et al., 2014), which indicate that students are likely to play an important role in promoting interactions in the forum. given the larger number of students (relative to the number of teaching staff ), this pattern of participant interaction is not conducive to information transmission. In such a network, information and (tacit) knowledge are disseminated slowly within the group of students, while information travels faster from teaching staff to students. The heterophily effect observed in this study is possibly a result of the typically teacher-centered view of teaching in the Chinese educational context. In this context, teachers are very highly respected by their students, and are seen as the source of all knowledge and the correct answer. The process of receiving feedback directly from a teacher in a discussion forum is valued over and above any feedback from the discussions between and among students themselves. Peer-supported learning has only recently been introduced in conventional Chinese universities, and as such, its practice needs to be further refined. The heterophilic effect observed in this study might also reflect Chinese perceptions of MOOCs. MOOCs were first introduced in China as a pathway to high quality education on a larger scale. And since then, offering MOOCs in China has been seen as a key strategic goal for increased access to high quality education, and highly regarded professors from elite universities, as well as an opportunity for students to meet with and interact with other students, instructors, and TAs. Yet despite the wealth of experience of open universities in China in providing educational opportunities on a large scale, few lessons have been learned by MOOC providers-the new players on the scene (Wei, 2008)-and not surprisingly, the notion of peer-supported learning remains a vision and not yet fully realized by providers in the Chinese online education system.
To conclude, this study adopted SIENA to explore the mechanisms that drive the dynamics of interactions in the context of a MOOC offered by a highly rated Chinese university. Investigating how interactions occur and change over time highlights the unique potential of MOOCs to accommodate a global-scale body of students, as well as offering new insights into the pedagogical value of MOOCs . Messages emerging from the results of this study suggest that when designing a robust, peer-supported online learning environment on scale, there is a need to consider how to increase the effects of reciprocity, transitivity, and homophily from a network perspective. Moreover, how to lessen the preferential attachment effect to avoid vulnerability of a discussion network, while maintaining efficient information transmission, is key to the design of a sustainable networked learning environment.
Maxim Skryabin is a senior data scientist at Stepic.org, and works as an associate at Beijing Normal University and as an associate professor at ITMO University. He has a PhD in mathematics and physics, an MA in age and developmental psychology, and an MA in comparative education.
Xiongwei Song is an associate professor at the Chinese Academy of governance. He obtained his PhD from the University of Sheffield, UK. His work has been interdisciplinary, including using learning theories to interpret the process of policy implementation, and adopting the framework of deliberation in democracy.