Rational-Critical User Discussions: How Argument Strength and the Conditions Set by News Organizations Are Linked to (Reasoned) Disagreement

Abstract Due to their potential influence on the individual and societal formation of opinions, the quality of online discussions has been a subject of widespread interest. From a deliberative perspective, rational argumentation and critical reflection are central criteria for good discourse. Drawing on research on the perception of arguments and the conditions of disagreement, we ask how argument strength is linked to the likelihood of receiving (reasoned) disagreement and whether the discussion norms and technical features set by the news organizations moderate this effect. Based on a manual content analysis of 14.690 user comments on nine German news websites, we find that comments with a higher argument strength are more likely to receive disagreement in general and reasoned disagreement in particular. Further, the levels of (reasoned) disagreement are higher on platforms with strong discussion norms and supporting technical features. The results show that the quality of a discussion can be related to both users’ argumentation and the decisions of the news organizations.

violated (Reich 2011). From a strategic viewpoint, user comment sections can be seen as a tool to compete for readers with other news organizations and retain readers to the site (Reich 2011;Manosevitch and Tenenboim 2017). Because comments can be perceived as indicators of relevance, discussions with a high level of controversial interaction and well-reasoned arguments can increase traffic and thus be beneficial for the news brand.
Previous research has studied the degree to which user discussions fulfil deliberative requirements. The results vary, depending on the context, both for the quality of arguments and for the expression of disagreement (Collins and Nerlich 2015;Freelon 2015;Graham 2009;Oz, Zheng, and Chen 2018). These studies usually analyse the criteria in an aggregated way (e.g. the share of rational comments per website) in order to compare different discussion contexts. However, how users react to each other also depends on previous discussion content they encounter in the discussion (Graham 2009). The relationship between rational-critical expressions and the content of previous comments has not been studied so far. Therefore, this study asks: Under which conditions does a rational-critical discourse evolve in user comment discussions? In this article, we will discuss the requirements from the perspective of deliberative democracy and combine them with empirically falsifiable assumptions about the effects of argument strength and platform features set by news organizations.

Rational-Critical Discourse from a Deliberative Perspective
Based on Habermas' (1996) work on communication in the public sphere, the theory of deliberative democracy posits that political decisions should be legitimized by public deliberation, i.e. societal discourse and negotiation about solutions for collective problems. Thus, the exchange of arguments among citizens is supposed to promote the formation of opinions and wills (Graham 2009). Scholars have debated whether user discussions improve democratic debate or open up the space for anti-democratic behaviour (Strandberg and Berg 2013). Their anonymous nature and low barriers of entry can enable free speech (Ruiz et al. 2011), but also disrespectful expressions (Coe, Kenski, and Rains 2014). News organizations can promote deliberative norms by defining commenting policies and technical features for the discussion, discouraging and sanctioning violations of discussion rules (Ksiazek 2015). They can highlight valuable comments for the audience, journalists or political decision-makers and thus enable them to follow up on the discussion (Manosevitch and Tenenboim 2017). Thus, news organizations intervene in the discussions, which could be understood as a violation of the free speech condition necessary for deliberation (Wessler 2018). However, considering the prevalence of disrespectful and generalizing content (Coe, Kenski, and Rains 2014), moderation can also be crucial in enabling deliberative discourse (Ruiz et al. 2011).
To clarify this ambiguity, research has assessed the deliberative quality of user discussions and studied its conditions (Wessler 2018). Following this tradition, we do not understand comment sections as perfect realizations of the deliberative ideal, but as partial public spheres which offer the opportunity for deliberative discussions between citizens. To approach the individual and societal implications of discussions that partly fulfil deliberative criteria, it is crucial to gain a deeper knowledge of the specific processes that take place between individual deliberative criteria (Mutz 2008).
Among the deliberative criteria identified in the literature (Wessler 2018), two constitute the rational-critical nature of deliberation: rationality (i.e. providing justifications) and critical reflection (i.e. critically replying to other participants' arguments). These criteria are central to the classical concept of deliberation (Graham 2009;Wessler 2018), because they are strongly associated with the cross-cutting exchange of opinions. Gathering and evaluating arguments for or against a certain condition is the central process of deliberating (Landemore and Mercier 2012). The interplay of rationality and critical reflection is an elementary precondition for citizens to gain knowledge about the arguments of the other side and to feel competent to engage in a discussion with people with different opinions (Graham 2009;Mutz 2006). When these criteria are fulfilled in a discussion, a process of mutual understanding can take place and the discussion is less likely to become polarized into two isolated groups that decease to engage with each other (Graham 2009;Stromer-Galley 2003). Although we do not mean to argue that other deliberative criteria are not relevant for deliberative discourse, we consider the analysis of these two criteria especially relevant for understanding the dynamics of user discussions. The way they interact has not been studied so far. Previous research has studied how disagreement is linked to civility (Chen and Lu 2017), but has not looked into the connection to rationality.

Rationality
A rational-critical discourse demands participants to present well-reasoned arguments (Graham 2009). These are often understood as claims supported by justifications (Wessler 2018). In a narrower sense, valid arguments have to be based on evidence, i.e. objective facts or sources that can be traced by others (Stroud et al. 2015). This demand follows the understanding that the strength of an argument is determined by the amount of supporting information (Ricco 2008;Price and Neijens 1997;Kuhn, Shaw, and Felton 1997). There has been extensive theoretical debate about the deliberative focus on rational argumentation, which has led to a wider conceptualization that accounts for other communicative styles, such as storytelling or emotional talk (Wessler 2018). If these styles are used as justifications for a claim, they can be considered a special case of reasoning, which is not strictly rational in its content, but in its function. Justifications that are not based on evidence can be based on conditions, sentiment, or appeals to a precedent, a perceived majority or an authority (Kuhn, Shaw, and Felton 1997). We can therefore distinguish different degrees of rationality, with verifiable justifications having the highest degree, and non-verifiable justifications having a medium degree. There are different expected effects of rational argumentation: On the individual level, people are expected to sharpen their beliefs and argument repertoire and gain a broader knowledge of the arguments of the other side (Mutz 2008). On the societal level, rational discourse is expected to improve comprehensibility, common understanding (B€ achtiger et al. 2009;Jensen 2003), and decisionmaking (Wessler 2008). Research has found that in online discussions, citizens present justifications to varying degrees (Collins and Nerlich 2015;Freelon 2015;Jensen 2003;Ruiz et al. 2011;Wright and Street 2007). Justifications with verifiable evidence appear less frequent than others (Santana 2019;Stroud et al. 2015). Asynchronous digital environments can promote rational conversation because participants have enough time to develop well-reasoned arguments (Graham 2009). Regarding the effects of rationality, research has shown that news items containing justifications lead to larger argument repertoires (van der Wurff, Swert, and Lecheler 2018) and more participation in user discussions (Marzinkowski and Engelmann 2018). Furthermore, people who participate in rational-critical discussions acquire a broader knowledge about the arguments and a better understanding of the opposing side, which in turn favours participation (Cappella, Price, and Nir 2002).

Critical Reflection
For a rational-critical discourse to take place, those participants who disagree with another person's argument or claim are expected to express this disagreement (Chen and Lu 2017;Graham 2009). As Graham (2009) argues, disagreement alone does not fully capture all requirements of the critical in rational-critical discourse, because it can be expressed without a prior reflection of the arguments. He suggests critical arguments as a more appropriate criterion. Thus, two levels of critical reflection can be identified, a simple expression of disagreement, and a more demanding expression of reasoned disagreement. From the normative perspective, disagreement is essential for the development of deliberative discourse. It highlights the different positions and argumentations, shows where they are conflicted, and enables the development of new ideas, compromises, and political tolerance (Stromer-Galley 2007;Stromer-Galley and Muhlberger 2009). Previous studies have found varying levels of disagreement in political discussions (Freelon 2015;Ruiz et al. 2011). Disagreement has rarely been investigated in combination with other deliberative criteria (Graham 2009). Qualitative studies have analysed critical reflection and found it to be practiced frequently (Tanner 2001;Dahlberg 2001). Quantitative studies, however, have been missing so far. Regarding the effects of critical reflection, previous studies have found that exposure to opposing views promotes one's awareness of political arguments and political tolerance (Mutz 2008) and can lead to further critical participation (Lu 2019).
Conceptually, the elements of rational-critical discourse are strongly connected (Cappella, Price, and Nir 2002). By disagreeing, participants establish a relation between their own arguments and the arguments of previous statements. To study this connection, the normative assumptions discussed above have to be supported by falsifiable theories on the perception and effects of argument strength. The following section will therefore present a theoretical model to explain critical reactions in rational argumentation settings.

Theoretical Explanations for the Effectuation of Rational-Critical Discourse
The Perception of Attitude-Challenging Views Encountering attitude-challenging comments in comment sections is very likely because affiliation is not based on a personal acquaintance or specific interests (Barnidge 2017;Lu 2019). When people encounter attitude-challenging views, their previous beliefs are challenged and a feeling of unease comes into being, described as cognitive dissonance (Chen and Lu 2017). Dissonant cognitions can be "any knowledge, opinion, or belief about the environment, about oneself, or about one's behaviour" (Festinger 1957, p. 3). Thus, any comment expressing an opinion or information that stands in conflict with one's own beliefs can trigger this effect. People strive towards consistency of their beliefs (Festinger 1957). The described psychological unease comes into being because dissonant cognitions challenge one's self-image (Garrett 2009). When confronted with dissonant cognitions, people aim at reducing the dissonance and achieving consonance in order to strengthen their self-image (Festinger 1957;Jeong et al. 2019;Lu 2019). Leung (2009) found out that online commenters engage in discussions in order to establish their identity, gain respect, or build up confidence. Increasing self-esteem and expressing one's own identity are also important motivations for writing comments (Springer, Engelmann, and Pfaffinger 2015).
The theory of cognitive dissonance states that the higher the meaning attributed to a dissonant cognition, the higher the strength of cognitive dissonance and thus the need to react (Festinger 1957). If a cognition is only moderately dissonant, the urge to reduce the dissonance will also be moderate. In this scenario, the fear that adding a new cognitive element could further increase the dissonance may prevent people from reacting. If a cognition creates a strong degree of dissonance, however, this fear is less likely and the urge to reduce the dissonance dominates (Festinger 1957). Festinger (1957) argues that the degree of dissonance evoked by a cognition depends on the importance attributed to it. In the context of a discussion thread with diverging opinions, we assume that this importance depends on the level of persuasiveness that users attribute to the cognition. This assumption is based on previous research that argues that the presence and quality of a justification, sources, or additional information to an opinion are relevant for its persuasiveness (Gunther and Liebhart 2006;Kim, Wyatt, and Katz 1999;Price and Neijens 1997). This assumption goes in line with other theoretical perspectives. First, studies on the Elaboration Likelihood Model (Petty, Cacioppo, and Goldman 1981) argue that a higher attributed relevance to an argument can lead to higher levels of elaboration and understanding of arguments (O'Keefe and Jackson 1995). Second, studies on corrective actions in online discussions have shown that the potential reach that is attributed to an information can increase the willingness to argue against it (Rojas 2010)if an argument is perceived as more evidence-based, it can also be expected to be more convincing to others. Based on these considerations, we can assume that expressions with a higher argument strength, and thus a higher relevance, can be assumed to have a higher reach in the sense of other users processing it.

Critical Reactions to Attitude-Challenging Views
From cognitive dissonance research, we know that dissonant cognitions create an urge to seek consonance, e.g. through increased civil or political engagement (Kim and Chen 2016). One possible way is the removal of a dissonant cognition (Festinger 1957). Applied to user discussions, this can be understood as expressing disagreement with an opinion-challenging comment. This way, users aim to correct the perceived wrong, take weight from the other side, and convince others of the "right" opinion (Jeong et al. 2019). According to the original theory, people have other options, such as adding a consonant cognition or evading the discussion environment (Festinger 1957). These reactions can also be expected in online communication environments (Lu 2019). However, research found that correcting perceived wrongs, balancing discussions or expressing disagreement with opposing views are important motivations for writing comments (Diakopoulos and Naaman 2011). This can be observed especially for users who are highly involved in and opinionated about a topic (Kwak et al. 2020;Lu 2019). Also, active argumentation against attitude-challenging comments can be expected from people with strong pre-existing beliefs that are cognitively motivated to engage in the discussion. As Post (2019) argues, partisans actively search for attitude-challenging content because they want to know what the other side is thinking. This can be expected especially from people with high issue attention, political activity, informational needs, and informational utility. These characteristics have also been shown to predict online opinion expression (Leung 2009;Springer, Engelmann, and Pfaffinger 2015). Taber and Lodge (2006) found that partisans spend more time on processing attitude-challenging information than attitude-confirming information and actively engage in developing counter-arguments. The same study also discovered that participants rated attitude-confirming arguments as stronger than attitude-challenging ones. Heussen et al. (2011) found that negative evidence can have an impact on perceived argument strength. Furthermore, Kappes et al. (2020) showed that a higher argument strength does not make attitude-challenging arguments more likely to change one's opinion (Kappes et al. 2020). So far, to our knowledge, the links between different degrees of argument strength and the expression of disagreement have not been studied. Therefore, we hypothesize: H1: The higher the argument strength of a comment, the more likely it will receive disagreement.
Dealing with contradiction leads to a better knowledge of the arguments of the other side (Cappella, Price, and Nir 2002). By dealing with the arguments of the other side more intensively, one must also respond to such statements convincingly if one decides to contradict them. The theory of cognitive dissonance assumes that the strength of dissonance influences how active one becomes or how much resistance one offers to a dissonant cognition (Festinger 1957). So far, this has not been considered in terms of the strength of newly added cognitions. However, it can be assumed that well-reasoned arguments must also be answered with a well-reasoned disagreement to counter the stronger dissonance or persuasion. Assertions or subjective feelings can be rejected more easily than verifiable evidence. Passing on knowledge, persuading others, and opposing attitude-challenging opinions are strong motivations for the public expression of opinions in online discussions (Diakopoulos and Naaman 2011). Taber and Lodge (2006) found that cognitive dissonance leads to motivated reasoning, which means that partisans spend more time on processing attitude-challenging than on attitude-confirming information, because they want to elaborate on an argumentation against these views. A dissonant cognition with a higher persuasiveness (e.g. a cited source) increases the likelihood that people will not only disagree but present their own argument against it to further weaken its potential influence (Rojas 2010). At the same time, being exposed to attitude-challenging arguments also increases one's awareness and tolerance for the arguments of the other side (Valenzuela, Park, and Kee 2009). We assume that the presence of a justification of counter-attitudinal information makes it more challenging (van der Wurff, Swert, and Lecheler 2018). As Kuhn, Shaw, and Felton (1997) showed in an experimental setting, dyadic interaction with other discussants can improve the argumentative sophistication. So far, this relationship has not been studied in the context of online discussions. Due to the user characteristics described above, we assume that these effects can also take place here: H2: The higher the argument strength of a comment, the more likely it will receive reasoned disagreement.

Conditions of Rational-Critical Discourse Set by News Organizations
Previous research has studied the influence exercised by platforms hosting online discussions (e.g. Arag on, G omez, and Kaltenbrunner 2017; Ksiazek 2015; Peacock, Scacco, and Stroud 2019). As each news organization chooses a different design and sets its own norms, different outcomes regarding the amount of rational-critical discourse can be expected on different websites. Considering previous literature on this topic, we identify differences regarding (1) the discussion norms set by the website and (2) platform-specific technical criteria.
Concerning the discussion norms, both the expression of well-reasoned arguments and of disagreement can be promoted to different degrees. First of all, the policies published by the news organization can name criteria for the desired argument quality and ask users to conduct a controversial debate (Ksiazek 2015). Violations of the rules can be a motive for banning comments from publication (Ksiazek 2015), carried out by different moderation styles. In some cases, comments are only published if a moderator approves them for publication, in other cases, norm-violating comments are deleted after publication. This post-moderation approach can be conducted visibly or invisibly (Wright and Street 2007). The visible way implies that deleted comments continue to be displayed in the discussion thread, with their content removed. The moderators can add a reason for the deletion in order to exercise the compliance of this norm visibly (Yeo et al. 2019). The norms that news organizations want to enforce in the discussion can thus be visible to users to different degrees when they decide on writing a comment.
Technical features can also promote the visibility of arguments and disagreement (Peacock, Scacco, and Stroud 2019). The expression of disagreement can be enhanced by promoting replies in general through a reply function (Peacock, Scacco, and Stroud 2019). When being offered a feature that allows to directly get back to another user, we can expect users to feel more entitled to express disagreement than if they actively have to mention another user. Arag on, G omez, and Kaltenbrunner (2017) studied the differences between a linear and a hierarchical thread visualization. They found that a hierarchical view increased the level of replies between users and the size of subthreads, i.e. sequences of comments created by replying to previous comments. As reply comments become more visible in the context of previous comments in a hierarchical view, we can also assume that arguments become more understandable, and therefore the intention to engage with other comments increases.
When users read a dissonant comment and have to decide whether to write a disagreeing reply, the social norms and technical features can influence this decision. For example, remembering that well-reasoned arguments are desired by the moderators can lead them to think of a reason for their disagreement. Therefore, we assume that social norms promoting rational-critical discourse and technical features that promote (reasoned) disagreement will influence the perception of the argument strength and the following (reasoned) disagreement: H3: The relationship between argument strength and (reasoned) disagreement is moderated by discussion norms and technical features provided by the website.

Sample
The analysis is based on a manual relational content analysis of 14.690 user comments. The media sample consisted of nine German online media outlets that were among the top 100 journalistic German news sites regarding online reach (IVW 2017), offered a comment section that was actively being used, and published at least three articles on the first topic (see below). They vary regarding the norms they mention in their commenting policies, whether they apply visible moderation of comments, and regarding the presence of a reply function and a thread visualization 1 (see Table 1). Three media outlets can be identified that fulfil almost all of the discussion norms and technical features discussed above: zeit.de, welt.de, and sueddeutsche.de. Three outlets provide all of the technical features, but only demand one of the discussion norms, "provide sources/ quotations": tagesspiegel.de, huffington-post.de, and taz.de. The other three outlets do not fulfil both areas completely: rp-online.de, focus.de, and spiegel.de. All selected media can be classified as quality media that report on a broad range of topics.
To study the expression of disagreement, we included journalistic articles on two highly controversial issues: The debate about introducing an upper limit for the number of refugees to be taken into Germany (November 2015-November 2016 and the debate about reforms in the German pensions system (November 2016-September 2017, pre-election period). Articles on the first issue were selected from days on which at least three of the news media published articles. For the second issue, all published articles were collected. All user comments from these articles were collected, resulting in 174 news articles and 14.690 comments. This allowed us to capture all interactions between users, as demanded in previous studies (Jensen 2003;Wright and Street 2007). Only a few discussions were actively closed by the moderators. Therefore, the comments were saved several weeks after the publication of the articles to ensure that the discussions had stopped. Sequences of one initial comment and the following reply comments were randomly assigned to six coders, who were trained for several months and regularly supervised during the coding procedure. Reliability was measured using pairwise agreement following Holsti (PA) and Krippendorff's alpha (K-a). The reliability scores can be found in the following section. 222 comments were excluded because they had been deleted or shortened by the moderators, resulting in 14468 comments for further analysis.

Measures
To code for the rationality of a comment, in a first step, it was assessed whether it contained a justification (PA ¼ .81; K-a ¼ .62). A justification was understood as the provision of some kind of evidence for a claim. When a justification was identified, in a second step, we classified whether it constituted verifiable information, e.g. external sources, or non-verifiable expressions, such as personal reasons, generalizations or hypothetical claims (Jensen 2003;Stromer-Galley 2007) To calculate whether a comment received (reasoned) disagreement, first, the number of replies each comment received was aggregated. Second, it was identified how many of the answers were agreeing, disagreeing, or neutral (PA ¼ .90; K-a ¼ .77). On this basis, we calculated the number of disagreeing replies a comment received. Third, we determined whether the disagreeing answers contained comments presenting a justification to support their claim, based on the operationalization described above. Thus, it was calculated how many replies with justified disagreement a comment received. Table 2 shows the descriptive statistics of the main variables for analysis. The number of disagreeing replies per comment ranged from 0 to 13 (M ¼ .34, SD ¼ .70) and was non-normally distributed, with a skewness of 3.82 (SE ¼ .02) and a kurtosis of 31.02 (SE ¼ .04). The number of replies with justified disagreement per comment ranged from 0 to 9 (M ¼ .26, SD ¼ .59). It also deviated from the normal distribution, with a skewness of 3.66 (SE ¼ .02) and a kurtosis of 26.72 (SE ¼ .04). Therefore, we dichotomized both variables for further analysis into "disagreement received (no/yes)" and "justified disagreement received (no/yes)".
The platforms in our sample were grouped into three types: type A with high levels in both discussion norms and technical features, type B with low levels in discussion norms and high levels in technical features, and type C with low levels in both discussion norms and technical features. Within each medium type, there are variations regarding underlying print outlets, reach, and editorial lines (see Table 1).
Several control variables were taken into account. The issue, divided in discussions on a limit for refugees and the pension reform, was considered because both issues may have specific characteristics, based on morality, affectedness, and controversy, that might lead to different forms of expression (Gearhart and Zhang 2018). Due to the thread structure in most of the platforms analysed, we also included a variable to differentiate between first-level (i.e. comments not replying to another user) and reply comments (i.e. comments replying to another user). This differentiation did not depend on the presence of a reply button, i.e. replies could also be created by addressing other users (e.g. by using the @ indicator). As replies are sometimes hidden in the default view, first-level comments are more visible and might therefore have a higher probability of receiving replies in general. Additionally, we controlled for a basic criterion from deliberation theory, respect for other participants' expressions (Monnoyer-Smith and Wojcik 2012). Chen and Lu (2017) argue that missing respect can constitute a threat to the recipient's identity when combined with a dissonant cognition, resulting in a higher willingness to reply and corrective actions (Chen and Lu 2017). Therefore, missing respect was coded in the form of expressions that contained degradations of other people's expressions (Graham 2009;Steenbergen et al. 2003). Missing respect was detected in 13% of the comments (PA ¼ .92; K-a ¼ .57). We also coded for negative emotions because of the increased acknowledgment of emotional speech in deliberation research. Moreover, previous results indicate that negative emotions evoke replies (Oegema, Wang, and Kleinnijenhuis 2010) and go together with disagreement (Barnidge 2018). Negative emotions were coded when anger,  disappointment, negative surprise, or fear were expressed. These emotions were identified either through the use of emotional expressions (e.g. "I hate taxes", "This is really disappointing") or through highly negatively connoted language (e.g. "Your comments are disgusting!" or the use of proverbs) (Winkler 2005). Negative emotions were found in 15% of the comments (PA ¼ .91; K-a ¼ .65). The pairwise agreement of all variables exceeds the critical value of .70 (Frey, Botan, and Kreps 2000). The Krippendorff's alpha values of missing respect and negative emotions were lower due to the highly skewed distribution of the nominal variables because both indicators could only be found in a minority of the comments. Therefore, here, the pairwise agreement can be considered a more suitable coefficient (Feng 2014).

Results
Our analysis was conducted with two logistic regression models, one for the likelihood of receiving disagreeing replies and one for the likelihood of receiving replies with justified disagreement. The results can be found in Table 3. Both models differ significantly from the zero model (-2 Log-Likelihood ¼ 16116.68/13627.83; Chi 2 ¼ 1563.70 ÃÃÃ / 1109.73 ÃÃÃ ). The model for the likelihood of receiving disagreement explains 15% of the variance, the model for justified disagreement 12%.
In H1, we assumed that the higher the argument strength of an initial comment, the more likely it will receive disagreement. To differentiate levels of argument strength, two dummy variables for non-verifiable justification, understood as medium argument strength, and verifiable justification, understood as high argument strength, were included in the analysis. Compared to comments without a justification, non-verifiable justifications increase the likelihood of receiving disagreement by 44% (Exp(B) ¼ 1.44, p < .001). The highest argument strength is attributed to comments with verifiable justifications. In comparison with unjustified comments, verifiable justifications increase the likelihood of receiving disagreement by 83% (Exp(B) ¼ 1.83, p < .001). Thus, verifiable justifications have a higher likelihood of receiving disagreement than non-verifiable ones, which leads us to confirm H1.
H2 stated that the higher the argument strength of an initial comment, the more likely it will receive reasoned disagreement. Reasoned disagreement was measured with disagreeing responses that included a justification. Comments with non-verifiable justifications have a 54% higher likelihood of receiving justified disagreement than non-justified comments (Exp(B) ¼ 1.54, p < .001). Verifiable justifications increase the likelihood of receiving justified disagreement by 102% (Exp(B) ¼ 2.02, p < .001). Therefore, H2 can be confirmed.
H3 claimed that discussion norms and technical features moderate the relationship between argument strength and received (reasoned) disagreement. To this aim, we included two medium types into our models, type A representing high levels in discussion norms and technical features and type B low levels in discussion norms, but high levels in technical features. No moderating effect could be detected in either model, leading us to reject H3. However, both platform types have a direct effect on received disagreement (type A: Exp(B) ¼ 2.47, p < .001; type B: Exp(B) ¼ 2.07, p < .001) and received justified disagreement (type A: Exp(B) ¼ 1.75, p < .001; type B: Exp(B) ¼ 1.36, p < .01). Comments on platforms with high levels of discussion norms and technical features have the highest likelihood of receiving (reasoned) disagreement. Comments on the platforms with high levels of technical features also have a higher likelihood of receiving (reasoned) disagreement than comments on platforms with low levels of discussion norms and technical features. However, the effect of the argument strength on both dependent variables is independent of the platform type.

Discussion
In this study, we asked whether rational-critical discourse in user discussions can be enhanced by preceding comments and conditions set by news organizations. Our results show that comments containing a justification are more likely to receive disagreeing and reasoned disagreeing replies than comments without a justification. In both cases, the likelihood is higher for comments containing a verifiable justification than for those with a non-verifiable justification. Furthermore, we found that comments on platforms that promote well-reasoned arguments and reactions between users have a higher likelihood of receiving (reasoned) disagreement, independent of their argument strength. Important factors are the provision of technical features, such as a reply function and a thread visualization, and the definition and enforcement of discussion norms. The effects of the argument strength can be explained by the cognitive dissonance evoked by strong arguments and the assumed persuasive effect on other commenters that urge them to correct the perceived wrong. Our study thus offers support for applying assumptions derived from cognitive dissonance (Festinger 1957) to the content of individual comments in user discussions (Chen and Lu 2017;Kwak et al. 2020). Thus, it advances our understanding of why and how users express disagreement. It also adds weight to findings on cognitive and correcting motivations for participating in news discussions (Leung 2009;Springer, Engelmann, and Pfaffinger 2015) and provides a conceptualization of argument strength that can be further examined in future studies.
The differences between the platform types in our sample can be explained by two different factors: discussion norms, enhanced by guidelines and moderation practices, and technical features offered by the platforms (Reich 2011). The more explicit the rules for writing comments, and the more transparently they are enforced, the higher the overall level of critical reflection and reasoned disagreement. This contributes to the literature on the relationship between journalistic moderation and discussion quality, which has found that different moderation styles can promote discussion norms (Ksiazek 2015;Ruiz et al. 2011;Wright and Street 2007). Following our results, moderation can also enhance controversial discussions between users of different opinions and critical reflection. Moreover, platforms can promote reactions between users by implementing a reply function and a hierarchical discussion view that improve the visibility of argumentative structures (Arag on, G omez, and Kaltenbrunner 2017). These features enable users to read other users' comments and directly reply to them. From previous research, we know that participants and recipients of discussions in which rational-critical arguments are exchanged will increase their knowledge about different arguments and facts, strengthen their own opinion, and participate more often (Shah et al. 2015;Zerback and Fawzi 2017). Thus, news organizations that implement higher levels in discussion norms and technical features can expect a more vivid public discourse and less fragmented conversations (Shah et al. 2015). Increasing the visibility and the enforcement of discussion norms could further promote this goal, which could suit both the public value created by the discussions and the strategic interests of the news brands. Our results add evidence to the literature on the relationship between online journalism and its audience. From a normative perspective, better argumentation improves the overall quality of the discussions and can give journalists a better picture of what their audiences think about their work (Manosevitch and Tenenboim 2017;Reich 2011). Furthermore, it adds a new perspective on how journalists can identify valuable comments as sources for their work (Manosevitch and Tenenboim 2017;Reich 2011). Our results also indicate that some users are interested in a rationalcritical exchange of arguments, which provides new insights about the audience's motivations to engage in the discussion (Springer, Engelmann, and Pfaffinger 2015). From a strategic perspective, good discussions can be valuable because they can improve the sense of community and retain users to the site (Manosevitch and Tenenboim 2017).
Our results also fit the normative expectations from deliberation theory that the expression of arguments will lead to a rational-critical debate (B€ achtiger et al. 2009;Mutz 2006). Thus, they contribute to the literature on deliberation, showing that efforts to adhere to deliberative standards can have positive effects. This study is thus a step towards a better understanding of the relations between different deliberative criteria (Mutz 2008). The theoretical framework offers additional explanations about the internal and external processes of deliberation (Landemore and Mercier 2012). Our results also add evidence to the scientific debate on designing deliberative spaces, which has shown that different discussion designs can lead to different levels of deliberative criteria (Arag on, G omez, and Kaltenbrunner 2017; Peacock, Scacco, and Stroud 2019). We conclude that critical reflection and controversial debate can be enhanced by technical features. However, which overall effect these deliberative interactions have is still an open question. As Mutz (2006) argues, discussions with diverging opinions do not necessarily lead to outcomes expected by deliberation theory. Not all participants necessarily share deliberative values, and non-desirable forms of discussion (e.g. disrespect, anger) can turn the discussion towards destructiveness or bitterness. Previous research has identified discussions with missing respect and argumentation (Dori-Hacohen and Shavit 2013; Strandberg and Berg 2013). Research following up on the questions addressed here should not ask whether discussions are deliberative, but under which conditions people engage rationally and critically in discussions and how rationality and critical reflection are related to other deliberative criteria (Mutz 2008). Future research should also examine whether the same effects can be found in different discussion contexts such as social media platforms. As Barnidge (2017) argues, social media platforms promote the visibility of opposing arguments because they encourage different types of relationships and controversial discussion styles. At the same time, comments on social media platforms show lower levels of rationality and critical reflection (Esau, Friess, and Eilders 2017;Freelon 2015), which limits the probability of (reasoned) disagreement and could lead to more homogeneous discussion environments. Research should also include technical features and analyse if discussion norms are established and enforced. Finally, the conditions for (reasoned) disagreement cannot be generalized for different discussion contexts but depend on the specific circumstances.

Limitations
The results presented in this study are based on a large sample of user comment threads that were coded entirely and analysed on the level of reactions between users. However, there are some limitations to our results. First, due to the methodological approach chosen for this study, we only observed the comments that had been written, but do not know how many or which comments users have read before writing a comment. Thus, we do not know how many users decided against writing disagreeing replies to the comments. Second, the psychological effects studied here could not be tested due to the content analysis of actual discussions. Whether cognitive dissonance, presumed reach, and need for cognition lead to disagreeing answers should be tested in experimental settings. To this aim, user comments can be manipulated across different argument strengths and opinions. Measures of cognitive dissonance (Jeong et al. 2019), perceived argument strength (Taber and Lodge 2006), perceived relevance (Lu 2019), and presumed influence of counter-attitudinal comments (Gunther and Liebhart 2006) can be used to compare the relative explanatory power of the different approaches discussed in our study. Third, the validity of our models has some limitations, because rarely, several characteristics that were examined appear in the same comment. Thus, the distributions of several independent variables are skewed with an overrepresentation of zeroes. This problem cannot be solved in a non-experimental approach such as content analysis. Lastly, our sample is based on two specific issues. Although both issues are well-suited for studying rational-critical discourse due to their highly controversial nature, they might not be representative of all issues that are discussed in the media. Differences regarding the levels of rationality or controversy of the discussions can be expected compared to other issues. While the first discussion had a high potential for emotional language, the second was marked by a very detailed discussion of different macroeconomic models, but also of values of resource distribution. Future research could explore the differences between topics by taking into account the distinction between theoretical and practical discourses and the associated truth and rightness claims from deliberation theory (Wessler 2018).

Conclusion
This study made several contributions to research on user discussions. We combined normative assumptions about argument quality with empirically verifiable assumptions from previous research to explain disagreeing reactions comments receive in a discussion. We accounted for the reply structure in a discussion and showed how it can be analysed in an empirical model. Users that are interested in a rational-critical exchange of arguments can promote this norm by presenting well-reasoned arguments that other users can reply to. Our results also have implications for the practice of offering and moderating user comments. First of all, news organizations can decide upon specific criteria for desirable discourse and publish them in their guidelines. Also, displaying deleted comments can remind users of the rules and demonstrate that they are enforced. Therefore, reasons why comments are deleted should also be provided to help users understand which discussion styles are expected. Moreover, hierarchical discussion views and a reply function can be important tools to make the exchange of arguments more visible and encourage replies to other users. Thus, our results add new insights to the research about the audience's motivations and behaviour and the relationships between online journalism, its audiences, and public discourse in general. The discussions on news websites are controversial and some users are motivated to engage in a rational-critical exchange of arguments, which can be further encouraged by users and news organizations. This way, user discussions could become a space of improved mutual understanding and knowledge about arguments and facts. Note 1. To be classified as a thread visualization, all reply comments need to be visualizable (by default or by extending the thread) on the main page and beneath their original comments. On spiegel.de, reply comments were sorted chronologically, so that they did not appear next to their initial comments. On focus.de, if a comment received more than one reply, in order to read all reply comments, a new page had to be opened.