Discussion of Climate Change on Reddit: Polarized Discourse or Deliberative Debate?

ABSTRACT Studies of climate discourse on social media platforms often find evidence of polarization, echo chambers, and misinformation. However, the literature’s overwhelming reliance on Twitter makes it difficult to understand whether these phenomena generalize across other social media platforms. Here we present the first study to examine climate change discourse on Reddit, a popular – yet understudied – locus for climate debate. This contributes to the literature through expansion of the empirical base for the study of online communication about climate change beyond Twitter. Additionally, platform architecture of Reddit differs from many social media platforms in several ways which might impact the quality of the climate debate. We investigate this through topic modeling, community detection, and analysis of sources of information on a large corpus of Reddit data from 2017. Evidence of polarization is found through the topics discussed and sources of information shared. Yet, while some communities are dominated by particular ideological viewpoints, others are more suggestive of deliberative debate. We find little evidence for the presence of polarized echo chambers in the network structure on Reddit. These findings challenge our understanding of social media discourse around climate change and suggest that platform architecture plays a key role in shaping climate debate online.


Introduction
Whilst the scientific community has reached a near-unanimous consensus on anthropogenic climate change (Anderegg et al., 2010), it still remains a polarized issue for some (Harvey et al., 2018;van Eck et al., 2020), underpinned by two opposing logics of "skeptics" and "the convinced" (Hoffman, 2011). With the rise of the "network society," defined by Castells (2004, p. 3) as "a society whose social structure is made up of networks powered by micro-electronics-based information and communications technologies," online discourse, particularly on social media, is of particular interest. Schäfer (2012) highlighted two opposing camps on the impact of the rise of online and social media, labeling them "cyber-optimists," who perceive multiple benefits such as improved communication of science, improved societal communication, and enhanced user engagement and understanding, and "cyber-pessimists," who believe they may cause echo chambers (homogenous clusters of like-minded users where information (and misinformation) "echoes" round the group 1 ) and polarization (where views diverge and two (or more) opposing communities form), and a rise in users falling prey to misinformation. In contrast, others have suggested that online discussion forums may encourage deliberative democracy: providing opportunities for the civil exchange of views, the presentation and consideration of possible alternatives, and arrival at an informed opinion (Collins & Nerlich, 2015;Wright & Street, 2007).
Reddit, the self-proclaimed "front page of the internet" is a relatively little-studied social media site but is more visited than Twitter globally and in both the UK and the US (Alexa.com, 2021). The research value of Reddit as a data source arises from two key aspects. First, it offers an expansion of the empirical base for the study of online communication about climate changea 2019 systematic and critical review of the literature on social media and climate change found a substantial bias toward Twitter studies (Pearce et al., 2019). Second, the platform architecture of Reddit differs from many social media platforms (including the lack of user profiles, length of posts, themebased rather than follower-based information flows, and community moderation)all ways which might impact the quality of the climate debate (see Freelon, 2015).
Here we present the first analysis of climate change discourse on Reddit, giving a broad overview of the structure and content of climate change discourse on Reddit. We do this by investigating the community structure formed by the interactions of users, considering the topics of discussion, and investigating the sources of information referenced by users within their discussions. We also look at the crossover between these features of the Reddit climate discourse, by examining how topics and information sources vary between different user communities. Previous studies have highlighted that a deliberative democracy relies on a broadly informed public and a healthy ecosystem of competing ideas, and that polarization and echo chambers are therefore a threat to democracy (Sunstein, 2001). Similarly, a skeptical discourse and misinformation have likely confused the climate change discourse, increased existing political polarization, led to political inaction, and stalled support for or led to the rejection of mitigation policies (Treen et al., 2020). Throughout the paper, we are therefore particularly interested in whether the phenomena found in climate discourse on other social media platforms such as polarization, echo chambers, misinformation, and skeptical discourse are also present on Reddit, or whether it is more reflective of deliberative democracy in action. This study is intended to be a first step towards investigating climate change discourse on Reddit, identifying areas of interest upon which others may build in future research.

Literature review
A growing body of scholarship examines online discussion about climate change across a range of digital platforms, including major social media sites (Pearce et al., 2019), blogs (Elgesem et al., 2015;Sharman, 2014), and through studying user comments on climate-related news articles (de Kraker et al., 2014). Regardless of the platform, scholars often find evidence of polarization (van Eck et al., 2020), echo chambers (Walter et al., 2018), and misinformation, often linked with climate skepticism, denial, or alarmism (Treen et al., 2020). The presence of these characteristics in climate discourse have been found to have negative impacts such as political inaction (Brulle, 2014), stalled support or rejection of mitigation policies (Cook et al., 2018), and could even be considered a threat to a democratic society (Finkel et al., 2020). This section first outlines the empirical literature on social media and climate change, and then describes why Reddit is an important case in the study of online climate communication.

Climate change on social media
Social media sites are interactive sites where content is created and shared by and with virtual communities and networks of individuals. Harvey et al. (2018, p. 281) claim that Facebook, Twitter, and other social media outlets "provide powerful voices in the battle for public opinion" about climate change. Facebook has been accused of giving "implicit support" to climate change deniers (Stampler, 2019), and of hypocritically launching a new "Climate Change Center" to provide accurate scientific information whilst not taking down climate misinformation (Durkee, 2020). In late 2020 Twitter was prominent in the mainstream news for flagging several tweets by then US President Donald Trump as being disinformation (Dawson, 2020).
Despite this, there are few empirical studies investigating climate change polarization, skepticism, or misinformation on social media, and most of these are on Twitter. Williams et al. (2015) use network analysis to study Twitter users communicating about climate change and find that discussions about climate change on Twitter often occur in polarizing echo chambers, divided into "sceptics" and "activists," although they did find limited evidence of some networks where these two groups interact. Samantray and Pin (2019) investigate the effect of homophily on the level of polarization on Twitter and surprisingly find that the evolution of homophily over time negatively affects the evolution of polarization. Two studies use frame analysis to investigate climate change discussions on Twitter, both finding that the frames vary based on the context. Jang and Hart (2015) specifically look at frames that promote skepticism about climate change, and find they are prevalent in specific regional and political contexts, whilst Roxburgh et al. (2019) focus on three extreme weather events and find that the framing varies by event, with the characteristics of the event and the socio-political context at the time of the event being key factors in this. Jacques and Knox (2016) perform a summative content analysis of climate change denial discourses on Twitter to understand why individuals reject climate science. They find three major discourses related to climate science being a conspiracy, each of which focused on climate politics rather than science and reflected a belief that climate science is a "wholesale fraud." One study uses Facebook data: Bloomfield and Tillery (2019) select two prominent climate denial groups on Facebook to analyse how climate change denial circulates online. They find that these Facebook pages act as echo chambers, with misinformation circulating.

Reddit as a locus for climate debate
We investigate Reddit as its platform architecture is different in four distinct ways; and hypothesize how this may impact the quality of climate discussion (following Freelon, 2015). First, Reddit has an extremely generous character limit of 40,000 (Reddit.com, 2015), considerably greater than Twitter's 180 characters: greater post length may encourage more deliberative conversation. Second, Reddit is structured around themed topics called subreddits rather than a follower-based social network, and third, user information on Reddit is limited to the user handle only (i.e. unlike Twitter, there is no user profile where details such as location, interests, political views or affiliation may be found). Both of these factors may lead to more deliberative, topic-led discussion rather than values-led discussion. Last, user moderation is a key part of the Reddit platform, unlike Twitter. Freelon (2015) finds that platform design features are a strong predictoralong with users' left/ right issues stanceof the nature of the discussion that takes place and whether it reflects a deliberative democracy. The findings of previous studies of Reddit show mixed evidence as to whether Reddit's platform design makes it more or less susceptible to echo chambers, misinformation, and polarization. One analysis of political discourse on Reddit finds evidence of incivility, indicative of polarized discourse, and links to controversial media outlets associated with misinformation and also evidence of negative partisanship (Nithyanand et al., 2017). Another study identified Trump and Clinton supporters around the 2016 US presidential election and found more cross-cutting political interactions than within-group interactions, suggesting the absence of echo chambers (De Francisci Morales et al., 2021). An absence of echo chambers does not necessarily mean there is deliberative debate, however, echo chambers can undermine or inhibit deliberative democracy, and cross-cutting interactions do at least mean there is an open channel for information flow (Williams et al., 2015).

Overview
Since the Reddit dataset used here is large and complex, our research methodology focuses on computational techniques for the extraction and analysis of data. We use network analysis to find community structures based on many pairwise interactions between users; natural language processing to identify topics of discussion amongst the huge volume of posts and comments; as well as a variety of statistical techniques to determine the size and significance of observed patterns. The outputs from this computational work are then set in the broader socio-political context.

Data
Our study draws on two publicly available datasets of Reddit posts (BigQuery, 2019a) and comments (BigQuery, 2019b) during the period from 1st April 2017 to 30th June 2017. This was an important period in climate politics, leading up to the announcement of the US withdrawal from the Paris Agreement on 1st June and covering the month following the announcement. While there is a r/climatechange subreddit devoted to the issue, the vast majority of climaterelated discussion takes place in other relevant subreddits 2 (e.g. r/science, r/politics, r/environment, and r/worldnews). Therefore a keyword approach was used to identify all posts containing the words "climate change" or "global warming" across all subreddits. Whilst there are limitations to this approach and some relevant content may be missed, this is an approach commonly used in climate communications research (for example, Boykoff & Boykoff, 2004;Painter, 2013). All comments created in response to these posts were also included in the dataset, regardless of whether they included the chosen keywords. 3 This procedure led to a total of 18,558 posts and 267,147 comments from 93,850 users related to the issue of climate change (see Online Appendix A for a more detailed description of our data collection procedures, and Online Appendix B for Exploratory Data Analysis examining temporal trends and the size of posts and comments).

Detecting communities
Whilst Reddit is structured into subreddits that could be considered communities, one unknown was whether some users carried their discussions across subreddits and whether there were subcommunities within subreddits, and so a community approach naive of subreddits was taken. The first step in understanding the communities involved in the Reddit discussions on climate change was to construct a "reply network" in which two users are connected if one comment on an earlier post/comment by the other. Where the author of the post was showing as "deleted" (i.e. the user account has been deleted), these users were removed from the dataset. The resulting dataset was loaded into the Gephi 4 graph visualization tool, and a ForceAtlas2 layout (Jacomy et al., 2014) was applied. The in-built modularity maximization algorithm was applied to detect community structure (Blondel et al., 2008) (see Online Appendix C.1).

Measuring topics
Following previous scholarship on climate communication (Boussalis & Coan, 2016), we utilize a Latent Dirichlet Allocation (LDA) model (Blei et al., 2003) to identify the topics of discussion in Reddit posts and comments. LDA is a statistical approach to topic detection that identifies topics as groups of words that commonly co-occur in a set of documents. Each document is assumed to contain a distribution of topics (e.g. "climate science," "climate policy," etc.), while each topic is defined by a distribution of words (e.g. the "climate science" topic could include words such as "model" and "data"). Further details on text processing, as well as estimation of and outputs from the LDA model, are given in Online Appendix C.2.
The topics found by LDA are clusters of co-occurring words, which may be linked to thematic topics as commonly understood by people. As such, a two-step process of labeling and interpretation was applied to match the estimated topics to substantively meaningful themes. The 25 topics found by the LDA model were initially labeled based on the high-probability words associated with each of them. A validation exercise was then undertaken whereby the ten documents with the highest representation of each topic were manually examined and the accuracy of the label assessed. Where the documents were duplicated (e.g. a user posted the same text in reply to different posts or comments) or in a foreign language, the document with the next highest representation was additionally considered. In this way, three topics were identified as being "junk," and the topic labels for the remaining 22 topics were refined.

Measuring ideological bias in external sources of information
The sources of information (hyperlinked domains in the text) for the full dataset were identified as described in Online Appendix C.3. To investigate whether there is any ideological bias in the sources of information being shared, the top 25 domains being shared for the whole dataset were assessed using two published lists -MediaBiasFactCheck 5 (MBFC), representing political leaning, and "ClimateBiasScore" (CBS), representing environmental leaning (Cann et al., 2021). We apply a score to the MBFC ratings as shown in Figure 1 (see Online Appendix C.3 for definitions of the categories), whilst CBS is a score spread from −1.1 to +2, with zero being neutral and the most negative and positive scores being the most environmentalist and climate skeptic, respectively.

Communities of interest
Community detection on the complete reply network found 801 communities with high modularity of 0.764. Of these, 727 are isolated components, and 74 are communities within the giant component, two of which are only connected by a single interaction to the remainder of the giant component (See Online Appendix C.1). The network diagram for the remaining 72 communities in the giant component ( Figure 1) shows a high degree of connectivity between the communities, with Figure 1. The Categories of Bias assigned to domains by MediaBiasFactCheck (MediaBiasFactCheck.com, 2021), and the score assigned for this study to enable quantitative analysis. 4,055 edges. Previous studies on Twitter (Williams et al., 2015) showed that climate discussion on the platform was characterized by environmentalist and skeptic echo chambers; this pattern is typical of polarization, with opinions at two extremes and little interaction across the divide (see also Adamic & Glance (2005) who found a two-cluster network topology for US political echo chambers). Here we were looking for similar evidence, which would have appeared as clearly segregated, single-opinion communities on either side of the debate. This is notwithstanding the platform architecture of Reddit, in which the existence of subreddits creates a semi-structured interaction landscape; this still allows for the existence of echo chambers to be manifested as either subreddits or cross-subreddit communities in which all participants share the same view. There is little evidence of polarized echo chambers in terms of network structure; here, all the communities are interlinked within a single-cluster giant component. This does not preclude opinion-based polarization in the character of debate revealed by topic modeling and source usage. For example, a community focused on one politicized aspect of climate change and using only sources from one ideological position might be said to be biased towards one side of the debate; if there are many instances of this, with a bias towards either side of the debate, then the overall debate would be polarized (see below).
Analysis was undertaken to identify the relationship between the communities and the subreddits in which community members had authored posts/comments. The results of this analysis are shown in Figure D2 in Online Appendix D, with an extract of this for the six largest communities given in Figure 3. These six largest communities account for more than 30% of the total user population in this dataset (see Figure D1 in Online Appendix D). To keep the manual investigation of communities manageable, while also capturing a significant proportion of the total community, we Figure 2. Network of communities in the giant component; node size represents community size. The nodes are the communities, with the size of the node representing the size of the community. The six largest communities which we focus on in this paper are coloured lilac, green, black, blue, orange and pink. All remaining nodes are grey.
focus on these largest six communities, clearly visible on the network diagram ( Figure 2) in lilac, green, black, blue, orange, and pink.
It can be seen in Figure 3 that the members of the communities post in a number of subreddits. Often, but not always, there is a dominant subreddit (the analysis in Online Appendix D of the largest 25 communities found nine for which no single subreddit was accountable for more than 50% of the posts/comments). Manual inspection showed that two communities were largely formed around single posts which generated a large number of comments, three were formed around a larger number of posts that tended to come from one or two specific subreddits, and one was largely from a specific subreddit but with a single post within this being a major contributor (Table 1).

Topics of discussion
From the topic modeling 22 topics of discussion were identified. The topic labels and five representative topic words for each of these are shown in Table 2 below, with the full set of topic words and topic labels given in Online Appendix C.2.

Topic clusters
To help make sense of these topics and to further validate them, the Jenson-Shannon distance between the topics was calculated and plotted in 2-D space. This enabled the topics to be grouped into six "clusters" through a qualitative assessment, supported by k-means clustering (see Methods). For ease of visualization the area formed by the six topic clusters is manually created and shaded and a "cluster label" given, as shown in Figure 4. The similarity between topics, represented by how close together they appear in Figure 4, and further highlighted by the clustering, appears intuitively consistent. For instance, the "Global Warming, causes and impacts" cluster contains interlinked topics identified as the "Greenhouse effect," "Global warming," the "impact of Global Warming on the sea / ice / rivers," "Veganism," "Energy sources," and "Apocalyptic / post-apocalyptic fiction"for which one imagined apocalypse is that humans have destroyed the Earth through Global Warming. Similarly, the related topics of "Globalisation / Developing Nations" and "Economics" also form a cluster. Topics covering various forms of "Debate" are close together and form a cluster, as do two topics that discuss "US Politics" from two different angles. It is interesting that the "US Politics" topics are also close to a topic about UK-based discussion and UK politics discussions 60% Table 2. Full list of topic words and topic labels.

Dominant topics
An analysis of which topic(s) are dominant for each post/comment found that for the whole dataset, the three most dominant topics are "Incivil debate," "Denial / scepticism," and "US Politics with a heavy focus on Trump and Russia." The frequent words in the "Incivil debate" topic are suggestive of unfriendly language (fucking, bullshit, fuck) and name-calling (idiot, ignorant, asshole, stupid), with these features previously found to characterize discussion between skeptics and the convinced (Elgesem et al., 2015;Koteyko et al., 2013). Impoliteness has also been found to be a "hallmark of homophily" (Andersson, 2021) which can lead to echo chambers (Shin et al., 2017;Del Vicario et al., 2016) which can, in turn, exacerbate opinion polarization (Sunstein, 2007). The frequent words in the "Denial / scepticism" topic include deny, denier, denial, and skeptic. Although the words "alarmism"/"alarmist" don't appear hereor indeed in any topicthe word "catastrophic," which could be associated with an alarmist viewpoint (Risbey, 2008), does appear in this same topic. The third most dominant topic is suggestive of the politicization of climate change, with the names of Hillary Clinton, Donald Trump, and Obama, along with the terms "President," "Office," "Administration," "Leaders," and the words "White" and "House" all being present in this topic.

Topics by community
To understand which topics were most important within each community, all documents (posts/ comments) authored by a member of one of the communities of interest were labeled with the community ID, and the dominant topics counted across the posts/comments making up each community. Figure 5 shows the proportion of posts/comments for which each topic is dominant for each of the six communities of interest (columns) with the "background level" (i.e. the proportion of posts/ comments for which each topic is dominant across the full text) shown as a line on each chart for comparison. Each of the six communities of interest has a different set of dominant topics.
Community 33based largely in the Politics subreddithas "Denial/ scepticism" as its most dominant topic, with a high proportion of such posts and comments (3.7x background level), and "US and Paris agreement" as the second-ranked topic (3.0x background level). Community 396largely based on the two subreddits "Unitedkingdom" and "ukpolitics"has the topic "Policy particularly focused on UK politics" as its most dominant topic (3.2x background level). This is perhaps an unsurprising result given the subreddits around which this community is largely formed. Community 126largely based on a post criticizing Bill Nye's television serieshas media debate as its most dominant topic (2.9x background level).
Other strong results are seen for community 46a community largely based around a post on the "IAmA" subreddit by Bill Nyewhere "Scientific consensus" and "Personal reflection" are both dominant (both 1.8x background level). In community 54largely based around the "Politics" subreddit, and in particular, a post titled "Vice President Mike Pence says climate change is just an issue for the left"the "Identity-driven debate" topic was dominant (1.8x background level), the second and third most dominant topics being "US politics with a heavy focus on Trump and Russia," and "US Domestic Politics." This seems logical for a community dominated by the "Politics" subreddit and with a high proportion of its posts/comments being linked to a post about Mike Pence.

Sources of information
A useful way to characterize a community of Reddit users is by the sources of information that they reference in their discussions. Figure 6 shows the top-25 web domains based on the volume of references made to them (measured as embedded hyperlinks) in all posts/comments across the full dataset, excluding within-platform links to other Reddit content (domains reddit.com and redd.it). The top-25 domains account for 45% of all shared links in this dataset (excluding the Reddit links).
Across the full dataset, the most shared domain is Wikipedia, a free online encyclopedia created and edited by volunteers around the world (Wikipedia, 2021a). The next most heavily referenced domains are platforms with user-generated content (e.g. the social media platforms YouTube and Twitter), or other content-sharing sites (e.g. Imgur, commonly used for sharing memes and Smmry, a tool to summarize an article/piece of text into a specified number of (fewer) sentences). Only two of the top-10 shared domains have traditionally "expert-generated" or "trusted" content: The Guardian (British newspaper) and Nasa (US scientific organization). 6 The second and fourth most shared domains are YouTube and Imgur, for sharing videos and images respectively, showing the importance of videos and memes/imagery in the discussion. A recent study investigated memes in relation to the "conflicting logics" of climate change discourse and found that memes clearly demonstrate either a skeptic or "convinced" logic (Ross & Rivers, 2019), and so the high volume of links to Imgur might be suggestive of the presence of polarized viewpoints, although this was not investigated here. A number of mainstream media and scientific sources are also present in the top-25 domains, whilst the IPCCthe authoritative, global assessment of climate change informationcomes in at 35th.
An analysis of the sources of information shared by each of the six communities of interest is given in Online Appendix D.

Measuring ideological bias in sources of information
Political bias was investigated for the top-25 domains using The MediaBiasFactCheck (MBFC) rating. Ratings were found for 11 of the 25 domains, with five being assessed as "Pro Science," four as "Left Center Bias," and two as "Least Biased." Converting these to scores and weighting them based on the number of links gives an overall mean score of −0.230. The domains for which there were no ratings on MediaBiasFactCheck were primarily those for which there are multiple contributors with various political leanings, for example, YouTube, Twitter, WordPress, and therefore it is impossible to assign a single position.
Eight of the top 25 domains appeared on the CBS list for environmentalist/skeptic bias, with scores ranging from −1 to −0.4 and a weighted mean score of −0.616. Note that CBS excludes sites without editorial control, such as Twitter, WordPress, and Reddit as they may not have a single ideological position.
These two measures suggest an overall leaning that is somewhat left-wing politically and environmentalist in its climate perspective. Whilst it is recognized that there is a risk of bias with so many domains not appearing on the lists, it is still considered a worthwhile exercise to investigate the measures for the domains that are on the lists as those excluded from the lists are primarily domains that do not have a single political/environmental position.
The sources of information within each community were identified by filtering the set of source domains to only include posts and comments authored by a member of each of the communities of interest.
The weighted average MBFC and CBS scores for the top 25 domains shared by the six communities are plotted in Figure 7, with the horizontal axis representing the average MBFC score whilst the vertical axis represents the average CBS score. The error bars show the standard error of the mean. It can be seen that Community 133 is an outlier, appearing in the top right quadrant of the scatter plot (right bias and climate skeptic) when all other communities are in the bottom left quadrant (left bias and environmentalist). Consistent with the full dataset, the average MBFC score for Communities 54, 126, 46, and 33 are slightly left of center, whilst community 396 is neutral on average. Community 133, however, has a significant right bias on average. Most domains for most communities are either "Least Biased," "Pro Science," or "Left Center Bias," whereas Community 133, largely based around The_Donald, has 3 domains in the top 25 that are assessed as "Conspiracy -Pseudo Science" and a further 2 highlighted as "Questionable Sources"all right leaning. Several of these domains are associated with misinformation. This is consistent with the findings of previous research which assessed the URLs shared by six subreddits, selected for their propensity to share news URLs, categorizing them as either mainstream or alternative, and found that alternative URLs made up a much higher proportion of the URLs shared on The_Donald than on the other five subreddits (Zannettou et al., 2017). The average CBS scores tell a similar story, with most of the communities being slightly environmentalist, although less so than the average for the full dataset, whereas the average score for Community 133 is slightly climate skeptic. Given that Community 133 is largely formed around the subreddit "The Donald," a subreddit dedicated to Donald Trump, well known for his climate skeptic viewpoints, these findings are not surprising.

Discussion
Exploratory data analysis (Online Appendix B) revealed that there is lots of discussion about climate change on Reddit, with the issue arising in multiple contexts including politics, news and television programs, and celebrities. Whilst there were peaks related to specific events, such as Trump pulling out of the Paris Agreement, there is a continuous underlying level of discussion. Most posts and comments are fewer characters than the Twitter limit, so despite having a much more generous limit, the majority of posts and comments do not make use of it. There is however a very long tail for the character length of both posts and comments, such that the mean length of comments, and especially posts, are above the Twitter limit of 280 characters. This suggests a pattern with a few long posts and a large number of smaller comments.
Community analysis, based on the network structure of which users interact by responding to each other's posts/comments, found that communities are formed beyond the subreddit structure, with most communities comprising posts across a number of subreddits, albeit often with one dominant subreddit. There was a high degree of connectedness between the communities which is not what we would expect to see if there were polarized echo chambers as have previously been seen on Twitter (Williams et al., 2015).
Topic analysis suggests that discussions of climate change on Reddit are wide-ranging, covering a range of topic themes including global warming and its causes and impacts, US and UK politics, policy and regulation, world economics, and the scientific consensus, with debate in various forms also forming part of the discussion. Across the full text, incivil debate with its associated name-calling and unfriendly language was dominant in more posts and comments than any other topic, with skepticism/denial coming in a close second, and US Politics with a heavy focus on Trump and Russia in third rank. This is suggestive of a polarized discourse between skeptics/deniers and "the convinced" and also highlights the strong link between politics and climate change.
The topics of discussion were found to vary by the community with the most dominant topic being different between each community and the background levels seen in the full text. For some communities, the dominant topic seemed closely linked to the dominant subreddit or post around which the community was largely formed, for example, a community emerging around a single post about a particular television show had "media debate" as its dominant topic, whilst the community largely formed around two subreddits related to the United Kingdom and UK Politics had "Policy -particularly focused on UK Politics" as its dominant topic. One very strong result here was that "Denial / scepticism" was the most dominant topic (with 3.7x the proportion of posts and comments than in the full text) for community 33for which the Politics subreddit is the dominant element. This is perhaps reflective of the partisan nature of climate change, particularly in the US, with denial and skepticism being more strongly associated with right-wing politics.
Investigation of the sources of information used found that only two of the top-ten shared domains across the full text have traditionally "expert-generated"/"trusted" contentthe Guardian and Nasaalthough the most shared domain, Wikipedia, is considered by some to also fall into this category. The other domains in the top-ten most shared are all user-generated/sharing content sites. YouTube and Imgur are both in the top four most shared domains, highlighting the importance of video and memes/imagery in the discussion. The sources of information analysis showed some variation between communities, with Community 33 having a much higher number of links than the other five communities, and with a different profile of domains being the most shared as compared to the other communities. Perhaps the most stark difference in the sources of information being shared by the communities is the right political bias and climate skeptic bias of the domains being most frequently shared by Community 133largely based around the subreddit "The_Donald." This is suggestive of this community being an echo chamber, which is perhaps not surprising given that one of the rules of this subreddit was that any material that was critical of Donald Trump would be removed (Reddit.com, 2019). The domains being shared may also be indicative of the presence of misinformation.

Polarization or deliberative democracy?
One of the aims of this study was to investigate whether the phenomena found in climate discourse on other social media platforms such as polarization, echo chambers, misinformation, and skeptical discourse are also present on Reddit. While we do not explicitly test for polarization, we find a variety of suggestive evidence relating to this concept. Both the topics discussed and the sources of information being used suggest the presence of climate skeptic viewpoints, which are often linked with misinformation, as well as environmentalist opinions. However, there was also evidence suggestive of more deliberative debate, with topic themes found on a wide range of subjects and many topics suggestive of debate that is not incivil.
A community largely based on The_Donald subreddit was found to be a potential echo chamber with strong right-wing/skeptic views. However, in 2019, subsequent to the time period of this study, the The_Donald subreddit was quarantined and eventually banned by the platform in June 2020 for hate speech (nytimes.com, 2020). Other than the community formed around The_Donald, we find no further evidence of echo chambers on Reddit related to climate change. This is perhaps surprising given the evidence of polarization found elsewhere in social media, and the common cycle of polarization leading to homophily and echo chambers (Treen et al., 2020). However, similar findings were reported by a study on political interactions on Reddit, which showed that polarization around a highly controversial issue does not preclude the existence of cross-cutting political interactions and an absence of echo chambers (De Francisci Morales et al., 2021). Our findings here support the proposition that Reddit is less susceptible to echo chambers than other platforms.
In summary, we have presented the first look at climate change discourse on Reddit, expanding the empirical research beyond Twitter. We find evidence for polarization and climate skeptic viewpoints that are linked to misinformation, but no strong evidence of echo chambers other than a single rogue community that has since been banned. As described in the literature review, whilst the lack of echo chambers does not necessarily mean there is deliberative debate, we do find evidence that Reddit allows for the flow of information between polarized users, in contrast to other social media platforms like Twitter. Potential reasons for this are the subject-themes structure of user interactions, as opposed to networks that are built around users and social interactions, the longer-form text allowing for a deeper level of debate, and the level of moderation on Reddit.

Limitations
It is important to note that this study only covers a limited time period centered on a significant political event. Future studies could investigate whether these findings hold across longer time periods, as well as around specific events. A further important point relating to the overarching methodology of this study is that it is based on computational analysis rather than qualitative analysis; this approach is based on statistical regularities and may miss nuances and subtleties of the discourse. Furthermore, the keyword-based data collection contained a lot of non-climate-themed material, and conversely may have missed some climate-themed material where the chosen keywords were not used in the initial post.

Future research
Our research offers a first step in exploring climate change discourse on Reddit and we hope that others will build on this to further investigate this important subject. In particular, future studies could investigate whether Reddit is a locus of deliberative debate, and if so, what features of Reddit contribute to this. Freelon (2015) finds that two significant predictors of whether the discussions on a platform will be deliberative or not are the users' left/right issue stances and the design features of a platform, with the presence of moderators being a factor within this. In the case of Reddit, the standpoint of each subreddit is likely to be reflected in its user base, and the different rules and guidelines and moderation levels within each subreddit almost create a series of "mini-platforms" with different features and operation, albeit where users canand dopost across these "mini-platforms" as we see in our findings. Similarly, Wright and Street (2007) highlight that the format and operation of online discussion forums play a key role in whether the discussions are reflective of deliberative democracy or polarization.
The analysis of information sources also provides direction for future research activities. First, Wikipedia was the most shared domain, suggesting there is a high level of climate change content on Wikipedia. Yet only one previous study specifically investigates this topic (Esteves Gonçalves da Costa & Cukierman, 2019) for Portuguese-language pages of Wikipedia. Interestingly, climate change articles were the focus of a project to develop a platform to provide analysis and visualization of controversies in Wikipedia articles, as it is one of the most intensely discussed topics on Wikipedia (Borra et al., 2014). Future studies could consider climate change debate on Englishlanguage Wikipedia. The finding that YouTube and Imgur are two of the most shared domains shows the importance of visual content, in the form of both videos and memes/images, in the climate change debate. A few studies have been carried out into climate change on YouTube (e.g. Andersson, 2021;Shapiro & Park, 2015Uldam & Askanius, 2013). However, these mostly focus on user comments in response to videos, rather than the content of the videos themselves. A potential direction for future research is YouTube contentboth the visuals and the audio content. No previous research was found specifically looking at climate change memes on Imgur, although the broader topic of climate change memes has been the subject of some recent research (e.g. Ross & Rivers, 2019;Zhang & Pinto, 2021) and may be worthy of further investigation given the prevalence of these in the sources of information for climate change discussion on Reddit.

Disclosure statement
No potential conflict of interest was reported by the author(s).