Whose ideas are worth spreading? The representation of women and ethnic groups in TED talks

ABSTRACT We investigate the representation of women and ethnic groups in TED talks, which reach a large online audience on YouTube with science-related content and topics on societal change. We argue that gaps in representation can create a misleading perception of science and the respective topics discussed in these talks. We validate annotations from an image recognition algorithm for identifying speaker ethnicity and gender to compile a data set of 2333 TED talks and 1.2 million YouTube comments. Findings show that more than half of all talks were given by white male speakers. While the share of women increased over time, it is constantly low for non-white speakers. Topic modelling further shows that the share of talks addressing inequalities which affect both groups is low, but increasing over time. However, talks about inequalities and those given by female speakers receive substantially more negative sentiment on YouTube than others. Our findings highlight the importance of speaker and topic diversity on digital platforms to reduce stereotypes about scientists and science-related content.


Introduction
With the digitization of human interactions affecting societies all over the world, digital platforms play an ever-increasing role for the creation and distribution of information. People consume information and entertainment content on social media platforms, online news and videos, which heavily influences public opinion and shapes social identities. Similar to traditional media sources, it does not only matter what kind of information is distributed in the digital sphere but also who is allowed to spread content. Despite the benefits of digitization, there is increasing concern about the level of digital democracy and whether audience concentration might lead to a new form of discourse elitism online. The important question therefore arises how disadvantaged groups across societies are affected by this development and to what extent they are represented in the digital sphere. While a large body of literature is available on representation and negative dispositions towards these groups in traditional media and entertainment outlets (Bleich, Bloemraad, and de Graauw 2015;Shor et al. 2015), little research has been done with special focus on digital platforms.
In our paper, we investigate the representation of women and ethnic groups in TED talks. We focus on TED talks as a substantial part of the digital sphere because they reach a very large audience with science-related content and often discuss important matters of societal change. To do so, we build our theoretical argument on the literature about descriptive and substantive representation from the field of political science (Mansbridge 1999;Dovi 2002). A group is thus considered to be descriptively represented if a sufficient number of its members are part of a system of interest. In our case, to what extent women and ethnic groups are allowed to give talks and represent themselves on a global stage like TED. In contrast, substantive representation is achieved when needs and issues of groups are addressed, e.g. when topics important for women and ethnic groups are discussed in TED talks. Gaps in both forms of representation can create a misleading perception of science and the respective topics that are discussed in these talks. Our first research question is therefore: RQ1: To what extent are women and ethnic minorities descriptively and substantively represented in TED talks?
Furthermore, we examine how digital audiences respond to the representation of women and ethnic groups, as this allows us to gain insights about the public perception of these groups and the related topics. An important feature of most digital platforms is their responsive feedback systems, where users can give positive or negative feedback on content through digital interactions such as commenting, sharing, liking or disliking. Unlike traditional newspaper articles, most online media are thus not consumed in isolation (Scheufele 2018). Comments from other users can have a decisive impact about what someone thinks about an article or video in the first place. Through these social cues, a neutrally framed article can, for instance, be perceived as heavily biased only due to viewers commenting behaviour (Anderson et al. 2014). Such behaviour has been observed across social groups and socio-demographic backgrounds, but highly social individuals in particular tend to be more affected by such developments and filter bubbles (Bar-Gill and Gandal 2017). On YouTube, positive or negative comments about TED talks and speakers could thus create polarization or amplify prejudices or stereotypes towards certain social groups or topics. Ultimately, such behaviour can create feedback loops in which users adopt the predominating opinion of others (Rothschild and Malhotra 2014). Our second research question is thus a more exploratory one, namely: RQ2: How do descriptive and substantive representation of women and ethnic groups in TED talks affect viewer sentiment on YouTube?
To measure descriptive representation of speakers, we utilize and validate annotations of an image recognition algorithm. The algorithm detects faces within images of speakers and assigns probabilities for sex and ethnicity based upon physical appearance. Results of our analysis show that more than half of all TED talks were given by white male speakers and, while the share of talks by women increased from 2006 to 2017, it is constantly low for non-white speakers. To analyse substantive representation, we apply structural topic modelling on the transcripts of all TED talks to identify content in which needs and issues of women and ethnic minorities are addressed (Roberts et al. 2014). We identify a topic about inequality discussions which are relevant to both groups. Overall, this topic constitutes only 3% of talk transcripts, but the trend is increasing over time. Furthermore, sentiment analysis of YouTube comments suggests that the public sentiment is positive for non-white speakers, but negative for talks about inequalities, such as violence against women or racism. Talks given by women also receive substantially more negative and to some extent hateful comments than those given by men. We discuss how our findings highlight the importance of speaker diversity on global digital platforms to reduce stereotypes about science-related content.

Media effects and representation in the digital sphere
Representation in digital media such as TED talks is important as a large body literature suggests that media consumption affects public opinion (see Tewksbury and Scheufele 2009;Valkenburg, Peter, and Walther 2016). In that regard, three theoretical concepts are particularly relevant (Scheufele and Tewksbury 2007): Agenda-setting basically describes how media influences the salience of certain topics in public discourse (McCombs and Reynolds 2009). Priming is then considered to be an extension of agenda-setting (Scheufele and Tewksbury 2007), which connects news content with certain benchmarks for evaluation that can result in a change of standards that people use to make assessments of particular topics (Iyengar and Kinder 1987). Framing on the other hand illustrates how public opinion is influenced through the way certain information is being presented. As such, frames define problems, they identify their causes, they render a moral judgment and suggest potential solutions for them (Entman 1993;Scheufele 1999).
Users consuming media in the digital sphere are thus influenced constantly through these mechanisms. For instance, if users of social media channels like Facebook are exposed to a certain set of political information only, it raises their perceived importance of these policy issues and it ultimately affects what people will think about (Feezell 2018). Priming of certain aspects by political elites in public debates, e.g. fake news, also influences citizens in their way of how they evaluate news media as a whole (van Duyn and Collier 2019). Priming can also subconsciously evoke certain feelings like aggressiveness (Buchanan 2015) and quite often it appears that biased content is most popular (Peer and Ksiazek 2011), as users become more engaged in user-content interaction when content is popular (Ksiazek, Peer, and Lessard 2016). There is also evidence for stronger user engagement in political campaigns when candidates tend to attack their opponents more often (Xenos, Macafee, and Pole 2017). And finally, framing not only shapes how we think about certain issues but also what people share with others on social media platforms (Valenzuela, Piña, and Ramírez 2017) or whether and how strongly they become engaged in political campaigns (Pond and Lewis 2019).
Media effects are subject to a multitude of issues like societal norms and values, pressures through interest groups or the ideological orientation of journalists and social media personalities (Tewksbury and Scheufele 2009). Related to that, a crucial factor is the question who actually communicates information in the media. As Merskin (2017Merskin ( , 1098 argues, mainstream media 'are powerful hegemonic tools' that enforce cultural reproduction of dominant ideologies, e.g. in terms of race and gender, and also regarding economic or social values. If then representation of certain groups of society in the media is either unbalanced, inaccurate or stereotypical, it not only negatively shapes the perception of how non-minority members think about such groups and their interests but also how these groups themselves think about their own group. It is therefore both the visibility and the nature of the visibility of groups in the media that matters about how we perceive certain phenomena in society (Merskin 2017).
When it comes to the representation of disadvantaged groups in traditional media outlets, e.g. in newspapers and television, a large number of studies already investigated outcomes for women and ethnic minorities. For instance, scholars analysed media coverage of migrants and minorities to see what kind of information is presented, who is allowed to participate and how minorities are represented (Miller 2006;Bleich, Bloemraad, and de Graauw 2015). Many similar studies have been conducted about the representation of women, e.g. in radio, television and film (Fonda, Morgan, and Steinem 2017) as well as in newspapers (Harp, Bachmann, and Loke 2014) and other printed news (Shor et al. 2015). The overall image depicted by studies about these groups is clear: both women and certain ethnic groups often suffer from under-representation and coverage that enforces stereotypes and negative attitudes.
Scholars also started to examine whether similar patterns exist for the increasingly available share of digital content, but the body of available work is still small in comparison. Jia et al. (2016) collected visual and textual data of online newspapers to analyse how gender is represented in online news. They found that women were more likely to be represented visually than they were mentioned within texts and that online news sources are still male dominated. Other scholars found that social media platforms, such as YouTube (Guo andHarlow 2014), Facebook andTwitter (Matamoros-Fernández 2017), are used to spread content enforcing stereotypes or xenophobic attitudes about ethnic minorities. However, while these studies provided important insights, they do not enhance our knowledge about the connection between representation and the interaction with the digital sphere, i.e. in comment sections of blogs, web pages or social media platforms. Qualitative work by Sobieraj (2017) about digital sexism shows that studying these interactive environments is crucial to make sense of how the public perceives digital content. This interactive part of digital spaces enables users to give positive or negative feedback on content by digital interactions such as commenting, sharing, liking or disliking. One key contribution of this paper is thus not only to study representation in the digital sphere (RQ1) but also the public sentiment about it (RQ2).
For our theoretical concept of representation, we draw upon terminologies from the large body of political science literature about representation in political systems. A group is thus considered to be descriptively represented if a sufficient number of its members are part of a system of interest (Pitkin 1967;Mansbridge 1999;Dovi 2002). As a simple example, the gender distribution in most societies is close to even, which makes it desirable that half of all members of national parliaments are women. In comparison to representative political systems, it is often difficult to evaluate whether the descriptive representation of certain groups is adequate in the digital sphere. Political systems usually have well-defined populations of people for which they are responsible, such as all residents of a country. For digital platforms, this is not generally true. Web pages like that of TED media or social media platforms like YouTube reach global audiences and distributions of their socio-demographic attributes are often unknown. Thus even though it is important to what extent women and ethnic groups are allowed to give talks and represent themselves on a global stage like TED, it is also difficult to agree upon what enough representation is. Nevertheless, members of disadvantaged groups need to be represented in the digital sphere, as under-representation can lead to the same issues that scholars found in traditional media sources.
Moreover, substantive representation assesses system responsiveness, a normative ideal of democracy. For instance, in an electorate with a high share of citizens of immigrant origin, their needs and interests should find more consideration in the activities of their representatives (Dahl 1971). In our case, this applies when topics that are important for women and ethnic groups are discussed in TED talks. TED speakers are no elected representatives, but it is still crucial whether they address the needs and interests of women and ethnic groups on the TED state. In that regard, the substantive representation of groups can be connected to and affected by descriptive representation, but not necessarily so. Often, it is believed that a 'critical mass' or 'critical actors' can be sufficient (MacDonald and O'Brien 2011; see also Celis and Childs 2008;Celis and Erzeel 2015). Group members are more likely to be substantively represented by people that look alike and have similar needs. In this paper, we make use of both concepts for representation and apply them to studies of the digital sphere. We refer to descriptive representation as the presence of women and ethnic groups on the TED stage as speakers. As for substantive representation, we examine whether needs and issues relevant to women and ethnic minorities are addressed in the content of TED talks. Being invited for a TED talk gives speakers the possibility to spread content that is important for them and we therefore assume that descriptive representation in the digital sphere also fosters substantive representation.
Much in the way of the thermostat model (Wlezien 1995;Soroka and Wlezien 2010), public sentiment is likely to react according to how well certain groups of society are represented both in terms of their physical presence and the topics that concern them the most. As this relationship is heavily amplified through media attention (Williams and Schoonvelde 2018), we will use an exploratory approach to investigate reactions towards descriptive and substantive representation. It is possible that stronger representation of certain ethnic groups can also increase negative sentiments among some members of society. For instance, women and ethnic minorities often suffer from hate speech or other forms of discrimination in online environments and digital media outlets (Chetty and Alathur 2018). Thus it is likely that we can observe similar patterns for TED Talks, too. In general, we expect that stronger representation of women and ethnic groups will also trigger reactions in public sentiment.

Digital democracy and discourse elitism
Based on the premise that 'meaningful democratic participation requires that the voices of citizens in politics be clear, loud and equal' (Verba, Schlozman, and Brady 1995), many expected the Internet to become a powerful tool to democratize communication and politics, an 'army of Davids' against big media and political elites (Reynolds 2006). The Internet, so the argument of many early journalists, commentators and scholars, would empower citizens to participate in online communities that create content collaboratively, thereby diminishing the influence of traditional media juggernauts (Benkler 2006). And indeed, there is evidence that the Internet has increased the potential for citizens to make meaningful and widely heard contributions online, or that so-called 'burglar alarm' models of reporting misdemeanour and wrongdoing become more effective (Zaller 2003;Hindman 2008). However, there is growing concern about the online establishment of new forms of discourse elitism.
According to Matthew Hindman, the infrastructure of the web does not decrease centralization of news content, but often it actually increases centralization to levels way beyond traditional news outlets in the offline world (Hindman 2008(Hindman , 2009(Hindman , 2018. This is true because of four reasons: first, online traffic and the link structure of the internet follow a power-law distribution, in which few sites receive the bulk of visits and most sites remain fairly untouched (Huberman et al. 1998). Thus most people get to see similar content and the potential of immensely diversified information is never reached. Much of this has to do with the way citizens search for information online. Second and against popular conviction, even digital natives are not universally savvy online (Hargittai 2010) and many people rely on well-known and long-established procedures to browse the web . Such skill-based habits of use can then induce cognitive lock-in, i.e. that the repeated use of products or sites online creates cognitive switching costs which increasingly prohibit the user from looking for potential alternatives (Johnson, Bellman, and Lohse 2003;Murray and Häubl 2007). Third, 'digital distribution is never free' and 'costs of audience building' are huge (Hindman 2018, 167). Unlike traditional media, it is not the production, but rather the distribution of information that is so expensive. Digital survival hinges upon one's stickiness, i.e. the ability to attract users and make them return to your website over and over again (Hindman 2018). Large online players have therefore a considerable advantage, as they can rely on economies of scale in terms of staff, equipment, speed and especially data that can be used for personalization through recommender systems (Sundar et al. 2015;Valkenburg, Peter, and Walther 2016). That way, large sites can direct users much faster to their desired content than any small niche site could do it. And finally, those who get heard in the online sphere are not representative for average citizens, but to a substantial degree 'well-educated white male professionals' (Hindman 2009, 128). For the USA, it has been shown that most successful blogs are run by educational, business and technical elites or journalists from traditional media outlets with Ivy League degrees (Hindman 2009). Overall, digital activism is strongly affected by ethnicity and class (Schradie 2018). Entman and Usher (2018) further extend this line of thought by stressing the importance of platforms, algorithms, digital analytics, ideological media and rogue actors in framing news content and ultimately shaping public opinion. Such developments can become alarming if the content that is distributed through a small number of big players represents the opinion of a particular interest group only. Spreading lopsided arguments to large audiences can tilt public opinion in one way or the other. For instance, if a news outlet predominantly hosts neo-liberal economists and politicians to discuss major topics like the Euro crisis or the refugee crisis, some solutions might be presented as if there is no viable alternative. User and viewer opinion could thus be pushed into a certain direction, whereas commenting behaviour can add to that effect and increase polarization among viewers. Unfortunately, the growing presence and consumption of soft news, which present news in episodic frames without additional context and often in personalized manners, pulls viewers' opinions heavily towards one side or the other (Ter Wal 2002;Baum 2004;Boukes and Boomgaarden 2015). Thus it is particularly important to study the representation of disadvantaged groups like women and ethnic minorities in successful online media outlets such as TED talks. Furthermore it is relevant to examine how topics relating to these social groups are represented in the online sphere and how viewer responds to such topics.

Why study TED talks?
Studying representation of social groups in digital media is empirically challenging due to a huge number of potentially relevant data sources. Depending on research motivations, this can make it difficult to assess reasons for choosing a particular source over another. We consider TED talks as a particularly interesting case, because they introduce sciencerelated content to the public and in doing so they reach a very large online audience. TED talks are part of conferences organized by the media organization TED: Technology, Entertainment and Design, which was founded in 1984. Talks of TED conferences are distributed for free across several platforms under the slogan Ideas Worth Spreading. Invited speakers were scientists like Stephen Hawking and entrepreneurs like Bill Gates, as well as some activists and entertainers. In general, the populist nature of TED talks (Tsou et al. 2014) can be characterized by a sales pitch atmosphere, passionate styles of delivery and hints at feelings of self-actualization and inspiration (Ludewig 2017). For this reason, some scholars consider TED talks as a source of information for the masses rather than for scientists . Speakers giving TED talks also present relatively few counter arguments to their proposed ideas and thus potentially undermine alternative viewpoints (Singh Chawla 2016). Moreover, TED conferences are often considered as elitist events, restricted to those who can pay significant entrance fees. To give one example, attending a 2018 TED conference in person was priced at around $10,000, which results in a very special audience (Turnaround Management Association 2017; Schwartz 2018). While potentially any person can be suggested as a speaker via a nomination form on the TED website, it is unknown how TED exactly selects the speakers for its elitist stage.
One reason why we analyse TED talks in this paper is the significant size of their audience. Talks are available at the TED homepage, via several mobile applications and most importantly the YouTube video sharing platform. In 2012, TED celebrated 1 billion views for videos uploaded on its own web page, stating that talks were being viewed at 1.5 million times a day. In April 2018, Socialblade Analytics ranked TED's YouTube channel at top 250 global, as it attracted more than 9 million subscribers and received more than 1 billion views in total.
TED talks also get covered regularly in global publications like The Guardian (Giussani 2015) and The New York Times Magazine (Dominus 2017). Moreover, as the predominant language of talk content is English, TED provides transcripts in more than 100 languages. For most YouTube videos of talks, captions are also available in several languages. Regarding scientific coverage, only few studies investigated TED talks. One study examined presentation characteristics of TED talks and concluded that they can be a valuable source for teaching and communicating ideas to students (Kedrowicz and Taylor 2016). Other scholars found that women gave fewer TED talks than men . In sum, TED talks reach a very large audience and provide content in many languages, which makes them a relevant case for analysing representation in the digital sphere.
Another reason for focusing on TED talks is that they are considered as a 'highly successful disseminator of science-related videos' (Sugimoto et al. 2013, 1) and the content of talks is often related to important matters for societal change. Their popularization of science and digital content about Ideas Worth Spreading draws attention from people across several societies. Due to the large audience, under-representation of certain groups on the TED stage is particularly problematic. If, for instance, TED talks would predominantly be presented by white male speakers, this could lead to the development or amplification of stereotypes among the audience. Decades of Draw-A-Scientist studies have shown that as children get older and stereotypes manifest, they predominantly associate science with men (Chambers 1983;Miller et al. 2018). Other work by scholars further suggests that the development of stereotypes (Shor et al. 2015), negative attitudes (Van Klingeren et al. 2015) and lowered self-esteem (Martins and Harrison 2012) can at least in part be ascribed to media exposure. It is therefore important for the global audience of TED that women and ethnic groups are adequately represented descriptively and substantively to not further enhance stereotypes and negative attitudes.

Data and methods
To analyse the representation in TED talks, we first applied web scraping techniques to retrieve content about talks and corresponding speakers from the TED homepage. Content from the TED website is distributed as creative commons licensed material. We collected information for the very first talk given in 2006 up until the time of data collection in May 2017. Second, we used YouTube Data Tools (Rieder 2015) to find the related YouTube videos for these talks and collected meta data about video metrics such as dislikes and video views. Subsequently, we also retrieved all 1.2 million available user comments for the YouTube videos and used a combination of approximate string matching and manual coding to merge data from TED and YouTube. In this process, we noticed that if talks appear on the TED page, they will generally also be available on the main YouTube channel, regardless of whether they are conventional talks, from the TEDx subbranch, or from specific events like TEDWomen. Our sample thus includes all talks that TED considers to be relevant enough for its main channel. Furthermore, we restricted our sample by removing a small number of talks that do not contain language (e.g. music concerts or dancing), or without human speakers as main presenters (e.g. an entertainment talk about Einstein the Parrot).

Descriptive representation: speaker gender and ethnicity
Capturing attributes of thousands of speakers by manual coding is a resource draining task. We therefore decided to utilize an automated approach in form of an image recognition algorithm. As facial recognition software is increasingly deployed by companies and government agencies (Dearden 2018), scholars have recently started to analyse their performance. They found that face recognition algorithms by Microsoft, IBM and Face++ can produce substantially biased results by identifying the sex of darker-skinned females less accurately in comparison to other skin tones (Buolamwini and Gebru 2018). In this work, we use the Kairos diversity recognition software for gender and ethnicity annotations. In 2018, the company released a statement about restricting access to these features in light of concerns about face recognition systems and their consequences for privacy and safety (Brackeen 2018). As of February 2019, gender and ethnicity annotations are still available via the Kairos Application Programming Interface (API). The Kairos API detects faces within images of speakers and assigns corresponding probabilities for sex and ethnicity based upon facial features. Our data set contains images of all speakers publicly available at the TED homepage, which we used as input for the diversity recognition algorithm. Figure 1 shows the annotation output for two publicly available example images of TED speakers.
Depicted on the left-hand side is Kimblerlé Williams Crenshaw, a scientist and full-time professor who gave a TED talk about race and gender bias. The famous entrepreneur Bill Gates is a regular TED speaker and depicted on the right-hand side. In both cases, the algorithm correctly annotates the sex of the speakers. Regarding ethnicity, it is only possible to differentiate between Asian, Black, Hispanic, White and a category for ethnic groups other than those mentioned. In both cases, the mode prediction of the algorithm is in line with labels assigned by human coders (see supporting information S1). We automated the process of sending each image to the image recognition service and then merged all annotations with our data. For ethnicity, we always coded the mode prediction of the algorithm as final value. As can be seen in supporting information S1, the majority of the probability mass for each ethnic group lies at either 0 or 1, which is why using the mode does not result in a noticeable information loss. To evaluate algorithmic performance, we conducted several validity checks and computed inter-rater agreements between two human coders and the algorithm for 200 randomly selected images of TED speakers. For all images in this sample, we computed Fleiss' Kappa for pairwise and overall ratings between two human coders and the algorithm (see supporting information S1). In total, Kappa values are satisfying, with 0.85 for ethnicity and 0.95 for sex. For the ethnicity categories Hispanic and Other, agreement is lower in comparison to the remaining categories. Qualitative inspections revealed that the small number of disagreements for these categories cannot solely be ascribed to poor algorithmic performance as both, pairwise comparisons between human coders as well as between humans and the algorithm were not always in line. For instance, some speaker images were labelled as White by one rater and Hispanic by another rater or the algorithm. Overall, our validation results nevertheless suggest that annotations from the image recognition algorithm are sufficient for our research task. We also conducted additional analyses for which we treated human annotations as the gold standard. In doing so, we assessed the performance for predicting ethnicity and gender with F1 scores and compared predictions of the image recognition algorithm with predictions based on the names of speakers (Imai and Khanna 2016;Wais 2016). Results indicate that the image recognition algorithm outperforms name-based approaches for both ethnicity and gender (see supporting information S1).
One issue with our data is that a small number of TED talks were given by groups and not by single persons, which makes it difficult to analyse descriptive representation when attributes like ethnicity are not identical across group members. We therefore identified talks with more than one speaker and only included these talks in our data if all members have identical annotations for sex and ethnicity, which reduces our final data set to 2333 TED talks.
Regarding descriptive representation, we need to highlight one important limitation at this point: algorithms such as the image recognition algorithm obviously neglect the fact that gender is a non-binary construct. What kind of labels an algorithm assigns may not be in line with the gender a person identifies herself with. The same applies for race or ethnic groups, which are also not objective categories, but rather social constructs that may vary across cultures. With that in mind, we still believe that our (over-) simplified indicators are useful to examine the representation of women and ethnic groups.

Substantive representation: content of TED talks
Ideally, as TED talks are available as videos, substantive representation of women and ethnic groups could be analysed in both, visual and audio data. Scholars have recently started to work on signal processing for audio data (Knox and Lucas 2018), but models for video content that are useful for social scientists are still to be developed. For this reason, we focus on the transcripts of TED talks to examine substantive representation, as we assume that what speakers say in their presentations is the most important way to address the needs and issues of disadvantaged groups. We chose the English language because TED talks are almost exclusively given in English and it is also the language for which the most transcriptions are available.
To analyse substantive representation, we apply topic modelling on all talk transcripts. Topic models are a method for automated content analysis. They allow to discover latent themes from text documents, where a topic can be understood as a set of words representing these themes and documents are represented as mixtures of topics. To provide an example for this paper, after fitting a topic model, a talk transcript might contain content related to a topic about technology, consisting of words like computer, machine, device and algorithm with a proportion of 60%. In addition, 30% of its content could capture a topic internet with words like internet, online, website, link and the remaining 10% would include other topics. To prepare our textual data for topic modelling, we used the programming language R (R Core Team 2018) and the corresponding packages quanteda (Benoit 2018) and tidyverse (Wickham 2016) to process all transcriptions into a corpus with common methods of text analysis. Texts were treated as bags of words in which each term represents a feature and word order is ignored. In addition, terms without semantic meaning such as words like and or the and very infrequent terms were removed from the corpus. We further applied a stemming algorithm to all terms, so that words with similar semantic meaning, such as inequality and inequalities, get reduced to their common word stem inequ (Grimmer and Stewart 2013).
As we expect that substantive representation of disadvantaged groups can be affected by descriptive representation and developments over time, we utilize a novel variant of topic models called the Structural Topic Model (Roberts et al. 2014). Structural Topic Models enable us to not only examine topic proportions for each TED talk but also to analyse how these proportions vary dependent on the date of the TED talk, speaker gender and speaker ethnicity. Although topic modelling is very useful to automatically categorize large text corpora, one limitation is that the number of topics has to be determined by the analyst. To find the best model for our research goals, we utilized the R package stminsights (Schwemmer 2018) to qualitatively inspect and validate several models. More details about our validation procedure are available in the Supporting Information S2. The validation showed a model with 30 topics to be superior to others, both in terms of desirable statistical properties as well as its usefulness for our research task. We therefore chose this version as our final model and assigned labels to each of its topics.

Public sentiment on YouTube
To examine our second research questionwhether descriptive and substantive representation affect viewer sentimentwe ( i) apply sentiment analysis on all 1.2 million YouTube comments of TED talks and ( ii) count the respective (dis-) likes of each video. Sentiment analysis describes methods to measure people's opinions, sentiments, attitudes and emotions from language (Liu 2012). As it pertains to all sentiment analyses (e.g. Tsou et al. 2014), our results apply to those who comment on YouTube, which means we cannot infer on the sentiment of people who remain silent. Nevertheless, YouTube comments appear below the corresponding videos and therefore visible to viewer regardless of whether they are commenting themselves. It is therefore reasonable to assume that the comment sentiment of TED talks is likely to affect many viewers that are not sharing their opinions on the platform. While there are different ways to conduct sentiment analysis, most of them rely on the use of dictionaries. With the aim to detect whether people speak in a positive or negative way about something, polarity dictionaries created by researchers usually include word weights. These weights are then used to produce an overall sentiment value for each text document. Although sentiment analysis is a commonly used instrument for analysing social media discourse, it is associated with a number of limitations (Puschmann and Powell 2018). Among other issues, sentiment analysis is constrained by two problems in particular: first, results strongly depend on the dictionary in use. A dictionary created by researchers for examining moods in short social media messages like Tweets is unlikely to properly capture the mood in, for instance, political speeches. Second, most implementations for sentiment analysis are very simple and do not account for basic features of human language like valence shifters. Valence shifters like not ('I do not like it') or really ('I really like it') are commonly used in spoken and written language to alter the original meaning or sentiment of words.
In our paper, we apply a novel, sentence-based variant of sentiment analysis from the sentimentr package (Rinker 2018) that combines established dictionaries for polarity terms (Hu and Liu 2004;Jockers 2017) and valence shifters. Moreover, this implementation also accounts for the use of internet slang and emoticons, which frequently occur in social media data. For each TED talk, we calculate the average sentiment value over all corresponding YouTube comments. 13 YouTube videos of TED talks were not included in the analysis as comments on these videos either did not yet exist or were disabled on the platform. Overall, the average comment sentiment across all TED talks is at 0.06 and the median is at 0.02. The following examples show two YouTube comments with very positive or negative sentiment values: . 'epic epic epic epic epic epic epic epic epic epic epic epic' (sentiment score: 2.77) . 'wow, sir I am really really inspired by your speech every thing what you explained was awesome but the last minutes of your speech in this video was pure motivation "Yes We Can" that's brilliant'. (sentiment score: 2.96) . 'GO FUCK YOURSELF YOU ARROGANT PRICK GO FUCK YOURSELF YOU ARROGANT PRICK GO FUCK YOURSELF YOU ARROGANT PRICK' (sentiment score: −3.74) . 'lies, lies & more lies' (sentiment score: −2.30) After calculating sentiment scores for each video, we use these scores as dependent variable in a generalized linear model. As independent variables, we incorporate speaker gender and ethnicity and the topic proportions from our structural topic model, while controlling for popularity (views on YouTube) and time (YouTube upload date). At last, we calculated predicted values for our representation covariates with the R package ggeffects (Lüdecke 2018). Given the limitations of sentiment analyses, we computed another regression model with the same covariates for predicting the number of dislikes for each video. Model information is available in the Supplementary Information S4 and the results are very similar to those obtained from sentiment analysis.

Descriptive representation
Applying the image recognition algorithm to the images of all TED speakers, we find that overall, 68% of speakers are men and thus only about one-third of all speakers are women. With regard to ethnicity, 80.2% of all speakers are classified as white, with the remainder consisting of 7.3% Black, 5.4% Hispanic, 4.7% Asian and 2.4% other ethnic groups. These numbers show that male speakers are substantially more often allowed to present at TED stages than women. Moreover, only one out of five speakers' ethnicity is non-white. To put this into another perspective, combining both attributes results in 56.2% of speakers being both white and male. Figure 2 further illustrates how descriptive representation for women and ethnic groups developed over time. Regarding speaker gender, the figure shows that, despite the overall low share of women speakers on TED stages, the trend increased over the years. In fact, in the year of our data collection up until May 2017, women were for the first time more often present as speakers than men. However, when it comes to ethnicity, the representation of non-white speakers is constantly low across time, with the share of white speakers declining only by a small margin. To put this in context, we can compare numbers for TED talks given between January 2016 and May 2017 with demographics from the United States, where a TED headquarter is located and many TED conferences took place. As can be seen in Table 1, TED improved in terms of an equal distribution of sexes.
At the same time, white speakers are over-represented and some ethnic groups like Asians and especially Hispanics are still substantially under-represented in comparison to the United States. A much better comparison would be possible with knowledge about the geographical distribution of TED viewers, for instance on YouTube. Unfortunately, TED did not provide such data after a request from the authors. Nevertheless, it can be expected that the share of non-white viewers is higher in many countries outside North America and Europe. For this reason, speakers with non-white ethnic groups are still under-represented in TED talks.

Substantive representation
From our structural topic model, we identified one out of 30 topics to be strongly related to the substantive representation of women and ethnic groups. We labelled this topic as inequality topic, as the corresponding talks predominantly address unequali treatment of either women or ethnic groups. The topic accounts for about 3% of all TED talk transcripts. Table 2 shows the most important stemmed terms as indicated by the frex metric, which captures terms that are both frequent and exclusive for a given topic (Lucas et al. 2015, 19). Labels, proportions and frex terms for all other topics are included in Supporting Information S2.
As indicated by the most important frex terms, the topic captures content about gender (women, men, gender, girl, boy, feminist) and sexuality (gay, sexual, sex). References to ethnicity (black) and several terms related to violence and misuse of power (rape, slaveri, abus, violenc) are also apparent. The stemmed term equal refers to equality, suggesting another important aspect of substantive representation for women and ethnic groups. In addition, we utilized a feature of the structural topic model to find the most representative TED talks, for which corresponding titles are also included in the table. The titles provide further evidence that the topic is strongly related to inequality structures, bias and negative attitudes towards women and certain ethnic groups, but also children.
As topics of a structural topic model are by design allowed to correlate with each other, we can examine which topics frequently co-occur together in TED talks. To visualize topic connections, Figure 3 contains output from hierarchical ward clustering of topic proportions on the left-hand side.
The topic clusters reveal that inequality is correlated with content about family as well as children & school, but also to a lesser extent with the topics law & politics, war & terror, money & business and countries & poverty. As we incorporated speaker gender and ethnicity in the estimation process of our topic model, we can assess whether these covariates  affect the likelihood to talk about inequalities related to women and ethnic groups. We also incorporated the date of talks to examine whether TED preferences for certain topics change over time. Figure 3 (b) includes correlations between all topics and covariates with p values adjusted for multiple comparisons. As expected, descriptive representation affects substantive representation and women as well as non-white speakers are more likely to discuss corresponding inequalities on the TED stage. Supporting Information S3 contains estimates for topic proportions dependent on the same covariates, which are identical to the correlation patterns. The output from subfigure (b) further shows that over time, TED talks increasingly covered content related to inequality. It can also be seen that women are less likely to give talks about computers & technology and that the environment topic is predominantly discussed by white male speakers. In summary, inequalities relevant for women and certain ethnic groups are addressed in a small but increasing share of talks, suggesting that both groups are substantively represented on the TED stage. Moreover, they are also more likely to talk about these inequalities.

YouTube sentiment
To recap our final research question, we examine whether descriptive and substantive representation of women and ethnic groups affects how TED talks are perceived by the digital audience. Figure 4 illustrates regression estimates for the average sentiment of YouTube comments for each TED talk. Topic proportions of topic models always sum up to 1, which is why including every topic proportion in regression models induces perfect collinearity. For the following analysis, we therefore removed one topic that is irrelevant for our research task, labelled as work & misc, from our regression model. Figure 4(a) shows standardized coefficients for speaker attributes, topic proportions and controls. Following the recommendation by Gelman (2008), numeric variables were divided by two standard deviations, so that they can be compared to binary variables for gender and ethnicity. The output suggests that, surprisingly, while holding speaker gender and the topical content of TED talks constant, the public sentiment of nonwhite speakers is positive and they receive more positive comments than white speakers. Supplementary material S3 includes a visualization from an additional regression model including an interaction between ethnicity and date, showing that sentiment differences between white and non-white diminish over time. With regard to speaker gender, the sentiment of talks given by female speakers is generally more negative in comparison to men on the TED stage. However, the most important predictors for receiving negative feedback on YouTube are related to talk content. Average sentiment values of talks are most negative for high topic proportions of the substantive representation topic (inequality) and for talks about law & politics and war & terror. This could be further evidence for the hostile media phenomenon, i.e. the perception of people with strong pre-existing attitudes that media coverage is biased against their own point of view (Gunther and Chia 2001). As a result, sentiment values in such highly polarizing talks are also substantially more negative. In addition to standardized coefficients, Figure 4(b.1-b.3) shows effect estimates for representation covariates, where all other variables were held at their observed values. The output confirms that descriptive representation affects public sentiment, but the effect of substantive representation is stronger. To test the robustness of our results, we also analysed the number of dislikes a TED talk on YouTube receives, using a negative binomial model and the same set of covariates. Regression tables and visualizations for both models are available in Supporting Information S3. Results of the dislike regression model are very similar, also showing a positive sentiment of talks by non-white speakers, but negative sentiments for female speakers and talks about inequality.

Discussion and conclusion
This paper was motivated by our interest in the representation of women and different ethnic groups in the digital sphere. By utilizing automated methods for image annotation and text analysis, we examined to what extent members of both groups give TED talks and whether their specific needs are discussed on stage. We showed that more than half of talks were given by white male speakers and, while the share of talks by women increased over time, it is constantly low for non-white speakers. We further identified a small but increasing share of TED talk content being strongly related to inequalities and the substantive representation of women and ethnic groups. Both women and certain ethnic groups were more likely to discuss such inequalities on stage. Moreover, we were specifically interested in examining the feedback systems of digital platforms as a means to capture public sentiment. Analysis of YouTube comments on TED talks and their (dis-) likes showed that while public sentiment is positive for non-white speakers, it is negative for women and talks about gender and ethnicity related inequalities.
Regarding descriptive representation, it seems as if TED media increased their efforts to achieve a more balanced gender representation in TED talks. The share of female speakers was below one-third in 2006, but in the first half of 2017, more woman than men gave TED talks. One possible reason is that this was a reaction to an earlier study about scientific careers and TED talks , which was covered in the media (Taylor 2013) and revealed an overall low share of female speakers. Nevertheless, while the representation of women shows improvements, the situation is different for non-white ethnic groups in TED talks. Only one in five TED talks is given by a non-white speaker with no evidence for improvement over time. Digital content providers like TED media should increase their efforts to prevent that talking about science and important matters of societal change on a global stage remains a privilege of white people. Otherwise, under-representation of certain ethnic groups in the digital sphere can, similar to traditional media sources, further enhance stereotypes and negative attitudes.
When it comes to public sentiment, we showed that female presenters receive more negative feedback on YouTube than male speakers. To our surprise, the sentiment of non-white speakers was on average more positive than for white speakers and this finding holds for additional robustness checks. There is no reason to assume that non-white speakers in general perform better at stage and negative attitudes could rather be expected for other ethnic groups and not for white speakers. Future research could enhance our knowledge about this puzzling finding. Regarding substantive representation, our findings provide evidence for a negative public sentiment of talks about inequalities related to women and certain ethnic groups. These talks often contain depressing rather than entertaining content, which might to some extent be reflected in video dislikes and comment sentiments. Some of the YouTube comments in our sample, as hinted by one of our examples above, contain very harsh language. Nevertheless, they are visible to all users, which raises concerns about YouTube's decision making related to removing obnoxious content.
Although computational methods enable promising social science research as demonstrated in this paper, their use also comes with important limitations. Regarding the image recognition algorithm that we used to measure descriptive representation, it is only possible to retrieve binary annotations for the sex of persons. These simplified annotations may not be in line with the gender a speaker identifies with. Related to this, gender is increasingly considered as a non-binary social construct. Likewise, as we mentioned above, the algorithm is limited to recognizing only certain ethnic groups and is unable to fully grasp the concept of ethnicity, which is also a complex social construct. Furthermore, the share of non-white speakers in TED talks was so low that we had to aggregate all non-white categories. With regard to the measurement of public sentiment on platforms like YouTube, our knowledge of the user population is very limited. We cannot assess whether only specific people on YouTube, e.g. people with negative attitudes towards white speakers prefer to engage via commenting or liking. Nevertheless, regardless of how one interacts publicly on YouTube, these interactions are always visible on the platform. For this reason, quantification of the public sentiment in the digital sphere is still useful, even though we do not know what kind of users produce this content. Nevertheless we are only able to measure sentiment of users that actually commented on talks. Future research could use a more experimental setting to investigate the sentiment of silent users and how commenting behaviour can impact their opinion.
With regard to the use of image recognition in general, the authors of this paper would like to highlight that we agree with many citizens, scholars and politicians who consider some applications, e.g. for mass surveillance systems, troublesome and unethical. We fully support initiatives like Safe Face Pledge, which provides guidelines for ethics principles of facial analysis technology. However, we still think it is worth to study whether image recognition can also be utilized for good causes, which is why we examined whether its performance is reliable enough for social science research.
Despite the limitations of our work, this paper contributes to the emerging body of literature about the representation of disadvantaged groups in the digital sphere. As a substantial part thereof, TED talks are a main institution to popularize science and they reach millions of people around the globe. Our results raise some concerns, particularly about the representation of certain ethnic groups in these talks. This highlights the importance of speaker diversity to reduce stereotypes about scientists and people driving societal change. However, our knowledge about new features of these platforms, like interactive feedback systems and their utilization for digitally interacting with content by or about minorities, is still limited. We encourage scholars to further examine such feedback systems and the public sentiment in response to women and ethnicity representation on digital platforms.

Disclosure statement
No potential conflict of interest was reported by the authors.