ABSTRACT

Numbers don’t speak for themselves – yet taking numbers for granted (numerism) is widespread. In fact, journalists often rely heavily on numbers precisely because they are widely considered objective. As a team of journalists and social scientists, we undertook a qualitative exploration of clauses and entire news reports that are particularly quantitatively dense. The dense clauses were often grammatically complex and assumed familiarity with sophisticated concepts. They were rarely associated with explanations of data collection methods. Meanwhile, the dense news reports were all about economy or health topics, chiefly brief updates on an ongoing event (e.g., stock market fluctuations; COVID-19 cases). We suggest that journalists can support public understanding by:

  • Providing more detail about research methods;

  • Writing shorter, clearer sentences;

  • Providing context behind statistics;

  • Being transparent about uncertainty; and

  • Indicating where consensus lies.

We also encourage news organizations to consider structural changes like rethinking their relationship with newswires and working closely with statisticians.

Introduction

Researchers have consistently found that news content contains an awful lot of statistics (e.g., Maier Citation2002; Cushion et al. Citation2017). Starting from news users’ perspective, other researchers have focused on the skills people need to make sense of statistics in the news. For example, Gal (Citation2002) and Utts (Citation2003) identify common types of statistical knowledge that crop up again and again in news stories. The authors of this paper are concerned not just with the types of knowledge, but with the density of quantitative content.

We are currently engaged in a multi-year study to understand the relationship between U.S. adults’ news consumption and their quantitative reasoning. This study is part of a long-term collaboration between a social science research group and a major news organization in the U.S. (Barchas-Lichtenstein et al. Citation2020, Citation2021). Our research program treats both media (cf. Couldry Citation2004) and quantitative reasoning (cf. Oughton Citation2018) as social practices, and includes research with journalists on the news production process as well as research with news users on their engagement with news.

This is one of a series of papers drawn from a single mixed-methods study of the current news landscape that sought to assess the amount and types of quantitative reasoning required by a wide array of media sources (Voiklis et al. Citation2022). We focused on four topic areas identified as quantitatively dense (Cushion et al. Citation2017): economics, health, science, and – because we collected data in a presidential election year – politics. Because the authors’ varying professional experience is often key to our interpretation, this paper includes occasional snippets from our ongoing dialogue as footnotes.

Numerism

Numbers do not speak for themselves.Footnote1 All the same, many people believe that they do. An ideology we call numerism, which accords a privileged epistemic status to quantification, is widespread. Within this ideology, numbers are taken for granted. They are assumed to be objective and truthful at their core, and the conditions of their production are elided (cf. Porter Citation1995; McConway Citation2016). As Porter (Citation1995, 8) observes,

The appeal of numbers is especially compelling to bureaucratic officials who lack the mandate of a popular election, or divine right. Arbitrariness and bias are the most usual grounds upon which such officials are criticized. A decision made by the numbers (or by explicit rules of some other sort) has at least the appearance of being fair and impersonal. Scientific objectivity thus provides an answer to a moral demand for impartiality and fairness. Quantification is a way of making decisions without seeming to decide. Objectivity lends authority to officials who have very little of their own.

People may be particularly tempted to make these assumptions of official statistics, which are backed by the resources and authority of states, although they are “beset by all the same problems as other types of numbers” (Barchas-Lichtenstein Citation2021).

The ideology of numerism is linked to the widespread and ultimately false idea that science is value-neutral,Footnote2 and is an extreme form of what historians of science call mechanical objectivity, or the notion that following consistent procedures leads automatically to truth.Footnote3 Porter (Citation1995) traces mechanical objectivity back to large-scale bureaucracies and their need for trust and commensurability across distances, particularly accountants, actuaries, and state engineers. He also observes that mechanical objectivity is impossible in practice, due to the vast amount of tacit knowledge that always underlies the collection of numbers. Mechanical objectivity is a close relative of scientism,

“slavish imitation of the method and language of Science … an attitude which is decidedly unscientific in the true sense of the word, since it involves a mechanical and uncritical application of habits of thought to fields different from those in which they have been formed” (Hayek Citation1942, 259).

Scientism suggests that the methods of the natural sciences are uniquely suited to knowledge production, regardless of the appropriateness of those methods to the questions at hand.

A number of researchers suggest that journalists fall prey to the temptations of numerism at least somewhat frequently. In an audit of a daily newspaper, Maier (Citation2002) observed that common errors included both errors of computation and errors of interpretation. He blames this on “a tendency for reporters and editors to take numbers on faith, overlooking implausible claims and numerical inconsistencies” (Maier Citation2002, 516). Van Witsen (Citation2018, Citation2020) also found that journalists typically take statistics for granted, particularly since they are embedded in reporting routines. In particular, journalists frequently used language that eliminated or minimized uncertainty, rather than recognizing it as inherent in statistical methods.

A recent review article pinpoints numerism among journalists as due, in part, to a misunderstanding about the nature of statistics.

Statistics and mathematics are two different things: It is not necessary to be adept at mathematics to be able to use statistics effectively. However frightening they might look, statistical analyses are about the application of valid reasoning, not calculation. Mistakes are often made in the news, but few involve getting the math wrong: Most are due to flaws in the logic applied to data and their context. (Nguyen and Lugo-Ocando Citation2016, 4-5)

For the most part, statistics use the laws of probability to generalize about a large group (of measurements, data, or people) by looking at smaller subsets of that group. They let us describe things in the context of variability and uncertainty without exhaustive measurement, by identifying what is most likely. However, the applicability of statistical methods and tools depends on characteristics of the data set. As such, both methods and caveats are key contextual elements – but they are infrequently reported in the news (Portilla Citation2016; Bhatti and Pederson Citation2016).

Our previous study (Voiklis et al. Citation2022) found that, on average, health and economics stories in our corpus required far more quantitative reasoning than science and politics stories. We also found an extremely uneven density of quantification: some stories referred to multiple quantitative concepts per sentence, while others barely used any quantification at all. Here, we take a closer look at phrases and full news reports that rely particularly heavily on quantification. We use quantitative reasoning (see Karaali et al. Citation2016; Barchas-Lichtenstein et al. Citation2021), statistical literacy (Gal Citation2002), and statistical reasoning more or less interchangeably, because the numbers we find in the news are largely statistics, as we’ll show below. In particular, we focus on the following two research questions:

RQ1: What characteristics do quantitatively dense clauses share?

RQ2: What characteristics do quantitatively dense stories share?

Data Journalism & Other Quantitative Epistemologies

We ground this study in the research on what others have called journalism’s “quantitative turn” (Coddington Citation2015) or “quantitatively-oriented forms of journalism” (Splendore Citation2016). Coddington (Citation2015) lays out a typology of three forms of journalism (computer-assisted reporting (CAR), data journalism, and computational journalism) that considers both professional and epistemological factors. For him, data journalism occupies a middle ground – it maintains traditional journalism’s focus on storytelling over data, while it breaks with traditional journalism in its understanding of its users as active makers of meaning. Building on this work, Splendore (Citation2016) finds that computational journalism and automated journalism are more “data-driven” than is data journalism, which continues to follow a journalistic logic.

Despite these categorization efforts, practitioners use the term data journalism much more often than these alternatives – and there is still considerable disagreement about precisely what data journalism is. Recent research using a wide range of methods fails to find a single accepted standard (Young et al. Citation2018; Lewis and Waters Citation2018). A content analysis of 26 Canadian finalists and award winners between 2012 and 2015 found a heavy reliance on free digital tools (Young et al. Citation2018), leading to overrepresentation of simple and readily available analysis techniques, a state of affairs similar to what Usher (Citation2020, 252) calls “epistemological debt.” A metajournalistic discourse analysis, looking at definitions of data journalism appearing in 612 news stories from 41 countries, found that numerical analysis is the most common definition overall (Lewis and Waters Citation2018). However, data journalism was more strongly associated with electoral predictions in the U.S. than elsewhere, due in part to associations with Nate Silver and the FiveThirtyEight site.

A related strand of research looks beyond self-identified data journalists to the use of statistics and data by a broader set of journalists – and they largely find that statistics go unscrutinized. Van Witsen (Citation2018) notes that many of the journalists he interviewed ascribe a special epistemic status to numbers, believing that “numbers provide direct access to a kind of truth not available from live sources or eyewitness descriptions.” Cushion and colleagues (Citation2017) looked at statistical claims across a wide range of UK media. They found that statistics in news coverage typically reinforced hegemonic institutional voices: journalists did not typically challenge or independently review statistical claims made by politicians and business elites. Similarly, Van Witsen (Citation2020) studied 95 articles covering a single announcement by NASA and NOAA and found that journalists relied on scientists to judge the quality of measurement. In particular, the articles he analyzed were much more likely to treat scientists’ claims as certain and policymakers’ claims as doubtful. A third study found that journalists who cover election polls often lack in-house methodological expertise, and they tend to attribute more precision to results than is warranted (Toff Citation2019).

The journalist co-authors offer additional professional context to these findings: there are often gaps in official data that contribute to which stories go untold, particularly about the experiences of marginalized groups. And skepticism about statistics collected by advocacy groups – as if official statistics somehow provided a “view from nowhere” (cf. Barchas-Lichtenstein Citation2021) – can make it even more difficult to report on these issues at all.

Statistics in Economic and Health Reporting

Our prior research found that economic and health stories were far more quantitatively dense than science and politics stories (Voiklis et al. Citation2022). These results are largely consistent with earlier work. For example, Cushion et al. (Citation2017) found that 75% of economy stories contained at least one statistic, as did 38.5% of health stories; only 32.5% of politics stories and 24% of science stories did so.Footnote4 We focus here on the first two topic areas because they require somewhat different types of quantitative reasoning.

Economic reporting has typically focused on (quantitative) indicators at the expense of focusing on mechanisms for change (Jensen Citation1987). Specialist financial outlets may well be more statistically dense than economic reporting in other outlets (Manning Citation2013), since these outlets follow their own set of news values and see their primary role as informing investors (Doyle Citation2006; Boukes and Vliegenthart Citation2020). Some researchers have noted that economic and financial journalism could promote public literacy about many of these indicators, but there is no consensus among these journalists about whether they are supposed to serve such a role (Doyle Citation2006; Tambini Citation2010). Other studies have noted that economic journalism can sometimes create feedback loops: journalists may treat statistical estimates as more precise than is warranted, which can magnify the importance of small fluctuations in the stock market and create panic (Manski Citation2015; Hope Citation2010; Kleinnijenhuis et al. Citation2013).

Health information more generally is characterized by pervasive and highly complex numerical information (see Reyna et al. Citation2009, for a review), and health journalism is no exception. A content analysis of two women’s magazines, Cosmopolitan and Latina, found that more than half of all health stories contained numbers, even though most stories were quite short: less than one page (Len-Ríos and Hinnant Citation2014). Health journalists report that they find it important to include data and statistics in their reporting, which may be due to a lack of awareness of general numeracy, a belief that readers are relatively health-literate, or a desire for credibility (Hinnant and Len-Ríos Citation2009).Footnote5 Given that their audiences likely have diverse literacies and diverse information needs, journalists use a range of practices to try to increase both accessibility and credibility (Hinnant et al. Citation2012). Some common challenges in health journalism include anchoring effects, misapplications of conditional probability, confusions between correlation and causation, and implausible precision (Paulos Citation1996).

The Covid-19 Context

We collected our news stories in late February 2020. COVID-19 is the single most frequent topic in our data corpus. At least 62 of the 230 stories address either health or economic concerns related to the virus. This is all the more surprising because the U.S. media was still largely treating COVID-19 as a distant problem at that time. An additional 11 stories address influenza without mentioning the coronavirus. It is obvious in retrospect that there was already community transmission of COVID-19 in the U.S., and many “influenza-like illnesses” may have been unrecognized cases of COVID-19.

COVID-19 journalism has focused heavily on numbers while “gloss[ing] over the rather messy procedures used to create those numbers” (Best Citation2020, 4). Within our data corpus, the numbers often take prominence over the economic and geopolitical circumstances behind them (cf. Briggs and Nichter Citation2009; Briggs Citation2011). Even within the strictly quantitative realm, we have seen the issues raised by Ancker (Citation2020): comparing absolute numbers instead of population-adjusted proportions, the absence of benchmark or threshold values, and a high level of precision without acknowledging the attendant uncertainty.

Methods

Data Collection and Processing

Using Google News, we collected the text of six news reports per day for one week in February 2020 in each of the following categories: Business, Science, Health, and Politics. We defined a news report operationally as the complete text or transcribed audio from a single hyperlink; we also use “story” to refer to this unit of content. We used the first six reports collected each day, after excluding those that were either (a) from sources outside the U.S. or (b) marked as opinion pieces, and removing all duplicates in this data set so that no story was represented more than once. In addition, we collected the top three reports a day from the same content areas on the PBS NewsHour website to ensure that public media – which has an explicit mission to educate the public – was well represented. The size of the data set was determined to provide sufficient data for our initial goal (to develop a machine-coding algorithm) without creating an overly burdensome coding task for we human researchers.

We collected both videos and text stories and replaced each video with its official transcript or captions. Since punctuation has no analogue in speech, we preferred to rely on journalists’ segmentation rather than our own, so that we would be comparing like with like. However, two of the 40 videos in our corpus did not have official transcripts available; we transcribed these by hand, using pause length as a proxy. After all stories were represented in text, we automatically parsed each report into clauses, defined by the presence of periods and semi-colons rather than grammatically, in order to speed the coding process.Footnote6 See Voiklis et al. (Citation2022) for details of data collection and processing, including inter-rater reliability.

Coding and Codebook

A team of three researchers coded the stories in our database using Dedoose, a qualitative data analysis software package. We coded each clause within each story (Appendix A), as well as each news report as a whole (Voiklis et al. Citation2022). Story-level codes were largely developed top-down, based on Gal’s (Citation2002) statistical literacy framework identifying key components of statistical literacy required for engagement with news, particularly news magazines. Clause-level codes were largely developed bottom-up through discussion and iteration based on a preliminary review of a number of news reports not included in the final data set.

Close Reading

While the story-level codes provide a sense of the understanding needed at the big-picture level, they cannot tell us how dense a story is, much less a clause within that story. To explore this question of density, after all data were coded at both levels, the first author identified the clauses that had the largest number of clause-level codes, as well as stories that had a particularly high number of these clauses. In total, 166 clauses contained four codes or more; these clauses came from 72 different stories. The first author also identified a small set of stories that had the highest density of codes. Out of the 230 stories in the data set, only 13 had an average of 2 or more codes per clause. A close reading of these clauses and reports allowed us to identify concepts in each content area that required particularly sophisticated quantitative reasoning and – in some cases – specialist knowledge.

Collaborative Dialogue

The production of a multiple-authored paper is always dialogic, and this work is no exception. Three co-authors of this paper were employees of a well-known journalism outlet; the other six were employed by a social science research institute. Two of those six have formal training and experience as journalists. Each author’s professional experience and judgment played a critical role in our interpretation of the data we present here.

We therefore introduce a convention for bringing some of these “backstage” conversations to the fore. When a specific author’s positionality strongly informs our collective interpretation, we quote their comments (from meetings, email correspondence, or comments on earlier drafts) using their initials to highlight the importance of their standpoint.

Characteristics of the Data set

Before turning to our analysis, we outline the characteristics of the data set we collected. Our data set contains a total of 230 stories collected between February 18 and February 24, 2020. During this period, a single topic — the spread of COVID-19 and associated economic shocks — was already somewhat dominant; over one-fourth of the stories address this topic, including many stories that were not considered Health stories.

Of 230 stories, 167 were collected through Google News while 63 came from PBS NewsHour, including stories syndicated from the Associated Press. Only 40 stories were videos, while the other 190 were in text format. They were well-balanced among the four content areas, ranging from 53 science stories to 61 politics stories.Footnote7 The politics stories were somewhat longer than stories in the other content areas ().

Table 1. Number and median length of stories in each content area.

The 230 stories in the data set were produced by 74 distinct outlets. Two researchers categorized each outlet in two different ways: we noted whether the outlet was a legacy media source or an online-first publication in case there were differences between them. 38 outlets were online-first, 35 were legacy media, and one did not fit into either category. We also clustered these outlets into 12 general categories (), given research suggesting that specialist publications tend to assume higher levels of audience knowledge (e.g., Manning Citation2013).

Table 2. Categories of producing outlets.

Of the stories collected through Google News, there were notable differences between content areas (). Nearly all economy stories were produced by national outlets with a general audience or outlets specializing in the economy. Meanwhile, a plurality of health stories were produced by local news outlets, followed closely by newswires, national outlets with a general audience, and outlets that specialize in the economy. National outlets with a general audience produced almost half of the politics stories in this corpus, followed by outlets specializing in the economy and then by political specialist outlets and local news outlets. Finally, more than half of the science stories were produced by outlets specializing in STEM, followed by national outlets with a general audience.

Figure 1 News outlet types by topic area.

A graph showing the distribution of topic area by category of news outlet. The data table is available at https://bit.ly/3yCf90Y.

Each of the four content areas shared some general characteristics that the coders noted during the classification process. We delve into the details of these characteristics in our analyses:

Economy articles were generally fairly code-dense, reporting on such topics as interest rates, profits, and stock prices. Many articles addressed change over time in these economic variables, often due to some external change. Most articles also included some sort of quantifiable prediction or forecast. The coder also noted that they did not code references to “the Dow” as Central Tendencies unless it was explicitly specified as “the Dow Jones Industrial Average” because readers unfamiliar with this number might not know how it is calculated.

Almost all Health stories were about COVID-19 or the seasonal flu. These stories often included references to case counts “confirmed” by local health authorities, sometimes including considerable detail about who was or was not included in such counts. We also saw a number of references to clusters and outbreaks in specific areas. Some of these stories also focused on the likelihood of WHO declaring COVID-19 a pandemic (which it eventually did). Comparisons between the two diseases (COVID-19 and influenza) were also somewhat frequent.

The numbers that showed up most frequently in politics stories typically referred to years or dollars. Compared to Economy and Health stories, magnitudes were less precise: “thousands of voters” or “millions of dollars.” Even politics stories that had several clause-level codes present did not necessarily require any statistical literacy to be understood. Stories that did require statistical literacy were typically about elections, addressing issues like demographic differences in political attitudes, “electability,” and various candidates’ likelihood of winning.

Science stories about research studies had quite a few statistical concepts. Space exploration was a big topic the week we collected data — due to announcements that SpaceX was planning to launch tourists into orbit and that Japan was planning a Phobos mission. Stories about this announcement typically only included information about distances, costs, and perhaps comparisons to earlier missions.

Quantitatively Dense Clauses

In general, the quantitative analysis showed that health and economy stories received more story-level and more clause-level codes than politics and science stories (Voiklis et al. Citation2022). And quantitative information was distributed quite unevenly within stories, with some clauses considerably denser than others. Of the 9504 clauses in our data set, more than half were coded as lacking quantitative information. However, the mean clause received 0.77 codes due to the presence of clauses that received multiple codes ().

Table 3. Number of codes per clause in our data set.

Quantitative information was also distributed quite unevenly across stories. The 166 clauses with 4 codes or more came from 72 different stories (). A relatively small number of stories (19) accounted for more than half (n = 87) of those clauses. In other words, these dense clauses were split between stories that contained only one or two such clauses (53 stories) and those with three, four, or even eight dense clauses (19 stories). Similarly, a small number of outlets were responsible for the bulk of these clauses: 34 (20.5%) of them were found in Associated Press stories, 16 (9.6%) in CNBC stories, 15 (9%) each in Fox News and CNN stories, and 14 (8.4%) in New York Times stories.Footnote8 All other outlets accounted for 8 such clauses or fewer. All told, national outlets accounted for 48 (28.9%) such clauses, economy specialists for 42 (25.3%), newswires for 40 (24.1%), and partisan outlets for 17 (10.2%).Footnote9 The other 19 were divided between the various outlet types.

Table 4. Distribution of high-code clauses.

Nearly half of those 166 clauses – 77 of them – were in Economy stories, while 53 were in Health stories. Only 24 were found in politics stories, and 12 were found in science stories.

Discourse Features

There is near-total agreement among journalists that news stories should follow a structure known as the “inverted pyramid”: they should begin with the big picture and zoom in on details as they go on.Footnote10 Given this structural convention, the location of sentences within stories tells us something about their relative importance. When dense quantification appears early in a story, it suggests that the story is about quantification in some meaningful way, rather than treating the quantification as supporting detail.

Seven stories begin with code-dense clauses, and all seven of them follow the pattern described above. The numbers are the story, in some meaningful sense:

Excerpt 1: “SEOUL, South Korea (AP) — South Korea reported an eight-fold jump in viral infections Saturday with more than 400 cases mostly linked to a church and a hospital, while the death toll in Iran climbed to six and a dozen towns in Italy effectively went into lockdowns as health officials around the world battle a new virus that has spread from China” (Kim Citation2020).

Excerpt 2: “The World Health Organization confirmed that over 77,794 people are infected with the coronavirus, with 2,348 deaths in China and 11 deaths outside of China” (Newberger Citation2020).

Excerpt 3: “China said on Thursday it lowered its benchmark lending rates — a move that was widely expected by analysts as the world’s second-largest economy faced threats from an outbreak of a deadly coronavirus” (Lee Citation2020).

In addition to these three examples, this group included stories about the pre-emptive cancellation of the Venice Carnival (Bruno and D’Emilio Citation2020), a White House economic report (Boak Citation2020), the relative deadliness of COVID-19 and flu (Colen Citation2020), and recent flu deaths (MacAneny Citation2020). All but one of these stories had quantitative headlines. Another 24 dense clauses appeared in sentences 1–5 – and 21 more in sentences 6–10. In general, stories with dense clauses appearing early were heavily quantified throughout.

Grammatical Complexity

One journalist co-author (IIT) highlighted a tendency to switch to academic language when discussing numbers that may itself be rooted in conceptions of how “objectively” important and “official” they are. Across topics, many of the clauses that contained particularly large numbers of codes were grammatically complex, with multiple grammatical clauses. Consider three examples.

Excerpt 4: “In Virginia, only 16 of 122 licensed hospitals provide sexual assault forensic exams, and only about 150 of the state’s 94,000 registered nurses are credentialed forensic nurses, according to a 2019 study by the Virginia Joint Commission on Health Care” (Lavoie Citation2020).

Excerpt 5: “All demographic groups, all groups of Americans from whites to Hispanics, African Americans, women, are looking at record low unemployment levels and Americans in Gallup poll seem to give this president a tip on that one” (Fox News Citation2020).

Excerpt 6: “In a Fox Business television interview last week, White House economic adviser Larry Kudlow said the epidemic could lower U.S. GDP by 0.2% or 0.3% in the first quarter, though he added there was a ‘lot of uncertainty' surrounding that estimate” (Miller Citation2020).

An online reading level calculator placed all three of these at a “college graduate” level of difficulty (Kincaid et al. Citation1975). That is, complicated concepts were often tightly packed.Footnote11

Topical Features

Each topic area also included specific concepts that tended to be particularly quantitatively dense. We discuss them in turn.

Economics

Understanding economics stories often requires quantitatively dense concepts. For one, stories about the stock market tend to report change over time multiple ways. Specifically, change is often reported in both percentage and absolute values.

Excerpt 7: “The Dow Jones Industrial Average (DJIA) advanced 115.84 points, or 0.4%, to 29,348.03” (Marketwatch Citation2020).

This clause received the following codes: comparison because it reports change over time, magnitude and scale because it provides the absolute value and absolute change, proportion or percentage since it reports a percentage change, and central tendencies and exceptions since the DJIA is, after all, an average.

Stories reporting on other major economic indicators—including GDP, wages, and unemployment—tend to use similar formulations. Even when they do not provide both absolute and percentage change, they sometimes provide comparisons between groups that are themselves defined quantitatively.

Excerpt 8: “Wages at the 95th percentile grew by 4.5% last year, while the median increase was just 1%” (Tappe Citation2020).

This clause asks readers to understand the percentage change over time (comparison) in median (central tendency) wages, as well as the variability in change over time at different parts of the wage spectrum. Consider how much easier it is to understand the following version: Workers earning the highest salaries saw a 4.5% increase, while the typical worker saw an increase of just 1%.

Quantifiable forecasts and projections also figure prominently in economics reporting, including comparisons of current data to earlier projections.

Excerpt 9: “But Trump’s tax cut was not part of that forecast — and the budget deficit is 42% — or $301 billion — higher today than the CBO estimated at the time” (Boak Citation2020).

The reference to forecasting was coded as probability, while the comparison is reported as both a percentage and a magnitude.

Health

Among health stories, epidemiological concepts typically received multiple codes. In particular, many clauses focused on official reports of confirmed COVID-19 cases. Consider the following:

Excerpt 10: “On Monday, South Korea, the hardest hit country outside China, reported 231 more cases, bringing its total to 833 cases and seven deaths” (Landler Citation2020).

Reports of official state counts always received the official statistics and enumeration codes, as well as the magnitude code on most occasions, since absolute numbers were typically reported. This sentence also included a comparison over time, as well as a reference to concentration (“the hardest hit country outside China”).

In addition to a heavy reliance on official statistics, health stories also often included references to central tendencies and exceptions, nodding either to record numbers or to typical ones. They also included frequent comparisons between different groups or places, such as in excerpt 11:

Excerpt 11: “While the flu season so far has been harshest on children - Minnesota already has reported a record 762 outbreaks of flu-like illnesses in schools - the CDC reported that the vaccine has been more effective for that age group” (Olson Citation2020).

Because many stories addressed both the emerging COVID-19 crisis and its effects on the global economy, the concepts addressed above showed up in Economy, Health, and Politics stories.

Science

We could not easily identify patterns among science stories because relatively few of them contained these kinds of clauses. We did, however, notice a tendency to report quantitative research results without the necessary context to evaluate or interpret those results.

Excerpt 12: “A study in 2018, for example, found methane emissions from oil and natural gas were 60 percent higher than those reported by the US Environmental Protection Agency” (Cassella Citation2020).

Excerpt 12 mentions official statistics and compares results from a recent non-official study to the official EPA numbers, using a percentage. It also implies that at least one set of numbers may not be generalizable, likely due to sampling or methodological error - but does not provide any insight into why these numbers differ or how they are collected. Nor is that information provided elsewhere in the article.

Among both health and science stories, some reports on research findings contained similarly code-dense clauses.

Excerpt 13: “An analysis of survey data from more than 800,000 U.S. adults found skin cancer may be more common among gay and bisexual men and people who are gender non-conforming, researchers report in JAMA Dermatology” (Reuters Citation2020).

By drawing our attention to the magnitude of the sample, this sentence refers to sampling as a research method. It also references variability among different groups of adults, and specifically of an implied proportion – the rate of skin cancer incidence. Science stories contained a broader range of topics than the other areas, making it difficult to isolate any one set of concepts that were particularly quantitatively dense.

Politics

Highly quantitative concepts appeared most frequently in politics stories related to polling and elections. We treated major polling organizations Gallup and Pew as official statistics organizations, since they have a quasi-official status and typically make all findings public. Journalists also tend to treat them similarly as matters of “routine” reporting (see Toff Citation2019; Van Witsen Citation2018, Citation2020).

Excerpt 14: That's very similar to its previous poll and marks the first time since January 2017 that Trump has a net (approval - disapproval) positive approval rating in the Gallup poll” (Enten Citation2020).

Excerpt 14 includes a comparison to a prior poll, a recognition of central tendencies in that this poll is exceptional compared to the prior three years, and a percentage—although both approval and disapproval ratings are left unspecified. Arguably, it also contains a reference to sampling, but a subtle one.

The term electability recurred in a number of politics stories. While it was rarely if ever quantified directly, we saw relatively frequent comparisons of electability between candidates for office. We also understood the concept as a reference to probability, however indirect; it was often linked explicitly to head-to-head polls and discussed in terms of variability among states or groups of voters.

Quantitatively Dense Stories

Shifting to quantitative density at the story level, the stories in our data set received between 0 and 3.31 codes per clause. The median story received 0.77 codes per clause, and 13 stories received 2 codes per clause or more. Interestingly, the number of multiple code-dense clauses did not correlate particularly well with overall density.

Looking at the most code-dense stories in our data set (), two features immediately stand out: First, this group includes only stories categorized under Economy and Health.Footnote12 And second, these stories are all shorter than the median story, which was 32 clauses long. (Meanwhile, the stories with the greatest number of quantitatively-dense clauses (Appendix B) were more varied in length, and included three science stories and a politics story.)

Table 5. The most code-dense stories in our data set ( ≥2 codes per clause).

Not only were these stories consistently dense throughout, they also addressed a relatively small number of topics. Six of the seven health stories focused on influenza in the US, including one that compared the relative danger of influenza and COVID-19 and one that compared Americans’ attitudes towards these two illnesses. The last health story reported on COVID-19 globally. All of these stories included numbers that reported on the scale of the 2019-2020 flu outbreak, and nearly all of them compared statistics across ages and geographies, which required an ability to understand multiple quantitative concepts nested within one another. (We caution that these findings may be due to the timing of data collection in February 2020, before there was awareness of COVID-19 community spread in the U.S.)

Meanwhile, two of the six economy stories (Marketwatch Citation2020; Newburger Citation2020) also contained quite a lot of COVID-related statistics. Both of these articles reported on national and global economies, rather than on specific conditions. Another two stories reported on growth in U.S. manufacturing (Cox Citation2020; Robb Citation2020), one discussed U.S. economic growth as a whole (Boak Citation2020), and the last reported on U.S. income inequality (Tappe Citation2020).

Five of these stories were produced by financial specialty publications (Bloomberg, Marketwatch, and CNBC), one by a medical specialty publication (Medscape), two by the Associated Press, one by CNN, and one by The Blaze, a conservative source of online news.

The technical publications, in particular, seem to be geared towards expert audiences, leaving a great deal of background information unexplained and using jargon without remark. For example, one of the CNBC pieces (Cox Citation2020) includes the following text:

Excerpt 15: “Early in the week, New York's Empire State Manufacturing Survey for general business conditions posted a reading of 12.9, up 8 points from January and its best level since May. … The indexes are percentage measures of companies expecting growth or contraction.”

While this text attempts to contextualize the numbers, the guidelines it provides are easily misinterpreted. One might think that only 12.9% of businesses surveyed reported growth; however, this statistic is actually the difference between those who reported growth and those who reported decline. Yet a different technical publication’s piece on the same topic explained it much more clearly and concisely:

Excerpt 16: “The [Philadelphia] regional Fed bank’s index jumped to 36.7 in February from 17 in the prior month. Any reading above zero indicates improving conditions” (Robb Citation2020).

Similarly, the Medscape piece (Brown Citation2020) includes the following terms without definition or explanation: influenza activity, influenza A(H1N1)pdm09 viruses, cumulative hospitalization rate, epidemic threshold, vaccine effectiveness estimates. And in fact, Medscape is most explicit about its expert audience, describing itself as “the leading online global destination for physicians and healthcare professionals worldwide, offering the latest medical news and expert perspectives” (Medscape, Citationn.d.). Even the local news stories tacitly assumed relatively high levels of background knowledge: they included overall statistics as well as information about the distribution by age, geography, and flu strain (Austin Citation2020; MacAneny Citation2020; Olson Citation2020).

Discussion & Implications

Under strained staffing levels in an increasing number of newsrooms in the United States, journalists work under tight deadlines to collect facts, conduct interviews, analyze data, write copy, take photos, gather sound, shoot and edit video, create graphics, lay out and publish the story. Amid historic industry-wide layoffs and attrition, a single journalist is often required to perform most of these roles at more outlets to tell stories. That single journalist is likely a generalist who may not have particular expertise in statistics, and as such may not be able to challenge official numbers or provide relevant methodological background.

All the same, that work is vital to help the public better understand their world and what is at stake during a global pandemic. Journalists constantly must negotiate the pressures of doing their job accurately, fairly and speedily with the responsibility of serving people’s need to stay well-informed. That means listening in on government briefings and parsing official data and statistics to spot trends or findings that deserve a broader audience’s attention. And the journalist’s role becomes more difficult when insufficient testing obscures outbreak patterns, inaccurate record-keeping misrepresents vaccine equity or stonewalling ignores questions and defies accountability. Every story often represents a valiant effort to deliver on that complex promise to tell the truth, despite those obstacles and more.

Returning to the research questions, we asked:

RQ1: What characteristics do quantitatively dense clauses share?

RQ2: What characteristics do quantitatively dense stories share?

Qualitatively dense clauses (RQ1) can occur anywhere within the “inverted pyramid” story structure. When they appear very early, they likely indicate numbers that are newsworthy in and of themselves. When these sentences appear later in a story, they typically provide supporting details for the larger story. Across story placement, they share several traits. They are often grammatically complex, with multiple clauses. Even before accounting for content, this complexity means they are relatively difficult to understand. Many of them assume familiarity with sophisticated quantitative measures like economic indicators and epidemiological concepts. Audiences who lack this prerequisite knowledge may find this type of writing inaccessible, particularly because these sentences and the stories that contain them rarely take the time to fully explain research methods. In particular, references to official statistics typically left data collection methods unquestioned and unexplained (cf. Barchas-Lichtenstein Citation2021). Without an understanding of the underlying methods (e.g., how BLS calculates unemployment), news users may not be prepared to make meaning of changes and trends, particularly at times of social disruption (e.g., COVID-19, cf. Irwin Citation2020).Footnote13 In combination, all of these traits seem to suggest these journalists are speaking to a more sophisticated target audience – and may be leaving typical news users behind.

We also saw commonalities among qualitatively dense stories (RQ2). They were all short pieces in either the Economy or Health area. Many of them report on fluctuations in an ongoing or recurring event (e.g., the stock market; flu season). Such stories focused heavily on statistics. Like the dense sentences, these stories typically take familiarity with sophisticated terminology for granted. Some of these stories also provided multiple approaches or competing statistics without explicitly articulating the relationship between them. For example, stories about COVID-19 provided multiple different estimates of key figures like transmissibility or mortality – often without explaining the assumptions or methods that led to those results.Footnote14

Nearly every type of source was responsible for at least one dense clause. National news sources, economy specialists, and newswires collectively produced more than three-quarters of these dense clauses. Once we consider overall representation in the database, we found that economy specialist outlets and newswires were largely responsible for dense clauses. Economics specialty publications also produced a plurality of dense stories, with local news and newswires also contributing a considerable share.

Audience design likely accounts for the high density of economy specialist and newswire pieces, but we believe the highly dense local news stories have a different explanation. Manning (Citation2013, 179) observes that “specialist news outlets, such as the Financial Times, operate with distinct news values on the assumption that their target readership is more inclined to absorb detailed financial data.” This observation likely explains our results. Similarly, newswire pieces are written with an audience of journalists in mind – and journalists likely have a more sophisticated statistical understanding than the general public (McConway Citation2016). For local outlets, on the other hand, highly dense stories may be a matter of resource constraints. All three local stories in this category were about the status of the flu season. These are “routine” stories (cf. Van Witsen Citation2018, Citation2020), which are produced very quickly. This may be especially true for local outlets, which are under tremendous financial pressure. On top of that, these stories were written at a time when journalists reporting on a health beat were grappling with one of the largest and most stressful stories of their careers to date.

Implications

Given these common characteristics, the authors make five recommendations to journalists – and two more to news organizations – who seek to help the public make sense of statistics.

First, to help news users make sense of results, provide more detail about research methods. Most of the references to methods in our corpus were brief nods, not detailed explanations. News users need to know what researchers or government agencies did. They also need to know why they did them. For the public to think critically about numbers, they need to understand the trade-offs between different procedures for generating them. Journalists cannot assume these trade-offs are self-evident. In particular, not all news users may understand the concepts of probability and ambiguity that underlie all generalizable research.

Second, write shorter, clearer sentences. Breaking up quantitative information across sentences is one of the simplest ways to demystify it. Rather than expecting audiences to make sense of multiple comparisons, stick to one concept per sentence. Norris and Phillips (Citation2003, 224) define “reading and writing when the content is science” as “the fundamental sense of scientific literacy.” We suggest that the same is true for quantitative reasoning more broadly: fundamental literacy plays a role. Simplifying the language used and the sentence structures, then, should leave more processing ability available for quantitative reasoning.

Third, provide context for interpretation. Gal and Ograjenšek (Citation2017) note that the public needs a better understanding of the system of official statistics and the principles that underlie it. Specifically, the public needs to understand that there are shared principles and methods, and that official statistics data collection procedures are frequently designed for comparability across times and places. We agree with health informatics professor Jessica Ancker (Citation2020, 4), who suggests that journalists always answer five key questions rather than taking public understanding for granted:

  1. What does the number mean in English and why is it important?

  2. What’s the possible range of the number?

  3. Are there important categories or thresholds to interpret it?

  4. What comparison values might help the reader understand the importance of the number?

  5. Is there uncertainty about the number?

Answering these questions, or linking to resources that do, would improve quantitative coverage across the board. Visual and interactive materials can also be valuable here: some people have an easier time interpreting a graph or chart while others prefer text.

Fourth, be transparent about uncertainty. We frequently saw point estimates reported with little or no information about likely ranges – but not all news users understand the distinction between different types of uncertainty (Attaway et al. Citation2020). Specifically, confidence intervals measure the uncertainty that is inherent in probability-based research methods, while there are other ways of accounting for uncertainty due to such concerns as non-response bias (Sedgwick Citation2014) and social desirability bias (Grimm Citation2010). Explaining these sources of uncertainty and providing ranges, rather than point estimates, can help news users recognize that uncertainty is a feature, not a bug. That is, statistics quantifies uncertainty rather than eliminating it. And the uncertainty in all statistics is an opportunity to think critically about numbers and the goals of those who produce and promulgate those numbers. Thus, reporting on this uncertainty falls squarely within the duty of journalists to provide the public with the information needed for informed decision making.

Fifth, indicate where consensus lies. We often saw multiple measures or multiple models reported without a clear explanation of either the differences between them or their relative support. “Both-sides” reporting strategies are not well suited to situations where the vast majority of credible experts agree, because these strategies falsely legitimize fringe opinions. In contrast, “weight of evidence” reporting strategies allot column space or on-air time proportional to the amount of evidence supporting competing claims (Kohl et al. Citation2016). Such strategies help journalists ensure that they do not undermine solid findings by focusing overly on controversy. Such strategies also help readers recognize some of the sites of disagreement within the research community.

All these changes are primarily a matter of journalistic style. We also suggest two changes at a more structural level.

Consider making changes in the organization’s relationship with newswire content. Some of the densest clauses and stories we read were republished directly from the Associated Press and other newswires. We urge news organizations to add or link to explanations of statistical concepts and findings, rather than reprinting this content unchanged. We also encourage news organizations to demand that newswires provide more accessible content.

We also encourage a type of collaboration that would support newsrooms in making these stylistic changes: work more closely with statisticians, particularly official statisticians. We are far from the first to make this recommendation. Kevin McConway (Citation2016), W. Martin Podehl (Citation2002), and Wayne Smith (Citation2005) are among the statisticians who have argued for a close working relationship with journalists. As Smith (Citation2005, 6) notes, these relationships must start from an understanding that statisticians and journalists have complementary expertise:

It is not the role of a statistical agency to create statisticians out of journalists. It is to help journalists in whatever way possible to do their job. In addition, statisticians cannot be expected to become journalists any more than statistical agencies can expect all journalists to become statistically literate.

In the end, journalists do not need to do the calculations themselves to help the public reason with numbers. They simply need to ask themselves “what do I need to know to assess this number?” and walk through it. Where are the numbers coming from? Which numbers are being discussed more regularly than others? What information is missing from the picture you've painted with the data you've chosen for a specific piece, or even from the study or organization that's providing it? Doing so can help everyone understand the numbers, question the numbers, and – ultimately – use the numbers.

Funding Statement

This material is based upon work supported by the National Science Foundation under grant number 1906802. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Acknowledgments

Discussions with a great number of colleagues at both PBS NewsHour and Knology have informed this work, particularly Rupu Gupta, Erica Hendry, Nicole LaMarca, and Megan McGrew. We are also grateful to our project evaluators and advisors, who have participated in those conversations — and several of whom have read earlier drafts of this piece: Jim Corter, Jim Hammerman, Eric Hochberg, Danny Bernard Martin, Caitlin Petre, Jonathan Stray, Nikki Usher, and Darryl Yong. We are also grateful to two anonymous reviewers for comments which have clarified our thinking.

Disclosure Statement

No potential conflict of interest was reported by the author(s).

Data Availability

The full data set, including all coded clauses, is available at https://bit.ly/3jIF3K0.

Additional information

Funding

This work was supported by US National Science Foundation [Grant Number 1906802].

Notes

1 The many ways in which this is so are beyond the scope of the present paper, but see D’Ignazio and Klein (Citation2020) for a synthesis of many recent critiques.

2 PP (journalist): “This ideology calls to mind the debate over whether algorithms can be biased. Of course they can, but for a time people assumed they were as pure as numbers.”

3 Within the context of journalism, the notion of mechanical objectivity has also been used to critique the unreflexive symmetry we know informally as “both-sidesism.” For example, Galison (Citation2015) quotes an editorial published in TIME in 1950 that uses this term to take The New York Times to task.

4 Their categories are somewhat more granular than ours, and they found a large number of statistics in topic areas that would likely have been subsumed under one of ours: 58.5% of energy stories contained statistics, as did 54.1% of social policy stories, 49.7% of business stories, and 47.9% of taxation stories.

5 IIT (journalist): “My sense is that [the desire for credibility] is a fairly prominent motivator, underscored by the idea that numbers are objective and legitimate.”

6 Upon beginning to examine the clauses with 4 codes or more, we found some parsing errors. We thus excluded those that should have been further divided into multiple clauses (by our operational definition), leaving us with 166 clauses for close reading. In at least some cases, at least one of the component clauses may have received 4 codes on its own. However, we excluded these to be conservative.

7 It is important to note that we removed duplicate stories from the data set: some stories appeared on multiple dates, and some stories appeared in multiple content areas. That is, these stories appeared only once in the data set for analysis.

8 Fox News, the Associated Press, and – to a lesser extent – CNBC are more heavily represented among high-code clauses than in the data corpus as a whole: 11.7% of clauses in the database were produced by the Associated Press, 6.1% by CNBC, 2.5% by Fox News, 9% by CNN (including CNN’s wire service), and 8.2% by the New York Times.

9 National general outlets are under-represented compared to their representation in the database as a whole, while all other outlet types are over-represented: 53% of all clauses were produced by national outlets, 12.5% by economy specialists, 13.3% by newswires, and 3.2% by partisan outlets.

10 We are indebted to a colleague for observing in a conference session that “journalists say everything they know up front.”

11 As an example, we've rewritten the first example to make it more accessible: In 2019, the Virginia Joint Commission on Healthcare did a study. They found that Virginia has 122 licensed hospitals. Only 16 of them provide sexual assault forensic exams. The state also has 94,000 registered nurses. But only about 150 of them are credentialed forensic nurses.

12 By contrast, the densest Politics story has 1.87 codes per clause. It is the 23rd most dense story in the dataset. The densest Science story has 1.72 codes per clause. It is the 31st most dense story in the dataset - behind 13 Health stories, 16 Economy stories, and 1 Politics story.

13 For example, Irwin (Citation2020) walks through common economic indicators – the unemployment rate, the 'establishment survey,' and GDP – to outline how COVID-19 shutdowns interact with data collection and reporting conventions to produce irregular results.

14 IIT (journalist): “I'd argue that this problem still [in May 2021] largely has gone unaddressed in a lot of reporting, especially mortality.” She recommends https://ourworldindata.org/covid-mortality-risk

15 We expected to see both left-wing and right-wing outlets, but all three outlets in this category were right-wing: Fox News, The Daily Wire, and The Blaze.

16 Rather than attempting a theoretical definition of news, which remains contested (see, e.g., Armstrong et al. Citation2015; Cunningham et al. Citation2016; Edgerly and Vraga Citation2019; Ekström and Westlund Citation2019), we defined it functionally and accepted Google News’ inclusion criteria at face value. The single story from NASA was a YouTube video from a series called “NASA Explorers” intended for general audiences, which we judged to be comparable to other science stories in the sample. The story was relatively light on quantification and thus not one of the stories we focus on for analysis in this paper.

References

References from data set

Appendix A

Clause-level codes (full version)

Appendix B: The stories with the largest number of dense excerpts in our data set (≥ 3 dense excerpts)

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.