Annual Report Narratives and the Cost of Equity Capital: U.K. Evidence of a U-shaped Relation

Abstract We hypothesize and test for a U-shaped relation between the cost of equity capital and the level of disclosure in annual report narratives. Using a computer-generated word-count-based index of the level of disclosure in U.K. annual report narratives, we document a negative relation with the cost of equity capital at low levels of disclosure, and a positive relation at higher levels of disclosure, together implying the presence of an optimal level of disclosure. We interpret the positive relation at higher levels of disclosure as evidence of uninformative clutter increasing the cost of equity capital. Additional analyses indicate the presence of both firm-level learning and regulatory corporate reporting initiatives as factors shaping adjustments towards optimum levels of disclosure.


Introduction
A recurring theme in the financial disclosure literature is the relation between disclosure and the cost of equity capital. Several studies hypothesize, and find, a negative (and linear) relation between the cost of capital and corporate disclosure (see Botosan, 2006 for an overview). Beyer, Cohen, Lys, and Walther (2010) however, conclude that the evidence on the negative relation remains inconclusive. This inconclusiveness especially applies to studies that measure disclosure via annual report disclosures.
The present paper postulates and tests for the existence of a non-linear U-shaped relation between the cost of equity capital and the volume of narrative disclosures. On the one hand, prior theoretical literature explains how increases in disclosure can drive a reduction in the cost of capital by reducing (priced) information asymmetry. On the other hand, the market may respond negatively if management increases the clutter in annual reports (both in terms of the volume and complexity of disclosures) in order to obfuscate underlying poor performance. Paradoxically, the opportunities for the generation of uninformative clutter may be increased by regulatory demands for increased disclosure. Taken together, the positive and negative theoretical effects of increased disclosure could lead to a U-shaped relation between the cost of capital and disclosure.
We use U.K. data to test for a U-shaped relation. We believe that the U.K. is particularly well suited to examine the effect of annual report disclosure on the cost of equity capital as it is a major stock market economy where a considerable amount of thought and experimentation has been given by regulators (and companies) regarding the content of the annual report. In particular, unlike the U.S.A., where annual reporting is shaped by a highly structured form 10-K, U.K. firms have considerable discretion over both the content and the format of their annual report disclosures.
In order to test our main hypothesis, we construct an index of the volume of the disclosures in an annual report. Our disclosure index, Discindex, is constructed by aggregating the full sample ranks of eight component measures of annual report disclosure. Seven of the eight components of Discindex capture the word counts of annual report commentary relating to strategy, performance, causality relations, forward-looking information, governance and remuneration (combined), other front-end commentary, and rear-end commentary. The eighth component measures the readability of annual report performance commentary.
In our main model, we regress an empirical proxy for the cost of equity capital on Discindex and Discindex squared as well as a number of control variables and fixed effects. The results provide strong support for the existence of a U-shaped relation. However, a significant challenge to modeling the relation between the cost of equity capital and disclosure is that disclosure is an endogenous variable that is chosen by firms. In a cross-sectional setting, this means that disclosure could have different directional effects on the cost of equity capital, and this could lead the researcher to conclude no effect. This is even more so the case when disclosure is partially endogenous, that is, when firms deviate from optimum levels due to random (regulatory) shocks and (subsequent) learning effects (Larcker & Rusticus, 2007). Larcker and Rusticus (2007) highlight this point, that is, the possibility that firms are only partially successful in making optimal policy choices and learn to make better choices over time. This notion of partial adjustment to optimal policy choices is particularly compatible with our framework, especially as it is likely to co-exist, or even be driven by, regulatory overload clutter. As such, in additional analyses we test both effects.
First, we test for the possibility that firms do not know for sure their optimum disclosure choices, and that they learn gradually both from their own experiences and from the experiences of other firms. To test for this element of learning over time, we estimate the distance from the optimum volume of disclosure, as implied by the cost of equity capital model, and test the extent that this distance predicts subsequent changes in disclosure, distinguishing between firms that over-disclose (positive distance) and firms that under-disclose (negative distance). We find that positive and negative deviations from optimum volume levels predict, respectively, declines and rises in subsequent disclosure levels, consistent with firms adjusting their volume of narrative disclosures towards the optimum level over time.
Second, we test for the impact of regulatory corporate reporting initiatives in triggering, or constraining, clutter. During our sample period, we identify two regulatory initiatives that were targeted to improve the content of U.K. annual report narratives. The first one was the introduction of the Business Review in annual report narratives in 2006 and the second was the 2010 revision of the U.K. Corporate Governance Code requiring firms to provide a clear explanation of the business strategy in the narratives. In between these initiatives, regular reviews of the disclosure practices of U.K. firms revealed clutter to be rising within the narrative sections, mainly Annual Report Narratives and the Cost of Equity Capital 29 as a behavioral response to regulatory information overload. The Financial Reporting Council (FRC) responded by issuing a set of guidelines for making corporate reports less complex and more relevant by removing unnecessary text (2009,2011). In light of these guidelines, the two regulatory initiatives offer scope for examining the effect of regulation on both the 'rising' and 'cutting' of clutter in annual reports. Using the excess of actual disclosure over and above optimal disclosure as our operational proxy for clutter, we document around the two new regulations both a significant rise and a significant decline in the amount of overdisclosure. These results highlight the role of regulatory reporting initiatives as factors shaping adjustments towards optimum volumes of narrative disclosures.
In addition to providing a basis for generating our disclosure index, our disclosure data provides word counts for individual sections and specific types of language (e.g., strategic keywords) and these individual sections and types of language can be traced back to major U.K. regulatory reporting initiatives. This enables our analysis to (i) highlight the presence of both intended and unintended consequences of these initiatives for the cost of equity capital and (ii) offer insights for the effect of individual initiatives on the content of the annual report. With respect to the latter, results of additional analysis unraveling the effects of individual components of our disclosure index suggest that increases in performance commentary and governance and remuneration sections of the annual report may have been successful in reducing the cost of equity capital -although we do not claim to have established a causal effect. These findings are broadly supportive of the disclosure developments encouraged by the U.K. regulator, and may be useful to firms when considering which types of narratives to increase or decrease.
In summary, our paper provides new insights into the relation between the cost of equity capital and annual report narrative disclosures. Specifically, we draw from theory and practice to provide a framework, and associated evidence, for a non-linear U-shaped association between annual report narratives and the cost of equity capital. Our findings imply the existence of an optimum volume of narrative disclosures in the annual report and the existence of uninformative disclosures, or clutter, associated with high volumes of narratives. These results imply the possibility of managing the costs of equity downwards not only through increasing disclosure volumes at below optimal volume levels, but, equally important, also through removing narratives when narrative disclosures have reached above-optimum volume levels. Our analysis also highlights that reaching optimal disclosure levels is not a straightforward task as firm learning and regulatory overload might impede firms from reaching optimal levels immediately.

U.K. Regulatory Framework
U.K. annual reports can be viewed as a 'one stop shop' for corporate information about various aspects shaping firm value. Also, U.K. firms are given considerable discretion over both the content and the format of their reports. Such discretion allows companies to tell their value creation story in their own way, but it can also lead to excessively lengthy (cluttered) annual reports.
Over the last two decades, U.K. firms were subject to numerous regulatory and legislative developments in corporate reporting requirements. Below we outline developments that shaped the provision of narrative disclosures related to management performance commentary, forwardlooking information, corporate governance, executive remuneration, and corporate strategy.
Since 1998, the U.K. Combined Code on Corporate Governance has been the flagship policy instrument used to provide regulatory guidance to U.K. listed firms about their overall governance, remuneration, and financial reporting practices. Over the years, the Code has been 30 V. Athanasakou et al. developed through a program of engagement with companies, investors, and other stakeholders. As a result, significant demands were added to the Code in revisions issued in 2003, 2006, 2008, 2012, 2014, and 2016. In addition to establishing the principles of corporate governance in the U.K., the Combined Code also imposed a requirement to provide extensive disclosures on how companies comply with the Code within their annual reports. Most companies responded to these disclosure requirements by producing a separate statement of corporate governance within their annual reports. Many of the more routine disclosures were published as part of the directors' report but disclosures requiring explanations and additional voluntary content about corporate governance was mostly placed in the corporate governance report.
A second important development in U.K. corporate reporting rules was the requirement for firms to produce a remuneration report as part of their annual report. The 1998 version of the Combined Code established the principle that the annual report should contain a statement of the remuneration policy of the firm and details of the remuneration of each director. The requirement to issue a remuneration report and detailed specific disclosure requirements were subsequently enshrined in statutory regulations, starting with The Directors' Remuneration Report Regulations in 2002, and then with the more strenuous Large and Medium-Sized Companies and Groups (Accounts and Reports) Regulations in 2008, 2013, and 2016. GC100 (2016 provides detailed guidance on the remuneration disclosure regulations, which came into force on 1 October 2013, that is, towards the end of our sample period. In contrast to the U.S.A., annual reports in the U.K. must contain a comprehensive remuneration report, and this is the main source of this information. In the U.S.A. many firms disclose their remuneration details in their proxy statements, and these are only included in the 10-Ks filing by reference. Three other areas where the Combined Code encouraged new, mostly narrative, disclosures were management performance commentary, forward-looking information, and business strategy. A key pilot initiative was the requirement by Section 417 of the Companies Act 2006A for firms to include a business review in their annual report narratives. 1 The business review introduction was reinforced by the publication of Reporting Statement 1 (RS1) which recommended U.K. listed firms to publish an Operating and Financial Review. 2 The FRC issued this statement as a formulation of best practice that would have persuasive rather than mandatory force. The statement encouraged greater disclosure in the form of forward-looking information as well as performance reviews on current year financial performance and end-of-year financial position. A substantial follow-up initiative to RS1 was the 2010 revision of the U.K. Corporate Governance Code encouraging explicit management commentary in annual report narratives about firm strategy, objectives, and the business model. In 2013 strategic reporting became a mandatory section for all U.K. annual reports through an amendment of U.K. Company Law.

Theoretical Motivation, Prior Empirical Evidence, and Hypothesis Development
From a theoretical perspective, annual report narratives could have a beneficial, none, or even a harmful effect on the cost of equity capital. A beneficial effect arises when narratives are successful in reducing information risk and, if this risk is priced, then the cost of capital should decline (Barry & Brown, 1984, 1986Brown, 1979;Easley & O'Hara, 2004). However, if markets react less completely to information that is less easily extracted from public disclosures, managers have incentives to obfuscate information when firm performance is bad (Bloomfield, 2002). Under this framework more narratives may reflect management attempts to mask poor 1 Within the Business Review firms provide a commentary on corporate objectives, strategy and resources available to deliver those objectives, risk and uncertainties facing the entity, and trends and factors likely to affect the company's future. 2 RS1 is the only reporting statement the FRC has ever issued.
Annual Report Narratives and the Cost of Equity Capital 31 underlying performance and not reduce information risk. In this case, additional disclosure may not reduce the cost of capital, but may 'backfire' and instead increase the cost of capital, especially if investors perceive overly lengthy narratives as more difficult to process and as increasing information risk. Alternatively, when there are no direct costs of misreporting -likely to be the case with qualitative financial disclosures -managers may make misleading disclosures. In equilibrium, investors might treat such disclosures as non-informative as they are independent of managers' private information ('babbling equilibrium', Stocken, 2000) in which case more disclosures may either not matter or may even backfire. 3 Regulatory reporting initiatives play an interesting role in this theoretical framework. Initiatives ostensibly targeted at reducing information asymmetry may, paradoxically, increase the scope for meaningless financial disclosures that impede users from understanding firm value creation (see, e.g., Bloomfield, 2002). 4 The regulatory overload of reporting initiatives may offer a channel to management for communicating meaningful information, but may also provide a way to intentionally obfuscate a poor underlying economic reality. At the same time clutter may be unintentional if firms are unable to successfully identify optimal levels of disclosure immediately after a new regulatory initiative. Indeed, an FRC review highlighted that clutter may reflect a behavioral response to excess regulatory demand for more information, coupled with a lack of sufficient guidance on what constitutes material information (2009). It is plausible that when firms face a rising regulatory demand for information and a regulatory guidance on materiality which focuses on what to include rather on what to take out, they err on the side of caution by devising long checklists of items for inclusion in the annual report. Peer disclosure dynamics may further exacerbate this regulatory overload clutter. Institutional theory arguments suggest that, in the absence of normative guidelines, organizations mimic peers to obtain 'social proof of appropriate behavior as a cognitive heuristic to reduce uncertainty (Dacin, Goodstein, & Scott, 2002;Deephouse, 1996;DiMaggio & Powell, 1983). Under this framework, information overload, or clutter, arises in a firm's attempt to reinforce legitimacy through social herding, or as the FRC notes 'firms may disclose information simply because everyone else does ' (2009, 2011). Irrespective of the existence of managerial intent, clutter in the annual reports, as a manifestation of the regulatory reporting paradox, acts as a key source of tension in the effects of annual report disclosures on the cost of capital.
From an empirical perspective, prior studies based on annual report narrative disclosures show that these narratives typically exhibit the predictable effect of reducing the cost of equity capital but the effect is largely contingent on factors associated with the firm's information environment. Botosan (1997) employs a disclosure index to manually score the annual report disclosures of 122 U.S. companies and finds a significant negative relation between the overall disclosure index and the cost of equity capital, but only for firms with low analyst following. Francis, Nanda, and Olsson (2008) use a similar index approach to rate 677 U.S. annual reports in 2001. They document evidence of a negative association between their index and the cost of equity capital, but the effect is reduced or disappears completely once they condition on earnings quality. Using automated content analysis, Kothari, Li, and Short (2009) find that the management disclosures in annual reports associated with different aspects of risk (e.g., market, firm, organizational, reputational, performance or regulatory risk) exhibit predictable increases in the cost of equity capital effects for small cap stocks but inconsistent effects for large cap stocks. 5 Grüning (2011) produces an automated disclosure index rating German annual reports along several dimensions (e.g., information about markets, customers, employees, corporate governance, R&D, capital markets and corporate strategy) and shows that the index is negatively related to various information asymmetry proxies, including bid-ask spread.
Turning to the possibility that clutter and other forms of obfuscation may generate harmful capital market effects, a number of studies examine the determinants and/or capital market consequences of annual report length and readability in this respect. 6 A prominent study in this pool, Li (2008) examines annual reports of 50,000 firm-years from 1994 to 2004 and finds that longer annual reports and annual reports with a lower readability, as measured by a higher Fog Index, exhibit lower and less persistent earnings performance. Lehavy, Li, and Merkley (2011) find that analyst following, the amount of effort incurred to generate analyst reports, and the informativeness of analysts' reports, are greater for less readable annual reports. They also find that less readable annual reports are associated with more earnings forecast dispersion and less accurate earnings forecasts. A number of other studies document negative equity capital market consequences of longer or less readable annual reports. Miller (2010) finds that small equity investors reduce their trading when annual reports increase in length and that longer annual reports are associated with reduced trading consensus for small investors. Lawrence (2013) finds that less sophisticated investors prefer clearer and more readable 10-K filings. You and Zhang (2009) employ a simple word count measure and document that the initial under-reaction to 10-K filings, and the subsequent return drift, tend to be higher for annual reports with an above-median word count. Lee (2012) shows that less (more) of the earnings-related information is impounded in stock prices during the three-day filing window (post-filing drift window) for quarterly reports with an above-median change in length or readability. Callen, Khan, and Lu (2013) find that less readable annual reports have a higher stock price delay. Bonsall and Miller (2017) provide similar evidence for the credit markets. They measure textual complexity at the most recent annual report filed before the bond offering date and show that less readable annual reports are associated with a higher cost of debt. At the same time, Bonsall and Miller (2017) note that at present 'there is no evidence supporting the notion that financial statements [length or] readability is associated with a higher cost of equity' (p. 614).
Summarizing the discussion in this section, while it is conceptually plausible for annual report narratives to have a positive effect, no effect, and even a negative effect on the cost of equity capital, empirical studies on narratives often focus on examining a one-directional association. We fill this gap in the literature by considering forces triggering opposite sign associations between financial disclosures in the annual report narratives and the cost of equity capital, and use this as a basis for building a hypothesis for a non-linear U-shaped association. From a theoretical perspective, U.K. annual report narratives, especially in light of the rise of different types of commentaries about firm value, may, on one hand, serve to decrease information asymmetry, yielding cost of capital benefits. On the other hand, the development of new types of commentaries may raise scope for using narratives as a way of distracting users' attention from the firm's underlying performance. Such obfuscation attempts may be reflected in lengthier and less readable commentaries. Also, in the presence of a rising regulatory demand for information in annual report narratives, lengthier narratives may reflect regulatory overload clutter as a firm-level learning effect or a behavioral response to ensuring compliance with norms. Either way, at higher levels of disclosure, further increases in disclosure may have either no impact on the cost of equity capital, or may even backfire and increase the cost of equity capital. We therefore hypothesize a U-shaped association between annual report narratives and the cost of equity capital, with the cost of equity capital decreasing with increasing disclosure in narratives up to some point, but then increasing at higher levels of disclosure beyond that point as clutter obfuscates the annual report content. Formally we test the following hypothesis: H1: The association between annual report narrative disclosures and the cost of equity capital is a U-shaped relation implying a negative marginal association at low levels of disclosure and a positive marginal association at high levels of disclosure.

Measuring Disclosure
To construct our disclosure index, Discindex, we begin by assigning annual report narratives to content categories based on the CFIE dataset of automated disclosure scores. 7 The CFIE website contains word frequency counts, and a small number of other linguistic properties, for U.K. annual reports published in calendar years 2003-2014 by firms listed on the London Stock Exchange (LSE). For each annual report, word counts, and other linguistic properties, are given for the (i) Chairman's Statement, (ii) Chief Executive Officer's (CEO) Review, (iii) Chief Financial Officer's (CFO) Review (iv) aggregate business review sections (including CEO review, CFO review, business review and operating reviews) (v) aggregate management performance commentary (including Chairman's Statement, CEO review, CFO review, business review, operating review, financial review, strategy and business model review), (vi) corporate governance statements, and (vii) remuneration reports. Linguistic properties, other than word counts, comprise page counts, measures of net tone and uncertainty, volumes of forward-looking and causal reasoning commentary, and two readability indices. Most linguistic properties are calculated for the entire annual report and for the seven generic sections in (i) to (vii). 8 El-Haj, Alves, Rayson, Walker, and Young (2020) discuss in detail the motivation behind the CFIE project and provide a full discussion of the steps followed to generate the disclosure scores from digital annual report PDF files. In summary, the process involves three main steps: retrieval of document structure, classification of content, and retrieval and analysis of textual content.
Retrieval of document structure begins by automating the detection of the page containing the annual report table of contents. The information in the contents table is then used to identify document structure, that is, individual section headings, and the pages on which these sections begin and end.
The classification of content was done in two stages. The first stage involved distinguishing simply between narrative sections ('Narratives') and sections containing the mandatory financial statements ('Financials') including financial statement footnotes. This binary classification task was based on an allocation of section headings found in the table of contents to either Narratives or Financials. The second step involved allocating individual narrative headings to the seven generic sections (i) to (vii). Three of these seven sections, namely management performance commentary, remuneration report, and corporate governance, correspond directly to the U.K. reporting initiatives outlined in Section 2.1. All narrative annual report sections other than those allocated to management performance commentary, remuneration report, or corporate governance, are assigned to a residual category comprising primarily Corporate Social Responsibility (CSR), directors' reports, and risk reports.
The retrieval and analysis of text step involved retrieving the textual content and then calculating (and storing in Excel) the word frequency counts, and the other linguistic properties, for the entire annual report, for Narratives and Financials, and for the generic sections (i) to (vii). 9 The present paper exploits the ability of the CFIE dataset to identify the volume of narratives in individual sections of the annual report, especially the sections associated with management performance commentary, remuneration, and corporate governance. It also exploits the ability of the CFIE tools to count the volume of specific categories of content, namely the volume of forwardlooking and causal reasoning keywords. Our measure of disclosure, Discindex, aggregates six CFIE based word count measures, one CFIE based readability measure, plus one non-CFIE measure of the volume of strategic discussion, into an overall index that reflects the length (and readability) of annual report narratives. Appendix 1 provides full details of how Discindex was constructed. In particular Equation (A1) presents the formula that shows how the eight categories of content (which we refer to as sub-indices) are aggregated into a single overall index, i.e., Discindex.

Estimating the Cost of Equity Capital
The cost of equity capital can be calculated in various ways including from realized future returns and from estimates implied by analyst earnings forecasts. In the present paper, we follow the approach of Hou, Van Dijk, and Zhang (2012) and generate estimates implied by a cross-sectional earnings prediction model. 10 Employing an earnings prediction model allows us to investigate the effects of disclosure for the firms where we would expect to have the largest effect, namely the smaller companies that are often neglected by other researchers due to a lack of analyst earnings forecasts. We present details for the Hou et al. (2012) earnings prediction model and our results for this model in the Internet Appendix (henceforth IA).
With the earnings predictions in hand we then estimate the cost of equity capital via four valuation models. This is typically referred to as reverse engineering and yields implied cost of equity capital measures that are expected by the market and 'justify' the market price given the current earnings predictions. We follow the prior literature such as Hail and Leuz (2006) and use four valuation models: Claus and Thomas (2001) (Coec_ct), Gebhardt, Lee, and Swaminathan (2001) (Coec_gls), Ohlson and Juettner-Nauroth (2005) (Coec_oj) and Easton (2004) (Coec_peg). Specifically, to minimize measurement error, we follow the prior literature and employ as our cost of capital estimate the average estimate from the four valuation models, and we require at least two estimates to calculate the average. 11 The IA presents details for the four valuation models. We find in untabulated tests that the four cost of equity proxies are positively correlated with future realized firm returns (even after controlling for firm fixed effects) thus providing partial validation for the four cost of equity capital measures.

Regression Model
We employ the following regression model to test H1: Our data consists of firm-year values of the variables of interest (but we omit time and firm subscripts in Equation (1) for simplicity). The dependent variable, Coec, is an estimate of the average implied cost of equity capital. The main independent variables of interest in (1) are Discindex and Discindex_squared and they are our measures for the volume of annual report narratives and squared volume of annual report narratives. The hypothesized U-shaped relation between the cost of equity capital and narratives implies the prediction of a negative coefficient on Discindex and a positive coefficient on Discindex_squared. The other variables in (1) are control variables that have been identified in the prior literature as potential determinants of the cost of equity capital. Specifically, Beta is motivated by the literature on the capital asset pricing. Market to Book, MTBV, and natural logarithm of total assets, LogTA, are motivated by empirical asset pricing studies that regularly find a tendency for the cost of equity capital to be associated with these variables. Leverage is motivated by a basic insight from corporate finance which shows that, for a given stream of firm cash flows, the cost of equity capital increases with leverage. LogAge is included to control for the age of the company since younger firms tend to have a higher cost of capital than more mature firms. Finally, AbsAwca represents absolute abnormal working capital accruals as estimated from the modified Jones model and is motivated by the earnings management literature, including Dechow, Sloan, and Sweeney (1995) who suggest that high levels of earnings management increase the cost of equity capital. In contrast, and as discussed earlier, Francis et al. (2008) show that the relation between disclosure and the cost of equity capital becomes insignificant once one controls for earnings quality. All variable definitions are given in Appendix 2. We estimate Equation (1) using ordinary least squares, adding industry fixed effects, and firm fixed effects. We view our firm fixed effects results as being the most reliable as there are good reasons to expect the cost of equity capital to vary across firms. Firm fixed effects offer a potential for controlling for endogeneity due to heterogeneity in unobserved firm characteristics beyond those captured by the control variables and industrial effects. 12

Sample Selection
We obtain share price, return, and company accounting data from Datastream. Our initial CFIE sample consists of 10,443 U.K. listed firm-years for which we have disclosure scores from the CFIE dataset. Subsequently, we deleted 52 observations due to (i) an inability to find the matching Datastream company codes (9 observations) and (ii) the CFIE dataset containing two documents per firm-year (43 observations). Our disclosure scores -defined as fractional ranks for the eight index components and percentile ranks for Discindex -were built based on the dataset of 10,391 CFIE observations. To arrive at our final sample we applied two more filters as shown in Table 1 Table 1, Panel A. Panel B shows the year-by-year composition of the sample. Finally, we winsorize all variables in our analysis, other than the disclosure score ranks, at the 1st and 99th percentiles to mitigate the effects of outliers.

Descriptive Statistics
This section presents summary statistics for the main variables in the empirical analysis. Table 2 reports summary statistics for the final (estimation) sample of 5152 firm-year observations. Panel A shows the distribution of Discindex and the distributions of the underlying eight ranked disclosure measures: The mean and median values are all greater than 0.50 because we rank disclosure scores before deleting firms-years with missing variables and because firm-years with non-missing variables tend to have higher disclosure ranks, on average, than firms with missing variables. Panel B shows that there is some variation in the mean and median cost of equity capital estimates across the four different valuation models. In particular, the means and medians of the earnings growth models measures (Coec_oj and Coec_peg) (for which we had a restricted sample since they require earnings growth) are materially higher than the means and medians for the measures based on the residual income valuation models (Coec_gls and Coec_ct). For the control variables in Panel C a notable feature is that the mean of Beta is only 0.61, perhaps suggesting that thin trading has led to a downward biased estimate. Note, however, that Beta is only one of several risk factors in our regression model. The mean leverage of the firms in our sample is 25% and the mean market-to-book value is 2.83. Mean values of LogTA and LogAge are 11.8 and 3.14, respectively. Panel D reports the median values for Discindex and its fractional rank components by year. The two lowest median values for Discindex (0.45 and 0.45) occur in the first two years of the sample period and the three highest values (0.74, 0.78, and 0.74) occur in the last three years of the sample period. There is a marked increase in the median Discindex from 2005 to 2006 -presumably due to the implementation of IFRS. Panel D indicates that most of the median fractional ranks of the seven components based on word counts increase materially over time. The slight exception to this is the causal word count that has a more irregular patternincreasing slightly from 2003 to 2013 but dropping back in 2014. With regards to the median Fog index, we define the fractional rank as the inverse of Fog so lower values of the median fractional rank indicate lower readability, that is, a tendency for readability to worsen over time. Further details of the empirical properties of Discindex and its components are presented in the IA. The IA also documents the relation between the fractional ranks of the components of Discindex and the raw scores from which the fractional ranks are derived. 14 Table 3 reports the results of a panel regression of the cost of equity capital on Discindex, Discindex_squared, and the control variables. Column (1) presents results without either industry or fixed effects for a linear model. In column (2) we add the Discindex_squared term. Column (3) controls for industry fixed effects and column (4) for firm fixed effects. We regard the firm fixed effects regression as our main result. In column (1) we document a statistically significant negative relation between Discindex and the cost of equity when the quadratic disclosure term is omitted; in line with prior research that documents such a beneficial linear effect of disclosure (e.g. Botosan & Plumlee, 2002). Furthermore, all three non-linear models in (2) to (4) show a significant negative coefficient on Discindex and a significant positive coefficient on Discindex_squared. The size and significance of the two coefficients are much lower for the firm fixed effects regression. However, the t-statistics associated with Discindex and Discindex_squared in (4) are − 4.39 and 5.14, respectively, indicating that the two coefficients remain highly significant even in the firm fixed effects model. 15 Overall these results strongly support our U-shaped hypothesis. 16 In columns (2) to (4) we use the estimated coefficients on Discindex and Discindex_squared to calculate an implied optimum disclosure level (ODL). Specifically, we derive the partial derivative of Coec in Equation (1) with respect to Discindex and set this derivative equal to 0, that is, dCoec / dDiscindex = β 1 + 2 β 2 Discindex = 0. Rearranging yields an optimal Discindex 14 A correlation table for the main regression variables is also presented and discussed in the IA. 15 To assess the sensitivity of our findings in Table 3 to the way the components of the index are aggregated, we run the fixed effects regression with all the individual components and all the squared values of the components included as separate variables. The adjusted R-squared of the model increases only marginally, from 0.43 to 0.44. Consistent with the U-shaped hypothesis, the t-value of the average value of the individual (linear) coefficients is − 5.41, and the t-value of the average value of the squared term coefficients is 6.33. 16 The IA presents robustness results where our implied Coec estimates are replaced with realized future returns, acknowledging that realized returns are a noisy proxy for Coec (e.g., Campbell, 1991;Elton, 1999;Vuolteenaho, 2002). In addition, the IA addresses the concerns of Wang (2015Wang ( , 2017 about implied Coec estimates.  (1) and an additional quadratic term for disclosure in columns (2) to (4). Next to our standard control variables (see Appendix 2 for definitions) we include no fixed effects (columns (1) and (2)), industry fixed effects (column 3), and firm fixed effects (column 4). The t-statistics are adjusted for firm error clustering; *, **, and *** denote significance at 10%, 5%, and 1%. We calculate the optimal Discindex as -β 1 / 2β 2 where β 1 is the coefficient on Discindex and β 2 is the coefficient on Discindex_squared (as in Equation (1)).

Main Results
of -β 1 / 2 β 2 . Using the estimates of β 1 and β 2 of − 0.169 and 0.154 from column (4) yields an ODL of 55% in (4) which is slightly lower than the corresponding implied ODL of 64% in column (2). The implied ODL under industry fixed effects falls between these two values at 62%. 17 With respect to the control variables, leverage is significantly positive and firm size is significantly negative in all regressions, consistent with prior literature. Also, market-to-book is significantly negative in all four specifications. We do not obtain consistent evidence of significance for Beta, AbsAwca, and LogAge. The adjusted R-squared of our models lie between 43% and 50% depending on the fixed effect specification.
To illustrate the intuition of the ODL, we plot the quadratic relation between cost of equity capital and annual report narrative disclosure for the firm fixed effects specification in Figure 1. Our measure of narrative disclosure seems to reduce the cost of equity capital until the minimum point is reached. Beyond this point, the cost of equity capital increases with additional narrative commentary. 17 Changes in sample composition are also unlikely to underlie our results, on account of the large size of the crosssection in our sample (1183 firms in our final sample with a 4.4 average number of years). As a robustness test, we also repeat the analysis for a 'quasi-balanced' sample of firms that have at least 10 observations over our sample period. The average observation per firm for this test rises to 10.8. We document similar results for this sample. We find optima that are similar to those reported in Table 3 (e.g., in column (2) Table 3, column (4) (that is after allowing for firm fixed effects)

Changes in the U-shaped Relation Over Time
The U.K. experienced material increases in the average level of disclosure over the sample period. This was largely caused by a number of regulatory reporting initiatives, outlined in Section 2.1., as well as firms responding to investor demands for information relating to a number of emerging themes such as key performance indicators, business model, strategy, governance, and remuneration.
To gain some insights into the effects of these changes on the U-shaped relation we estimate Equation (1) using three-year rolling windows. 18 This analysis allows us to observe time variation of the ODL. Table 4 reports the regression results.
The (absolute) size of the regression coefficients on Discindex and Discindex_squared decrease materially from the earlier to the later windows. Nevertheless, both coefficients remain statistically significant throughout. Of particular interest is the tendency for the ODL to increase materially from the earlier to the later windows. For example, the ODL increases from 50% in 2005 to 74% in 2012. 19 This increase in the ODL reflects either a rise in net benefits of higher disclosure in annual report narratives over the years, or a drop in the annual report narratives disclosure threshold. Higher net benefits of disclosures could arise from firms' higher awareness over the years of the type of information that investors need in the annual reports. The regulatory reporting initiatives of the FRC (outlined in Section 2.1.) targeted such awareness through revisions of the Code that responded to investor demands for more information on key drivers of performance and risk in annual reports (in Code revisions issued in 2003(in Code revisions issued in , 2006(in Code revisions issued in , 2008(in Code revisions issued in , 2012(in Code revisions issued in , 2014(in Code revisions issued in , and 2016. Later in our sample period regulators could have also contributed to lower preparation costs of annual report narratives by giving explicit guidance on what constitutes clutter (FRC, 2009(FRC, , 2011) and therefore accelerating firm-level learning effects. Through the   Notes: The table shows the relation between a firm's overall disclosure index and the cost of equity capital with a linear term and a quadratic term for disclosure. We calculate the optima based on the coefficients on Discindex and Discindex_squared as described in the text and in Table 3. The regressions are based on a three-year rolling window and include industry fixed effects. We include our standard control variables (untabulated) (see Appendix 2 for definitions). The t-statistics are adjusted for firm error clustering; *, **, and *** denote significance at 10%, 5%, and 1%.
lens of voluntary disclosure theories, an increase in the ODL could also reflect the gradual fall in disclosure thresholds from proprietary costs (Verrecchia, 1983) or disclosure thresholds allowed by investor uncertainty that managers are informed (Dye, 1985), a fall that is well anticipated in the presence of the FRC's enhanced frameworks for reporting on firm strategy and the business model, governance, and remuneration. The final row of Table 4 reports the proportion of the three-year rolling window sub-samples with a Discindex value lower than the implied three-year ODL. The proportion of firms with Discindex below the ODL ranges from 41%, in the 2006-2008 rolling window, to 56%, in the 2009-2011 and the 2011-2013 rolling windows. The median proportion is 48%. There is a slight tendency for the proportion of firms with Discindex below the ODL to be greater the higher the value of the ODL. However, this association is far from perfect. The adjusted R squared value of a regression of the proportion of firms with Discindex lower than ODL on ODL is only 20.7%.

Firm Learning Effects
Given that disclosure is endogenous, it is possible that the non-linear U-shaped relation that we observe in the data is just an equilibrium 'illusion' from the cross-sectional distributions of different optimal associations (Larcker & Rusticus, 2007). This alternative interpretation of the core result assumes that firms know their optimum level of annual report narrative disclosure at every point in time and are able to costlessly adjust their levels of narrative disclosure to the optimum at every point in time. We believe that this assumption is unrealistic. We argue that firms face considerable difficulties in identifying their optimum level of narrative disclosure, partly because firm characteristics change over time, partly because accounting and disclosure regulations change over time, and partly because understanding how the market uses disclosures is not always an easy task. For these reasons, it is reasonable to view the level of narrative disclosure as being partially endogenous, as defined by Larcker and Rusticus (2007), when firms are unsuccessful in reaching optimal policies immediately. Under this view, firms learn from their mistakes over time: If they are below (above) the optimum level of narrative disclosure, they increase (decrease) their level of disclosure towards the optimum level. We next investigate these firm-level learning effects as they could also contribute to the non-linear U-shaped association between disclosure and the cost of equity capital.
To examine how firms learn to adjust their disclosure to optimum levels over time we adapt the Core and Guay (1999) modeling approach to the disclosure choice and cost of equity capital context. Core and Guay (1999) devise a two-stage approach to model changes in the level of stock-based compensation allowing for the possibility that firms adjust compensation packages over time in the direction implied by optimal portfolio incentives. In the first stage, they regress the level of stock-based compensation on a set of potential determinants of stock-based compensation such as firm size, idiosyncratic risk, book to market, a proxy for a potential free cash flow problem, and industry controls. In the second stage model, they relate the level of new stock-based compensation grants in year t, to the residual of the first stage model in year t − 1, controlling for firm size, book to market, net operating loss, cash flow shortfall, current and prior period stock returns, and industry effects.
In a similar fashion, we test for the possibility that firms adjust their volumes of narrative disclosure towards the ODL over time using a two-stage approach. 20 Our analysis of this learning effect requires the calculation of an ODL on the year level. To generate this, we estimate Equation (1) using a three-year rolling window, that is, we obtain coefficient estimates for Discindex and Discindex_squared each year using data on years t − 2 through to year t. 21 We then again use these estimates to calculate the implied ODL for each year.
In the second stage, we predict that firms that had lower (higher) actual levels of disclosure, relative to the ODL, will increase (decrease) their level of disclosure in subsequent periods. Specifically, we use the results of the first-stage regression to define two measures of the distance of actual firm disclosures from the optimum (year-specific) disclosure levels: Lag_distance_overdisclosure is set equal to the actual disclosure level minus estimated ODL, if this value is positive, and zero otherwise, while Lag_distance_underdisclosure is set equal to the estimated ODL minus the actual disclosure level, if this value is positive, and zero otherwise.
In our second-stage we regress the current year change in Discindex on one-year lagged overand under-disclosure, along with disclosure model controls such as LogTA, MTBV, AbsAwca, ROA, Exantefinance and SNetincome as follows: where Discindex is the year-on-year change in Discindex, Lag_distance_overdisclosure and Lag_distance_underdisclosure are our measures for prior years' over-and under-disclosure, relative to prior year's optimal disclosure, Exantefinance is an attempt to model the firm's need for external finance, and SNetincome is the standard deviation of net income over years t to t − 2. We would expect disclosure to increase in external finance needs and earnings volatility. These controls follow Athanasakou, El-Haj, Rayson, Walker, and Young (2019). All variables are as defined in Appendix 2. Table 5 column (1) reports the results of this test. As expected, the slope coefficient on Lag_distance_overdisclosure is significantly negative ( − 0.383, t-statistic = − 14.2) while the slope coefficient on Lag_distance_underdisclosure is significantly positive (0.101, t-statistic = 5.81). These two coefficients suggest that previously over-and under-disclosing firms decrease and increase, respectively, their volume of annual report narrative disclosure in the following period, and the extent of their adjustment is based on the degree to which they overand under-disclosed in t − 1. Furthermore, these observations hold even after controlling for the negative serial autocorrelation that we observe in the data for Discindex: When we include the one-period lagged change in Discindex, Lag_ Discindex, as a further control in column (3) then the coefficients on Lag_distance_overdisclosure and Lag_distance_underdisclosure retain their predicted sign and remain significant. 22 Overall, we interpret the findings in Table 5 as evidence that firms learn about their optimum levels of disclosure over time and that they react and adjust their disclosure level towards their optima. 23 endogenous variable, the disclosure level, differs from the optimum level in the first stage as the main variable in our second stage estimations. 21 We use a three-year window to ensure a reasonable number of observations for the first stage regression and taking into account the possibility that the optimum level can change gradually over time. 22 In an untabulated test, we also added two-period lagged change in Discindex and found that the coefficients on Lag_distance_overdisclosure and Lag_distance_underdisclosure continued to show their predicted sign and remained significant (despite a further reduction in sample size to 1405 observations). Specifically the two coefficients were − 0.316 and 0.061, with t-statistics of − 8.95 and 2.55. There was no evidence in the data of serial autocorrelation  The table reports the year-on-year change in the disclosure index, Discindex, depending on whether the firm, in the prior year, was over-or under-disclosing, relative to the sample-wide year-specific optimum level of disclosure in that year. Lag_distance_overdisclosure is set equal to the prior year actual disclosure level minus prior year estimated optimal disclosure level if this value is positive, and zero otherwise. Lag_distance_underdisclosure is set equal to the prior year estimated optimal disclosure level minus the prior year actual disclosure level if this value is positive, and zero otherwise. The t-statistics are adjusted for firm error clustering; *, **, and *** denote significance at 10%, 5%, and 1%.

Regulatory Effects
An additional factor underlying the non-linear U-shaped relation between annual report disclosures and the cost of equity capital relates to the unintended consequences of regulatory initiatives for more information in narratives. As discussed in detail in Section 2.2. it is possible that, when faced with a rising regulatory demand for further narrative disclosures in the annual report, firms err on the side of caution and end up including unnecessary information. Peer disclosure dynamics -where firms mimic the increased disclosures of their peers (considering this as 'the socially acceptable level') instead of seeking their own firm-specific optimummay further exacerbate this regulatory overload clutter. As such regulation-induced clutter may also underlie firm-level learning effects by posing a challenge to firms in immediately identifying their optimum level of disclosure. Over our sample period, two regulatory initiatives stand out: The 2006 company law mandate introducing the Business Review (BR) in annual reports and the 2010 revision of the Corporate of third order. After controlling for first and second order autocorrelation the coefficient on the three-period lagged change in Discindex is − 0.005 with a t-statistic of − 0.14.
Governance Code which required firms to provide a clear explanation of the company strategy and business model in the annual report. In between these two regulatory initiatives the FRC published 'Louder than Words' (FRC, 2009) -a set of principles for making corporate report less complex and more relevant -and another set of 'cutting clutter' guidelines to encourage firms to remove unnecessary text and data from annual reports (FRC, 2011). Together these developments offer scope for investigating both the rise and the fall of clutter in annual report narratives.
To investigate this issue empirically, we regress our operational proxy for clutter, Dis-tance_overdisclosure, on indicator variables for the two shifts, the 2006 Business Review (BR) introduction, Post_BR, and the subsequent focus on clarity culminating in the 2010 revision of the Corporate Governance Code, Post_CGREV. Post_BR identifies year-ends ending on or after 31 March 2006 while Post_CGREV identifies periods ending on or after 30 June 2011. Our proxy for clutter, Distance_overdisclosure, is the amount of over-disclosure, relative to optimal disclosure, and is defined in the same way as Lag_distance_overdisclosure in Section 5.3, but measured in period t, not in t − 1. We believe that the market perceived optimum level of narrative disclosure, as implied by the cost of equity capital, is an important filter in identifying clutter because the FRC's clutter definition embeds a users' perspective, that is, rises in annual report narratives are not clutter unless they inhibit users' ability to identify relevant information. We then assess the sign and significance of each scaled measure's slope coefficient to assess which features of the annual report the firm should extend (and which features it can possibly shorten). Note that, for this test, there is no theoretical prior for the impact of the individual components; the motivation is rather empirical: U.K. policy developments explicitly encouraged specific types of commentary related to performance analysis, strategy, corporate governance (including remuneration), and forward-looking information. So, it seems worth asking if changes in the relevant proportions of these types of commentary were reflected in cost of capital improvements. Table 7, columns (1) to (5) initially report the results of regressing the cost of equity capital on the five measures individually. The loadings on the five scaled measures in (1) to (5) are all negative suggesting that an increase in the relative volume of all five measures would reduce the cost of equity capital. However, while Scaled_Perf is highly significant, with a t-statistic of − 4.59, Scaled_Forward is only marginally significant, with a t-statistic of − 1.85.
Differences in significance between the five measures become even more pronounced when we include in column (9) all five measures in one regression -together with Fog, Log_Rearend, Log_Rearend_squared, Log_Restfront, and Log_Restfront_squared as further controls. Specifically, in this specification the loadings on Scaled_Perf and Scaled_Gov are negative and significant. Scaled_Strategy is also negative but not significant at conventional levels, and Scaled_Causal and Scaled_Forward are positive but not significant. Overall this result seems to suggest that firms could have reduced their cost of equity capital by increasing management performance commentary and governance and remunerations sections, while a reduction in Scaled_Strategy, Scaled_Causal and Scaled_Forward would lead to no significant effect on the cost of equity capital.
In terms of control variables, we note that the loading on Log_Restfront is negative, but insignificant, and the coefficient on Log_Restfront_squared is close to zero. Also, the loading on Log_Rearend is negative and significant, while the loading on Log_Rearend_squared is positive and significant. Thus, the results for Rearend are consistent with the presence of a U-shaped relation for the (absolute) volume of rear-end narratives, consistent with our earlier finding. In summary, the findings in Table 7 are consistent with the focus of U.K. disclosure policy on Perf and Gov (including remuneration).

Conclusion
The relation between the cost of equity capital and annual reporting disclosure is a long-debated research topic in accounting and finance. Prior literature has assumed a linear relation between disclosure and the cost of equity capital. We contribute to the literature by proposing and testing for a U-shaped relation between the cost of equity capital and disclosure. We use a word-countbased computer-generated index of U.K. annual report disclosures to provide new evidence for a panel of 5152 firm-year observations over the period of 2003-2014. This makes it the largest sample that was ever used to address this type of research question.
Our approach to measuring U.K. annual report disclosure content reflects the institutional framework of the U.K. where the annual report is seen as a one-stop shop for information that extends and complements the information contained in the financial statements. It also pays particular attention to regulatory developments over the sample period that focused company attention on disclosures related to performance commentary, forward-looking and strategic information, and commentary related to corporate governance and director's remuneration. In line with prior literature and related regulatory concerns, our approach to disclosure measurement also incorporates indicators of annual report length and readability.

Annual Report Narratives and the Cost of Equity Capital 49
In addition to finding a U-shaped relation between the cost of capital and our overall disclosure index we also present disaggregated findings which suggest that firms could reduce their cost of equity capital by increasing management performance commentary and governance and remuneration sections.
Taken together our results show that annual report disclosures are useful to investors but that more disclosure is not always better -thereby reconciling the theoretical tension that exists on the controversial roles of both mandatory and voluntary disclosures. We verify empirically the existence of the growing issue of clutter within annual report narratives.

Supplemental Data and Research Materials
Supplemental data for this article can be accessed on the Taylor & Francis website, doi:10.1080/09638180.2019.1707102 Section A) Further details on the disclosure index, Discindex, and its components. Section B) and Table IA.3 provide details on the estimation of the implied cost of equity capital. Section C) Correlation table for main regression variables. Section D) Robustness tests relating to the main regression model. Section E) Results of using nonparametric regression to infer the functional form.