Information externalities, analyst research resource allocation, and stock pricing efficiency

ABSTRACT Information is key to decision-making and is a major determinant of investment performance. We hypothesise that as analysts are constrained by their research resources, they collect information with more externalities. Measuring information externalities as a stock’s fundamental correlations with other stocks in the same industry, we find that stocks with high information externalities get more analyst coverage, more (high-quality) analyst reports, and more site visits than those with low information externalities. At the analyst level, we find consistent evidence that individual analysts allocate more effort to firms with high information externalities in their stock portfolios. Consequently, we further demonstrate that stocks with high information externalities are associated with less information delay and are priced more efficiently.


Introduction
The efficient market hypothesis (Fama, 1970) states that information can always be injected into stock prices in a timely, sufficient, and accurate manner. However, in practice, investors spend a lot of time and energy searching and processing information; therefore, stock prices can be only partially efficient under the constraint of information acquisition costs (Grossman & Stiglitz, 1980). Thousands of stocks are listed and traded in today's stock markets; thus, it is important to understand how investors allocate scarce resources to research and acquire valuable information. In this study, we examine the effect of information externalities on the information production of sell-side analysts.
Economic theories point out that information is a kind of special goods because it has a high acquisition cost while its replication cost is negligible (Romer, 1990;Veldkamp, 2006). Therefore, the value of information hinges on whether that information is useful for a set of stocks or just one stock. Clearly, the larger the set of stocks, the higher the value of information. In the context of stock trading, the value of the information obtained from one particular stock lies in two dimensions. The first refers to the notion that information collected from a stock always reflects its fundamentals. The second dimension is that information can be used to price other stocks, for example, other stocks in the same industry, geographic region, or supply chain. In this study, we consider the value of information reflected in the second dimension as information externalities. As the replication cost of information is negligible compared with its acquisition cost, if that information can be applied to a sufficiently large number of stocks, then information externalities will be sufficiently high. Therefore, we assume that analysts will allocate more research resources to stocks with high information externalities.
Following the design of Hameed et al. (2015), we measure a firm's information externalities as the correlation between its fundamentals (return on assets or ROA) and the fundamentals of other firms within the same industry. If a firm has strong fundamental correlations with its industry peers, by devoting research resources to that firm, analysts can obtain information that is valuable not only for the firm itself but also for other firms in the same industry. However, if analysts allocate their research effort to firms with weak fundamental correlations, the information acquired from that firm can only be used to price just a few stocks.
Although analysts are not the only market participants who are concerned about the external value of information, they are the most suitable subjects for us to examine the relationship between information externalities and the allocation of limited research resources for the following reasons. First, as proposed in prior literature (Driskill et al., 2020;Harford et al., 2019), due to constraints on attention and cognitive capacity, analysts need to optimise their resource allocation to maximise their utility function. Second, analysts are an important information intermediary in stock markets. They provide research and investment recommendations to meet the information needs of investors, and their work has a substantial impact on stock prices. Third, the availability of analyst data allows us to quantify analysts' resource allocation in multiple aspects, such as stock coverage and the frequency, quality, and timeliness of forecasts. Furthermore, analysts make forecasts for multiple firms, which allows us to explore how individual analysts balance their research resources within their portfolios.
Using a large sample of A-share listed firms from 2007 to 2018, we find that firms with higher information externalities are covered by more analysts. Analysts also issue more reports, revise earnings forecasts more frequently, and conduct more site visits for these firms. We also explore research resource allocation at the individual analyst level. By adding analyst × year fixed effects to analyst-level regressions, we can estimate whether analysts are selective in their resource allocation using variations within their portfolios. We find that analysts produce more in-depth reports and revise earnings forecasts more frequently for firms with high information externalities than for those with low information externalities. We also find that analyst forecast revisions are timelier and more informative for such firms. Collectively, these findings suggest that analysts expend their research resources strategically and devote more effort to researching firms with higher information externalities.
It is well known that pricing efficiency is largely determined by investors' information acquisition incentives (Grossman & Stiglitz, 1980;Verrecchia, 1982), and the information production of sell-side analysts strongly contributes to firms' information environment (Derrien & Kecskés, 2013;Harford et al., 2019). Therefore, we further examine whether stocks with different levels of information externality have different pricing efficiency. Given that firms with higher information externalities receive more research resources from analysts, we expect industry information to be integrated into the price of stocks with high information externalities rapidly and with a significant delay for stocks with low information externalities. The above prediction is confirmed by the lead-lag effect in time series tests. We find that the lagged returns of stocks with high information externalities significantly predict the current returns of stocks with low information externalities, but the lagged returns of stocks with low information externalities have no explanatory power for the current returns of high information externality stocks. In addition, we find consistent evidence in cross-sectional multivariate regression tests, in which we regress the firm-specific information delay measures developed by Hou and Moskowitz (2005) on information externalities and find significant and negative coefficients.
Next, we use earnings announcements to examine whether investors pay more attention to stocks with high information externalities. If so, we expect the earnings information of high information externality stocks to also be useful in pricing the stocks of their industry peers. Consistent with our hypothesis, we find that the earnings announcements of high information externality stocks have greater spillover effects on other stocks within the same industry than those of low information externality stocks.
We find that information externalities also exhibit within geographic regions, and supply chains. Using a province information externality measure, we replicate the main empirical tests and obtain similar findings. Furthermore, due to the close economic links within supply chains, we hypothesise that a stock's information externalities will increase significantly after the initial public offering (IPO) of any of its main customers or suppliers. Taking the IPO of a firm's customers or suppliers as exogenous shocks and using a difference-in-differences approach, we find that firms receive significantly more research resources from analysts after the IPO of its customers or suppliers.
Finally, the importance of information externalities should be recognised by all stock market participants who trade a large number of stocks, not just by analysts. Although the behaviour of stock market participants other than analysts is usually unobservable, we manage to test it by using site visit data. We find that mutual funds and other institutional investors are more likely to conduct site visits to firms with high information externalities, which further supports the validity of our information externality theory for other market participants.
Our study makes several contributions to the literature. First, we investigate the research resource allocation of sell-side analysts in China's A-share market and present information externality as a new determinant of such allocation. In this way, our study extends the theoretical and empirical findings of Peng and Xiong (2006), Kacperczyk et al. (2014), Kacperczyk et al. (2016), andHarford et al. (2019).
Second, our study contributes to the literature on limited attention and rational inattention. It is generally accepted that investors with limited attention are prone to ignore valuable information. However, the literature ignores the possibility of investors strategically allocating their limited attention by choosing to be rationally inattentive to some stocks. Our study, together with those of Peng and Xiong (2006) and Van Nieuwerburgh and Veldcamp (2009), demonstrates that many market-related topics, such as portfolio choice, stock price synchronicity, and pricing efficiency, can be the endogenous result of investors' optimal resource allocation strategy under time, attention, and energy constraints.
Third, prior literature shows that pricing efficiency depends on the information acquisition incentives of investors, which in turn depends on information acquisition costs (Grossman & Stiglitz, 1980;Verrecchia, 1982). We add to this literature by demonstrating that information externalities are also an important determinant of information acquisition incentives, and thus advance our understanding of market efficiency.
The remainder of this paper is organised as follows. Section 2 reviews the related literature and develops the hypotheses. Section 3 describes the research design. Section 4 presents and discusses the main empirical results. Section 5 provides additional evidence and Section 6 concludes the paper.

Literature review and hypothesis development
Since the pioneering study of Kahneman (1973), it has been widely recognised that investors have limited attention; many studies in various settings provide compelling evidence for it. For example, investors' reaction to earnings announcements is weaker and the post-earnings announcement drift (PEAD) is stronger when investors are overloaded with many same-day earnings announcements made by different firms (Hirshleifer et al., 2009). As investors are more likely to be distracted on Fridays, PEAD is more pronounced for earnings announced on Fridays (DellaVigna & Pollet, 2009).
Although limited attention is often deemed to be a feature of retail investors (Barber & Odean, 2008), a growing literature provides evidence that professional investors are also constrained by it. Kempf et al. (2017) find that institutional investors are unable to act as corporate monitors when their attention is distracted by industries with extreme returns. Ben-Rephael et al. (2017) show that price drifts following earnings announcements and analyst recommendation changes are largely caused by the insufficient attention of institutional investors. Driskill et al. (2020) demonstrate that analysts update their earnings forecasts less frequently when multiple portfolio firms announce earnings on the same day.
The scarcity of research resources highlights the importance of allocation. Peng and Xiong (2006) propose that, due to limited attention, rational investors tend to process more market and industry information than firm-specific information. This type of allocation strategy generates categorical thinking and comovement in stock returns. Huang et al. (2019) use large jackpot lotteries that distract investor attention from the stock market as exogenous shocks, and find that stock price synchronicity is significantly higher on large jackpot days than on other days. Therefore, the findings of Huang et al. (2019) support the theory proposed by Peng and Xiong (2006).
The allocation of research resources is also related to portfolio choice. When choosing stocks to hold, investors are free to acquire information about stocks they are already well informed about (such as local stocks) or learn about unfamiliar stocks. Van Nieuwerburgh and Veldcamp (2009) consider investors' information acquisition and portfolio decisions jointly and find that investors strategically allocate their research resources to further specialise in what they know well. Their theory effectively explains why investors continue to hold undiversified portfolios tilting towards local stocks even when information is readily available in the modern information era. Kacperczyk et al. (2014) and Kacperczyk et al. (2016) examine how fund managers allocate research resources at different stages of economic cycles. They find that fund managers devote more research resources to time the macroeconomic trends during recessions and to stock picking during booms. Research resource allocation is also an important issue for analysts. Clement (1999) points out that an oversized portfolio impedes the quality of analyst earnings forecasts. Harford et al. (2019) examine how the career concerns of analysts affect the way they allocate research resources. They find that analysts allocate more research effort to portfolio stocks that are more important to their careers, for example, stocks that generate more commissions for brokerage houses.
Our study extends the literature by examining how analysts allocate their efforts from a different perspective: information externalities. As pointed out by Romer (1990) and Veldkamp (2006), the most prominent feature of information processing is the scale effect: information has high fixed costs and a near-zero using cost. Therefore, all else being equal, the value of information is mainly determined by its range, which is the number of stocks in a sector (for example, industry) where that information can be used. Analysts mainly provide information services to institutional investors. Although institutional investors value the skill of stock picking, they care much more about the performance of their entire portfolios. Therefore, to better meet the need of their clients, the optimal choice for resource-constrained analysts is to acquire information that can guide the portfolio decisions of institutional investors. In other words, analysts should pay close attention to information externalities.
We visualise the above arguments in Figure 1. Suppose that an analyst needs to advise their client, an institutional investor, and has to decide how to allocate efforts between stock A and stock Z. As shown in the graph, the information acquired from stock A is useful for both stock A and for a set of other stocks including stocks B, C, D, E, and others, then stock A has high information externalities. Conversely, the information collected from stock Z can only be used for its own trading decision, thus stock Z has low information externalities. Hence, the optimal strategy for the analyst is to spend more research effort on stock A to become a better advisor.
Based on the analysis above, we propose the first empirical hypothesis as follows.
Hypothesis 1: Ceteris paribus, firms with high information externalities receive more research resources from analysts than firms with low information externalities.
It is well known that stock pricing efficiency is largely influenced by investors' information acquisition incentives (Grossman & Stiglitz, 1980;Verrecchia, 1982), which in turn is a function of the costs and benefits of acquiring information from that stock. As mentioned above, analysts are likely to exert more effort on stocks with high information externalities; therefore, information externalities are likely to have implications for pricing efficiency.
As information externalities affect firms connected by economic links, such as the same industry or geographic proximity, we consider the diffusion of sector-wide information among stocks of the same sector and pricing efficiency accordingly. Prior literature (Brennan et al., 1993;Chordia & Swaminathan, 2000;Lo & MacKinlay, 1990) shows that some stocks respond to market information rapidly, while others exhibit significant information delay and adjust sluggishly. In other words, the returns of efficiently priced stocks lead the returns of inefficiently priced stocks (lead-lag effect). Hou (2007) finds that, within the same industry, big firms respond to industry information more rapidly than small firms, as big firms have lower information collection costs and more transparent information environments. Similarly, we argue that compared with stocks with low information externalities, stocks with high information externalities are priced more efficiently in the sense of information delay because such firms are associated with higher information collection benefits for analysts and other stock market participants. Therefore, we propose the second hypothesis as follows.
Hypothesis 2: Ceteris paribus, firms with high information externalities are associated with higher pricing efficiency than firms with low information externalities.

Sample and data
Our initial sample consists of all non-financial companies listed on the China's A-share stock market. We require quarterly financial data for the last 5 consecutive years to construct InfoExtVal, our information externality measure. As A-share firms started disclosing quarterly financial statements from 2002 onwards, the sample period is from 2007 to 2018, with the firms in the sample having to be listed for no less than 5 years. Financial and site visit data are obtained from the China Stock Market and Accounting Research (CSMAR) database, while analyst data are retrieved from the Suntime database.

Key variable construction
Following Hameed et al. (2015), we measure InfoExtVal, a stock's information externalities, as the average correlation between the firm's fundamentals and the fundamentals of other firms in the same industry. Industries are classified according to the 2001 CSRC (Chinese Securities Regulatory Committee) categories and the manufacturing industry is further divided into the second subcategories.
To calculate InfoExtVal for firm k, we first pair firm k with firm i in the same industry (k, i). We regress firm i's ROA iq on ROA Mq (the market average ROA excluding that of firm i) over the previous 5-year window using quarterly financial data in equation (1) to obtain the R 2 of regression (1), R 2 i,excl,k . R 2 i,excl,k measures the extent to which firm i's fundamentals can be explained by ROA Mq .
Next, we use equation (2) to regress firm i's ROA iq on ROA Mq and firm k's ROA kq simultaneously over the same 5-year window to obtain R 2 i,incl,k , the R 2 of regression (2). R 2 i,incl,k represents the incremental explanatory power of firm k on firm i's fundamentals.
Then, we extract firm k's incremental explanatory fraction on firm i's fundamentals using equation (3).
Next, we pair firm k with all other firms in the same industry and rerun equations (1), (2), and (3) for each pair. PCORR_ROA k is the average PCORR_ROA k,i across all (k, i) pairs. If an industry has N firms, we will get N-1 firm pairs matched with firm k. Finally, InfoExtVal k is defined as the log-transformed value of PCORR_ROA k , as shown in equation (4). A higher InfoExtVal k means larger information externalities of firm k, as firm k's ROA becomes more informative for the fundamentals of other firms in the industry.

Information externalities and research resource allocation
We run regression (5) to test whether analysts allocate their research effort according to information externalities.
In model (5), subscript i denotes the firm and subscript t denotes the year. We measure the dependent variable Allocation for firm i in year t in different ways: the number of analysts following (Coverage i,t ), the number of new analysts following (NewCov i,t ), the number of reports issued by analysts (Rpt i,t ), the number of earnings revisions made by analysts (Rev i,t ), and the number of site visits by analysts (SiteVisit i,t ). As the site visit data are only available for firms listed on the Shenzhen Stock Exchange (SZSE) after 2012, the corresponding sample where SiteVisit i,t is the dependent variable contains only SZSE-listed firms, and the sample period is from 2012 to 2018. In model (5), X represents a set of control variables. Following Lang and Lundholm (1996) and Hameed et al. (2015), we control for firm size (Mv), book-to-market ratio (Btm), return on equity (Roe), institutional investors' shareholding ratio (InsHold), stock turnover (Turnover), and stock volatility (RetStd). We also control for IndStatus, the average of the relative ranking of a firm's size, revenue, and ROA within the industry (from 1 to 5, in ascending order), to ensure that InfoExtVal measures a firm's information externalities beyond its position in the industry. We also include IND-YEAR, two-dimensional higherorder fixed effects, to ensure that all parameters are estimated using variations within intersections of industries and years. The standard errors of regression (5) are clustered by industry, and we expect β to be significantly positive.
We also conduct analyst-level analysis to further examine how individual analysts allocate their time and effort. In regression (6), subscripts i, j, and t denote firm i, analyst j, and year t, respectively. By controlling for high-dimensional fixed effects (ANALYST-YEAR) , β estimates how individual analysts allocate their time and effort within their stock portfolios. 1 As in regression (5), in regression (6) we measure analyst j's effort allocated to stock i in year t by the number of reports (AnaRpt i,j,t ) and the number of earnings revisions (AnaRevision i,j,t ) made by analyst j. In addition, we consider the number of in-depth reports issued by analyst j that forecast both earnings and revenue (Thoroughness i,j,t ) and analyst j's response timeliness to firm i's annual report (ResponseDays i,j,t ), that is, the number of days between firm i's annual earnings announcement and analyst j's update of their earnings forecasts. Finally, we use the magnitude of the market reaction to analysts' earnings forecast revisions (MktReaction) as an indirect measure of analyst effort allocation. Our logic here is that more effort leads to higher-quality reports and therefore greater price impacts on the stock market. We measure MktReaction as the cumulative excess return within the [−1, +1] analyst earnings forecast revision window. 2 In regressions (5) and (6), both InfoExtVal and the control variables are lagged by 1 year relative to the dependent variable. To ensure the robustness of our empirical results, the standard errors of regression (6) are double-clustered by analyst and by firm. Furthermore, to mitigate the impact of outliers, all continuous variables used are winsorised at the 1% and 99% levels.

Information externalities and pricing efficiency
Analysts and other participants in the stock market optimise their limited research resources by allocating more time and effort to stocks with higher information externalities, which in turn could have a significant effect on pricing efficiency. We expect industry information to be rapidly incorporated into the price of stocks with high information externalities but with a delay for stocks with low information externalities. To test the above inferences, we use time series and multivariate cross-sectional regression models.
Following Hou (2007), for each year and industry, we build a High_InfoExtVal portfolio containing stocks with InfoExtVal above the 70th percentile of the industry, and a Low_InfoExtVal portfolio containing stocks with InfoExtVal below the 30th percentile of the industry. We compute daily portfolio returns for the High_InfoExtVal and Low_InfoExtVal portfolios for each industry from 1 May 2007 to 30 April 2019. Then, we test whether there is a significant lead-lag effect between the High_InfoExtVal and Low_InfoExtVal portfolios using a vector autoregressive (VAR) model, as shown in equation systems (7).
where H and L stand for High_InfoExtVal and Low_InfoExtVal portfolios, respectively. The subscript IND stands for industry, and the K above the summation symbol represents the highest order of time lag. To improve robustness, we report the results for both the firstorder and fifth-order VAR models. The time series model depicts the diffusion of industry information from one category of stocks to another category. According to the information delay hypothesis, industry information should first transmit to the price of stocks with high InfoExtVal and then to the price of stocks with low InfoExtVal with a time lag. That is, in the VAR model of equation (7), the lagged returns of portfolio H, RetH IND,t-k , can significantly predict the current return of portfolio L, RetL IND,t . However, the lagged return of portfolio L, RetL IND,t-k , should not have significant return prediction ability for portfolio H, that is, c 1 + . . . + c k > b 1 + . . .b k . Similar to Hou (2007), we estimate the VAR models jointly across all industries.
Following Hou and Moskowitz (2005), Kong et al. (2013), and Hu et al. (2016), we also construct IndDelay, a series of information delay measures for individual stocks.
where r i,t is the daily stock return of firm i, r M,t is the market return, and r IND,t is the portfolio return of the industry to which firm i belongs. At the end of April of each year t + 1, we regress each stock's daily returns on the market returns of same day and the last 5 days, and the corresponding industry portfolios from May of year t to April of year t + 1.
The first individual stock delay measure IndDelay1 is the fraction of returns explained by lagged industry portfolio returns. As shown in equation (9), IndDelay1 is 1 minus the ratio of the R 2 of regression (8) to the R 2 of a restricted regression, where δ k is restricted to 0 for all k∈ [1,5].
The second information delay measure IndDelay2 is given in equation (10), which is the ratio of five lagged estimates to the sum of current and lagged estimates.
As shown in equation (11), relative to IndDelay2, the third delay measure IndDelay3 further incorporates the precision of the estimates, where se(δ k ) denotes the standard error of δ k . IndDelay1, IndDelay2, and IndDelay3 are higher with a slower diffusion of industry information.
IndDelay3 ¼ We test whether InfoExtVal is systemically associated with IndDelay in multivariate cross-sectional regression (12). Compared with the time series test, the advantage of the regression method is that we can control for multiple variables related to information delay simultaneously. If stocks with higher InfoExtVal are priced more efficiently with less information delay, the regression coefficient β is expected to be significantly negative.
Consistent with regression (5), we include high-dimensional IND-YEAR fixed effects, cluster standard errors by industry, and winsorise all continuous variables at the 1st and 99th percentiles. Table 1 reports the detailed definitions of all variables used in empirical analysis. Industry-level information externality measure following Hameed et al. (2015). The construction of InfoExtVal is described in detail in Section 3.2

Descriptive statistics
ProInfoExtVal i,t-1 Geographic region-level information externality measure constructed like InfoExtVal.

Analyst Research Resource Allocation Coverage i,t
The natural logarithm of (1 + the number of analysts following firm i in year t) NewCov i,t The natural logarithm of (1 + the number of new analysts following firm i in year t) The natural logarithm of (1 + the number of reports issued by analysts following firm i in year t) The natural logarithm of (1 + the number of earnings forecast revisions issued by analysts following firm i in year t) The natural logarithm of (1 + the number of site visits by analysts for firm i in year t) The natural logarithm of (1 + the number of reports issued by analyst j for firm i in year t) The natural logarithm of (1 + the number of earnings forecast revisions issued by analyst j for firm i in year t) ResponseDays i,j,t The natural logarithm of (1 + the number of days between the announcement of firm i's annual report and analyst j's update of their earnings forecast for firm i) Thoroughness i,j,t The natural logarithm of (1 + the number of in-depth reports containing both revenue and earnings forecasts issued by analyst j for firm i in year t) MktReaction i.j,t Market-adjusted cumulative abnormal returns over the [−1, +1] window around analyst j's forecast revision

Frev
The difference between analyst j's updated earnings forecast and original earnings forecast for firm i in year t divided by the absolute value of analyst j's original earnings forecast

RetH
The portfolio returns of stocks with information externalities above the 70th percentile of the industry

RetL
The portfolio returns of stocks with information externalities below the 30th percentile of the industry

IndDelay1
Industry information delay for individual stocks constructed using equations (8) and (9) IndDelay2 Industry information delay for individual stocks constructed using equations (8) and (10) IndDelay3 Industry information delay for individual stocks constructed using equations (8) and (11) ProvDelay1 Province information delay for individual stocks, constructed like IndDelay1

ProvDelay2
Province information delay for individual stocks, constructed like IndDelay2

ProvDelay3
Province information delay for individual stock, constructed like IndDelay3 The natural logarithm of (1 + the number of site visits by other institutional investors to firm i in year t)

Supply Chains and Exogenous Shocks
SupplyChainIPO A variable with a value of 1 for firms in the experimental group and 0 for firms in the control group. A firm is assigned to the experimental group if its customers or suppliers have gone public within the 6-year window (Continued)

PostIPO
A variable with a value of 1 for the experimental firm and its matched control firm after the IPO of suppliers or customers of the experimental firm and 0 before the IPO Firm i's geographic economic position in year t-1. For each year, the firms in each province are sorted into quintiles in ascending order according to their size, revenue, and ROA. ProStatus i,t-1 is the average of the relative rankings of size, revenue, and ROA but the median of AnaVisit is 1, indicating that a few firms receive more attention from analysts. The descriptive statistics for the other variables are generally in line with expectations. Table 3 compares the characteristics of firms with different InfoExtVal. For each year, within each industry, we sort stocks into quintiles based on InfoExtVal. Quintile 5 represents stocks with the highest InfoExtVal, whereas Quintile 1 represents stocks with the lowest InfoExtVal. We calculate the mean and median values of Mv (in millions of RMB), Btm, and Roe for all stocks in each quintile, as well as the ranking of the above variables (from 5 to 1, in descending order).   Table 2 are not log transformed. This table reports the regression coefficients and corresponding t-statistics in parentheses. Standard errors are clustered by analyst and firm. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% levels, respectively. This table reports the regression coefficients and corresponding t-statistics in parentheses. Standard errors are clustered by industry. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% levels, respectively. Table 3, firms with high InfoExtVal are generally larger and have higher book-to -market ratios and stronger profitability within the industry than firms with low InfoExtVal. Table 3 highlights the need to control firm characteristics, such as Mv, Btm, and Roe, in the regression analysis. Table 4 reports the results of the firm-level analysis. In columns (1) and (2), we find that both Coverage and NewCov are significantly and positively associated with InfoExtVal, implying that analysts are more attentive to stocks with high information externalities than to those with low information externalities. Columns (3) and (4) further illustrate that analysts release more reports and forecast revisions for firms with high InfoExtVal. Moreover, in column (5), we find that analysts are more likely to conduct site visits to firms with high InfoExtVal than to those with low InfoExtVal.

Firm-Level tests
Collectively, the empirical results in Table 4 suggest that analysts allocate more research resources to stocks with high InfoExtVal, supporting Hypothesis 1.

Analyst-Level tests
Although Table 4 derives firm-level evidence consistent with analysts allocating research resources based on information externalities, we cannot determine how an individual analyst allocates resources for stocks in their portfolios. In Table 5, we use an analyst-firmyear sample including ANALYST-YEAR fixed effects to examine individual analyst behaviour. Column (1) of Table 5 shows that analysts release more reports on stocks with high (versus low) information externalities; column (2) shows that analysts also pay more attention to fundamental changes in these stocks and revise their earnings forecasts for these stocks more frequently. Furthermore, analysts are more likely to issue in-depth reports that provide both earnings and revenue forecasts for stocks with high information externalities; as shown in column (3), InfoExtVal has a significantly positive correlation with Thoroughness. In column (4), we also demonstrate that analysts respond more quickly to the earnings announcements of stocks with high information externalities, as the time lag of forecast revisions after the publication of annual reports for stocks with high information externalities is significantly shorter.
In addition, we investigate the market reactions to analyst forecast revisions. If analysts expend more resources on stocks with high information externalities, the corresponding forecast revisions should be more informative and have a greater price impact. As shown in column (5), the interaction term InfoExtVal × Frev is significantly positive, indicating that, given the magnitude of forecast revision, investors react strongly to stocks with high information externalities. This table reports the regression coefficients and corresponding t-statistics in parentheses. Standard errors are clustered by industry. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% levels, respectively.
The findings of Table 5 clarify the micro-mechanisms underlying research resource allocation and further support our Hypothesis 2.

Information externalities and pricing efficiency
In this section, we use both time series VAR tests and cross-sectional multivariate regressions to examine the pricing efficiency of stocks with different information externalities.
In regression (1) of Table 6, RetH and RetL are lagged by one period. The results show that lagged RetH (L1.RetH) can predict RetL, but lagged RetL (L1.RetL) has no explanatory power for RetH. The corresponding F-test shows that the above asymmetry in crossautocorrelations is significant at the 5% level.
In regression (2) of Table 6, we use a fifth-order VAR test to conduct a robustness check. We find that four of the five lagged values of RetH are significantly positive when the current RetL is the dependent variable, but none of the five lagged values of RetL can predict current RetH. The F-test rejects ∑L. RetH = ∑L. RetL at the 1% significance level. This table reports the regression coefficients and corresponding t-statistics in parentheses. Standard errors are clustered by industry. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% levels, respectively.
The above VAR test results show a significant lead-lag effect, indicating that industry information is incorporated into the price of stocks with high information externalities first and then spreads to the price of stocks with low information externalities. Table 7 reports the results of multivariate cross-sectional regression tests. We control for other variables related to pricing efficiency and information delay, such as size (Mv; Hou, 2007;Lo & MacKinlay, 1990), institutional investor ownership (InsHold; Sias & Starks, 1997), analyst coverage (Coverage; Brennan et al., 1993), and stock trading volume (Turnover; Chordia & Swaminathan, 2000).
The results of the regression tests highly collaborate with the findings in Table 6. As we see in columns (1)-(3) of Table 7, all three individual industrial information delay measures (IndDelay) are significantly and negatively related to InfoExtVal. In other words, stocks with high InfoExtVal are priced more efficiently as they absorb industry information faster than stocks with low InfoExtVal.
The results of Tables 6 and 7 also deepen our understanding of pricing efficiency. In the classical framework of Grossman and Stiglitz (1980) and Verrecchia (1982), pricing efficiency is determined by investors' information acquisition incentives, which in turn are influenced by information costs. We distinguish the information externality channel that affects investors' information acquisition incentives from the traditional information cost channel. We posit that investors have greater information acquisition incentives for stocks with high information externalities. As investors expend more resources on such stocks, they are likely to obtain information that is valuable for many other stocks. This table reports the regression coefficients and corresponding t-statistics in parentheses. Standard errors are clustered by province. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% levels, respectively.  This table reports the regression coefficients and corresponding t-statistics in parentheses. Standard errors are clustered by province. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% levels, respectively.

Evidence from the spillover effects of earnings announcements
In this section, we test whether the market pays more attention to stocks with high information externalities by focusing on the earnings announcements of such firms. Foster (1981) and Wang (2014) point out that earnings announcements are important sources of industry information for investors, and the earnings announcements of bellwether firms usually have spillover effects on industry peers. In Table 8, the dependent variable AvgPeerRet i,t measures the average cumulative abnormal returns relative to the market returns of firm i's industry peers when firm i releases its annual report, and the independent variable UE i,t is earnings growth deflated by the market value of equity. Therefore, the coefficient of UE i,t estimates the spillover effect of firm i's earnings announcement on other firms within the same industry. We interact High InfoExtVal with UE and expect the coefficient of High InfoExtVal × UE to be positive, as the earnings information of firms with high information externalities is also useful for other stocks in the same industry. To rule out the possibility that stock returns mechanically comove with other stocks in the same industry, we also control for IndBeta, which is the regression coefficient of daily stock returns on daily industry portfolio returns (excluding firm i).
Consistent with the findings of Foster (1981) and Wang (2014), in column (1), the coefficient of UE is significantly positive, suggesting that earnings announcements contain industry information and have spillover effects on the stock prices of industry peers. The coefficient of High InfoExtVal × UE is also significantly positive, indicating that the spillover effects of stocks whose fundamentals are closely related to those of their industry peers are much stronger than that of stocks whose fundamentals are weakly correlated. This finding validates the measurement of InfoExtVal and further suggests that the market values the information released by high InfoExtVal stocks and transfers that information to other stocks in the same industry.
In column (2), we use the abnormal returns of firm i during the [−1, +1] earnings announcement window (AdjRet i,t ) as an alternative measure of firm i's earnings information and find similar results.

Information externalities within geographic region
So far, we have empirically verified information externalities within industries; however, this does not mean that our theory is only restricted to industries. If the information externality hypothesis was generalisable, we should obtain similar findings for stocks connected by other economic links. Coval and Moskowitz (1999) show that investors have strong geographic considerations and exhibit home bias in portfolio decisions. Parsons et al. (2020) suggest that there are close economic ties between firms within the same geographic region. Therefore, to test the external validity of our theory, we consider information externalities within geographic regions. 4 Following the construction of InfoExtVal, we define a geographic information externality measure, ProInfoExtVal, within provinces. Table 9 reports the results of firm-level tests. 5 In Table 9, the coefficients of ProInfoExtVal are significantly positive in all columns, which is highly consistent with the results in Table 3. The findings of Table 9 indicate that firms with higher geographic information externalities receive significantly more resources from analysts, which supports our theory from a different perspective.
Tables 10 and 11 examine the pricing implications of information externalities in terms of geographic regions. In Table 10, we find lead-lag effects between stocks with high and low province information externalities similar to Table 6. RetH (stocks with high ProInfoExtVal) significantly leads RetL (stocks with low ProInfoExtVal); in other words, stocks with high ProInfoExtVal respond more rapidly to province information than stocks with low ProInfoExtVal.
The results of cross-sectional regression tests are reported in Table 11. We document significantly negative associations between ProInfoExtVal and ProvDelay after controlling for other factors that are likely to affect pricing efficiency, and the results are stable for all three measures of ProvDelay. 4 Due to the very high number of listed firms in Beijing, Shanghai, and Shenzhen, we exclude listed firms registered in Beijing, Shanghai, and Guangdong province in all geographic tests. Untabulated results show that our main results are not affected by this sample selection restriction. 5 The analyst profession is organised by industry sector, and it is very rare for analysts to specialise by location (Parsons et al., 2020). As the portfolio choice of an analyst is not based on geographic regions, it is not meaningful to conduct individual analyst-level tests.

Information externalities in supply chains
The supply chain is another type of economic link among firms (Cohen & Frazzini, 2008). If a firm's (for convenience, let us call it the focal firm) main suppliers or customers go public, its information externalities will significantly increase after the IPO, as investors can benefit from the focal firm's information externalities by trading the stocks of its suppliers or customers. In general, the focal firm does not determine the IPO decisions of its suppliers or customers, thus the change in information externalities caused by the IPO of its suppliers or customers will be exogenous. Therefore, we consider IPOs in the supply chain as exogenous shocks to rule out endogeneity and provide further support for our hypotheses.
We collect data on the top five customers and suppliers of A-share listed firms and conduct difference-in-differences tests. The corresponding results are reported in Table 12. SupplyChainIPO is a dummy variable with a value of 1 for the treatment group and 0 for the control group. A listed firm is assigned to the treatment group if its suppliers or customers went public during the 6-year testing window. As IPOs in supply chains are relatively rare, for each firm in the treatment group and for the year before the IPO, we match it with a control firm with comparable characteristics, including size, book-tomarket ratio, return on equity, institutional shareholding, turnover ratio, and industry dummies, using propensity score matching. We compare the differences between the treatment and control groups 3 years before and 3 years after the exogenous IPO events. PostIPO takes a value of 1 for the post-IPO period and 0 for the pre-IPO period. The empirical results show that the interaction term SupplyChainIPO × Post is significantly This table reports the regression coefficients and corresponding t-statistics in parentheses. Standard errors are clustered by industry. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% levels, respectively. positive in columns (1), (2), and (3). In columns (4) and (5), the sign of SupplyChainIPO × Post is positive, and the corresponding t-statistics are close to the conventional 10% significance level. The findings in Table 12 demonstrate the significant impact of information externalities on analysts' research resource allocation through exogenous shocks and extend the validity of the information externality hypothesis to supply chains.

Information externalities and other stock market participants
As mentioned before, an essential prerequisite for making profits from information externalities is that investors have sufficient funds to trade a large number of stocks. Therefore, our information externality hypothesis should also apply for institutional investors, as they have strong research capabilities and abundant capital. However, unlike analysts, due to limited data, the allocation of resources by institutional investors is largely unobservable to researchers. We use site visit information disclosed by SZSE-listed firms to examine whether our hypotheses hold for mutual funds and other institutional investors. Table 13 reports the corresponding results. We find that institutional investors are more likely to conduct site visits to firms with high InfoExtVal than to those with low InfoExtVal. Table 13 provides some insights into whether institutional investors allocate their research effort based on information externalities, as site visits are inadequate to paint a complete picture of institutional investor behaviour.

Conclusion
In stock markets, it is important to optimise the allocation of research resources to collect value-relevant information, as such resources are limited for both investors and financial intermediaries, such as analysts. Our study focuses on the importance of information externalities, as information that is useful for many stocks is more valuable than information specific to a single stock.
We use the correlation between a firm's financial fundamentals (ROA) and those of its industry peers as the measure of information externalities, and find that analysts are aware of the importance of information externalities and allocate research resources accordingly. Overall, we find that firms with high information externalities are associated with more analyst coverage, more reports, more forecast revisions, and more site visits than firms with low information externalities. From analyst-level tests, we also find that within stock portfolios, analysts devote more effort to stocks with high information externalities than to those with low information externalities, which is reflected in more (in-depth) analyst reports and forecast revisions for such stocks and a faster response to the earnings announcements of such stocks. We demonstrate that analyst forecast revisions are more informative and credible for stocks with high information externalities, which is also consistent with analysts' information externality-oriented research resource allocation strategy.
We further examine the pricing implications of information externalities, showing that stocks with different information externalities have different information transmission efficiency. Stocks with high information externalities transmit industry (geographic) information rapidly, while stocks with low information externalities adjust to that information with marked delays. The above results extend the theoretical framework of Grossman and Stiglitz (1980) and Verrecchia (1982) by demonstrating that information externalities constitute an influential factor in investors' information acquisition incentives.