An enhanced investor sentiment index*

We propose an enhancement to the well-known market-based investor sentiment index by Baker, M., and J. Wurgler. (2006. “Investor Sentiment and the Cross-Section of Stock Returns.” The Journal of Finance 61 (4): 1645–1680). The low forecasting power of that index for future aggregate stock market returns has long been a puzzle, and we demonstrate that its ability to empirically capture latent investor sentiment can be significantly improved by allowing contributions of its individual components to vary over time, rather than being fixed. Our enhanced index represents an improved measure of investor sentiment: empirically, we find that our sentiment index not only instils forecasting power to the original Baker and Wurgler (2006) index, it also outperforms competitor measures in empirically capturing unobservable sentiment. This superior empirical performance is demonstrated to be due to investors’ irrational expectations about future discount rates. Lastly, sentiment measured by our enhanced index contains useful and unique information about future market returns, as it outperforms fundamental economic predictors in forecasting, in both the statistical and economic sense.


Introduction
Investor sentiment is well established in the finance literature as having a systematic impact on asset price fluctuations.De Long et al. (1990), for example, demonstrate analytically how prices can be driven away from their fundamental values by traders who base decisions on sentiment, or 'noise'.Furthermore, where such unpredictable traders are sufficiently influential, they deter arbitrage activities in the short run, leading to long-run price reversals (towards their fundamental values). 1, 2 Crucially, this implies that investor sentiment affects current stock prices and has predictive power for future stock returns, and thus, an accurate measure of investor sentiment is of great empirical importance for portfolio investors, regulators, and policy makers.
A key issue with the measurement of investor sentiment is that it is a latent (unobservable) factor in investor decisions.One popular solution to this is to use survey-based measures to directly quantify sentiment, however, these have been found to perform poorly empirically (e.g.Ferrer, Salaber, and Zalewska 2016;Otoo 1999).A second more recent approach is to use textual (media-or search-based) methods to elicit sentiment from sources such as newspapers and online forums.The seminal paper by Tetlock (2007) introduced this approach, but recent evidence regarding the efficacy of such indexes is mixed, given that, analogously to survey-based measures, they are not based on actual investor market interactions.The third, market-based approach, the one that this paper will focus on, as pioneered by the seminal paper of Baker and Wurgler (2006, BW hereafter), is to extract a common underlying sentiment component from a set of observable market variables.Their index (S BW hereafter) combines currently five (originally six) market-based sentiment proxies, using constant (timeinvariant) weights, into a single sentiment index. 3Other indexes such as that proposed by Huang et al. (2015) follow this same basic principle that sentiment can be extracted from a defined set of observable, but noisy, market variables, though arguably the BW index is still the most widely used and acknowledged market-based sentiment index. 4 The BW sentiment index (S BW , henceforth) has been demonstrated to be partially effective in capturing latent sentiment, as it successfully predicts low (high) future returns for small, young, distressed, and extreme growth stocks when current sentiment is high (low), i.e. it possesses forecasting power for stock returns at the cross-sectional level.It could be expected that S BW will also perform well at the market level, i.e. in the time-series, aggregate market context, given that investor sentiment is a market-wide phenomenon (as argued by Baker and Wurgler 2006;2007;Brown and Cliff 2004;De Long et al. 1990;Lee, Jiang, and Indro 2002;Stambaugh, Yu, and Yuan 2012, among others) and that S BW employs generalist market-based proxies.Yet, several studies demonstrate that S BW fails to exhibit strong statistically significant predictive power for future aggregate stock market returns in the time-series context (Arif and Lee 2014;Baker and Wurgler 2007;Huang et al. 2015).Interestingly, alternative sentiment proxies do display, even if not unanimously, considerable forecasting abilities for aggregate market returns (e.g. investors newsletters surveys in Brown and Cliff 2005; consumer confidence in Bathia and Bredin 2013;and Schmeling 2009;FEARS index in Da, Engelberg, and Gao 2015;and textual-based sentiment measures in Das and Chen 2007;Tetlock 2007).We agree with the literature that this largely absent ability of S BW to forecast aggregate market returns is highly puzzling, and attempt to empirically address and rectify this issue in this study (without sacrificing its cross-sectional performance).
The S BW index has also been widely employed as an investor sentiment measure in different financial applications, ranging from stock market anomalies (Antoniou, Doukas, and Subrahmanyam 2013;Stambaugh, Yu, and Yuan 2012;2014); mean-variance relation (Yu and Yuan 2011), pricing of macro-risk (Shen, Yu, and Zhao 2017), to high-beta low-return puzzle or downward sloping security market line (Antoniou, Doukas, and Subrahmanyam 2016).All of the above studies, excluding Yu and Yuan (2011), focus primarily on cross-sectional stock returns and conclude that S BW successfully explains cross-sectional asset pricing puzzles.
The novelty of our approach is to propose that one possible reason for the failure of S BW to significantly forecast future aggregate stock returns is the fixed nature of the original index's components weights, an approach based on an implicit assumption that the ability of each component to proxy for the latent sentiment is timeinvariant.Crucially, all these proxies are driven by both sentiment and fundamental factors.Thus, changes in these imperfect measures of investor sentiment could reflect a change in investor sentiment or/and in fundamental factors.For instance, equity issuance, which is included as a component of S BW , varies with investor sentiment, but also depends on the availability of investment opportunities, which is linked to prevailing macroeconomic fundamentals (Jung, Kim, and Stulz 1996).As such, the degree to which each proxy captures investor sentiment may vary over time, as the relative impact of fundamentals on that proxy also varies over time.In other words, not every proxy captures the unobserved investor sentiment to the same degree at all times, a supposition which casts doubt on the constant weight being applied to every component in the original S BW index.
An enhanced investor sentiment index, which would address this shortcoming and which could consequently establish the (a priori expected but empirically peculiarly ephemeral) time-series forecasting power of the BW approach, is therefore required.Furthermore, it is empirically more challenging to construct a sentiment proxy which will perform well in the time-series compared to the cross-sectional domain, given that Baker and Wurgler (2007) observe that the forecasting ability of investor sentiment for the aggregate stock market is harder to capture than for any subset of stocks. 5It is also the case that a good sentiment measure should not only produce an appropriate negative correlation with future long-run stock market returns (due to price reversals following initial sentiment-fuelled price over-or under-reactions), but also be able to produce lower forecast errors in the out-of-sample (OOS) validation, as compared to competitor sentiment measures.
This paper therefore seeks to improve the ability of the S BW index to capture latent investor sentiment.This is evaluated at two stages employing common benchmarking techniques for sentiment indices: firstly, we ask whether our newly constructed investor sentiment index is a good proxy for investor sentiment (i.e.high sentiment today predicts low future long-term returns, and vice versa), and secondly, if it is, how well does the newly constructed index perform in forecasting aggregate stock market returns as compared to other well-known sentiment proxies.
This study contributes to the existing literature in several ways.First, we construct a new market-based investor sentiment index, S TV , expanding and improving on the work of Baker and Wurgler (2006); we propose an approach which is alternative and empirically proven to be mostly superior to that by Huang et al. (2015).Our new index permits dynamic time-varying features of the sentiment components to be utilised and, unlike some other popular indexes, does not suffer from look-ahead bias.We distinguish our approach from previous studies (e.g.Baker and Wurgler 2006;Chen 2011;Chung, Hung, and Yeh 2012;García 2013) in that we model the dynamic feature of investor sentiment through the time-varying weights of index components prior to evaluating the predictive power of investor sentiment, thus avoiding look-ahead bias, in a time-varying sentiment-return relation framework.Our index also utilises only the most relevant information, unlike alternative approaches based on recursive estimations (e.g.Huang et al. 2015).Empirically, our index outperforms other sentiment measures in the time-series context, both in-sample and out-of-sample, whilst maintaining cross-sectional predictive performance.It also outperforms other known predictors of future market returns, and return forecasts produced by our index are economically valuable.Lastly, we show that sentiment affects stock prices through the discount rate rather than the cash flow channel (Cochrane 2008;2011).Overall, this study proposes a superior measure of unobservable investor sentiment.
The remainder of this paper is organised as follows: Section 2 presents a focused review of the literature on measuring investor sentiment while Section 3 describes the rationale behind and the construction of our timevarying weighted investor sentiment index (S TV ).Section 4 discusses the data and descriptive statistics used for empirical analysis; Section 5 presents the methods employed to empirically evaluate the statistical performance of variables employed.Section 6 investigates the ability of the S TV index to empirically capture the latent sentiment and channels through which sentiment affects stock prices, while Section 7 examines whether our enhanced measure of sentiment contains unique information which can be utilised in stock market forecasting and investment decisions.Finally, Section 8 concludes.

Measuring investor sentiment
'By investor sentiment we mean beliefs held by some investors that cannot be rationally justified' (Morck et al. 1990, 157).Assuming this sentiment is unpredictable, De Long et al. (1990) present a seminal model which shows how sentiment, causing irrational investors to become more bullish or bearish, deters arbitrage activity, since the random nature of changes in sentiment generates a 'noise trader risk' to rational investors when actions of irrational investors are correlated.Hence, the mispricing induced by investor sentiment could have a systematic impact on stock returns.
One of the implications of such models of investor irrationality is that sentiment leads to contemporaneous asset mispricing, but in the long run prices will revert to their fundamental values, as arbitrage activities again dominate price movements.This proposition has been supported by empirical evidence that high (low) sentiment predicts low (high) future long-run returns, as the overpricing (underpricing) is eventually corrected (e.g.Ben-Rephael, Kandel, and Wohl 2012;Da, Engelberg, and Gao 2015;Tetlock 2007).
Survey-based measures of sentiment attempt to capture the optimistic or pessimistic view of market participants by gathering responses of agents regarding their expectations of the future stock market developments and general economic conditions.Studies which have used survey-based measures of sentiment include Fisher and Statman (2000), Brown and Cliff (2004;2005), De Bondt (1993), Greenwood andShleifer (2014), andLemmon andPortniaguina (2006).Most studies, including the aforementioned, are consistent in finding a negative relationship between sentiment and future returns.However, there has been some evidence to suggest that consumer confidence indexes are poor proxies for investor sentiment (see Ferrer, Salaber, and Zalewska 2016;Fisher and Statman 2003;Jansen and Nahuis 2003;Otoo 1999) since survey responses do not necessarily link to actual investment decisions (Binswanger and Salm 2017).
Market-based measures, on the other hand, rely on observable market statistics for which latent investor sentiment is an assumed underlying component.As this is market data, it represents information, and so sentiment, which has been acted upon by investors, unlike survey-based measures where there is no element of 'skin-in-thegame' of respondents.Market-based proxies commonly used to gauge sentiment include Chicago Board Option Exchange's volatility index (VIX), as examined by, e.g.Ben-Rephael, Kandel, and Wohl (2012), Cheon and Lee (2018), Da, Engelberg, and Gao (2015), Kaplanski and Levy (2010) and Lutz (2016), the closed-end fund discount (Bathia and Bredin 2013;Doukas and Milonas 2004;Gemmill and Thomas 2002;Lee, Shleifer, and Thaler 1991;Neal and Wheatley 1998); IPO-related measures (Baker and Wurgler 2006;Brown and Cliff 2004); derivative variables (Bathia and Bredin 2013;Dennis and Mayhew 2002;Wang, Keswani, and Taylor 2006); share of equity issues (Baker and Wurgler 2000); and dividend premium (Baker and Wurgler 2004;2007).
However, single market-based variables will individually likely be poor proxies for general investor sentiment, due to inherent idiosyncratic noise or random shocks which mask the underlying sentiment component in each.A solution to this shortcoming was proposed by Baker and Wurgler (2007, 139), who combine several single market-based proxies into a composite sentiment index, S BW , to 'iron out the remaining idiosyncracies' and, hence, extract the common sentiment component more accurately.Other studies have followed Baker and Wurgler (2006) and constructed alternative investor sentiment indexes using either a similar approach and/or proxies for other stock markets (see, for example, Chen, Chong, and Duan 2010, for Hong Kong;Finter, Niessen-Ruenzi, and Ruenzi 2012, for Germany;Chen, Chong, and She 2014a;and Firth, Wang, and Wong 2015, for China;Baker, Wurgler, and Yuan 2012, for non-US developed stock markets).
One drawback of the Baker and Wurgler (2006) approach has been that empirical evidence for its performance in capturing sentiment for the aggregate stock market has been rather mixed, as epitomised by its weak forecasting power for future aggregate stock returns.Baker and Wurgler (2007) themselves observe that, when forecasting the aggregate market using S BW , 'the statistical significance is modest ' (148).In an attempt to address this shortcoming, Huang et al. (2015) construct a new investor sentiment index using the same components as in S BW but employing a different methodological approach (partial least squares).They find that their aligned investor sentiment index (S PLS ) significantly predicts short-term future aggregate stock market returns, where S BW has been found to have no predictive power in the same sample.However, except for the very short term, the performance of the Huang et al. (2015) index is poor for the aggregate stock market.Arif and Lee (2014) also confirm the fact that S BW has weak or no predictive power over the aggregate stock market returns.
Based on the above discussion, 6 we agree with the conclusion arrived at in the literature 7 that the modest, at best, ability of the S BW index to predict future aggregate market returns is puzzling: given that it utilises sentiment content from a set of individual sentiment proxies, one would expect it to demonstrate respectable forecasting performance.In what follows, we attempt to tackle this puzzle and propose an enhancement to the original Baker and Wurgler (2006) method; our enhanced index possesses superior ability to empirically capture unobservable investor sentiment, as evidenced by its ability to generate significant improvements in the forecasting power for future stock market returns (without sacrificing its predictive power in the cross-section).

Construction of the enhanced investor sentiment index (S TV )
This section describes the methodology underlying the S BW index and presents our time-varying weighted investor sentiment index S TV designed to address some of the shortcomings of S BW .Baker and Wurgler (2006) construct S BW using principal component analysis (PCA) to extract the common sentiment component from originally six, latterly five, investor sentiment proxies: dividend premium (PDND), average first-day returns of IPOs (RIPO), the number of IPOs (NIPO), the closed-end fund discount (CEFD), and the share of equity issues (EQ), with market turnover being excluded as a proxy since March 2016. 8PCA identifies the latent common factor from the group of interrelated variables.It redefines the data set by transforming it into new variables, which are termed principal components (PCs), and the first principal component (PC1) is the linear combination of the variables that explain the maximum variation from the individual sentiment proxies.PC1 is consequently used as the aggregate sentiment index, S BW .

S BW
Prior to employing PCA, fundamental components related to the business cycle 9 are removed from sentiment proxies, by regressing (orthogonalising) each proxy on a set macroeconomic variables: growth of industrial production ( INDPRO), real growth of durable consumption ( CONSDUR), real growth of nondurable consumption ( CONSNON), real growth of services consumption ( CONSSERV), growth in employment ( EMPLOY) and NBER-dated recessions (RECESS).
There are two features of this process of constructing S BW which might affect its ability to capture sentiment.Firstly, as PC1 is extracted using the entire sample period, each component within PC1 is implicitly assumed to have fixed impact (or weight) across all time periods in the sample.This is equivalent to an assumption that each component's relative ability to capture sentiment, i.e. as compared to the remaining components, is constant over time.As this assumption might not be correct, S BW may not optimally capture the dynamic contributions of those sentiment proxies to the aggregate sentiment index, given that each sentiment proxy might better capture the unobserved sentiment in some periods while being largely affected by fundamental factors in others.We propose to relax this implicit assumption and allow contributions of each index component to vary over time.
Second, as discussed by Chung, Hung, and Yeh (2012), S BW is constructed with a look-ahead bias, due to PC1 being extracted utilising data for the entire sample period.This poses an issue when evaluating its forecasting power, as forecasts formed at any time t should not rely on information which would have only become available in the future (t + 1 and beyond).For example, the value of sentiment in January 2000 should not be drawn from information available after that date, yet the value of S BW estimated using the entire sample period would be employing information which is in the future to January 2000.All sentiment values prior to the last period in the sample suffer, therefore, from this look-ahead bias.To address this issue, one should construct an investor sentiment index utilising only information available up to time t, in order to avoid any look-ahead bias in the return prediction implied by the sentiment model for time t + k.
Also related to this study, Huang et al. (2015) propose a new investor sentiment index (S PLS ) that aims to improve the return predictive power of S BW .They argue that a more accurate sentiment index can be constructed by using the partial least squares (PLS) rather than PCA approach: they suggest that the lack of forecasting power in the original S BW index stems from its inability to factor out common approximation errors present among the index's individual proxy variables, and that the PLS method, they argue, is an appropriate approach to remove that common noise component.However, the Huang et al. (2015) approach of employing the PLS estimation methodology in the context of the original six sentiment proxies involves running regressions with only six observations in the cross-section, at each point in time.This is concerning given that Kelly and Pruitt (2015) demonstrate that large timeand cross-sectional dimensions are required to ensure that the PLS produces consistent forecasts from the latent factor.Despite this issue, Huang et al. (2015) demonstrate that their index has an improved performance at both the aggregate time-series and cross-sectional levels compared to S BW for short-term, but not long-term, horizons. 10However, their index can also suffer from look-ahead bias: a solution they use is to estimate the index values recursively, but this suffers from potentially employing very outdated information, as compared to a rolling window estimation which only utilises most recent values of relevant variables.
While Huang et al. (2015) attribute the sub-optimal forecasting power of the original BW index to the existence of common approximation errors contained in sentiment proxies, this study investigates a different, somewhat complementary, proposition, namely the fluctuating relative ability of those proxies to capture unobservable sentiment and the resulting need to model their contributions to the investor sentiment index as time-varying.We also aim at avoiding look-ahead bias and at mitigating employment of outdated, irrelevant information by estimating our index in the rolling window framework rather than recursively.This time-varying framework also allows us to avoid problems resulting from potential structural breaks in the behaviour of proxy variables as well as in their loadings onto the aggregate sentiment index.The findings of this study reveal that the issue we address here (i.e. of time-varying ability of index components' to capture sentiment) seems to be more relevant empirically than that of common errors, given that our new time-varying investor sentiment index (S TV ) outperforms S PLS , on average, in forecasting future stock market returns at shortand long-horizons.

Theoretical underpinnings of time variations in factor loadings
However, is it theoretically sound to expect that different sentiment proxies will be varying in their ability to capture sentiment over time?We argue that it is, and that the empirical intuition presented above is not just a statistical artefact.Let us consider each BW sentiment proxy in turn.In general, the argument here is that each variable considered as a sentiment proxy is also driven by a number of other economic phenomena, hence an observed change in its value can be due to a change in sentiment, but also to changes in those other drivers; as a result, an increase in value of a sentiment proxy could be a representation of raising investor sentiment, but it could also be occurring when sentiment is in decline, being driven predominantly by those other non-sentimentrelated influences instead. 11Firstly, the share of equity issued (EQ) and the number of IPOs (NIPO) could be driven by a number of firms simultaneously coming into the stage in the pecking order where equity financing is their optimal decision, or change in response to changes in information asymmetry between managers and owners as predicted by the agency theory (maybe due to changes in financial reporting standards and their enforcement, or a new industry with significant asymmetries, such as the IT sector in the 1990s, dominating the market; Jung, Kim, and Stulz 1996); none of these drivers are linked to investor sentiment but each would affect the values of EQ and NIPO.
Similarly, the dividend premium (PDND), rather than always being a measure of irrational sentiment, could be also driven by regulatory changes or an increasing market share of institutional investors which need to pursue the 'prudent man' approach in their investment decisions, forcing them to invest in mature and stable, usually dividend-paying, firms; this effect being exaggerated by the trend of 'disappearing dividends' (Fama and French 2001).Dividend-paying stocks might also appeal as 'safe heaven' assets in times of turmoil and investors' flight to safety, leading to an increasing dividend premium when times turn tough, rather the opposite of the sentiment-driven dividend premium explanation.In a similar vein, changes in the first day IPO stock returns (RIPO) could also be driven by changes in such non-irrational phenomena such as information asymmetry among investors (Rock 1986) or between investors and issuers (Beatty and Ritter 1986), illiquidity (Booth and Chua 1996), reputation of underwriters and auditors, etc.Likewise, the closed-end fund discount (CEFD) could vary over time due to variations in: management fees, liquidity of CEF shares, asymmetry in information about managers' abilities, importance of capital gains timing for taxation purposes, market segmentation, provision of leverage and its sensitivity to changes in short term interest rates, occurrence of rational bubbles, to name just a few non-sentiment-related factors (Cherkes 2012;Jarrow and Protter 2019).Lastly, market turnover (TURN) can be driven by factors other than sentiment, such as arrivals of information combined with changes in asymmetry in information endowment, information precision, and interpretation of news among traders, or, as BW observe in the note to their online dataset, volume is increasingly driven by non-sentiment-driven algorithmic trading rather than sentiment-driven human investors.
Overall, there are sound theoretical reasons to argue that each of the components of the BW sentiment index can be affected by both sentiment and non-sentiment-related factors, and their relative impact on each component might change over time. 12This provides a theoretical rationale for our empirical enhancement of the BW index which aims to harvest that time-varying ability of individual sentiment proxies to capture latent investor sentiment.

Estimation of S TV
Similarly to Baker and Wurgler (2006;2007), we employ PCA to extract the common investor sentiment component from an identical set of sentiment proxies (i.e.PDND, RIPO, NIPO, CEFD and EQ) in the construction of our sentiment index, S TV .To capture the time-varying contribution of each of these proxies, we differ from BW in that our index S TV is constructed on a rolling window basis.This approach allows us to utilise only the most up-to-date, hence the most relevant, information at each point in time, and has an additional benefit of avoiding any look-ahead bias in the construction of the index.Furthermore, following Baker, Wurgler, and Yuan (2012) and Finter, Niessen-Ruenzi, and Ruenzi (2012), we adopt contemporaneous proxies in the construction of S TV . 13 The window length chosen for the rolling-window is three years, with the first window running from January 1966 until December 1968 inclusive: the period of three years roughly corresponds to a half of the average duration of an entire NBER-dated business cycle in the US during our sample period, hence the rationale for three-year long windows is that the sentiment's impact on the stock market varies across phases of the business cycle (see Chung, Hung, and Yeh 2012;García 2013;Huang et al. 2015).A longer window would confound those diverse and potentially opposite effects of sentiment on stock markets.It is also worth noting that the degree of backward-looking historical information being used to generate S TV does not relate to the short-or long-run forward-looking forecasting performance of our index.Thus, there is no reason to also assume that a shorter window length, one that would capture only a more limited information set, would provide more accurate short-run forecasts either. 14 As in Baker and Wurgler (2006;2007), each proxy is first orthogonalised against a set of macroeconomic variables to remove business cycle related (fundamental) components.The same set of fundamental macroeconomic variables as in Baker and Wurgler (2006) are used to orthogonalise the sentiment proxies, namely INDPRO, CONSDUR, CONSNON, CONSSERV, EMPLOY, excluding the recession dummy as this would result in perfect collinearity in certain windows.The resulting orthogonalised sentiment components are standardised before being applied in the principal component analysis.As with S BW , the value of S TV in each window is defined as the first principal component, PC1, as sentiment is assumed to be the dominant common feature across this set of proxies.More precisely, the summation of the products of investor sentiment proxies with their respective component loadings is taken to obtain PC1.Finally, the value of S TV is taken as the last observation in each window. 15As such, S TV can be viewed as a series of updated investor sentiment proxies utilising only the most relevant and updated information in every window.
It is a well-known fact that the signs generated by PCA are arbitrary and indeterminate for each PC vector and, hence, the signs of PC1 component loadings could be inconsistent with theory and provide no meaningful interpretation of the underlying data (Bro, Acar, and Kolda 2008;Fenn et al. 2011;Narsky and Porter 2013). 16 Our solution to this issue is to follow that relevant literature and to flip the signs of PC1 component loadings, contained in the principle eigenvector, based on expectations derived from financial theory. 17Flipping is performed in every estimation window in which it is needed.As long as the sign of every component is flipped, their relative contributions and, hence, the optimal variance of the principal component is preserved.In view of this, we flip the sign of this eigenvector whenever the sign of RIPO is inconsistent with the theory in each window.
RIPO is chosen as a theoretically sound benchmark, as previous literature argues that investor over-optimism generates high RIPO values and shows empirically a positive relation between investor sentiment and RIPO (e.g.Cornelli, Goldreich, and Ljungqvist 2006;Derrien 2005). 18Most importantly, Chu, Du, and Tu (2017) reveal that RIPO is least affected by macroeconomic factors as compared to other sentiment proxies of S BW , signifying its ability to truly reflect the latent irrational investor sentiment to a larger extent than other sentiment proxies.Therefore, the sign of the whole vector is flipped if RIPO displays a negative initial loading, which would be inconsistent with theoretical considerations and predominant empirical evidence; otherwise, the signs of component loadings remain unchanged. 19In addition, there is a considerable theoretical rationale for the use of RIPO as a leading contemporaneous sentiment gauge here.For instance, both the number of IPOs and the resulting fraction of equity in firm financing in month t are determined by decisions taken months in advance, given the complex and lengthy process of equity issuing; hence, NIPO and EQ will tend to be driven by lagged, not current, sentiment, and that of managers more than of investors.In contrast RIPO is driven by contemporaneous invertors' trading decisions and is hence more likely to capture present sentiment.The dividend premium (DPND) and the closed end fund discount (CEFD) have been demonstrated to also be driven by forces other by sentiment (e.g.clientele effects, agency issues, signalling, taxation, issuance costs for the former and agency costs, tax liabilities, market segmentation and liquidity for the later), rendering them less reliable as contemporaneous sentiment indicators as compared to RIPO. 20 Figure 1 presents our investor sentiment index, S TV , as well as those of BW, S BW , and of Huang et al. (2015), S PLS , from December 1968 to December 2014, split into three sub-periods for ease of exposition: December 1968 to November 1982, December 1982to March 2001, and April 2001 to December 2014.It can be observed that S TV evolves in line with the overall trend of S BW and S PLS .The initial glance shows that S TV is more volatile than S BW and S PLS , its movements corresponding to the peaks and troughs of the business cycle. 21The surge in investor sentiment, as proxied by S TV , is parallel with speculative periods, for instance, the young growth stocks bubble in 1968, the speculative period of the late 1970s, the early 1980s biotech bubble, the technology boom of the late 1990s, and the housing-related bubble before 2007.Our index also clearly shows a drop in investor sentiment during the historic bear market periods, such as the 1968-1970 recession, the stock market crash of 1973-1974, the 1981-1982   To provide further evidence for our hypothesis that individual components of the BW index exhibit timevariability, we inspect the behaviour of estimated PCA loadings of index components, as shown in Figure 2. Their visible fluctuations over time support our notion that contribution of each sentiment proxy to the aggregate index should be modelled as time-varying, and are in line with our theoretical considerations, as outlined in section 3.2, that each component is driven by both sentiment and non-sentiment forces, and that the relative importance of these two aspects changes over time.To provide more stringent evidence, we also plot the 95% confidence interval for each loading, estimated within the 36 months moving window framework and assuming time-invariability as in BW.As can be seen in Figure 2, the confidence intervals of time-varying and timeinvariant estimates do not overlap for most of the time (ranging from 52.62%, for CEFD, to 79.57%, for NIPO, of the sample period where there is no overlap), indicating that the assumption of time-invariant loadings results in significantly incorrect estimates for most of the sample period.Indeed, a test comparing the mean of the time series of each estimated component loading to the fixed loading value obtained from the whole sample PCA estimation (as in BW) rejects the Null of equality at a very high significance level for each of the five variables employed (results not reported to conserve space).Overall, these results further support our approach of constructing the aggregate sentiment index based on time-varying, rather than time-invariant, PCA loadings.As can be also seen from the summary statistics of these PCA component weights shown in Table 1, there is a substantial time-series variation of each component, as measured by the coefficient of variation (CV), with NIPO and EQ varying the most (RIPO exhibits the lowest volatility, presumably because it is the only component not subjected to flipping to preserve economically sensible component signs).This variability gives further credence to the hypothesis that these factors do not contribute to sentiment in a fixed manner, as is implicitly assumed in the construction of the BW index.Notes: This table reports the summary statistics for each proxy weight calculated from PC1 after this PCA vector has been flipped based on the sign of the RIPO.Obs denotes the number of observations.CV is the coefficient of variation, Min is the minimum value and Max is the maximum value.We further explore how the relative importance of each of the five proxies varies over time. 23The resulting relative importance of each component is presented in Figure 3.The figure again demonstrates that using a more traditional methodology, which does not allow these components to contribute differently during different states of the macroeconomy, for example during the Global Financial Crisis in 2007-2009, would yield a less volatile, but ultimately less accurate, index. 24

Data and descriptive statistics
The following section provides details of various types of data, which include investor sentiment indexes, the aggregate stock market return and economic predictors.The sample period of this study spans forty-nine full calendar years (in order to avoid biases potentially induced by calendar anomalies in variables, such as the January effect or the SAD effect) from January 1966 through December 2014. 25 As mentioned before, five investor sentiment proxies are employed: PDND, RIPO, NIPO, CEFD, and EQ, along with the five macroeconomic variables, INDPRO, CONSDUR, CONSNON, CONSSERV and EMPLOY.Data is obtained from Jeffrey Wurgler's website.This study also employs another market-based sentiment measure, namely S PLS , which is available from Guofu Zhou's website.Also used are the two most popular survey-based investor sentiment indexes, which are the University of Michigan Consumer Sentiment Index (MS), obtained from the Michigan Survey Research Center directly, and the Conference Board Consumer Confidence Index (CCI), retrieved from Bloomberg.
In line with other studies, 26 the excess market return (R m ), computed as monthly stock market return minus the risk-free rate (i.e.3-month annualised Treasury-bill rate divided by 12), is used as a measure of aggregate stock market return.The stock market return is the monthly value-weighted S&P 500 index returns (inclusive of dividends) computed by the Center for Research in Security Price (CRSP).The data for stock market returns and variables used to compute economic predictors employed by Welch and Goyal (2008), which are described fully in the next section, are obtained from Amit Goyal's website.Table 2 summarises the descriptive statistics for the abovementioned variables used in our forecasting analysis in panel A and presents the full sample correlations among sentiment measures in panel B. It is noteworthy that sentiment proxies and most economic variables display high degrees of persistence (last column of panel A), a feature which will be further discussed in the next section.Panel B depicts that, in general, all sentiment proxies share a common component.Care should however be taken when comparing these correlations across pairs of variables given the high degree of dissimilarity in variances of the raw sentiment proxy data, which will artificially inflate correlation coefficients of pairs of proxies with variances of different orders of magnitude. 27

Methodology
We assess the empirical performance of S TV in capturing latent sentiment against other investor sentiment measures, in-and out-of-sample, as well as the sentiment's statistical forecasting performance for stock returns relative to established economic predictors, using the following methods.

In-sample analysis
To assess how well various sentiment indexes capture the unobservable sentiment within our sample, we follow the literature (e.g.Baker and Wurgler 2006;2007;Brown and Cliff 2004;2005;Huang et al. 2015) and estimate the standard return predictive regression model, which is augmented to also control for key fundamental stock return predictors, the so called 'kitchen sink' model (Welch and Goyal 2008).We include the following economic fundamental proxies suggested by Welch and Goyal (2008): dividend-price ratio (DP), dividend yield (DY), earnings-price ratio (EP), stock return variance (SVAR), book-to-market ratio (BM), net equity expansion (NTIS), treasury bill rate (TBL), long-term return (LTR), term spread (TMS), default yield spread (DFY), default return spread (DFR), lagged inflation (INFL), and consumption-wealth ratio (CAY). 28This study also includes another two economic predictors, namely the lagged output gap (OG) as suggested by Cooper and Priestley (2009), and log surplus consumption (SCR) as proposed by Campbell and Cochrane (1999). 29Hence, the following model is estimated: where Rm,t+h denotes the h-month-ahead excess market return (stock market return minus the risk-free rate). 30As discussed in Section 2, in general, current sentiment is negatively correlated with future stock returns, especially in the longer run (Brown and Cliff 2005); a good empirical proxy for the unobservable sentiment should demonstrate this characteristic.Thus, we test the null hypothesis of β = 0 against the alternative of β < 0. 31

Out-of-sample tests
We generate the out-of-sample market return forecasts following the standard return predictive regression model, using each investor sentiment index and each of the 15 economic predictors as described above along with two additional economic predictors: dividend-payout ratio (DE) and long-term yield (LTY) 32 as explanatory variables.Given the relatively large number of economic predictors (17), we also consider the diffusion index approach. 33Specifically, we employ PCA to extract the common factor from the 17 economic predictors in every estimation window, the first principal component of these economic factors being labelled PC-ECON henceforth.The use of PCA not only avoids the over-parameterization issue, but has been found to perform better empirically than the individual components (e.g.(2) Although the standard predictive regression model (2) can be estimated using OLS in the out-of-sample forecasting, it suffers from size distortions in this context when predictors are persistent and endogenous (Stambaugh 1999).To ensure that the out-of-sample results are reliable, this study employs the Feasible Generalised Least Square (FGLS) framework introduced by Westerlund and Narayan (2012).
To perform the rolling regression forecasts, we use a fixed window length of 15 years, following Henkel, Martin, and Nardari (2011).The first estimation period used to generate the first return forecast for December 1983 is December 1968 to November 1983.The estimation window is then rolled over by one month to obtain the next forecast for January 1984, and so forth.The sample of forecasts from December 1983 to December 2014 are retained for forecast evaluations.Note that for S BW , we re-estimate the index when the new forecast of excess market return is to be computed in each window, so avoiding look-ahead bias: this approach allows for some time variability in the original BW index's component loadings, however it differs from our index as the width of the relevant information window is 15 rather than 3 years. 34 Three popular out-of-sample evaluation tests are used in this study, namely: Campbell and Thompson ( 2008) out-of-sample R-squared statistic (R 2 OS ), Clark and West (2007) adjusted mean squared forecast error (MSFEadjusted) statistic, and Harvey, Leybourne, and Newbold (1998) forecast encompassing test (ENC).To compute the R 2 OS and the MSFE-adjusted statistics, the historical mean model (HMM) 35 has been used as a benchmark model (e.g.Campbell and Thompson 2008;Neely et al. 2014;Welch and Goyal 2008).Care should be taken when interpreting the common R 2 OS , in the context of nested predictive models such as these, as bias is induced which can cause R 2 OS to be negative. 36In this context, therefore, R 2 OS should be used to compare magnitudes of forecasting power differences between the HMM model and each sentiment proxy (and, by extension, between different sentiment proxies), rather than an absolute measure of fit or forecasting performance per se.MSFE-adjusted statistic does not suffer from this bias, and hence should be used as a more reliable indicator of forecasting superiority.For the ENC, we are interested in the forecasting performance of S TV relative to other predictors.Therefore, the forecast evaluations based on R 2 OS and MSFE-adjusted statistics present results for the nested models; the evaluation based on ENC shows the forecast performance of non-nested models.For the latter, the forecast encompassing test focuses on the difference in information content between individual predictors in the forecast combination, Rc,t+h , which can be expressed as follow: where R1,t+h is a given forecast, R2,t+h denotes the competing forecast and λ represents the optimal weight associated with the competing forecast.ENC tests the null hypothesis that the given forecast encompasses the competing forecast (λ = 0).Alternatively, the competing forecast does provide useful information to the combined forecast that is not already embodied in the given forecast if λ > 0. The usual t-statistic leads to overrejection of the null hypothesis due to autocorrelation and conditional heteroscedasticity features in the errors of combined forecast.Therefore, Harvey, Leybourne, and Newbold (1998) propose a modified Diebold-Mariano (MDM) test statistic for long-horizon forecast evaluations which we employ.When applying encompassing tests in the context of our research, there is no a priori reason as to which forecast (from our index or the competing predictor) should be treated as a given forecast and a competing forecast.Therefore, we perform the test in both directions; the possible outcomes are illustrated in Figure 4.
As Figure 4 illustrates, the forecast based on S TV can be treated as the given forecast (termed test 1) or as the competing forecast (test 2), therefore, there are four different possible outcomes which could be obtained by combining the results from tests 1 and 2. Outcome 1 indicates that S TV -generated forecast encompasses the forecast retrieved from the alternative predictor.Outcome 3 is observed when the MDM statistic reveals the opposite findings in both tests, namely that the alternative encompasses S TV .Meanwhile, outcome 2 implies that both predictors could provide complementary information to the forecast combination, while outcome 4 indicates that the two forecasts encompass each other, hence, are informationally equivalent.Therefore, outcomes 1 and 2 are in line with our hypothesis that the forecast based on S TV captures unique information that is not already incorporated in the forecast by the alternative predictor.Outcome 3 implies that forecasts by the alternative predictor dominate those forecasts generated by S TV , and outcome 4 suggests that the forecast based on S TV is neither dominated by nor dominant over the alternative predictor's forecast, hence is equivalent.

Empirical Results
This section presents our main results.The first part (6.1) reviews the in-sample predictive performance of S TV against other investor sentiment indexes in order to obtain a solid intuition if our index is a good measure of

Notes:
Alternative forecast (A) S TV -based forecast λ denotes the optimal weight associated with the competing forecast in the following forecast combination: S TV forecast is treated as the given forecast, R1,t+h , in the first test and as competing forecast, R2,t+h , in the second test.
investor sentiment, in both absolute (i.e.does it behave like a sentiment proxy?) and relative (i.e. is it superior to alternative sentiment proxies?) terms.The second part (6.2) concentrates on the out-of-sample forecasting performance of S TV as compared to other sentiment proxies, which is arguably a more challenging test for index performance, whereas in the third part (6.3) we explore empirically the channels through which sentiment affects stock prices and which consequently yield S TV its forecasting ability for future stock returns.

How good is S TV in capturing unobservable investor sentiment in-sample?
As discussed previously, the ability of a sentiment measure to capture latent sentiment is typically empirically assessed in the literature by testing if high values of that measure today predict lower stock returns in the in the medium-to long-run, and vice versa.To that end, we begin with a preliminary, in-sample analysis, while the results from the more powerful, out-of-sample analysis are discussed in subsequent sub-sections.Table 3 presents the parameter estimates and t-statistics for different investor sentiment indexes across different forecast horizons, estimated based on equation (1) in-sample.
First, we note the power of S TV to empirically capture latent sentiment across all forecast horizons, as the slope coefficient of S TV remains negative and significant in the presence of economic predictors, the only exception being at the 1-month horizon.The lack of predictability for the 1-month forecast horizon is consistent with Brown and Cliff (2005) and Yu and Yuan (2011) who argue that the investor sentiment effect may not be well reflected in next months' return, since sentiment could persist for a longer period due to the limits to arbitrage, hence the subsequent price reversals take longer to materialise.Thus, one should not expect the mispricing at the aggregate level to be completely eliminated within the next-month period. 37Rather, longer-term price reversals, and therefore the ability of a sentiment measure to predict stock market returns over the intermediate and longrun periods is more indicative of the sentiment effect. 38In further support of this argument, Ding, Mazouz, and Wang (2019) show that the long-run component of investor sentiment index is a strong contrarian predictor of stock returns.
Consistent with the literature, S BW t is unable to predict future aggregate stock market returns across all prediction horizons (Arif and Lee 2014;Huang et al. 2015), as shown by the insignificance of each β.Furthermore, for longer prediction horizons the sign of S BW , in being positive (yet still insignificant), is inconsistent with the theory that current investor sentiment should negatively influence future stock returns due to return reversals.This finding further supports our initial assertion that S BW t performs poorly in the time-series context.Results also show that S PLS can predict future stock market returns for short horizons, including the nextmonth forecast horizon, which could be a reflection of the short-run transitory component of investor sentiment which that index predominantly captures: as shown by Ding, Mazouz, and Wang (2019), the short-run sentiment is a temporary deviation of investor sentiment from its long-run trend.Hence, when the transitory deviation of sentiment reverts back to its long-run trend within a short-term period, the mispricing it caused will disappear as well.
Table 3 shows that the predictive power of S PLS over longer (over 12 months) horizons is weak to non-existent, whereas our index performs best in medium-to-longer run, suggesting that while S PLS predominantly captures the short-term subcomponent of sentiment, S TV manages to empirically incorporate longer term sentiment features as well.Although both short-and long-run components of investor sentiment affect future stock returns, Ding, Mazouz, and Wang (2019) report that the effect of long-run component is more significant than that of the short-run component.
In contrast, CCI and MS do not predict the excess market return across most horizons.The results for CCI and MS are consistent with Ferrer, Salaber, and Zalewska (2016) who argue that consumer confidence indicators, such as MS and CCI, are inferior measures of investor sentiment, since they reflect how consumers perceive future economic conditions rather than their predictions for the future of the stock market.
Overall, those in-sample results confirm that S TV is a superior investor sentiment measure amongst the main competitors tested in this paper, since its high (low) values today predict low (high) future values of market returns, its predictive power is not driven by fundamental information, and it outperforms competitor indexes including both the original BW index as well as the index proposed by Huang et al. (2015). 39, 40

The OOS forecasting performance of S TV versus other sentiment indices
Given that S TV outperforms competing variables in capturing the latent investor sentiment in-sample, and assuming that sentiment systematically affects stock prices, we would expect our index to have significant forecasting power for future stock returns and to outperform other sentiment proxies in this respect out-of-sample.Table 4 presents the out-of-sample forecasting performance of investor sentiment measures based on R 2 OS and MSFE-adjusted statistics as described in Section 5.2.The former statistic is indicative of the extent that MSFE of a predictive regression model is reduced as compared to HMM, and the latter statistic helps to gauge the statistical significance of results, indicating if the predictive regression model has a forecast error that is statistically lower than that generated by the HMM after adjusting for noise in the predictive model.The null hypothesis of MSFEadjusted statistic (i.e.MSFE generated by HMM is less than or equal to that of predictive regression model) can still be rejected even though R 2 OS is negative due to the negative bias associated with the predictive regression model (as discussed in Section 5.2.).Thus, our inference is based largely on the results of the MSFE-adjusted test. 41 The results reported in Table 4 show that the MSFE-adjusted statistic for S TV -generated forecasts is statistically significant at the 5% level for 3, 9, 12 and 24 month forecasts, suggesting that our index has statistically superior out-of-sample forecasting power as compared to HMM at these forecast horizons.As a further confirmation, R 2 OS of S TV is greater than zero from 9 months until 24 months forecast horizons notwithstanding the negative bias associated with the R 2 OS measure.On the other hand, Table 4 also shows that S BW generates negative R 2 OS values and insignificant negative MSFE-adjusted statistics across most forecast horizons, except for positive R 2 OS values for the 1-month forecast.Nevertheless, its forecast error is not significantly lower than that of HMM at the 1-month horizon.These results suggest that S BW fails to outperform HMM since it produces greater forecast errors across most forecast horizons.Overall, for both in-sample and out-of-sample results, the original S BW index has been shown to have poor predictive power for excess market returns, whereas our time-varying modification of the original BW approach improves its forecasting power considerably.
Consistent with Huang et al. (2015), S PLS has a strong predictive power over short horizons.The null hypothesis of the MSFE-adjusted test is strongly rejected at the 1% significance level for 1-and 3-month forecast horizons.It also generates the highest R 2 OS values for 1-and 3-month predictions (i.e. 2% and 5.77%, respectively) among all predictive regression models.This superior performance, however, does not last beyond the 6-month forecast horizon.Indeed, the forecast performance is inferior for longer forecast horizons, with R 2 OS value of more than −30%, and has the worst 5 year forecast performance, with R 2 OS of about −44%, among all investor sentiment indexes.Thus, unlike our time-varying BW index, S PLS does not seem to produce consistent out-of-sample benefits across both short-and long-term forecast horizons.For survey-based sentiment indexes, we find that MS does not outperform HMM at any forecast horizon, likewise for the CCI, except for 36-and 60-month horizons, at which the R 2 OS values are more than 10%; however, Campbell and Thompson (2008) claim that such high values of R 2 OS are not economically credible, since everyone would become rich by just exploiting the information contained in such a model.Overall, the results of R 2 OS and MSFE-adjusted statistic suggest that S TV not only outperforms HMM for most forecast horizons, it also has a superior forecasting performance as compared to other investor sentiment indexes.
Table 5 presents the results of forecast encompassing tests, which provide an insight into the information content of forecasts produced by different investor sentiment measures, each pitched against our index.The forecasting performance of S TV against other sentiment measures is presented with each entry in the column λ(1) and λ(2), with the null hypothesis that the given forecast encompasses the competing forecast (i.e.λ = 0) tested for employing the MDM statistic in test 1 and test 2, as described in Section 5.2.We report the λ value and its associated MDM statistic (in square brackets) for both tests.Significance of λ indicates that the weight of the competing forecast is greater than zero, and that the competing forecast contains information that is not already included in a given forecast but is useful for the optimal combination forecast.The outcomes for each pairing, following the decision rules as summarised in Figure 4, are presented next to the λ(2) column for each forecast horizon.An outcome 1, which represents S TV outperforming a competing sentiment measure, is obtained when the weight (λ) of competing forecast in test 1 is insignificant and the weight (λ) of S TV forecast is significant in test 2.
The findings in Table 5 provide support for our hypothesis that S TV is a superior sentiment measure, as it contains useful information which is not already included in the competing sentiment proxies.Outcome 1 is consistently observed beyond 1-month forecast and up to the 36-month forecast horizon when we compare the forecasting performance of S TV against S BW , suggesting that S TV forecasts strictly dominate S BW forecasts across all forecast horizons, except for the 1-month.In addition, S TV -based forecasts also dominate those based on S PLS beyond forecast horizon of 6 months, while S PLS dominates S TV only for the 1-month and 3-month forecast horizons (as expected from the previous discussion on short-term sentiment).At h = 6, the outcome 2 is observed for the pairing between S TV and S PLS , implying that the forecasts of both predictors provide complementary information to the optimal forecast combination.As for the MS and CCI, we note that S TV -based forecasts dominate the forecasts based on those variables for most forecast horizons except at very long-term horizons: we observe equivalency for MS at 60-months and CCI at 36-months, and CCI dominant at 60-months.
Counting the occurrences of outcome 1 leads us to a conclusion that all other sentiment measures can be excluded from the optimal combination forecast in 75% of all cases considered, since S TV -based forecasts encompass forecasts of other sentiment measures significantly at the 10% level (i.e. S TV -based forecasts deliver all the useful information); this high probability associated with outcome 1 is too high to be purely driven by chance at the 10% level.In contrast, only 4 out of 32 cases show forecasts based on other sentiment measures to encompass S TV forecasts, as indicated by outcome 3.This is well depicted in the Figure 5, where our index captures the dynamics of stock market returns well and moves along with the direction of realised excess return even at longer forecast horizons.Overall, our sentiment index not only improves on the original BW approach in terms of time-series forecasting power, it also outperforms other competing measures of investor sentiment. 42,43,44,45   TV   Baker and Wurgler (2007, 129) define investor sentiment as 'a belief about future cash flows and investment risks that is not justified by the facts at hand'.Therefore, since asset prices can be calculated as discounted expected future cash flows, the effect of investor sentiment on future stock market returns could pass through the cash flow and/ or (investment risk-adjusted) discount rate channel.In this section, we empirically investigate through which of these channels sentiment affects stock prices, hence giving rise to the predictive power of the S TV index documented above.

Economic sources of the predictive power of S
On the one hand, one could conjecture that the predictive power of investor sentiment is transmitted through the discount rate channel, this would be indicated by S TV predicting positively future discount rates.Investors tend to form bullish return expectations and underestimate the riskiness of the stock market when they are optimistic (see Kaplanski et al. 2015;Vissing-Jorgensen 2004).Since Greenwood and Shleifer (2014) find that   return expectations formed by investors are negatively correlated with expected returns (i.e.required return, or discount rate), 46 investors will discount the expected cash flows using a lower discount rate when they are optimistic, thus inflating current stock prices.As sentiment fades away, investors will correct their risk estimates and apply a higher discount rate to the expected future cash flows, resulting in lower stock returns in the future.
In view of this, we would expect S TV to predict positively the discount rate, resulting in high S TV values predicting lower future stock market returns as documented above.Such a finding would allow us to conclude that sentiment's forecasting ability is due to its impact on expectations regarding discount rates, in support of the discount rate channel hypothesis.On the other hand, we also investigate the relevance of the cash flow channel by testing if S TV predicts negatively future cash flows, which leads to lower future stock returns.Previous studies confirmed that overly optimistic (pessimistic) forecasts of cash flows are formed when investor sentiment is high (low) (e.g.Hong and Sraer 2016;Hribar and McInnis 2012;Kim, Ryu, and Seo 2014).Therefore, low (high) stock market returns following a high (low) sentiment period could be a result of the mispricing correction, once a lower cash flow is disclosed in the future (Huang et al. 2015).Sentiment's predictability of future cash flows would lead to a conclusion in support of the cash flow channel as the source of the observed forecasting ability of S TV with respect to future stock returns.
To empirically differentiate between cash flow and discount rate channels, we follow the literature and employ the log-linearization of returns model proposed by Campbell and Shiller (1988a), which approximates oneperiod-ahead stock returns as the log-linear returns around the average of log dividend-price ratio: where r t + 1 represents the log stock market returns, DG denotes the log dividend-growth rate on the stock market, DP is the log dividend-price ratio on the stock market, ρ is the log-linearization parameter with a value slightly lower than 1 (Campbell, Polk, and Vuolteenaho 2010;Campbell and Shiller 1988a;Campbell and Vuolteenaho 2004), and k is a constant.The identity (4) suggests that stock market returns are predictable by contemporaneous cash flows and discount rates, which are represented by DG t + 1 and DP t + 1 , respectively, in addition to lagged discount rates, DP t .Therefore, the predictive power of S TV t for next-period stock market returns, r t + 1 , could enter through the cash flow (discount rate) channel if S TV t predicts DG t + 1 (DP t + 1 ).To examine the channel through which S TV predicts stock market returns, we regress either a cash flow or a discount rate proxy, which is represented by Y t + 1 below, on lagged sentiment, S TV t , and lagged dividend-price ratio, DP t , employing data from 1968 to 2014: The cash flow and discount rate proxies employed in this study follow the literature (e.g.Huang et al. 2015).The annual log dividend-price ratio on S&P 500 index (DP t + 1 ) is used as the discount rate proxy and the annual log dividend-growth rate computed from year t to t + 1 on the S&P 500 index (DG t + 1 ) is employed as the main cash flow proxy. 47Since dividends could be a poor proxy for cash flows (Ang and Bekaert 2007), the annual log earnings-growth rate computed over a year on the S&P 500 index (EG t + 1 ) and the annual log real GDP growth rate estimated from year t to t + 1 (GDPG t + 1 ) are used as alternative cash flow proxies following Huang et al. (2015). 48S TV t is our time-varying weighted investor sentiment index at the end of year t.Annual data are used in this section following Cochrane (2008), Campbell and Shiller (1988a) and Huang et al. (2015), since dividends and earnings exhibit seasonal effects across the year.
Table 6 presents the results generated for S TV (panel A), S BW (panel B) and S PLS (panel C) as alternative sentiment proxies.Panel A shows that regressing cash flow proxies, i.e.DG t + 1 , EG t + 1 , and GDPG t + 1 on lagged S TV yields insignificant slope coefficients, albeit the predictive direction is consistent with the notion that investor sentiment predicts negatively the future cash flows.However, the slope coefficient of S TV in predicting DP t + 1 , i.e. 0.011, is positive and significant at 5% level.These results indicate that S TV predicts negatively future stock market returns because sentiment primarily affects expectations about future discount rates rather than cash flows.
In line with our prior results that different sentiment proxies perform differently in capturing latent sentiment and, hence, predicting future returns, the economic sources driving those alternative indices appear to be different from those behind our index.The BW sentiment index appears to possess opposite characteristics to S TV , as it features a significant cash flow channel but an insignificant discount rate channel.The S PLS index appears to work through both the cash flow and the discount rate channel.Hence, as already indicated by less-then-perfect correlations among them, different sentiment indices appear to capture different aspects of sentiment, or be driven by different non-sentiment related forces.However, as supported by the overall superior return forecasting performance of our index vis-à-vis its competitors demonstrated throughout the paper, this section's result further suggests that variations in discount rate changes are the main driver behind stock return behaviour, in line with, e.g.Cochrane (2008;2011).
In addition, Panel A of Table 6 shows that dividend-price ratio is highly persistent, given that the slope coefficient of the lagged dividend-price ratio, DP t , for the predictive regression of future dividend-price ratio, DP t + 1 , is 0.956 and statistically significant.In contrast, future cash flows are not affected by DP t since slope coefficients of all cash flow proxies (DG t + 1 , EG t + 1 , GDPG t + 1 ) are not significantly different from zero.The adjusted-R 2 becomes negative when DP t is included as a predictor to the predictive regression (results without DP t not shown to conserve space), again, suggesting that changes in the dividend-price ratio do not explain the variations of future dividend-growth rate.Similar results are seen in panel B and C. Hence, these results imply that it is the changes in the discount rates (or expected returns) rather than in the expected cash flows that cause the dividend-price ratio to vary over time, a finding consistent with Campbell (1991) and Cochrane (1992;2008;2011).
Overall, these results indicate that the empirical ability of our sentiment index, S TV , to explain future market returns is due to the phenomenon of irrational sentiment affecting investors' expectations about the future discount rates, in line with findings in the literature (e.g.Greenwood and Shleifer 2014;Kaplanski et al. 2015; Vissing-Jorgensen 2004).Our findings support the notion that investors irrationally apply lower discount rates that are not justifiable by the facts at hand when investor sentiment is high, leading to stock market overvaluations.Subsequently, once investor sentiment fades away, stock prices revert towards fundamental values as investors adjust upward the discount rate to properly reflect the risk levels of financial assets.This generates a positive relationship between current sentiment and the future discount rate and, therefore, a negative relationship between current sentiment and future stock returns, as documented elsewhere in this paper.

Applications
Having empirically established that S TV is a superior sentiment measure, we now turn our attention to the question of whether information contained in sentiment, as captured by our index, is unique and can be utilised to (i) improve forecasts of future stock returns in a statistical sense (section 7.1) and (ii) generate economic value added to investors (section 7.2), above and beyond of what the popular economic predictors (as investigated in Welch and Goyal 2008) could accomplish.To the extent that we expect the US stock market to be fairly efficient and driven mostly by news about fundamentals, we do not necessarily expect our proxy of irrational sentiment to have higher forecasting power than every single economic variable considered; it might contain useful additional information about future stock returns only in comparison to a few economic predictors and still be considered a useful addition to the forecasters' toolbox.More importantly, any forecasts are useful if they are of economic value to the typically risk-averse investors.

The OOS forecasting performance of S TV versus economic predictors
Firstly, before directly comparing forecasts generated by sentiment with those stemming from economic predictors, we investigate whether sentiment and each of the economic variables employed contains, in a purely statistical sense, any information useful for forecasting future stock returns which would not be already embedded in the historical return mean (HMM).The results of forecasting performance of our index against economic predictors based on R 2 OS and MSFE-adjusted statistics are presented in Table 7.For the ease of comparison, we present the results of S TV in the first row, followed by the performance of different economic predictors from the second row downwards.These results demonstrate that S TV contains valuable additional predictive content, as it significantly beats the HMM, usefully at both short-term (3, 9, 12 months) and long tern (24 months) horizons.When we further compare this result with the forecast accuracy of individual predictors (versus HMM), sentiment outperforms most economic predictors: these predictors tend to underperform HMM, as they produce negative R 2 OS with large magnitudes and insignificant MSFE-adjusted statistics.This result is consistent with previous literature which finds that economic predictors have limited predictive power out-of-sample (Rapach, Ringgenberg, and Zhou 2016;Rapach, Strauss, and Zhou 2010;Welch and Goyal 2008).Some exceptions can be observed for DP, OG and CAY: while DP (with positive R 2 OS values and significant MSFE-adjusted statistics from 1-month up to 6-month forecasts) and OG (up to 12 months) are good predictors over the short term period, CAY forecasts well the excess market return also over longer horizons.However, it should be stressed that this performance is purely statistical and only against a naïve HMM alternative.The last row demonstrates that the aggregate index of all economic predictors (PC-ECON) outperforms HMM only over 3-and 6-month forecast horizons.Hence, investor sentiment proxied by S TV shows unique predictive content regarding future stock market performance, while most popular economic predictors do not.
Having examined the forecast accuracy of each predictor against HMM, we now turn our attention to the predictive performance of our investor sentiment index, S TV , against each of those economic predictors.We present the corresponding forecast encompassing results in Table 8.As before, the S TV forecast is treated as the given forecast in test 1 and as the competing forecast in test 2. Based on tests 1 and 2, the possible outcome is reported in Table 8 in the column denoted 'Outcome' for each forecast horizon.We are particularly interested in outcome 1, which implies that S TV dominates an economic predictor, for each pairing.
As can be seen in Table 8, many more pairings yield outcome 1 than 3 for different economic predictors across different forecast horizons, suggesting that it is much more likely for an economic predictor to be dominated by S TV than vice versa, in terms of their ability to forecast future returns.Specifically, outcome 1 occurs in 72 cases out of 144 pairings (i.e.50%) at the 10% significance level.Furthermore, the comparison between S TV and PC-ECON yields outcome 1 in more than half of the forecast horizons.On the other hand, PC-ECON forecast encompasses S TV forecast, represented by outcome 3, only at horizons h = 6 and h = 9.As PC-ECON has been found in previous literature to predict stock returns better than individual economic predictors do, the outperformance of S TV against PC-ECON is especially supportive of our expectation that S TV has incremental forecasting power beyond those economic predictors (fundamental information).
Overall, these results show that the S TV index is a strong predictor of excess market returns and that it contains unique and incremental non-fundamental systematic component, given its outstanding performance against economic predictors.Our findings demonstrate that stock market returns are significantly driven by investor irrationality, and this irrational sentiment is well captured by the S TV index proposed here.Needless to say, not all market movements are driven by irrationality, hence, no proxy, even an ideal one, of irrational sentiment should be expected to fully outperform all possible fundamental predictors all of the time across all forecast horizons.However, the strong pattern observed here of S TV outperforming most fundamental predictors most of the time is an empirical testimony of this index's strong ability to capture an important factor driving the aggregate stock market.

The economic value of the S TV index
Given that S TV performs well statistically, the next question is whether it can be utilised by investors and add economic value in addition to what the traditional economic predictors can generate.The literature generally finds that one should not conclude about economic value of a predictor based on its OOS statistical performance, as even small values of R 2 OS can be associated with significant economic value to investors, and hence return forecasts can be economically worthwhile (Campbell and Thompson 2008).Equally, a good forecasting performance in statistical terms, such as displayed by DP, OG or CAY above, should not be assumed to equate Table 7. Out-of-sample forecasting results: S TV index vs.economic predictors.with superior economic value of such forecasts, for instance due to investors being concerned with the riskiness of their holdings rather than exclusively with the average forecast error.Therefore, this section explores the economic value of the S TV -based forecasts in a realistic, out-of-sample framework.In line with the literature, we employ the utility gains, or certainty equivalent return (CER), which is the certain return an investor will receive on an investment that generates the same expected utility as a risky portfolio with uncertain returns (Campbell and Thompson 2008;Cenesizoglu and Timmermann 2012;Huang et al. 2015;Marquering and Verbeek 2004;Rapach, Ringgenberg, and Zhou 2016;Rapach, Strauss, and Zhou 2010).Contrary to previous statistical evaluation methods, this economic significance measure takes into account the risk faced by an investor, which makes it more relevant in the real world.Assuming a mean-variance investor who holds a portfolio consisting of equities and risk-free assets, one can determine the optimal equity allocation (ω * t ) based on the forecast of the excess market return ( Rm,t+h ) produced by a given predictor at the end of month t: where γ is the risk-aversion coefficient and σ 2 t+h is the forecasted variance of excess market return computed using ten-year moving window of past excess market return (Rapach, Ringgenberg, and Zhou 2016).Following the literature, a constraint on ω * t values to be between 0 and 1.5 is imposed based on the assumptions of no short sales and leverage of no more than 50%.The CER is defined as the average utility of the portfolio over the forecasting period: where Rp is the average portfolio return and σ 2 p is the portfolio variance.The CER gain is the difference in CER between a predictive regression using a specific predictor and HMM forecasts, and is expressed as an annualised term, representing the annual portfolio management fees an investor would be willing to pay to receive the information contained in the predictive regression forecast (rather than relying on HMM forecasts).The portfolio rebalancing frequency is similar to the forecast horizon (as in Rapach, Ringgenberg, and Zhou 2016). 49As a second measure of economic performance we employ the Sharpe ratio.
Results for the economic value of forecasts are presented in Table 9.For each forecast horizon, we present the CER gain and Sharpe ratio side-by-side and rank the economic performance of each predictor based on their CER gain, which is called CER ranking, at each forecast horizon.The average of the CER rankings for each predictor across different forecast horizons is denoted as the mean rank.We compute two mean ranks: 'mean rank (all)' is the average of CER rankings across all forecast horizons and 'mean rank (h ≤ 24)' is the average of CER rankings from 1-to 24-month forecast horizons. 50Finally, we assign a final rank to each predictor according to the mean rank value.
Panels A and B report the portfolio performance for an investor with the risk-aversion coefficient of 1 and 3, 51 respectively.Panel A demonstrates that S TV consistently generates sizeable CER gains, ranging from 0.15% to 1.14%, to investors from 1-month up to 24-month forecast horizon, except for the 9-month forecast horizon.This indicates that investors are willing to pay a portfolio management fee of up to 1.14% to exploit the information contained in a S TV -generated forecast.Meanwhile, S BW also produces greater CER than the HMM model in five out of eight forecast horizons, as shown by the positive CER gain.Taking a closer look at the magnitudes of CER gains, however, reveals that S BW has lower CER gains than S TV for most prediction horizons.Hence, once again we find that constructing the sentiment index in a way which allows for time-varying contributions of its components, as in S TV , helps to improve the way the original approach by Baker and Wurgler (2006) captures the underlying sentiment.
At 1-and 3-month forecast horizons, S PLS delivers the highest CER gains, which are 2.96% and 3.52%, respectively.Further analysis, however, shows that the CER gains of S PLS are affected greatly beyond 6-month forecast horizon and S PLS underperforms S TV in terms of CER gains beyond 3-month forecast horizon, except for h = 60.MS and CCI perform the worst among all sentiment measures since the portfolio return does Notes: This table reports the annualised certainty equivalent return (CER) gain (%) and the Sharpe ratio (SR) of portfolios formed based on excess market return forecasts constructed using different investor sentiment measures and economic predictors on a rolling window basis.Panel A and B report results for an investor with a mean-variance utility function with a coefficient of risk aversion of 1 and 3, respectively.Mean rank (all) represents the average ranking of the economic performance of each predictor across eight forecast horizons; whereas mean rank (h ≤ 24) is the average ranking of each predictor from 1-month up to 24-month forecast horizons.The final ranking of each predictor is determined based on their mean rank across different forecast horizons.Investor is assumed to rebalance the portfolio at a frequency similar to the forecast horizon.The proportion of wealth invested equities is restricted to be between 0 and 1.50.The out-of-sample period stretches from January 1985 until December 2014.
not improve by using their forecasts in lieu of HMM forecasts.Overall, the comparison among investor sentiment measures show that S TV performs better than most sentiment measures in delivering real benefits to a mean-variance investor with a risk-aversion coefficient of one.
As for the economic predictors, the majority provide little evidence of CER gains.OG appears to be the best return predictor as it consistently generates positive CER gains beyond the 1-month forecast horizon, other than at the 24-month forecast horizon.
When we assume a higher risk aversion coefficient of 3 (panel B), S TV stands out as the superior sentiment measure which consistently delivers positive CER gains up to 24 months.It is worth noting that the annual portfolio management fees an investor is willing to pay to gain access to the forecast based on S TV are higher in panel B (i.e.ranging from 0.41% to 1.92%), as compared to those in panel A where risk aversion was lower (i.e.ranging from 0.15% to 1.14%).Also, comparing with the results in panel A reveals that S TV experiences a greater improvement in CER gains for an investor with a risk aversion coefficient of 3 across most forecast horizons.These findings imply that the more risk-averse an investor is, the more they are willing to pay for access to forecasts generated by our sentiment index.
When risk aversion is higher (panel B), S PLS generates positive CER gains over short-term forecasts of up to 6-month horizon only, even though it delivers the highest CER gains for 1-and 3-month forecast horizons (5.07%and 5.37%, respectively).In line with the results shown in panel A, S BW , although producing positive CER gains for most forecast horizons, it generally has lower CER gains than S TV .MS and CCI are shown to provide positive CER gains to investors only occasionally and hence are ranked as the worst predictors among all investor sentiment measures.
Out of 17 individual economic predictors, only three of them, DE, NTIS and CAY, deliver positive CER gains over half of the forecast horizons.Surprisingly, the diffusion index of all economic predictors (PC-ECON) does not provide positive CER gains for most forecast horizons when γ = 3.
The final two columns, which represent the ranking of each predictor, demonstrate that S TV delivers consistent results across horizons in panels A and B. Specifically, S TV ranks first in both panels when 36-and 60-month forecasts have been excluded.Meanwhile, it ranks second in both panels when we take into account the CER gains across all forecast horizons.Other investor sentiment measures and economic predictors, however, do not produce consistent rankings across horizons in both panels.These results suggest that S TV can consistently deliver economic gains to mean-variance investors across different forecast horizons and magnitudes of risk aversion. 52 Regarding the Sharpe ratio, S TV consistently produces higher values relatively to HMM across most forecast horizons, regardless of the degree of risk aversion.This finding is in line with the results for CER gains.MS and CCI generate poorer risk-adjusted returns to investors with the risk aversion coefficient γ of 1 as compared to HMM, but CCI has a higher Sharpe ratio than HMM where γ is equal to 3. The result of economic predictors is analogous to that of CCI: economic predictors rarely generate Sharpe ratios higher than that of HMM when γ = 1, but perform better when γ = 3.
Overall, the analysis of the economic value of forecasts indicates that our sentiment index outperforms competitor indices and economic predictors.It generates statistically significant value to investors, as knowledge of investor sentiment captured by our index enables a profit-generating rebalancing and optimisation of equity portfolios.

Summary and conclusions
In this study, we address a puzzle that the market-based investor sentiment index constructed by Baker and Wurgler (2006) does not have a strong time-series forecasting power for future aggregate market returns.This is puzzling, given that this index is constructed as a composite of numerous variables which individually have been widely employed in prior literature as effective sentiment proxies and, hence, combined into one index should be expected to have superior forecasting ability.We propose an enhancement to the original Baker and Wurgler (2006) method which grants the resulting market-based sentiment index the previously lacking forecasting power in the time-series, aggregate market level context, without sacrificing its cross-sectional predictive power.Our approach is based on relaxing the implicit assumption that each of its components' ability to empirically capture the unobservable investor sentiment is time-invariant.We present evidence that the ability of each component to capture sentiment may vary over time and we construct an enhanced investor sentiment index, S TV , which allows each of those components to have time-varying loadings.Our approach captures the dynamic contributions of investor sentiment proxies to the aggregate index while also avoiding any look-ahead bias, as present in the original index.
Empirically, we show that S TV is a superior market-based measure of latent long-run investor sentiment: in-sample, it consistently displays a negative and significant relationship with future long-term stock market returns, in line with the theoretical rationale behind a good sentiment proxy (Baker and Wurgler 2006;2007;Huang et al. 2015; investors newsletters surveys in Brown and Cliff 2005; consumer confidence in Bathia and Bredin 2013;and Schmeling 2009;FEARS index in Da, Engelberg, and Gao 2015;and textual-based sentiment measures in Das and Chen 2007;Tetlock 2007).Our index outperforms other sentiment measures in forecasting stock market returns across different forecast horizons.Out-of-sample findings also confirm that the enhanced market-based index is a superior sentiment measure, as it generates superior forecasts as compared to those based on alternative sentiment proxies.We demonstrate empirically that long-run sentiment, as captured by our new index, drives stock prices through the discount-rate rather than cash-flow channel.Lastly, sentiment possesses unique information about future market movements not contained in most popular economic predictors.Therefore, it could be utilised to improve the quality of stock market forecasts, as it tends to generate both statistically and economically superior forecasts consistently outperforming competitor variables, especially for investors with higher levels of risk aversion.
Overall, our proposed enhancement to the original Baker and Wurgler (2006) index significantly improves its ability to empirically capture the latent investor sentiment.Our new market-based index should therefore be of value in future academic research where a good empirical proxy for sentiment is required, and to stock market investors, as demonstrated by our economic value analysis.For instance, portfolio managers would benefit from availability of an empirical measure which captures a well-known risk factor (sentiment, or noise trader, risk) in asset pricing (e.g.Antoniou, Doukas, and Subrahmanyam 2016). 53In addition, given that investor sentiment induces mispricing especially in environments lacking transparency and with weak corporate governance of firms (e.g.Firth, Wang, and Wong 2015), policy makers, regulators and accounting professionals would gain an improved instrument to gauge the severity of such distortions and to guide them towards appropriate reforms.Having a more accurate measure of latent sentiment would also facilitate further studies into what forces drive investor sentiment.

Notes
1.The impact of irrational traders acting on mood rather than fundamental news on stock prices could further be exacerbated by their tendency to flock together, i.e. to herd (Gavriilidis, Kallinterakis, andTsalavoutas 2016). 2. De Bondt andThaler (1985) and Chopra, Lakonishok, and Ritter (1992) demonstrate that stock prices experience reversals over 3-to 5-year horizons.Fama and French (1988) and Poterba and Summers (1988) also shows that stock returns are negatively correlated over similar long horizons.3.These original proxies are: dividend premium (PDND); average first-day returns of IPOs (RIPO); the number of IPOs (NIPO); the closed-end fund discount (CEFD); market turnover (TURN) and the share of equity issues (EQ).TURN has been subsequently excluded from updated versions of that index.4. Recent applications of the BW index include: Akbas et al. (2022), Bhattacharya et al. (2022), Cenesizoglu (2022), Chang et al. (2022), Chen, Choy, andTan (2022), Chu et al. (2022), Chue and Xu (2022), Guo, Li, and Li (2022), Han et al. (2022), Konstantinidi (2022), Lin et al. (2022), Lou and Polk (2022), to name but a few. 5. Baker and Wurgler (2006;2007) and Huang et al. (2015), among others, support the notion that in the cross-section, stocks which are speculative, difficult to value, and arbitrage are more susceptible to sentiment.In addition, given that those sentimentdriven stocks will typically constitute only a small fraction of the aggregate market, the sentiment effect is diluted by the remaining stocks, and it constitutes an additional challenge to find significant time-series predictive power of a sentiment proxy for the aggregate stock market.6.For the sake of comparability of results, this study uses only the survey-and market-based sentiment measures.We do not consider the textual-based sentiment measures such as the search-, media-and internet-based sentiment measures (see, for example, Antweiler and Frank 2004;Chen et al. 2014b;Da, Engelberg, and Gao 2015;Das and Chen 2007;García 2013;Sun, Najand, and Shen 2016;Tetlock 2007;Zhang et al. 2021) as i) these textual-based measures do not match the monthly data frequency of this study, ii) are not based on revealed preferences (i.e. the actual trading behaviour) of investors, and iii) are mostly available for much shorter time periods.7. See, primarily, Huang et al. (2015), but also Arif and Lee (2014), Baker and Wurgler (2006;2007), Brown andCliff (2004), De Long et al. (1990), Lee, Jiang, and Indro (2002), Stambaugh, Yu, and Yuan (2012).8. VIX and other sentiment proxies are excluded from the construction of the S BW index due to data not being available for the full sample period.9. Which interchangeably can be viewed as 'macroeconomic variables' as termed by Kadilli (2015) and Lemmon and Portniaguina (2006), or 'rational factors' as termed by Brown and Cliff (2005).10.A good short-term performance should not be surprising, given that the PLS method, as applied in Huang et al. (2015), requires running time series regressions linking each sentiment proxy to one period ahead (i.e.future) stock returns, hence effectively 'training' the resulting sentiment index in-sample towards short-term predictability (assuming reasonable levels of parameter stability).By contrast, the BW approach and our modification to it only utilise sentiment proxies when constructing the sentiment index, without exploiting the in-sample predictive link between individual proxies and future stock returns.That link is used to subsequently assess the validity of a sentiment proxy and should not be used in the construction of that proxy.11.As Da, Engelberg, and Gao (2015) put it, '[...] market-based measures [...] have the disadvantage of being the equilibrium outcome of many economic forces other than investor sentiment' (p.2).12.In this study, we do not attempt to investigate empirically the economic reasons behind each component's ability to capture latent sentiment (and behind time variations in this ability).Rather, we let the data speak for itself, in line with the general approach adopted in this branch of the literature (including Baker and Wurgler 2006;2007;Huang et al. 2015).13.The use of contemporaneous rather than lagged values is more realistic when applied in the forecasting context since, at any point in time t prior to sample's end, a forecaster would not possess information from the entire sample to determine the optimal in-sample lag order for each proxy.14.Unreported results, available on request from the authors, support our supposition that a three-year rolling window for the PCA is optimal in allowing S TV to reflect the time-varying ability of each index component and, hence, to empirically capture unobservable investor sentiment, compared to using PCA windows of one, two, four or five years in the calculation of S TV .15.The average proportion of variance explained by the first principal component across different windows is about 50%, which is higher than the proportion of variance explained by the first principal component from the entire sample (i.e. about 41%), further supporting our approach of time-varying estimations.16.In fact, when we replicated the BW index using those authors' data, our resulting whole-sample estimated index displayed a perfectly negative correlation with the original BW index, clearly demonstrating the arbitrariness of sign assignment and the intrinsic sign indeterminacy in the PCA approach.Additionally, the sentiment data file maintained by Jeffrey Wurgler online explicitly warns its users that PCA could yield the 'negative of sentiment'.17.BW also flip the signs of their entire index series if its December 2000 value is not positive, as indicated in the Stata code available on Jeffrey Wurgler's website.18.In unreported analysis where we regress future returns on each of the BW sentiment proxies in sample, one at a time, RIPO is consistently the only significant predictor across all return horizons, yielding further justification to our choice of RIPO as a benchmark sentiment index in PCA.19.When we estimate our time-varying index without sign adjustments/flipping (results not reported), its ability to empirically capture sentiment is still superior to the original Baker and Wurgler (2006;2007) index, but inferior to the time-varying index constructed with sign flipping in accordance with the theoretical rationale regarding the RIPO component.20.A further rationale for theory-based sign re-assignment is that this approach is employed in other branches of the finance literature, e.g. in the forecasting literature where restrictions are imposed on the signs of coefficients and return forecasts (Campbell and Thompson 2008).21.It should be noted that this difference in volatility vis-a-via S BW stems solely from the time-varying nature of the estimated component loadings, as the underlying data on variables entering the PCA is identical to that of BW. 22.While inspecting Figure 2, one can observe instances of more extreme values of the estimated weights, as pointed out by the Associate Editor and an anonymous reviewer.These extremities are partially spurious as due to boundaries of confidence intervals rather than point estimates themselves, however, in unreported analysis we explored if extreme values of weights are adding excessive noise or are the sole source of our index's predictive power.By winsorizing each weight by its full sample 25-th and 75-th percentile, the resulting sentiment index is highly correlated with our original S TV index (r = 99.1%) and its in-and out-of-sample forecasting performance is very similar as well.Hence, the estimated weights are not excessive in that they do not add noise, but also our S TV index does not rely on extreme weight realisations for its superior forecasting performance.23.To this end, we firstly estimate the effect, in each window, of each component relatively to the summative effect of all components.This is accomplished by taking the absolute value of the product of each component's weight and the corresponding component's value observed at the end of that window, and dividing it by the sum of such calculated (five) components' effects.
The resulting time series are then smoother using 3-years moving windows.24.In unreported results, available on request from the authors, we investigate statistically the validity of allowing weights to vary over time.Structural breaks in the weights time series, the presence of which would invalidate a methodology employing a fixed weighting scheme, are found, both by employing structural break tests and also by investigating level shifts as a result of the IT bubble ( 2000) and the Global Financial Crisis (2007)(2008)(2009).The robust evidence we find that structural breaks exist strongly rejects the assumption that weights should be fixed constant values.
in investor behaviour and informational efficiency of financial markets, both developed and emerging ones, in particular in topics such as how diverse factors affect investor behaviour and stock markets, including the rationality of motives to trade, investor herding, and sentiment.His further research interests include the behaviour of stock returns, return predictability and performance of trading rules, international financial integration and contagion risk, and corporate finance.Bartosz has published extensively in journals such as Journal of Banking and Finance, Journal of Empirical Finance, Economics Letters, Journal of Economic Behavior, International Review of Financial Analysis, Journal of International Financial Markets, Institutions and Money, etc.
Robert Anderson joined Newcastle University Business School, UK as a Lecturer in Economics in 2009 following completion of his PhD at the University of Manchester.He has used his background in applied time-series econometrics to research topics involving large datasets in finance and economics.In the former, he has developed a growing interest in behavioural finance, exploring issues of sentiment in stock markets, and the consumer inertia behind retail interest rate rigidities.In the area of finance more generally, he has published papers on stock market contagion and retail deposit pricing mechanisms.While in economics, he has looked at consumer inflation expectations, and developed econometric methods to analyse such data.

Figure 1 .
Figure 1.Investor sentiment indexes.Note: This figure depicts the investor sentiment indexes over three sub-periods: December 1968 to November 1982, December 1982 to March 2001, and April 2001 to December 2014.The blue line is the time-varying weighted investor sentiment index (S TV ), the green line is the aligned investor sentiment index (S PLS ) retrieved from Guofu Zhou's website, and the dotted line is the Baker and Wurgler investor sentiment index (S BW ) retrieved from Jeffrey Wurgler's website.Orthogonal investor sentiment indexes are used in this figure.Shaded areas represent NBER-dated recessions.
recession, the burst of biotech bubble in mid-1983, the Black Monday 1987, post-dot-com bubble period of 2000-2002, and the subprime mortgage crisis of 2008-2009.

Figure 2 .
Figure 2. Principal component loadings of each investor sentiment proxy for S TV .Note: These figures show the weights of individual investor sentiment proxies: dividend premium (PDND), average first-day returns of IPOs (RIPO), number of IPOs (NIPO), closed-end fund discount (CEFD), and share of equity issues (EQ), as well as their 95% confidence intervals, for the time-varying weighted investor sentiment index (S TV ) from December 1968 to December 2014.The weights of sentiment proxies in month t are retrieved from the first principal component on a rolling window basis with a window length of 36 months.Horizontal lines depict analogous time-invariant loadings and confidence intervals obtained from the whole-sample PCA.Shaded areas represent NBER-dated recessions.

Figure 3 .
Figure 3. Relative importance of individual sentiment proxies for S TV .Notes: This figure shows how the contribution of each component (RIPO, NIPO, PDND, CEFD and EQ), evolves over time, as a percentage of the total contribution of all components at each point in time.

Figure 5 .
Figure 5.The forecasts of excess market returns across different forecast horizons.Note: This figure illustrates the out-of-sample forecasts produced by the time-varying weighted investor sentiment index (S TV ), the Baker and Wurgler investor sentiment index (S BW ), and the aligned investor sentiment index (S PLS ).The forecasts are produced on a rolling window basis with a fixed window length of 15 years, and are compared against the realised return (6-month moving average of excess market return).Shaded areas represent NBER-dated recessions. 22

Table 1 .
Descriptive statistics for weights of individual PCA components (after flipping).
Notes: This table reports the summary statistics of data (panel A) and the correlation of S TV and other sentiment measures (panel B).SD denotes standard deviation, Min is the minimum value, Max is the maximum value and ρ(1) is the first-order autocorrelation.The description of each predictor variable is given in the text.The sample period in panel A spans 588 months, from January 1966 until December 2014; the sample period in panel B spans 553 months, from December 1968 until December 2014.The sentiment values prior to December 1968 are excluded from the correlation analysis in panel B since the constant loadings are assigned to each component of S TV prior to this date.* , * * and * * * indicate statistical significance at 10%, 5% and 1% levels, respectively.
t + t , VIX t , MS t , CCI t } Econ j,t = DP t , DY t , EP t , SVAR t , BM t , NTIS t , TBL t , LTR t , TMS t , DFY t , DFR t , INFL t−1 , CAY t , OG t−1 , SCR t (1) Neely et al. 2014;Rapach, Strauss, and Zhou 2010).Comparing the forecasting performance of S TV against PC-ECON in this way therefore gives us an extra layer of robustness to the results.The model estimated is thus: VIX t , MS t , CCI t , DP t , DY t , EP t , SVAR t , BM t , NTIS t , TBL t , LTR t , TMS t , DFY t , DFR t , NFL t−1 , ICAY t , OG t−1 , SCR t , PC − ECON t

Table 3 .
Predictive performance of investor sentiment indexes.This table reports the estimates obtained from equation (1) for the time-varying weighted investor sentiment index (S TV ), the Baker and Wurgler investor sentiment index (S BW ), the aligned investor sentiment index (S PLS ), the University of Michigan Consumer Sentiment Index (MS) and the Conference Board Consumer Confidence Index (CCI) across different prediction horizons.The Newey-West (automatic bandwidth selection) t-statistics are shown in brackets.* , * * and * * * indicate statistical significance at 10%, 5% and 1% levels, respectively.The sample period ranges from December 1968 to December 2014.

Table 4 .
Out-of-sample forecasting results: S TV vs. other investor sentiment measures.
TV), the Baker and Wurgler investor sentiment index (S BW ), the aligned investor sentiment index (S PLS ), the University of Michigan Consumer Sentiment Index (MS) and the Conference Board Consumer Confidence Index (CCI).* , * * and * * * indicate statistical significance at 10%, 5% and 1% levels, respectively, based on Newey-West t-statistic for MSFE-adjusted test.The in-sample period ranges from December 1968 to November 1983 and out-of-sample period from December 1983 to December 2014.
Notes: This table presents the Campbell and Thompson (2008) R 2 OS (in percentage) and the Clark and West (2007) MSFE-adjusted statistic of various investor sentiment measures: time-varying weighted investor sentiment index (S

Table 5 .
Forecast encompassing tests: S TV vs. other investor sentiment measures.

Table 6 .
The forecasting channel of investor sentiment.Notes: This table reports the estimates obtained from equation (5) for the time-varying weighted investor sentiment index (S TV ) in panel A, the Baker and Wurgler investor sentiment index (S BW ) in panel B, and the aligned investor sentiment index (S PLS ) in panel C. DP is the log dividend-price ratio on the S&P 500 index measured in annual term, DG is the annual log dividend-growth rate of S&P 500 index computed from year t to t + 1 (in percentage), EG is the annual log earnings-growth rate of S&P 500 index computed over a year (in percentage), and GDPG is the annual log real GDP growth rate (in percentage).The Newey-West (automatic bandwidth selection) t-statistics are shown in brackets.* , * * and * * * indicate statistical significance at 10%, 5% and 1% levels, respectively.The sample period ranges from 1968 to 2014.
This table presents the Campbell and Thompson (2008) R 2 OS (in percentage) and the Clark and West (2007) MSFE-adjusted statistic of the time-varying weighted investor sentiment index (S TV ), and economic predictors as listed in Section 5. * , * * and * * * indicate statistical significance at 10%, 5% and 1% levels, respectively, based on Newey-West t-statistic for MSFE-adjusted test.The in-sample period stretches from December 1968 to November 1983 and out-of-sample period from December 1983 to December 2014.

Table 8 .
Forecast encompassing tests: S TV vs. economic predictors.

Table 9 .
Out-of-sample CER gains and Sharpe ratios for a mean-variance investor.