Influences on mutual fund performance: comparing US and Europe using qualitative comparative analysis

Abstract This study examines the conditions that lead mutual funds to underperform or outperform competitors. Using fuzzy-set qualitative comparative analysis (fsQCA), we draw upon extensive research on fund returns to affirm and extend earlier discoveries. Fund performance (Morningstar ratings), features of the funds themselves, and characteristics of the fund managers are considered. Positive Morningstar star and analyst ratings are necessary conditions for funds to generate value (measured by Jensen’s alpha). Funds with low management fees and low ongoing fees have attractive Sharpe ratios and high returns. Likewise, large funds with good Morningstar ratings have good Sharpe ratios and returns, often when fund managers have short tenures.


Introduction
In the social sciences, the research-consuming public repeatedly fails to act upon widely published findings. This study affirms and extends earlier findings concerning mutual fund returns and risks, doing so using results from fuzzy-set qualitative comparative analysis (fsQCA), a method that scholars rarely employ.
Financial economists have used variants of ordinary least squares (OLS) to examine mutual fund returns (Kamstra, Kramer, Levi, & Wermers, 2017;Muñoz, Vicente, & Ferruz, 2015). The goal is usually to discern whether professional managers add value to investors. We offer further support to these studies' principal findings, namely that few mutual funds outperform tracker indexes or justify their high fees.
This paper compares mutual funds in Europe and the United States. FsQCA overcomes the assumption of linear relationships when explaining fund returns in terms of other variables. It provides a tool for making empirical and theoretical inroads into the debate on factors that influence mutual fund performance and adds value by jointly considering the pillars that are used to assess the functioning of mutual funds but have not yet been analysed in the economic literature.
Certain studies have shown that positive, abnormal returns are relatively small and often fail to exceed the funds' expenses (Droms & Walker, 1995;Jensen, 1968;Malkiel, 1995;Sharpe, 1966). Paradoxically, the perceived value of active management dominates the reality: Returns justifying active management are in fact rare. Despite this paradox, scholars conclude that finding successful funds ex ante is extremely difficult (Cuthbertson, Nitzsche, & O'Sullivan, 2016). Surprisingly, the industry has continued to grow.
So why do people invest in managed funds? Ostensibly, such funds add value through asymmetric information advantages and the fund manager's skills, thereby providing positive returns (Agarwal, Mullally, Tang, & Yang, 2015). Daniel, Grinblatt, Titman, and Wermers (1997) showed that mutual funds exhibit some selectivity ability. Future fund performance is influenced by fund features and fund manager characteristics.
Fund ratings offer a way of evaluating funds. For ratings to be useful and valid, performance should reflect these ratings (Chen, Wang, & Yu, 2014). Some studies have examined the predictive capacity of certain well-established mutual fund rating systems for the US market (Blake & Morey, 2000;Sharpe, 1998). Morningstar's qualitative and quantitative ratings are a common investment ratings instrument amongst investors and managers (Bolster & Trahan, 2013).
Aware of the importance of fund ratings, we assess fund performance as a function of Morningstar ratings, fund features, the fund manager's experience, and fund fees. Methodologically, the use of fsQCA in this study contributes to the literature on factors that affect mutual fund performance . Comparative methods, which are based on set theory, seek causal configurations within an empirical data set (Rihoux & Ragin, 2009). Such methods are unhindered by the assumption that causal conditions are linear-additive. Net effects must also be analysed. Multiple regression analysis (MRA) effectively identifies symmetrical relationships, but empirical observations often conceal other kinds of relationships (Fiss, 2011). Insightful combinations of conditions usually have asymmetrical relationships with an outcome.
FsQCA focuses on asymmetrical relationships to identify sufficient or necessary conditions for a certain outcome. FsQCA performs well when causation is complex and when different conditions yield identical results. Regression coefficients offer little insight when a high value of any one variable is not always associated with a high value of the dependent variable (Woodside, 2012a).
In fsQCA, the outcome occurs only when necessary conditions are present. In contrast, sufficient conditions always lead to the outcome (Fiss, 2007). FsQCA provides a novel tool for making empirical and theoretical inroads into the debate on factors that influence mutual fund performance. The paper has the following structure. Section 2 discusses the literature and frames mutual fund performance in terms of Morningstar ratings, fund features, and fund manager characteristics. Section 3 describes the data and analysis method. Section 4 presents the results. Section 5 provides the conclusions, limitations, and managerial implications. Finally, research opportunities are outlined.

Literature review
Extensive literature describes mutual fund performance in terms of a wide array of factors. Our paper extends this body of research, exploring the ability of Morningstar ratings and other fund features to predict fund performance. These features are fund size and fund age, the manager's experience, and fund fees.

Morningstar star rating
The Morningstar star rating was introduced in 1985 by Morningstar, Inc. Its one-to five-star scale lets investors easily evaluate funds based on risk-adjusted returns and distinguish between funds with similar investment strategies (Del Guercio & Tkac, 2008). Sharpe (1998) explored the properties of the rating system, underscoring its importance.
Star ratings are based on expected utility theory. The assumptions are (1) that investors are more sensitive to poor outcomes than unexpectedly high returns and (2) that they are willing to forgo part of the expected return in exchange for greater certainty of returns. These assumptions offer a prosaic reflection of the classic risk/ return trade-off. Blake and Morey (2000), Blume (1998), and Morey (2002) describe the Morningstar rating methodology in detail. The rating quantitatively assesses a fund's past performance over the past 3, 5, and 10 years. In certain scenarios, establishing a rating proves impossible. For example, funds that have overhauled their investment strategy have insufficient data to be awarded a rating. The rankings order funds by their Morningstar risk-adjusted return scores. The top 10% of funds receive five stars, the next 22.5% receive four stars, the next 35% receive three stars, the next 22.5% receive two stars, and the bottom 10% of funds receive one star.
The rating overlooks basic mutual fund data such as analyst opinions, fund manager performance, and expense ratios. Accordingly, it assesses the fund's management quality but does not predict future performance. Carhart (1997) criticises Morningstar ratings, noting that the star rating is a backward-looking measure. As such, it is of limited use to investors because past performance is typically a poor predictor of future performance. The rating alone provides insufficient data for investment decisions. Other characteristics that the rating does not cover are key considerations when choosing a fund. The rating nevertheless flags funds that deserve potential investors' attention.
But do Morningstar ratings in some way predict performance? Studies have documented the importance of star ratings in investor allocation decisions (Del Guercio & Tkac, 2008;Sirri & Tufano, 1998). New cash flows from investors flock to funds with good past performance ratings. Babalos, Doumpos, Philippas, and Zopounidis (2015) indicate that Morningstar ratings are closely related to the efficiency of the funds. Jain and Wu (2000) reveal that despite attracting extra cash inflows, funds that advertise above-the-benchmark returns fail to provide superior returns in the following period. Morningstar affirms in its prospectuses that star ratings are not predictors of future performance, advocating caution from investors who are keen to extrapolate a high fund rating to superior future performance. Nevertheless, highly rated funds experience cash inflows that far exceed the cash outflows experienced by poorly rated funds (Sirri & Tufano, 1998). Examining performance across funds shows whether these cash flows are justified by subsequent fund performance (Blake & Morey, 2000).
Using common performance metrics, studies have assessed how well Morningstar star ratings predict returns. Even before Morningstar's ratings methodology changed in 2002, several studies had already showed that high star ratings were poor predictors of future superior performance (Blake & Morey, 2000;Morey, 2005). After 2002, Gottesman and Morey (2006) found that the rating system had got better at predicting future performance. Kr€ aussl and Sandelowsky (2007) refuted these findings, reporting that predictions based on star ratings could not beat a random walk. Gerrans (2006) andF€ uss, Hille, Rindler, Schmidt, andSchmidt (2010) also failed to find support for the ratings' predictive power.

Morningstar analyst ratings
Investors and advisors operating in the current financial environment need objective, independent, transparent reports on mutual funds. Qualitative analysis of funds considers overall investment quality and helps investors understand how a particular fund can contribute to and complement their existing investment portfolios. In 2011, analyst ratings began to provide overall mutual fund ratings. Based on analysts' opinions regarding whether funds will outperform benchmarks over the market cycle, analyst ratings are forward-looking qualitative and quantitative analyses of mutual funds' competitive advantages (or shortcomings). A report on each qualified fund accompanies the Morningstar analyst rating. Analyst ratings identify funds that are appropriate for particular investor portfolios and risk tolerances (Haslem, 2014).
Morningstar is an independent, well-known, credible source of investment advice, so the market highly values Morningstar analyst ratings (Bolster & Trahan, 2013). Ratings may affect investors' decisions, leading to a fund flow response to ratings. The impact of analyst ratings on future fund flows may incentivise fund managers to improve in key areas that undermine their analyst ratings.
Morningstar analysts evaluate five pillars that have proven critical to a fund's longterm risk-adjusted performance. Analysts rate each pillar as positive, neutral, or negative. The fund's overall rating depends on the following five pillars (Armstrong, Genc, & Verbeek, 2017): Parent: The parent company is evaluated in key areas, including shareholder structure, incentive compensation system, stability of management teams, and corporate culture. The parent company is important for long-term investment.
People: Many people contribute to the fund's investment process. The quality assessment of the management significantly influences the rating. Team experience, manager temperament, manager workload, analytical support, incentive structure, and information flow are amongst the factors that are analysed.
Performance: Does the manager add value? Lucky and talented managers often resemble one another. Analysts focus on the sources of performance, the risks the manager has taken, and the performance of the same managers at the helm of other accounts.
Price: Fees are good predictors of future performance. Understanding whether the fund offers good value versus similar funds is important.
Process: Strategy is one of the keys to the fund's success, so analysts evaluate the fund's investment strategy and its implementation. They assess whether the strategy suits the manager's skillset and the fund's resources. They also evaluate the risks attached to the strategy.
Studies suggest that investors consider Morningstar ratings when picking well-performing funds. The result is higher inflow to higher rated funds. Armstrong et al. (2017) examined investors' responses to analyst ratings. Overall, ratings influence investor allocation decisions. But the lack of clarity regarding whether ratings provide valuable information on long-term performance is a key motivator of the present study.
Positive ratings have three levels: Gold, Silver, or Bronze. Neutral and Negative are the alternative ratings. Negative ratings have only one level, so positive ratings are more nuanced. By awarding positive ratings to certain funds, Morningstar analysts express the belief that these funds will outperform competitors in the long term. Through positive ratings, analysts convey their beliefs regarding funds' strengths or weakness. Each analyst rating is now described. The ratings appear in descending order of expected performance.
Positive ratings: Gold funds are outstanding. Based on the five assessment pillars, these funds have earned the analysts' full conviction that their performance will be excellent. Silver funds have notable advantages in several but not all of the five pillars. These strengths give analysts a high level of conviction. The advantages of Bronze funds outweigh the disadvantages. The analysts' conviction is sufficient for the fund to warrant a positive rating. Neutral ratings: Although the returns of these funds are not excellent, their behaviour is no worse than average. Negative ratings: These funds have at least one flaw that it is believed will significantly affect future performance. They are considered inferior to their peers.

Performance and fund characteristics
Numerous fund performance studies provide evidence of the relation between performance and fund features such as size, manager tenure, and fees (Cuthbertson et al., 2016;Golec, 1996). Others, like Agarwal et al. (2015) have affirmed that the performance decline is more pronounced with greater information asymmetry. We now discuss these features.

Fund age
The effect of fund age on performance is unclear (Cremers & Petajisto, 2009;Cuthbertson et al., 2016;Golec, 1996;Otten & Bams, 2002). Peterson, Petranico, Riepe, and Xu (2001) and Prather, Bertin, and Henker (2004) showed that younger funds perform better than older funds or that fund age and performance bear little relation. Younger funds may focus more on management, but this advantage is countered by higher start-up costs. Gregory, Matatko, and Luther (1997) suggest that younger mutual funds' performance might be affected by an 'investment learning period', reporting that younger funds tend to be smaller than older funds. That younger funds sometimes outperform older funds is interesting, given that survivorship bias would favour the older funds.

Fund size
Fund size (net assets) influences fund management. The relationship between fund size and fund performance remains unclear. Numerous investigations of the potential effect of size on fund performance are inconclusive. Droms and Walker (1995) actually reported a negative relationship between fund size and performance, explaining this negative relationship by citing larger funds' more diversified portfolios, which have lower risk and consequently lower returns. Babalos, Mamatzakis, and Matousek (2015) revealed that investors should be aware that larger funds have reduced flexibility, which offers inferior performance when markets are turbulent. Chen, Hong, Huang, and Kubik (2004) reported that larger funds offer substantial savings because of their scale but that fund size actually erodes performance. Basso and Funari (2017) studied a set of European equity mutual funds and found that there is no relationship between size and performance but that there is a positive size effect for large funds in comparison with smaller ones.
Other research also implies that performance deteriorates with fund size (Kacperczyk & Seru, 2007;P astor, Stambaugh, & Taylor, 2015;Pollet & Wilson, 2008). Ding, Zheng, and Zhu (2015) provide evidence of a U-shaped relationship between fund size and performance: As fund size increases, fund performance first improves but later declines. Some studies have linked fund size to fund fees. For example, Khorana, Servaes, and Tufano (2009) noted that larger funds often charge lower fees. The hypothesis positing the benefits of economies of scale receives little support from the literature.

Manager tenure
Manager tenure proxies management experience and may influence mutual fund performance. Here, we use the number of years the manager has been managing the fund to explore links between experience and fund performance. Nevertheless, most studies have shown that manager tenure has no significant effect on performance (Costa, Jakob, & Porter, 2006;Costa & Porter, 2003;Switzer & Huang, 2007). Peterson et al. (2001) suggested that managers that run funds for shorter periods are usually more alert and have greater incentives to perform. Asal (2016) noted that it is not possible to determine the managers who will be successful in the future because investors need to invest ex ante not ex post. In contrast, Filbeck and Tompkins (2004) and Golec (1996) have argued that managers with longer tenures perform better. Consequently, investors would prefer to invest in funds run by experienced managers. The same researchers reported a significant positive relationship between management tenure and performance, concluding that managers with more experience analyse and process information more efficiently, so manager tenure may also be linked to lower fees. Fund investors learn to avoid excess sales expenses (Ban, 2015). Financial education plays a relevant role in the retail financial market. Andreu, Sarto, and Serrano (2015) found that younger managers or managers with shorter fund management careers outperform older managers. The reason for this finding is that that older managers are more risk averse, whereas younger managers take greater risks, which lead to greater fund returns. Not all scholars share this opinion, though. For example, Alibakhshi, Reza, and Moghadam (2016) affirmed that an amateur investor can make similar investments to those of professionals. Other scholars have cited good versus bad luck as a key factor (Ayadi, Ben-Ameur, & Kryzanowski, 2016;Ayadi, Chaibi, & Kryzanowski, 2016;Blake, Caulfield, Ioannidis, & Tonks, 2017). Andreu and Puetz (2017) affirmed that individuals with university degrees can be found at either extreme: outstanding returns or devastating losses. In contrast, individuals with two university degrees are more cautious investors and take fewer risks, so they achieve more modest returns. Finally, Andreu, Gargallo, Salvador, and Sarto (2015) argued that everything depends on the emotional and rational factors that lie behind decisions, so the traditional model based on the laws of the market (i.e., prices offer a source of information) ceases to be efficient.

Fund fees
Typically, mutual fund cost is measured by the ongoing charge figure, which is the sum of all fees charged by the fund that are deducted from the fund's net asset value. The mutual fund literature commonly treats the management fee separately to analyse the relationship between management fees and performance. The way that mutual fund performance and fund fees relate to one another tests the value of active management. Mutual fund investors pay for the benefits associated with the costs of this investment. The manager's skill should be reflected in better performance, which justifies a higher management fee. In some cases, high management fees signify superior investment skill, which leads to better performance (Golec, 1996). In fact, the manager's skill may shape the relationship between fees and fund performance. Berkowitz and Kotowitz (2002) report a positive relationship between fees and performance by funds managed by good managers. In contrast, for low-quality managers, the relationship between fees and performance is negative.
The literature on the relationship between fees and performance is inconclusive. Droms and Walker (1996) reported a significant positive relationship between fund returns and fees. Overall, however, scarce empirical evidence supports the expected positive relationship between performance and fees. Gruber (1996) reported that fees for the top performing funds are no higher than they are for other funds, also arguing that investors are motivated to buy actively managed funds and pay their fees because past performance can partially predict future performance. Ban (2015) confirmed that higher sales expenses are not related to better performance. Anagol, Marisetty, Sane, and Venugopal (2017) found that reducing commissions would slow the growth of the mutual funds market. Carhart (1997) and Golec (1996) reported that higher fees are actually associated with lower investment performance. Otten and Bams (2002) listed different types of fees and reported a negative relationship between European mutual fund performance and these fees.

Empirical data
Morningstar classifies mutual funds according to asset investment policy. Funds are classified according to their current management style rather than the management regulation. This classification adheres more to the actual policy of the fund than to any statement by the manager. In September 2016, we collected mutual fund data from the Morningstar mutual fund database. The sample comprised funds that invested in large-cap US equities and large-cap Eurozone equities only. Sampled funds had to have Morningstar ratings and Morningstar analyst ratings. Data were available for 224 mutual funds, 60 of which invested in the Eurozone and 164 of which invested in the US.
All funds had their registered offices in Luxembourg. Luxembourg is the largest fund domicile in Europe and the second largest worldwide (after the United States). Luxembourg offers managers operational ease, favourable fiscal conditions, and a low level of bureaucracy. Registering funds is simpler in Luxembourg than it is elsewhere, reducing registration costs. The Association of the Luxembourg Fund Industry (ALFI) promotes Luxembourg's status as a leading international fund hub that investors, policymakers, and industry representatives consider open, reliable, and innovative. According to the ALFI Annual Report 2015-2016, net assets under management at the end of 2015 were e3.5 trillion, and 45.75% of all net sales in Europe were attributed to Luxembourg-domiciled funds.
The Morningstar mutual fund database provides data on fund size (net assets), age (years since creation), and fees. Management fees and ongoing fees were considered in this study. The management fee is the amount the fund manager charges for managing the fund. The fee is subject to a legal cap. It is automatically deducted daily from the net asset value of the fund. The ongoing fee consists of all fund fees that are deducted from the fund's net asset value. It is calculated as total charges divided by net assets. Morningstar takes these data from managers' reports. We also gathered data on the fund manager to assess the influence of the manager's experience on fund performance. The Morningstar database provided the number of years the manager had been managing the fund. Thus, the set of causal conditions leading to the outcome of high (or low) performance comprised two Morningstar ratings and five features of mutual funds and their managers.
To quantify managed mutual fund performance, the average annual return over three years and two traditional portfolio performance metrics-the Sharpe ratio and Jensen's alpha-were used. Whereas the Sharpe ratio considers the fund's total risk, Jensen's alpha considers only the fund's systematic risk (Morningstar, Inc, 2009). The performance measures (return and risk-adjusted return) were based on three years of monthly returns. For each Morningstar category, Morningstar uses standard benchmarks and the index from the fund's prospectus. The standard benchmark for Eurozone mutual funds is the MSCI EMU NR EUR index. Standard benchmarks for US mutual funds are the Russell 1000 TR USD indices.

Method
Conditions and combinations of conditions are studied using the configurational comparative method (Meyer, Tsui, & Hinings, 1993). Configurational approaches are built on the concept of equifinality, which treats configurations as different types of causes that lead to an outcome. Studying the combination of causes or their absence has current real-world applicability (Ragin, 2008).
The sample limitations in social science research also make the configurational comparative method a good technique for reaching reliable conclusions when working with a small number of cases (Collier, 1993;Fiss, 2007;Vassinen, 2012). Its use can nonetheless be extended to larger samples (Ragin, 2006b). For Rihoux and Ragin (2009), the number of causal conditions should be between three and eight; Crilly (2011) confirmed that up to seven causal conditions could be studied.
Such methods are used to examine how conditions (or variables) combine to create outcomes. Regression and linear regression analysis are not applicable in turbulent economic environments such the current one because this form of analysis maintains the values of all except one of the variables constants in the equation to explain variations in some outcome. In fsQCA, conditions may be present or absent (Ragin, 2000), as explained below.
As a configurational comparative method, fsQCA builds on set theory to investigate causal claims (Rihoux & Ragin, 2009). Configurational theory has a long history as a data analysis method. Roig-Tierno, Gonzalez-Cruz, and Llopis-Martinez (2017) conducted a bibliometric analysis of the three variants of QCA (csQCA, fsQCA, and mvQCA), reporting a positive trend in fsQCA use. Finance research based on configurational theory has appeared in top journals. Recently, there has been a surge in the use of comparative methodologies in management research (e.g., Crespo-Herv as, Calabuig-Moreno, Prado-Gasc o, Añ o-Sanz, & N uñez-Pomar, 2019). Methodological innovations in QCA create new opportunities for analysis and enrich methodological pluralism in research (Kornelakis, 2018).
In this research, fsQCA is used to identify the conditions associated with mutual funds than underperform or outperform competitors. FsQCA is useful when cases are best understood as combinations (or configurations) of attributes that potentially lead to an outcome.

Why use a fuzzy-set approach in this study?
We contribute methodologically to research on what influences mutual fund performance by testing our conceptual model using a configurational comparative technique, namely fsQCA. This approach identifies causal configurations from a data set of empirical cases (Rihoux & Ragin, 2009). FsQCA deals with complex causal perspectives (Greckhamer, Misangyi, Elms, & Lacey, 2008) and focuses on asymmetric relationships to report causal conditions that are sufficient to cause an outcome (similar to the concept of the dependent variable in other analyses). FsQCA has advantages for studying complex causation and identical outcomes (Mills, van de Bunt, & De Bruijn, 2006). Regression coefficients show the impact of variables; they do not indicate which individual variables are sufficient or necessary for an outcome (Woodside, Ko, & Huan, 2012). Necessary and sufficient conditions are analysed under a complex causal approach. Necessary conditions imply that the outcome only occurs when the causal condition is present (or absent). The analysis also considers sufficient conditions. Sufficient conditions indicate that a causal condition always leads to the outcome (Braumoeller & Goertz, 2000;Fiss, 2007).
The within-case analysis (Yin, 1994) was important to identify the relationships that we later tested using fs/QCA software. Use of configurational theory as a data analysis method is a recent phenomenon. Using this method, scholars have published research in top international management or social science journals (Roig-Tierno et al., 2017;Urueña, Arenas, & Hidalgo, 2018). The method is based on algorithms and Boolean algebra.

Procedure
We use fs/QCA Version 3.0 software, available at www.fsQCA.com. (Ragin & Davey, 2016). Qualitative comparative analysis proceeds in several steps (Rihoux & Ragin, 2009). The first step is to calibrate set membership. The aim is to group cases (Ragin, 2008). Degree of membership of observations in fuzzy sets is evaluated for each condition. Set membership scores are not probabilities (Woodside, 2012a). Independent and dependent measures are transformed into sets. For many variables, binary values of 0 and 1 would be appropriate, and crisp sets could be used. Other variables are more complex and require ordinal or continuous values. For these variables, fuzzy sets should be based on substantive knowledge (Vis, 2012). Here, the researcher decides on three breakpoints: when a case is 'fully in' the set (1.00) or full membership, 'fully out' of the set (0.00) or full non-membership, and 'neither in nor out' of the set (0.50) or the cross-over point. The cross-over point is the point of maximum (membership) ambiguity in the assessment of whether a case is more in or out of a set (Ragin, 2008). These breakpoints enable the calibration of values into membership values; they are based on theoretical knowledge of previous situations. This technique is a combination of qualitative and quantitative methods.
The second step is to build a data matrix, which is known as the truth table (Schneider, Schulze-Bentrop, & Paunescu, 2010;Fiss, 2011;Schneider & Wagemann, 2012). Researchers should keep configurations with at least one observation for further analysis. Kenworth and Hicks (2008) have recommended a consistency threshold of 0.95, although this threshold should not be applied mechanically. Ragin (2008) argued that the analysis should capture at least 75% to 80% of cases. In the process of obtaining the solution, researchers should select the prime implicants with the most significance in managerial decision making.
In the third step, Boolean algebra is used to derive combinations of causal conditions that produce the outcome. These conditions are minimally sufficient. The fs/ QCA software uses the Quine-McCluskey algorithm (Quine, 1955). The truth table algorithm provides three solutions: complex, parsimonious, and intermediate (Ragin, 2008). The complex solution is the most conservative solution, although it provides little insight into causal configurations. Complex solutions do not consider logical remainders, so researchers generally do not use this solution in their analyses. The parsimonious solution includes all simplifying assumptions. The intermediate solution includes simplifying assumptions and restricts logical remainders to only those that are most plausible. The parsimonious and intermediate solutions should be reported in most analyses.
Researchers must decide between the parsimonious and intermediate solutions, using the coverage and consistency (of each configuration) to reach this decision. Woodside (2012b) showed that the coverage index is analogous to the coefficient of determination and that the consistency index is analogous to correlation. Ragin (2008, p. 44) noted that coverage is 'the degree to which a cause or causal combination [accounts for] instances of an outcome', and consistency is 'the degree to which instances of the outcome agree in displaying the causal condition thought to be necessary'. Consistency can range from 0 to 1 (Ragin, 2006a). Ragin (2008) has recommended a minimum measure of consistency of 0.8 (preferably 0.85 for macro-level data). Table 1 shows the conditions and outcomes. The literature review justifies their relevance. Their measurement is explained in the empirical data section. Table 1 shows the anchors used to calibrate each condition and outcome following the direct method proposed by Ragin (2008). The consistency cut-off was 0.80 (Rihoux & Ragin, 2009).

Results
We analysed the models presented in Table 2.

Necessary conditions
The analysis of necessary conditions identifies causal conditions that must occur for the outcome to occur. Only fs_morn and fs_anal were necessary for the outcomes 'sharpe', 'alpha', and 'profit' in models 7, 8, and 9 (Table 3). No other individual condition was necessary for any other outcome. Conditions are necessary only if their consistency exceeds 0.9 (Schneider et al., 2010).
The results show that relationships of necessity emerge for Europe but not for the US. Furthermore, these relationships emerge for the presence of just three outcomes for the conditions of Morningstar rating and Analyst rating.
First, for the outcome Sharpe, coverage was 0.542 (consistency 0.975) for Morningstar rating and 0.588 (consistency 0.991) for Analyst rating. Second, for the outcome Alpha, coverage was 0.548 (consistency 0.970) for Morningstar rating and 0.595 (consistency 0.987) for Analyst rating. Finally, for the outcome Profit, coverage was 0.521 (consistency 0.971) for Morningstar rating and 0.564 (consistency 0.984) for Analyst rating.

Sufficient conditions for the outcomes
We compared the results for the US with those for Europe only when the models for both the US and Europe met Ragin (2008) and Woodside's (2012) criterion regarding consistency values. This criterion states that consistency must be greater than 0.8. We could only compare the results for the outcomes 'absence of Sharpe ratio' and 'profit'. Table 4 shows the sufficient conditions for the absence of Sharpe ratio. We compared the results for the US with those for Europe because the models for both the US and Europe met Ragin (2008) and Woodside's (2012) criterion regarding consistency values. This criterion states that consistency must be greater than 0.8. The consistency was 0.767 for the US model and 0.882 for the model for Europe.

Sufficient conditions for the absence of Sharpe ratio
For US funds, according to Configuration 4, 28.2% of cases imply that a poor Sharpe ratio is assigned when funds do not have good Morningstar ratings, do not have good analyst ratings, do not have low management fees, and do not have low ongoing fees, but have existed for many years. This finding was confirmed with a consistency of 0.898. Accordingly, for this configuration, Ragin (2008) and Woodside's (2012) criterion was met.
For European funds, according to Configuration 1, 54.1% of cases imply that a poor Sharpe ratio is assigned when funds are not large, but fund managers have a long tenure. This finding was confirmed with a consistency of 0.866. Accordingly, for this configuration, Ragin (2008) and Woodside's (2012) criterion was met.
Comparing US and European funds reveals that the primary difference is that funds in both scenarios have poor Sharpe ratios when fund managers have a long tenure but US funds do not have low ongoing fees and are not large. Table 5 shows the sufficient conditions for the outcome Profit. We compared the results for the US with those for Europe because both models met Ragin (2008) and Woodside's (2012) criterion for consistency values. This criterion states that consistency must exceed 0.8. Consistency was 0.783 for the US model and 0.935 for the European model.

Sufficient conditions for profit
For US funds, according to Configuration 3, 26.2% of cases imply that funds achieve a good Profit when they have good Morningstar ratings, good analyst ratings, a large size, low management fees, and low ongoing fees. This finding was confirmed by a consistency of 0.923. Accordingly, for this configuration, Ragin (2008) and Woodside's (2012) criterion was met.
For European funds, according to Configuration 2, 34.1% of cases imply that funds achieve a good Profit when they have good Morningstar ratings, good analyst ratings, a large size, managers with long tenures, low management fees, and low ongoing fees. This finding was confirmed by a consistency value of 0.943. Accordingly, for this configuration, Ragin (2008) and Woodside's (2012) criterion was met.
Comparing US and European funds reveals that the primary difference is that funds in both scenarios achieve good profits under the same conditions, except that European funds must also have managers with long tenures.

Conclusions
Mutual fund performance is discussed at length in the finance literature. There has been extensive research into the relationships between fund ratings, specific fund features, and performance. Numerous studies have investigated the features of mutual funds and characteristics of fund managers that influence fund performance. Despite the overall finding that fund fees are generally unjustified by subsequent performance, scholars have nonetheless failed to reach a consensus regarding the positive or negative impacts of different factors. These factors invite further investigation into areas such as the clustering of fund managers depending on whether they base their decisions on publicly available information or data that are less easily accessible to the general public (Abdesaken, 2015).
Methodologically, this paper contributes to economics and finance research through its application of fsQCA. Using complexity theory, researchers can test models where no single condition is responsible for the outcome. Instead, we analysed several conditions to observe how they combine to contribute to the outcome. We used fsQCA to identify the combinations of factors that lead to our outcome of choice (Ragin, 2008), instead of isolating the net and independent effects of single factors on a particular outcome.
Complexity theory helps provide answers when certain conditions cause an outcome (Feurer, Baumbach, & Woodside, 2016). No simple conditions are the cause of an outcome of interest. In our study, this outcome is the influence on mutual fund performance. As Kostova and Zaheer (1999) stated, international business is a causally complex phenomenon. Pratt (2009) reported that number of interviews or observations that should be conducted in a qualitative research project depends on the question that researchers seek to answer.
One of the relevant points of regression analysis is that researchers do not require data calibration or prior causal knowledge. In addition, researchers do not need knowledge or previous theories for the regression analysis. In contrast, set membership is determined by substantive knowledge in fsQCA (rather than using the sample mean).
Unlike standard econometric methods, sample representativeness is less of an issue in fuzzy-set methods. FsQCA does not rest on assumptions that data are drawn from a certain probability distribution (Woodside 2012b). The possible combinations of individual and group attributes may be infinite, but only a finite number of coherent configurations are prevalent in the real world.
We examined the performance of Luxembourg-domiciled mutual funds that invest in large-cap US and Eurozone equities. A configurational comparative method, fsQCA, identified the combinations of conditions that lead to mutual funds' underperformance or outperformance of competitors.
A poor Sharpe ratio for US funds is assigned when the funds do not have good Morningstar ratings, do not have good analyst ratings, do not have low management fees, and do not have low ongoing fees, but have existed for many years. A poor Sharpe ratio for European funds is assigned when the funds are not large, but fund managers have a long tenure. Comparing US and European funds reveals that the main difference is that both US and European funds have poor Sharpe ratios when fund managers have a long tenure but US funds do not have low ongoing fees and are not large.
US funds achieve good Profit when they have good Morningstar ratings, good analyst ratings, large size, low management fees, and low ongoing fees. European funds achieve good Profit when they have good Morningstar ratings, good analyst ratings, large size, low management fees, low ongoing fees, and managers with long tenures. Comparing US and European funds reveals that the main difference is that both US and European funds achieve good profit under the same conditions, except that European funds must also have managers with long tenures.
All stakeholders in the performance of mutual funds (i.e., government regulators, investors, and investment managers) have an interest in better understanding mutual fund performance. They may find the results of our research helpful in their deliberations about mutual funds with criteria of good performance.
From a regulator's perspective, the purpose of knowing the conditions that affect the performance of funds is to evaluate the relevance of including information on these conditions in the advertising of managers. For investors, the mutual fund market is large and offers a wide variety of assets in which to invest. Therefore, investors will consider the values of these conditions for potentially selectable funds as key aspects in their fund selection decisions. This research verifies the importance of Morningstar ratings and analyst ratings in mutual funds to avoid poor performance. The achievement of good ratings constitutes an incentive for fund managers to improve their advertising. Managers should spare no effort in carrying out policies and actions to improve the ratings assigned to the mutual funds they manage.
Like all studies, this study has certain limitations. The primary limitation relates to the sample, which comprised firms from a specific asset class. In addition, the study examined data for a short period. Future fsQCA studies could cover other mutual fund categories in other investment areas. Verifying whether our results can be confirmed for broader horizons would also be of interest.
A promising line of future research would be to explore whether any additional conditions may contradict our findings. Recent research applied to other sectors has linked performance to areas such as innovation (Ho, Nguyen, Adhikari, Miles, & Bonney, 2018), business intelligence capacities (Caseiro & Coelho, 2019), decisionmaking style (Abubakar, Elrehail, Alatailat, & Elçi, 2019), and employee relations climate (Ali, Lei, & Wei, 2018). As with any study of this nature, the robustness of the results invites further inspection.

Disclosure statement
No potential conflict of interest was reported by the authors.