Super-efficiency of education institutions: an application to economics departments

ABSTRACT This paper investigates the efficiency of 188 economics departments around the world using data from RePEc. We go beyond the heavily used data envelopment analysis and utilize partial frontier analysis – specifically order-α and order-m – which addresses some of the drawbacks of the standard efficiency frontier analysis and allows for so-called super-efficient departments. We examine the particularities of these approaches and find that the super-efficient departments are not only the ‘usual suspects’. Furthermore, standard output rankings are not well correlated with our estimated efficiency rankings, which themselves are rather similar.

The lion's share of efficiency literature uses either data envelopment analysis (DEA) 1 or stochastic frontier analysis (SFA), although both methods have severe caveats that have been pointed out many times. The parametric SFA has been criticized for relying on restrictive assumptions concerning the functional form and the distribution of the error term. The non-parametric DEA suffers from being highly vulnerable to potential outliers and measurement error, because every unit is related to the most efficient units. This problem is illustrated in Figure 1, which depicts a cross-plot of the main factor of RePEc rankings as output variable on the vertical axis and full-time equivalents as input variable on the horizontal axis. 2 There is one obvious outlier (Harvard University) dominating the bulk of economics departments, making them inefficient in a DEA. Most (negatively) affected is the London School of Economics (LSE, red circle), which would be efficient if we leave out Harvard University and recalculate the efficiency hull. The problem is less severe, but remains if the free disposal hull (FDH) approach, which is less restrictive in defining the efficiency frontier, is used.
A simple idea to correct for potential outliers was proposed by Andersen and Petersen (1993). They propose to run the DEA leaving out one unit at a time. This potentially leads to super-efficient units with scores larger than one. A further improvement, partial frontier analysis (PFA), which rests upon the FDH approach, allows for a more general approach to investigating the efficiency of (higher) education institutions. The different PFA methods address the outlier-problem by generalizing FDH and allowing for super-efficient units, which lie beyond the estimated efficiency frontier. Two prominent examples of PFA are the order-m method by Cazals, Florens, and Simar (2002) and the order-α method by Aragon, Daouia, and Thomas-Agnan (2005). The former adds randomness to the FDH approach by calculating each efficiency score multiple times with a random subset of units, the latter utilizes only a certain percentile of comparison units when calculating each efficiency score.
This study makes four contributions to the literature. First, we add to the small literature using PFA with respect to education institutions. Bonaccorsi, Daraio, and Simar (2006) and Bonaccorsi et al. (2007) apply an order-m approach to study 45 universities in Italy and 261 universities across four European countries respectively. De Witte et al. (2013) use DEA as well as PFA to study the performance of 155 professors working at a Business & Administration department of a Brussels university college. Bruffaerts, Rock, and Dehon (2013) use FDH and PFA to study the efficiency of 124 U.S. universities. Finally, Bornmann and Wohlrabe (2018) study the super-efficiency of 50 U.S. universities with respect to top-cited papers. Secondly, we are the first to apply PFA to economics departments. So far, there are only few papers investigating their productivity or efficiency. Johnes (1988) uses selfcreated performance indicators to construct a ranking of the economics departments of 40 British universities. Johnes and Johnes (1992 conduct a DEA for 36 British and Madden, Savage, and Kemp (1997) for 24 Australian economics departments. Kocher, Luptácik, and Sutter (2006) compare country-level economic research efficiency using DEA. Conroy, Dusansky, and Kildegaard (1995), Cherchye and Abeele (2005) and Perianes-Rodríguez and Ruiz-Castillo (2014) employ other inputoutput measures not belonging to frontier efficiency class. Macri and Sinha (2006) provide an overview of international rankings, which are partly based on productivity considerations. This paper draws on Friedrich and Wohlrabe (2017) who conduct a DEA for 206 economics departments worldwide. Thirdly, we investigate whether there is a relationship between the RePEc ranking, which can be considered as a proxy for the reputation or influence of an economics department, and efficiency rankings. Fourthly, we are the first to compare five efficiency approaches in an empirical setup: DEA, the super-efficient DEA approach by Andersen and Petersen (1993) (SDEA), FDH, order-m and order-α. This allows us to take a differentiated look at efficiency measures. Krüger (2012) examines four of our approaches, but not SDEA, as well as parametric ones in a Monte Carlo study. He finds that SFA and standard DEA outperform the PFA methods in terms of mean-squared errors. Therefore, his recommendation, which we follow, is to cross-validate PFA results with other methods. 3 We mainly focus on ranking comparisons and want to know if the ordinal rankings differ across efficiency approaches.
For our analysis, we use a data set of 188 economics departments from RePEc. As only input variable we use full-time equivalents for authors affiliated with the respective department. Our output variables comprise the condensed information from 32 indicators, which represent both quantitative (scientific output, such as published work) and qualitative (scientific impact, such as number of citations) bibliometric information as well as readership (downloads and abstract views).
We proceed as follows. Section 2 explains the different approaches to measuring efficiency which we employ. Section 3 presents and describes our data. Section 4 contains the results of our efficiency analyzes including some robustness checks. Section 5 draws conclusions from our findings.

Methodology
In this section, we outline the basic idea of the five approaches to efficiency measurement, which we employ. In all approaches we opt for output-orientation and assume variable returns to scale. We denote the input and output of department i with x i and y i , respectively. The efficiency score is denotedê i .
Being aware of the caveats discussed earlier, we begin with the simple non-parametric estimation of the educational efficiency frontier for later comparisons. We outline the standard data envelopment analysis (DEA) introduced by Charnes, Cooper, and Rhodes (1978). See Cooper, Seiford, and Zhu (2004) for a comprehensive discussion of DEA. We then state the basic idea of the super-efficient DEA approach by Andersen and Petersen (1993). Next, we explain the free disposal hull (FDH) approach which is somewhat less prone to outliers. The FDH approach was introduced by Deprins, Simar, and Tulkens (1984).
Both DEA and FDH are non-stochastic methods, as they assume all deviations from the frontier to be the result of inefficiencies. See Tauchmann (2012) for an illustrative example of both approaches. We continue by outlining the two partial frontier approaches, order-m and order-α. These techniques are generalizations of the FDH approach, which allow for super-efficient units, i.e. efficiency scores larger than one. Finally, we give a simple example to illustrate all five approaches.

Data envelopment analysis
We illustrate the output-oriented approach with variable returns to scale which was proposed by Banker, Charnes, and Cooper (1984). The linear programing approach envelopes the data in a piecewise linear convex hull. The DEA efficiency scoreê DEA i solves the following optimization problem (Cordero-Ferrera, Pedraja-Chaparro, and Santín-Gonzáalez 2010): where i and r are the different inputs and outputs, λ is a weighting parameter that maximizes the productivity, s + i and s − i are the input and output slacks, respectively, and ε is some small positive number. The efficiency score is given by 1/e DEA .

Super-efficient DEA
The approach by Andersen and Petersen (1993) builds upon the basic DEA by leaving out one department i at a time. A department is considered super-efficient, i.e. has an efficiency score larger than 1, if an department increase its input vector proportionally while preserving efficiency. The efficiency score reflects the radial distance from the department i under evaluation to the production frontier estimated with that one excluded from the sample, i.e. the maximum proportional increase in inputs preserving efficiency. This approach is especially useful when DEA delivers many efficient units, as it allows for a further differentiation by a more accurate ranking.

Free disposal hull
The FDH approach compares each department i with all other departments j = 1, . . . , N in the data set. The set of peer departments that satisfies the condition y lj ≥ y li ∀l is denoted by B i . Among the peer departments, the one that exhibits minimum input usage serves as reference to i andê FDH i is calculated as the relative input useê Departments that for a given output exhibit minimal input usage among all their peers serve as their own reference. For these units the efficiency scoreê FDH i is given by 1.

Order-m efficiency
In case of order-m efficiency, the partiality aspect comes in by departing from the assumption of benchmarking with the best-performing peer in the sample at hand. Instead, efficiency with respect to a sub-sample of m peers is considered. Daraio and Simar (2007) propose the following four-step procedure: (1) Draw from B i a random sample of m peers (departments) with replacement.
(2) A pseudo-FDH efficiency score (êF DH d mi ) is calculated using the randomly drawn data.
(4) Order-m efficiency is calculated as the average of the pseudo-FDH scores: A result of this procedure is that order-m efficiency scores may exceed the value of one. This is because in each replication d, department i may or may not serve as its own peer. As a consequence, there may be super-efficient departments (ê OM mi . 1) located beyond the estimated production possibility frontier. There are two parameters that need to be determined beforehand: m and D. The latter one is just a matter of accuracy. The higher D, the more clear-cut is the result, whether a department is super-efficient or not. Of course, a higher D increases the computation time. The choice of m is more critical. The smaller the value of m, the larger the share of super-efficient departments. For m 1, the order-m approach converges to FDH, but for m=N it is still possible that super-efficient departments occur.

Order-α efficiency
The order-α approach generalizes FDH in a different way. Instead of using the minimum input at a given output among available peers as a benchmark, order-α uses the αth percentile: When α=100, order-α reduces to FDH. In case a , 100, some departments may be classified as super-efficient. Similar to m in the previous approach, α can be considered a tuning parameter: the smaller α the larger the share of super-efficient departments.

An illustrative example
In Figure 2, we display the outlined full and partial frontier analysis approaches in one graph. We plotted input-output combinations for various artificial departments. The DEA with variable returns to scale is given by the solid line. The educational production frontier is defined by the departments A, B and E. These three have an efficiency score of 1, i.e. an efficient input-output combination. All other departments to the right of or below the frontier are considered inefficient. Applying SDEA, we leave department B out and recalculate the efficiency curve. In turns out that C is now efficient and B potentially super-efficient. This depends on the resulting scores when all other departments are left out one at a time. In case of FDH, the outer hull is spanned more explicitly, by also considering points which are not on the DEA curve. In our example, departments C and D are now also efficient. The frontier has shifted to the right and efficiency scores for all other departments increase somewhat, as the distance to the frontier is smaller than for DEA. Applying the partial frontier approaches order-m or order-α, we get a different picture. Now, only departments C and D are just-efficient with a corresponding score of 1, whereas departments A, B and E are now super-efficient with a score larger than 1. Of course, both approaches do not necessarily yield the same results as Figure 2 would suggest.

Data
Our data is taken from the Research Papers in Economics (RePEc) website 4 and refers to February 2018. RePEc provides a large collection of bibliometric information and contains (meta) data both for bibliometric items (working papers, articles, chapters and books) and authors. Using gathered citations, RePEc calculates various impact measures for journals and working paper series. Those are the basis for ranking authors and economic institutions. RePEc has become quite a success, as of February 2018 there were 2.2 million pieces of research from 2800 journals and 4500 working paper series. Additionally, more than 50,000 authors and 14,000 institutions are listed on the website. For more details on RePEc see Zimmermann (2013).
We consider one input for any department: accumulated author shares. Each author sets a share by which he or she is affiliated with a department. 5 In case of no self-setting, RePEc calculates shares based on the other affiliated members of the institutions. In the following, we call these accumulated author shares full-time equivalents (FTE), as an institution with 10 authors, who all identify themselves with 80%, would have the same accumulated author share (or FTE) as an institution with 8 authors who all identify with 100%.
Using the information on bibliometric items and registered authors, RePEc currently (February 2018) provides 34 rankings for institutions, which could potentially serve as output indicators. Table 1 provides an overview of these measures. There are five main categories: number of (published) works, citations, citing authors, journal pages, and RePEc access statistics. Each of these main categories can be combined with different weighting schemes: simple or recursive impact factors, number of authors and combinations of them. In the category 'distinct number of works' different versions of a paper are counted only once. Published work is counted only if, first the publisher provides the meta data to RePEc and second, the author assigns the work to his/her account. Table 1 reveals that there is a focus on citations both directly and indirectly. In 14 rankings, citations are counted with quality and time adjustments. Moreover, citations matter indirectly through the different impact factors. Zimmermann (2013) provides a detailed account of the methodology of RePEc.
We downloaded 32 publicly available rankings from the RePEc website, where the data refers to the January 2018 ranking. 6 For these rankings only the top 5% of world-wide institutions are shown. We selected all departments that were listed in all rankings. We excluded economic research institutions (e.g. ifo Institute), central banks and research networks (e.g. NBER). Therefore, only economics departments remain, which makes the data homogeneous to a certain extent. Some units are subidentities of larger organizational units, e.g. the School of International and Public Affairs and the Finance and Economics Department are both sub-units of Columbia University. We end up with a total of 188 units. A correlation analysis shows that the 32 rankings are highly correlated. 55% of all 496 bivariate correlations are larger than 0.9 and 75% larger than 0.8. This finding, which is in line with Seiler and Wohlrabe (2012) and Zimmermann (2013), indicates a high degree of similarity. In order to avoid an ad hoc choice we follow Seiler and Wohlrabe (2012) and define research performance as a latent process. Each of the 32 indicators can be regarded as an observed representation of this process. We run a principal component analysis (PCA) on our standardized data to extract the main components. This method has been used in the literature before to classify determinants of research productivity. See for instance Ramesh Babu and Singh (1998), Costas and Bordons (2007), Franceschet (2009), Docampo (2011), or Ortega, Lopez-Romero, and Fernandez (2011. Cordero-Ferrera, Pedraja-Chaparro, and Santín-Gonzáalez (2010) use PCA in a DEA framework to condense information. In our analysis, we extract two factors which serve as our outputs. The first one accounts for approximately 85% of the variation in the data, the second one for 9%. The remaining factors are  negligible. In order to avoid counter-intuitive negative factor values, we rescale them to an interval from 0 to 1. The nature of the data refers to the stock approach, i.e. a publication is assigned to the current affiliation of a researcher (partially in case of multiple affiliations). In contrast to this, one could adopt the flow approach where a work is credited to the institution that the author was affiliated with at the time of publication. Although the flow approach is preferable to the stock approach, it cannot be realized with RePEc data.
The RePEc data is consistent across departments for two reasons. First, authors can give weights to their affiliations (if they have more than one), which eliminates the need for arbitrary weighting that would have to be applied if an outsider performed this task. Second, RePEc enables authors to manage and verify their publication and citation list. In sum, the result is a largely consistent data set which allows for international comparisons.

Basic results
Following the methodology outlined in Section 2, we estimate five efficiency scores for each economics department in our data set and obtain five efficiency rankings based on them. For later comparisons, we also calculate a RePEc ranking, which is based on the overall RePEc score from January 2018. The RePEc ranking can be considered as a proxy for the reputation of a department, because it measures output and influence, but not efficiency.
All efficiency score calculations use full-time equivalents as the single input and the two extracted main factors as outputs. We assume output orientation with variable returns to scale. In Table 2 we provide some descriptive statistics for each approach plus the number of (super)-efficient departments. The average efficiency in case of DEA is 0.712 with a standard deviation of 0.12. For the other approaches the mean efficiency is higher, since the respective departments are shifted closer to the efficiency frontier. In addition, the standard deviation of the scores rises. Table 3 displays the ten most efficient economics departments and their respective efficiency score for DEA, FDH, SDEA, order-α and order-m. 7 Panel 1 displays the results of the DEA. We find that 8 departments attain the efficiency frontier with the maximum efficiency score of 1. Among these are many of the 'usual suspects', 8 such as Harvard University, Yale University or Massachusetts Institute of Technology. Panel 2 presents the results for the FDH approach. By construction, this approach finds more efficient departments than the DEA, which already found 8. In fact, we find 47 efficient departments, many of which are 'usual suspects'. Naturally, having 47 departments with the same efficiency rank (1) is rather unhelpful, which already foreshadows a key advantage of methods that allow for super-efficient departments. Panel 3 of Table 3 displays the results of the SDEA. We find 7 super-efficient departments, lead by Harvard with an efficiency score of 1.631, the London School of Economics (1.345) and the Paris School of Economics (1.120). At rank 8, Yale is the just-efficient department.
The previously employed methods -DEA, FDH, SDEAdo not require the specification of any parameters, making the efficiency score calculation straightforward. In contrast, PFA requires the specification of parameters, which eventually influence the amount of super-efficient departments. The order-α approach requires α, the percentile of the set of peer departments used as benchmark, and order-m m, the number of peer departments drawn randomly from the initial set of peer departments, being set ex ante. Figure 3 displays the number of super-efficient departments as functions of α and m, respectively. A similar exercise can be found in Cazals, Florens, and Simar (2002). Unless we set m 1 or a = 100, which is when the partial frontier approaches converge to FDH, we find super-efficient departments by construction. We set a = 99% and m=130 and get 15 (order-α) and 32 (order-m) super-efficient departments, respectively. The latter one is chosen because of some convergence of super-efficient departments at the value of 32 as m grows. The former is chosen to obtain a low number of super-efficient units. Setting α to 98%, we would have 44 super-efficient departments which we consider too many in our empirical application. Nevertheless, our calibration is somewhat arbitrary, but as we focus on the efficiency ranks and not the score, the number of super-efficient departments, and thus the calibration, has a limited impact. Panel 4 presents the results obtained using the order-α PFA. The ranking is led by the London School of Economics with an efficiency score of 1.781, followed by Harvard (1.661) and SciencesPo (1.423). Notably, 17 departments are just-efficient using this approach. Panel 5 displays the results of the order-m PFA. Interestingly, the first nine departments are also among the first nine departments using the order-α PFA, while the exact ranking differs. The London School of Economics (1.270) still leads the ranking, but is now followed by the Paris School of Economics (1.255) and Harvard (1.214).
Looking at the five efficiency rankings from a broader perspective, it is noteworthy that next to the renowned U.S. universities, there is also a number of European universities appearing regularly. Among these are the London School of Economics (in the top ten of all 5 rankings), the Paris School of Economics (5), the SciencesPo (4), Tilburg University (3), Oxford University (3), Barcelona GSE (3), and Bocconi University (3). Among the super-efficient departments are also some not so wellknown departments, like the Crawford School of Public Policy at the Australian National University and the economics department in Groningen (Netherlands).
Having so far only looked at the top-ranked departments, we now turn to a comparison of the complete rankings. Figure 4 displays scatter plots for all ranking combinations, including the RePEc ranking. Table 4 shows the corresponding rank correlation coefficients (Spearman's rho). Panel 2 (counting from left to right, top to bottom) shows that DEA and SDEA produce almost identical results, the rank correlation coefficient being 1.00. The SDEA additionally allows to discriminate among the efficient departments as found by the DEA. The other scores remain the same for both approaches. The other panels show that the three super-efficient methods also produce similar results, the correlations being 0.966, 0.792 and 0.778. In contrast, the scatter plots comparing super-efficient methods with DEA or FDH look slightly more dispersed. Nevertheless, they are decently correlated with rank correlation coefficients above 0.75 throughout. Figure 4 also reveals that the PFA classifies many departments as just-efficient with a score of 1.000.
However, the most interesting result is displayed in Panels 11-15 (bottom row), which show that there is only a weak link between the RePEc ranking and the efficiency rankings. The PFA approaches are slightly higher correlated with the RePEc ranking, but the correlation never surpasses 0.612. The DEA and FDH rankings even have correlations of less than 0.4 with the RePEc ranking. We conclude that efficiency is not well correlated with reputation. Of course, in a strict sense, this low correlation should not be surprising, as RePEc and the efficiency scores effectively measure different things.

Robustness
In order to test whether the medium-sized correlation between the RePEc ranking and the efficiency rankings is robust, we modify all five approaches as follows: (1) We assume input orientation in contrast to output orientation.
(2) Instead of two factors, all 32 RePEc rankings serve simultaneously as outputs.
(3) The number of authors, in contrast to full-time equivalents, is used as the input.
In Table 5 we document the corresponding rank correlations with the overall RePEc ranking. The correlation is somewhat higher when input-orientation is used. This is also the case when all rankings  serve as outputs. The reason for this might be a loss of information which is natural when information condensation approaches are applied. Unsurprisingly, using the coarser input measure of authors shrinks the correlation. However, the overall picture of a weak relationship between output and efficiency rankings remains.

Conclusion
The results of our analysis allow for three conclusions. First, using the standard DEA and FDH approaches to measure efficiency, we find many well-known economics departments of renowned universities among the top ten, such as Harvard University, Princeton University, the Massachusetts Institute of Technology and Yale University. However, using the more robust order-α and order-m approaches, we also find smaller and less-known departments, such as of the University of Groningen and the Crawford School of Public Policy at the Australian National University atop the list. The reason for this difference is technical. In a DEA-or FDH-based efficiency measurement, departments, which have the highest output are automatically atop the list, as they receive the highest possible score of 1. Using methods, which allow for super-efficient units, smaller departments with mediocre output but also very small input can achieve an efficiency score larger than 1. We conclude that the wellknown economics departments of renowned universities are not necessarily also the most efficient ones, as there might be some smaller departments, which make better use of their limited resources. Secondly, based on Figure 4 and Table 4, we conclude that renowned departments are not necessarily efficient and efficient departments not necessarily renowned, as the correlation between any efficiency ranking and the RePEc ranking, which is used as a proxy for reputation, is rather small. This finding is consistent with Friedrich and Wohlrabe (2017) and holds for other choices of the parameters α and m.
Thirdly, we see from our comparison of five efficiency measures, that order-α and order-m produce very similar results. Both also produce similar results to FDH, on which they are based, for a small number of super-efficient units. In contrast, standard DEA produces to some extent different results, as Figure 4 displays. We conclude that a thorough efficiency analysis should feature multiple methods, as these diverge under certain circumstances. The lack of a dominant method suggests that there remains room for methodological advances and refinement of the applied measures.
There is a technical caveat to our analysis due to details of PFA, that deserves mentioning. It would seem natural to ask whether there are super-efficient economics departments before evaluating which ones are super-efficient, as we did in Section 4. We motivated the existence of super-efficient departments with the mere look of the data in Figure 1. It is important to mention, that the former question cannot be answered using PFA, because the number of super-efficient departments depends crucially on our choice of the parameters α and m. Figure 3 visualizes this relationship. This issue is of limited significance for our analysis, because the parameters mainly affect the score, but not the ranking of departments. However, further studies might discuss this question in more detail. 8. Of course, we mean the term 'usual suspects' in a very positive way. These are the institutions that would typically be expected at the top of these rankings, due to their excellent reputation.