Introduction: absorbing the four methodological disruptions in democratization research?

ABSTRACT This article introduces the special issue on methodological trends in democratization research by taking stock of the overall development of methods practices and situating the findings of the individual article contributions within the broader developments. As has the broader discipline, democratization research has experienced four methodological “disruptions” over the past 60 years: the behavioural revolution of statistical methodology; the introduction of formal theory; the sophistication of qualitative, set-theoretic and multi-method research; and the increasing use of experimental methods. Surveying the methods practices in the past quarter century, we find that quantitative and multi-method research have been growth areas in recent years, but that the bulk of research is still done in comparative or single case studies. Formal theory as well as set-theoretic methods have gained a foothold in the field, but it is still a small one. In sum, democratization research is, methodologically speaking, still rather traditional. Moreover, the individual contributions to this special issue show that much of the empirical literature underutilizes the best available advice about how to develop and test theory, including standards on causal inference, case-selection, and generalization. We conclude with a plea for more transparency, humility, and collaboration within and across methodological traditions.

other formal methods from mathematics and economics, introducing a rigorous way to develop theory and derive hypotheses. In the 1980s and 1990s, some social scientists emphasized the agency of political actors, contingency and historical processes, leading to increasingly sophisticated thinking about the methodological foundations of qualitative and, in the past two decades, multi-method research. Most recently, the social sciences have been disrupted by the "potential outcomes revolution" and the widespread use of experimental methods.
The first three disruptions initially divided political scientists into opposing camps, giving rise to polarized debates about what constitutes good research. 1 By now, however, a pluralistic mainstream has emerged that recognizes that qualitative and quantitative methods and formal theorizing can all make essential contributions to our collective research endeavours and that no method can claim a monopoly on rigorous research. 2 We are now in the early phase of the fourth disruption, in which we still experience some mutual incomprehension, extreme claims, professional insecurity and resentment. Some believe that the findings of research designs that are not close approximations of experiments are hopelessly confounded and therefore meaningless. Others argue that experiments lack external validity and are therefore irrelevant for understanding the messy political world outside the laboratory.
However, there is reason to believe that political science will absorb the new potential outcomes disruption much as it has the three previous ones. We therefore expect that the pluralistic mainstream will be broadened to accommodate new and old insights. Certainly, the new focus on rigorously eliminating confounders makes a valuable contribution. Until recently, we have, as a field, been too willing to attribute causal force to factors that may be irrelevant when other factors are taken into account, that are endogenous to other factors, only spuriously related to the effects, or that may be effects rather than causes of the phenomena they were intended to explain. This insight does not imply that older approaches or findings have no value at all, but it does force us to re-examine the older approaches to distinguish between the advantages they claimed to have, which may have been exaggerated, and those that are still valid. By the same token, the "older" methods still can help us judge where the weaknesses of the potential-outcomes approach are. Do the concepts used in an experimental design match the ones we use to understand specific cases? How can we rigorously study processes that unfold iteratively over time? How far can we generalize findings derived from rigorously controlled tests?
Such reflections on the potential of different methods to improve our understanding of one of the most relevant and popular fields of political science, the study of democratization and political regimes, were the starting points that motivated this special issue of Democratization. We argue that methodological trends in democratization research lag behind the most recent developments in political science. Our analysis shows that while research on democratization uses diverse methods, most still employ case studies. Statistical methods are also staples of democratization research, even though such studies are outnumbered by studies with fewer than six cases. There is also some evidence of a recent trend towards multi-method analysis combining statistical and case study research. In comparison, set-theoretic methods, formal theory and experimental research are not used much in the field. Moreover, and rather troubling, all empirical methods practised in the field often ignore the increasingly sophisticated advice proposed by the methodological community in political science, for example about theory development, case selection and sampling, generalization, and making causal claims. Finally, while the practice of formal theorizing is exemplary in terms of its substantive contribution and methodological rigour, it is too infrequently used to make much of a difference.
Together, this suggests that there is a serious disjuncture between methodologists and practitioners, since empirical research depends crucially on methodological rigour. In addition, it also does not bode well for the field's ability to generate robust findings. The foremost example of these tendencies is the decades-long debate about the relationship between development and democratization. Is democratization best understood as one of many correlated attributes of modernization? 3 Does economic modernization cause democratization? 4 Does income sustain democracies but not cause transitions? 5 Does income do both, but act more powerfully to sustain than to promote transitions? 6 Is economic development mediated by culture? 7 Are all of these relationships spurious? 8 No consensus has formed about the answers to these central questions.
In order to survey these trends over the past 25 years, we commissioned five articles, each one focused on a different methodological approach. We asked the authors to describe what contributions each approach has made to our understanding of democratization and to assess its distinctive strengths and weaknesses. We are proud to have recruited top scholars who can address these topics with the authority of being experts on both the substantive issue of democratization and the methodology they are discussing. Jason Seawright, who has written widely about both methodology and Latin American comparative politics, assesses statistical research from the perspective of experimental logic. Milan Svolik, who discusses theory development, has proposed and tested some of the most influential formal models of regime change. Matthijs Bogaards writes on case studies and small-N comparisons from his experience working on democracy, elections, and ethnicity, especially in Africa. Jørgen Møller and Svend-Erik Skaaning, who have published on a wide range of substantive topics, including democratization, state formation, and conflict, and have contributed significantly to the development of qualitative and set-theoretic methodology, address the potential contributions of set theoretic methods. Amel Ahmed, a prominent voice on mixed methods, who has written about institutional choice during long-term democratization in Europe and elsewhere, assesses a unique contribution of multi-method research. The special issue ends with an article by Paul Friesen and Lars Pelke that introduces their original Democratization Articles Dataset and presents a number of trends in democratization research.
The Friesen and Pelke dataset classifies methodological practices of articles published between 1990 and 2016 in the two leading international journals focussing on democratizationthe Journal of Democracy and Democratizationand the three comparative politics journals with the highest impact factors: Comparative Politics, Comparative Political Studies, and World Politics. Our introduction draws on this dataset as well as on the findings of the five methods articles to evaluate whether practices in democratization research correspond to broader methodological developments in the discipline. We discuss the first and fourth methodological disruptions (the behavioural revolution and the potential-outcomes approach, which relies on the logic of experiments) together and then proceed to the second and third: formal theory and qualitative and multi-method research. We conclude with a brief summary of our main insights and conclusions and suggest how to better align methodological theory and practice in democratization research.
The first and fourth disruptions of statistical and experimental research The behavioural revolution introduced statistical methods to the study of democratization starting in the late 1950s. 9 The early analyses in this tradition were purely correlational and cross-sectional, largely due to the scarcity of data and the rudimentary methods then known to political scientists. Over the next forty years, however, data became more abundant and statistical expertise deepened and spread. By the late 1990s, quantitative work on democratization demanded large panel datasets, extensive controls, and aggressive efforts to wrestle with autocorrelation, multicollinearity, measurement bias, interactions, and often endogeneity. 10 This approach has come to comprise nearly 20% of the articles in the Democratization Articles Dataset (Table 1). It excels at finding and documenting general patterns, but usually by sacrificing conceptual richness and theoretical integration. It has identified a handful of robust tendencies, such as that high-income countries tend to be democratic and that democracy is strongly clustered historically and geographically; and dozens of other relationships that are not as well established. 11 However, quantitative research has also fed debates about why exactly we observe these tendencies and whether they are really causal. Quantitative researchers have rarely attempted to delve deeply into specific aspects of democracy simply because until recently there 33.6 *N = 500 (based on the subsample of all empirical articles that use statistical methods). **N = 391 (only subset of statistical studies that explicate sampling rules). Note: Percentages may not sum to 100 due to rounding or missing data. The data are drawn from the "primary research interest", "causal claim", "number of cases", "sample selection", "sampling rules 1", and "generalization" variables of the Democratization Articles Dataset (covering the years 1990-2016).
were insufficient data measuring concepts less general than "democracy" or "freedom". 12 To the extent that analyses of related concepts has been possiblefor example, research regarding human rightsit has also tended to document general patterns with limited conceptual richness and theoretical integration. The newest disruption, variously termed "experimental," "causal identification," or "potential outcomes", springs from the quantitative approach. It employs statistical methods but seeks to correct their limitations. The most familiar quantitative democratization research typically proceeds by regressing some measure of democracy on some predictors of interest and some control variables, using the largest panel dataset available (see Table 2). Notable examples include Przeworski et al.'s Democracy and Development and Teorell's Determinants of Democratization. 13 Recently, attention has increasingly focused on whether the relationships found in this approach can be interpreted as causal, or whether they are only correlational: just descriptions of patterns in the data that may or may not have resulted from causal processes. In his contribution, Seawright stresses that a key issue in identifying causal relationships is whether the variable of interestthe "treatment," in an experimental frameworkwas randomly assigned. Random assignment provides reassurance that omitted variablesi.e. alternative explanations that have not been controlled for, also known as "confounders"are uncorrelated with the variable of interest or the outcome, and therefore can be safely ignored. If the treatment was not randomly (or, in natural experiments, at least "asif" randomly) assigned, then there is no guarantee that alternative explanations do not affect the conclusion, so the conclusion cannot be trusted.
Seawright argues that regression analysis in effect assumes that the conditions for natural experiments have been met: that all confounders have been eliminated. He argues that these conditions are never met and as a result, the findings of typical regression analyses using observational data are inconclusive. This poses a profound challenge to mainstream quantitative research. One implication is that we must go back and question the conclusions of older quantitative research. Going further, Seawright concludes that "current research practices cannot realistically meet the goals we assign to them"; at best, they yield "descriptions of joint distributions between democracy and other variables" rather than any sort of causal inference.
We draw less damning conclusions than Seawright. Certainly it is progress on one front to be able to draw stronger, less-confounded inferences by using experiments, natural experiments, or quasi-experiments. Seawright is correct that most past regression analyses have paid too little attention to possible confounders and their conclusions cannot be fully trusted until all the confounders can be ruled out. It is difficult to argue that any analysis using observational, i.e. non-experimental, data ever achieves such perfection. However, not even the best laboratory experiment (on politics, at least) can rule out absolutely all confounders. All experiments are conducted in highly constrained environments, and the characteristics of those environmentsthe location, the time period, the population from which subjects were recruitedare omitted from the protocol. For example, in one of the seminal field experiments about democratization, Susan Hyde randomized monitoring of voting stations in Armenia's 2012 election. 14 While the study was very well designed there are good reasons to question whether the same results could be replicated in another country, or even in the next election in Armenia. This is why lab experiments tend to lack external validity, which means that the findings must be presumed to be confounded with the conditions in which the experiment was conducted. Therefore, we see the difference in inferential rigour between experiments and observational regressions as one of degree rather than a categorical difference between perfectly trustworthy methods that reveal the truth and hopelessly flawed ones that reveal nothing. Comparativists (in fact, all scientists) have to be comfortable living with provisional and probabilistic knowledge. Nevertheless, experimental and observational analysis offer two distinct routes towards overcoming the inherent trade-off between causal identification and generalization. One way would be for experimentalists to replicate small-scale studies in many different contexts until it becomes clear that the finding is generally true. In the process, they would inevitably have to add scope conditions in order to take contextual variables into account. Those working with observational data would prefer to generalize top-down, starting with general tests of sweeping hypotheses, and then gradually qualify them by testing for conditional relationships in smaller samples. We see no need to prefer one over the other. Both approaches would eventually converge on the same conclusions.

The second disruption and formal methods in democratization research
Formal modelsthose that aim to build theory from explicit and clear assumptions about how rational actors make choices in the presence of constraints and payoffs - Note: The data are drawn from the "experiments", "primary research interest", "causal claim", "number of cases", "sample selection", "sampling rules 1", and "generalization" variables of the Democratization Articles Dataset (covering the years 1990-2016).
are sometimes ignored in discussions of research methods because they are methods of theory development, not empirical testing. This is a mistake, because causal inference requires more than establishing that an empirical pattern exists. We must also have a good reason to expect the relationship to exist. This is what formal theorists bring to the table, as well as the promise of a logical framework that is able to connect many propositions into a single integrated body of theory and that provides clear-cut and, in principle, testable micro-foundations for the relationship between independent and dependent variables. Consequently, formal theorizing, either in its "hard" mathematical form or in a softer, more intuitive approach, has contributed much to our understanding of crucial aspects of democratic transition and consolidation. A few works have had an influence on theory that is far greater than the proportion of research done in a formal theoretical vein. One such example is Adam Przeworski's re-formulation of O'Donnell and Schmitter's intuitive argument that the Latin American and Southern European transitions from authoritarian rule in the 1970s and 1980s was due to splits between hard-and soft-liners in the regime leadership and the making of implicit or explicit pacts between the soft-liners and the opposition. 15 By transforming verbal arguments into a more formal language of game theory, Przeworski not only explicated several underlying assumptions, which in itself was a significant theoretical improvement; 16 he also showed that under normal conditions rational soft-liners would never defect from the authoritarian regime and, consequently, that transitions are best understood as the result of miscalculations of parts of the regime elite. 17 Acemoglu and Robinson give a more recent and more mathematically sophisticated answer to the question of why transitions to democracy occur. They argue that dictators may introduce democratic institutions to solve commitment problems created by economic inequalities and to stave off their violent dismantling by a military coup or a popular revolt. 18 Other influential formal work has theorized the relationships among inequality, redistribution and democratization; the survival of democratic and authoritarian regimes; and the inner workings of authoritarian regimes and their impact on the transition and consolidation of democracy. 19 These works have become much-cited landmarks without necessarily presenting incontrovertible empirical corroboration of their theoretical arguments. Rather it is their theoretical valuei.e. the clear-cut specification of the underlying theoretical assumptions and micro-foundations, the mathematical precision of all steps in the argumentation, and the often counter-intuitive predictions of these modelsthat often inspire a whole avalanche of new theoretical alternatives, both formal and informal, and empirical tests using a wide spectrum of research methods. 20 A look at the Democratization Articles Dataset might suggest that formal theory does not play a major role in democratization research, as not even 1% of the surveyed articles includes a formal model (see Table 3). 21 However, Milan Svolik's in-depth substantive survey in this special issue not only reiterates that there is a solid and growing body of formal literature on democratization, but also underscores the substantive and methodological contributions of this literature to knowledge accumulation in the field. Substantively, he argues that formal theorists have introduced the notion of "democracy as equilibrium", which stresses that the emergence and persistence of democratic institutions must be explained as the result of optimal strategic behaviour of rational actors. Methodologically, Svolik reiterates that formal theorizing enforces the specification of complete and coherent theoretical arguments that entail explicit statements on the micro-foundations of macro-political outcomes; and highlights the discipline, rigour and transparency of mathematical modelling, which facilitates the reproducibility of theoretical conclusions and the empirical testing of these models' observable implications.
Concerning empirical testing, however, Svolik also reminds us that formal theory has not fully exploited its potential to strengthen causal inferences, especially in relation to the combination of formal theorizing with experimental research. Here, Svolik sees the role of formal models in highlighting identification problems before the empirical evaluation and in providing a framework for evaluating the external validity of experimental results. Indeed, among the articles surveyed by Friesen and Pelke, not one combines formal theorizing and experimental research. Also, external validity seems not to be the paramount concern of empirical evaluations of formal theories, as most formal articles do not address the issue of generalization explicitly. Instead, the data summarized in Table 3 suggests that the empirical evaluation of formal models is the domain of quantitative and mixed method research, which reflects the dominant patterns in the broader discipline. 22 Purely qualitative work accounts for less than 20% of empirical tests of formal methods in the field. This suggests that the Analytic Narratives project, which aimed at combining formal theory with in-depth process-tracing case studies, 23 has not taken hold in democratization research. It also means that democratization researchers have yet to fully capitalize on the considerable leverage of withincase analyses in the empirical evaluation of formal models. This includes empirical testing of the theoretical assumptions underlying the formal models. In-depth case analyses would also permit empirical tests of the assumed sequencing of actor's decisions in formal models, which is usually crucial for the generation of the model's theoretical propositions. 24 In Gandhi's influential formal model of coalition-building and co- *N = 3093, including empirical and theoretical articles. The data are drawn from the "formal.modelling" variable of the Friesen and Pelke Dataset. **N = 24, including only empirical articles. Note: Percentages may not sum to 100 due to rounding or missing data. The data are drawn from the "formal modelling", "empirical method", "number of cases", and "generalization" variables of the Democratization Articles Dataset (covering the years 1990-2016).
optation in authoritarian regimes, for instance, the results are driven mainly by the assumption that it is always the dictator who initiates bargaining with the opposition. 25 This is a plausible simplifying assumption, but it is neither theoretically nor empirically substantiated. Case studies could contribute to corroborating the model's explanatory potential by providing empirical evidence for actual processes of co-optation in selected authoritarian regimes.

The third disruption and case studies in democratization research
Case study methodology has long been a staple of empirical work in political science in general, and the sub-discipline of comparative politics in general. 26 The debate on case study methodology received a major boost with King, Keohane and Verba's Designing Social Inquiry, which presented in commendable clarity a number of prescriptions for descriptive and causal inference in qualitative research. 27 It also generated an extremely fruitful wave of reactions by qualitative researchers, who found the epistemological and methodological foundations of their empirical approaches at odds with King et al.'s transferring the logic of variable-oriented, statistical methods to qualitative work. 28 Together, this new body of research has suggested a number of important concepts, general principles and guidelines to maximize the inferential potential of case studies. First, there is much agreement that case studies can be employed for different goals. On the one hand, even critics of the method agree that case studies are particularly well-suited for in-depth description and the empirical evaluation of rich concepts. 29 This is well reflected in democratization research, as 77.1% of all purely descriptive studies in the Friesen and Pelke dataset are case studies. In terms of conceptual validity, however, Bogaards suggests in his contribution to this special issue that case studies of democratization have failed to realize their full potential and led to excessive conceptual innovation: authors tend to coin new concepts for each case, which few other researchers adopt, leading to the lack of a common language for understanding cases in comparative perspective. On the other hand, case studies are also considered useful tools for theory development, especially the generation and modification of existing theories, and causal inference, especially the testing of causal mechanisms. However, only about one in five case studies in Friesen and Pelke's dataset (22.0%) also aims at some theoretical contribution (see Table 4). Beyond these general patterns, Bogaards argues that theoretical arguments in case studies tend to be tailored to each case, so it is often unclear how each case might contribute to more general theory.
Second, case studies in the field are also problematic from the perspective of research design. In general, there are two basic approaches to causal inference in case study research: cross-case comparison of causal effects and within-case tracing of historical processes to uncover causal mechanisms. Small-n comparison, whether based on Mill's methods or on some informal comparative scheme, is ridden with well-understood problems and threats to inference, including causal indeterminacy, a tendency to assume causal determinism and simplistic causal relationships, and the unrealistic requirements of finding most similar or most different cases. 30 Still, more than one in five of all case studies that do explicate their sampling rules do so in reference to Mill's methods. At the same time, Bogaards suggests in his article, the field has mostly ignored the developments and formalization of within-case process tracing into a coherent and reliable method of causal inference. 31 This deprives case study research of its most powerful tool of causal inference and its most important comparative advantage vis-à-vis large-n research: the identification and comparative evaluation of causal mechanisms. One reason for Bogaard's conclusion might be that good process tracing is not published in journal articles but requires book-length expositions. And, indeed, some of the best case study work in the field has been published in monographs, such as Greitens' study on the emergence of repressive apparatuses in three East Asian dictatorships. 32 Nonetheless, high-quality process tracing has also been published in journal articles, including Weyland's convincing analysis of the impact of cognitive shortcuts to explain the diffusion of mass protests in nineteenth century Europe. 33 Third, methodologists agree that case selection plays a crucial role in the method's ability to fulfil these goals. 34 However, Table 4 shows that 66.1% of case studies in the dataset fail to specify their sampling procedures. Moreover, of those case studies that do justify the selection of cases most do so based on the inherent importance of the cases and not on methodological grounds (41.5%). This is even true for those case studies that aim at making an explicit theoretical contribution: only slightly more than half (51.9%) of these case studies justify their case selection rules, and only 31.9% do so based on systematic, methodologically founded criteria. 35 As Bogaards stresses in his analysis, this undermines the evaluation of a given study's theoretical contribution and generalizability. Explicit discussion of generalization No 1587 86.0 Yes 251 13.6 *N = 1845 (based on the subsample of all empirical articles that use case study). **N = 574 (only subset of case studies that explicate sampling rules). Note: Percentages may not sum to 100 due to rounding or missing data. The data are drawn from the "primary research interest", "causal claim", "number of cases", "sample selection", "sampling rules 1", and "generalization" variables of the Democratization Articles Dataset (covering the years 1990-2016).
This is closely related to the finding that only 13.6% of the case studies surveyed in the Friesen and Pelke dataset explicitly discuss generalization. This may be because of the well-known limits of case studies in producing generalizable knowledge. Since all cases are instances of a larger, conceptually defined population, 36 however, representativeness and generalizability are always relevant, especially for case studies that aim at contributing to theory development. But only 31.4% of the theory-oriented case studies in the dataset explicitly address generalization. Of course, the claimed generalizability of case analyses mainly depends on case selection. Case study methodologists suggest that in order to maximize the generalizability and theoretical contribution of case studies, cases should be chosen that are either "most likely" or "least likely" to corroborate a given theory. 37 However, these case selection rules are explicitly used in only 12.0% of all case studies, and of those only 34.8% discuss matters of generalization in detail. In his article, Bogaards stresses that this lack of attention to questions of external validity and case selection is particularly damning for theory-generating or -testing case studies. Without specifying the theory's scope conditions, the theory's range and applicability cannot be assessed, theoretically or empirically. 38

The third disruption and set-theoretic methods in democratization research
The second element of the third methodological disruption is the development and formalization of set theoretic methods of social inquiry such as typological theory and the different forms of Qualitative Comparative Analysis (QCA). 39 While these methods have been around at least since the late 1980s, they received increased attention in the qualitative "backlash" to King et al.'s suggestion that social inquiry should be based on a single, essentially quantitative, logic of inference. Most recently, political science has even seen the proposal, most vocally brought forth by Gary Goertz and James Mahoney, that qualitative research per se is based on set-theoretic logic. 40 According to the Democratization Articles Dataset, however, democratization research practice has thus far made little use of QCA and other set-theoretic methods, as only 16 articles employ these methods. Of the published set-theoretic work, 62.5% is aimed at theory development through empirical analysis, and 93.8% of articles are trying to uncover causal relationships. For that, they typically rely on a medium to large number of empirical cases, with 62.5% of the studies having case numbers in the range between 6 and 30. In selecting the cases, QCA studies invariably justify their sampling rules and typically (56.2%) claim to include the complete population in the analysis. Despite the fact that 62.5% of QCA articles do not explicitly address the issue of generalization, this overall suggests that existing QCA research might make an important contribution to the accumulation of knowledge in democratization research (see Table 5).
As Møller and Skaaning show in their detailed survey of set-theoretic methods in democratization research, however, their inferential potential has been insufficiently realized due to three major limitations. First, despite the epistemological foundations of QCA in set theory, many authors employ the method in an attempt to uncover linear-additive, symmetrical relationships, which are typically better captured by statistical procedures. Second, Møller and Skaaning warn that too many studies too easily interpret the set-relational findings between (combinations of) conditions and the outcome as causal relationships. Just as correlation does not equal causation, set relations do not mean that a given cause is actually a causally necessary or sufficient condition for the outcome. Since temporal precedence is a defining element of many concepts of causation, the causal interpretation of QCA results is particularly tenuous given the well-known problems of the method to adequately capture time and historical sequences. 41 Finally, Møller and Skaaning bemoan the fact that many applications of set-theoretic methods do not follow well-established technical standards, especially the suggestion to run robustness checks to check for measurement errors and the fragility of results to small changes in the "calibration" of conditions. 42 In sum, such malpractices lead Møller and Skaaning to the gloomy conclusion that thus far QCA has not yet made any lasting contribution to the field. To that, we add that the failure to follow best practices not only weakens any method's contribution to the accumulation of knowledge, but also makes wholesale criticism of the method per se too easy. 43 Moreover, given the inherent challenges of causal inference based on the identification of cross-case regularities (summarized in our discussion of the first and fourth disruption above), it is unlikely that these problems will be solved by the refinement of set-theoretic methods alone. Consequently, the best research strategy for causal inference consists of taking seriously the suggestions of the earliest proponents of the method, and closely connecting set-theoretic cross-case techniques with the inferential potential of process tracing of systematically selected cases. Explicit discussion of generalization* No 10 62.5 Yes 6 37.5 *N = 16 (based on the subsample of all empirical articles that use set-theoretic methods). Note: The data are drawn from the "primary research interest", "causal claim", "number of cases", "sample selection", "sampling rules 1", and "generalization" variables of the Democratization Articles Dataset (covering the years 1990-2016).

The third disruption and multi-method research in democratization research
After an initial burst of enthusiasm for multi-method research, roughly coinciding with the founding of the American Political Science Association's Organized Section on Qualitative and Multi-method Research in 2003 and Brady and Collier's reply to King et al.'s plea for a unified logic of inference, 44 scholars realized that there are difficult problems of incommensurability. 45 Different methods that seem to be complementary often in fact employ different concepts, different logics, and different samples that end up not addressing exactly the same research question. In this special issue, Ahmed approvingly cites Seawright's distinction between mere "triangulation" and "integrated" multi-method research designs, which leverage the different strengths of each method in ways that actually do promote complementarity, such as Lieberman's "nested analysis". 46 In integrated designs, qualitative methods complement quantitative and experimental methods in order to learn about measurement error, causal mechanisms, sources of causal heterogeneity, and confounding variables; and to check for violations of random assignment in experiments. We would add that there are innovative Bayesian tools for combining findings from different methods into a single inference. 47 These deserve greater attention even though it is difficult to defend the assignment of priors that they require. Nonetheless, Bayesian methods show that passing several tests using different methods, even if they are individually relatively un-rigorous, can sometimes justify greater confidence in a causal inference than passing one more rigorous test using a single method.
In her contribution to this issue, Ahmed surveys the literature and finds that true multi-method research on democratization is still rather rare. It also tends to favour one method, with the other serving in a supporting role. She also finds that some methods, especially formal theory and interpretive methods, tend not to be included in multi-method studies. The Friesen and Pelke data concur: just 6.4% of the articles they classified qualify as multi-method. However, multi-method authors appear to be more self-consciously methodological, as all do empirical work, 92% make causal claims, and the majority study more than one case and define explicit sampling rules. However, only a quarter discuss the generalizability of their findings (Table 6).
Nonetheless, democratization research has produced a number of landmark studies on various substantive questions that exemplify the systematic combination of multiple methods to strengthen causal inference. Most of these publications combine statistical analyses with case studies to trace causal mechanisms, such as Norris's work on the impact of power-sharing institutions on democratic stability; Teorell's comparative analysis of the determinants of democratization; or Haggard and Kaufman's study of the role of elites and the mass public in transitions to democracy. 48 Recently, and reflecting the fourth methodological disruption, multi-method researchers have also started to combine (natural) experiments with observational methods, such as Dunning's research on the impact of natural resource wealth on regime development. 49 Ahmed also recognizes the fundamental obstacle, the difficulty of making different methods speak meaningfully to the same questions. If a case study of the relationship between economic growth and democratization is deeply embedded in the complex institutional context, history, and culture of one country, how is it fair to isolate one finding to test in a large-N analysis? Such questions are even more challenging when, as usually happens, quantitative researchers reduce the rich concepts used in case studies to much simpler, easy-to-measure variables. And when a body of quantitative research repeatedly demonstrates that on average some condition, such as per capita income or regional democracy levels, is associated with democratization, is there really a good reason to expect such general tendencies to hold true in most specific cases? What sense are we to make of it when they do not? For that matter, what sense are we to make of it when they do?
Ahmed's novel contribution in this issue is an empirical demonstration that multimethod research provides a distinctive benefit to the discipline beyond these issues of causal identification and generalization: getting practitioners of different methods more acquainted with research outside their orbits. This is a novel argument and an important one. She employs extensive citation data and revealing network graphs to make her case. Cross-method dialogue may not conquer the challenges of making all the different methods complementary and leading us to converge on shared truths. However, it is a necessary first step towards that goal.

Conclusion
We want to begin our conclusion with a caveat on the scope of this special issue. A complete survey and evaluation of methods practices in democratization research and its  25.0 *N = 173 (based on the subsample of all empirical articles that use multi-method research). **N = 100 (only subset of multi-method studies that explicate sampling rules). Note: Percentages may not sum to 100 due to rounding or missing data. The data are drawn from the "primary research interest", "causal claim", "number of cases", "sample selection", "sampling rules 1", and "generalization" variables of the Democratization Articles Dataset (covering the years 1990-2016).
strengths and limits would require a much more comprehensive (and most assuredly multi-method) approach than what is possible within the confines of this introduction, or even the whole of the special issue. Questions we cannot or can only partially address include, for example, whether methods practices vary across different substantive research questions in democratization research, or over time (which Friesen and Pelke address in their contribution). Similarly, we cannot say anything substantial about the reasons why methods practices in the field are the way they are. Nonetheless, we believe that our overview of the past 25 years of democratization research, both in this introduction and the individual contributions to this special issue, allows us to paint a portrait, albeit in broad strokes, of a research programme that has been cautious in absorbing the major methodological disruptions in political science and sociology. Quantitative and multi-method research (which tends in practice to be limited to regression plus case studies) have been growth areas, together reaching about a quarter of the articles in this time period. Case studies, either comparative or of single cases, still comprise about two thirds of the articles. Formal theory has gained a foothold in the journals, but it is still a small one, although books using this approach have undoubtedly been very influential. 50 Set-theoretic applications remain rare; their foothold in this literature is tenuous. For all the heated attention methodological debates have stirred up in political science and comparative politics, we would have expected to see alternatives to case studies being more prominently represented in the democratization literature; yet it is still surprisingly traditional. We do not mean to imply that the continuing prominence of case studies is a bad thing. Without case studies, we would be fumbling in the dark! Description must precede explanation, and the demands of merely describing all of this recent history are daunting. There are nearly 200 countries in the world and they are all constantly evolving. Producing just one article on democratization per country per year (an admittedly crude but probably low standard, given the need to include competing methods and theories) over the past 25 years would have required five thousand articles, so the nearly three thousand articles surveyed for this special issue were not keeping up. Perhaps this scholarly community needs such a high proportion of case studies just to have enough material to work with when developing and testing theory. And of course, 88% of the case studies also make causal claims (Table 4).
Having said that, much of the empirical literature apparently underutilizes the best available advice about how to develop and test theory. If we, as a community of researchers with a shared interest in understanding democratization, wish to make progress towards clear, cumulative, robust, tested and confirmed theory, we recommend the virtues of transparency, humility, and collaboration. Transparency is essential for understanding. What were the real-world cases that inspired a formal theory? How reliable are the measures we use? Which specification decisions led to different findings? How were cases chosen for comparison? Being transparent about these matters leads naturally to humility. It is human nature to be excited about one's latest research, but it is important to remind ourselves that no method leads unfailingly to the truth. Rather, each method reveals different facets of the truth, and it is only by sifting through many streams of evidence, as individuals or as a collective, that we earn the right to claim new understanding. Collaboration helps in this process, because it is practically impossible for a lone researcher to master all of the many methods now at our disposal, not to mention all of the deep case knowledge that is required to know whether to trust large-N generalizations. Furthermore, the benefits of collaboration are greater when collaborators have complementary strengths. This is therefore a call for more multi-method researchand for reading across the methodological divides which, as Ahmed suggests, multi-method research promotesbut especially collaborative multi-method research, in which each collaborator compensates for the weaknesses of the others.