Research Design and Case Selection,

Taken from Peters Guy, Comparative Politics, Theory and Method, New York, University Press, NY, 1998, pp. 36-41.

In the best of all worlds, we might be able to do experimental research to test the propositions advanced in comparative politics. This is not however the best of all worlds, and there are a huge number of practical and ethical limitations on the capacity to experiment on people and governments. Therefore, we are left with attempting to find the best possible substitute for the rigourous controls provided by the experimental method. The questions of case selection and research design here are presented as the substitute we have for not being able to manipulate variables and randomise case assignments. Any non-experimental design is subject to a number of threats to validity (see below), and therefore political science researchers are in the position of merely attempting to do the best they can, given the circumstances, to prevent contamination of their research findings.

The above solutions for the problem of controlling extraneous research are ways of skirting the principal issue, which is how to cope with this problem through the more direct means of research design and case selection. We will talk about how many cases and the problem of small Ns in the following chapter. Here, the principal question is not how many cases but which ones, although these issues cannot be separated entirely. Again, if we return to the fundamental differences between a comparative design and a statistical design, for the comparative design we must deal with the issues of controlling the sources of variance in the ex-ante selection of the cases, rather than through ex post manipulations of data. Comparative design tends to rely upon fewer cases, but ones that are selected purposefully rather than at random.

The most basic question is, What makes cases comparable? (See Lijphart, 1975b). It is difficult to attend any academic meeting on comparative politics without hearing at least once the phrase, 'But those cases really are not comparable.' What does that statement mean, and what criteria should be used for determining if political systems are indeed comparable? Some social scientists (Kalleberg, 1966) have been enamoured of the criteria assumed to apply in the natural sciences, and have argued for very strict standards of comparability. In this view, cases could not compared adequately unless they shared a substantial number of common properties. This argument was an attack on the structural-functionalists, such as Gabriel Almond, who sought to include virtually all national cases beneath their comparative umbrella (Almond and Coleman, 1960; Almond and Powell, 1966). Structural-functionalism used concepts that were so broad that almost any political system could be compared, but sceptics wondered if that truly was comparison, or simply putting essentially incomparable cases into a common, and extremely nebulous, framework.

For the purposes of comparative politics, these criteria from the natural sciences (if indeed they are actually operative there) are almost certainly too restrictive (DeFelice, 1980). First, we may well want to compare cases that display a certain property with those that do not. What factors appear to separate democratic from non-democratic political systems (Lipset, 1959; Przeworski, 1995a, b), or countries that experience revolutions from those that do not (Wickham-Crowley, 1991)? Comparative politics involves the development of theories explaining behaviour within groups of countries that are essentially similar (see pp. 18-19). It is also about contrasting cases that are different in any number of ways. Either focus of comparison - explaining similarities or differences - can tell the researcher a great deal about the way in which governments function.

Most Similar and Most Different Systems

One crucial question in the selection of cases has been advanced by Adam Przeworski and Henry Teune (1970). This is the difference between most similar and most different systems designs. The question here is how to select the cases for comparative analysis, given that most comparative work does involve purposeful, rather than random, selection of the cases. Does one select cases that are apparently the most similar, or should the researcher attempt to select cases that are the most different? Further, like much of the other logic of comparative analysis, this logic can be applied to both quantitative and qualitative work. Theda Skocpol (1979: 40-1), for example, argued in essence for a most different systems design in her historical analysis of revolutions in France, Russia and China. These systems all generated major revolutions, albeit arising within apparently very different political economic and social systems. The question for Skocpol then became: What was sufficiently common among those systems to produce political events that were essentially similar?.

Most similar systems design is the usual method that researchers in comparative politics undertake. They take a range of countries that appear to be similar in as many ways as possible in order to control for 'concomitant variation'. Wickham-Crowley (1991: II) refers to this strategy as the 'parallel demonstration of theory'. Any number of studies have been done of the Anglo-American democracies (Alford, 1963; Aucoin, 1995; Sharman, 1994), for example, or of the Scandinavian countries (Elder et al., 1988), or of the 'little tigers' in Asia (Evans, 1995; Alten, 1995; Clifford, 1994). The assumption is that extraneous variance questions have been dealt with by the selection of the cases. If a relationship between an independent variable X and a dependent variable Y is discovered, then the factors that are held constant through the selection of cases cannot be said to be alternative sources of that relationship. The most similar systems design has been argued (Faure, 1994) to be the comparative design, given that it is the design that attempts to manipulates the independent variables through case selection and to control extraneous variance by the same means.

Another important implication also arises from selection of the most similar systems designs. Any variable that does differentiate the systems is equally likely to be the source of the observed variation among them. There may be a hypothesis being tested, but there can be a large number of other competitors that would be equally plausible. Thus, any set of findings from this research design may be over-determined, with a large number of possible and plausible explanations, none of which can be ruled out. The most similar systems design may eliminate a number of possible explanations, but it also admits and fails to address a large number. Further, as argued above, the major problem in isolating extraneous variance may be that it is not possible to identify all the relevant factors that can produce differences among systems. If we look at the Anglo-American democracies (Table 2.1) - usually taken to constitute a reasonably homogeneous grouping of countries for analytic purposes - we can easily identify a number of social and economic factors that may have political relevance, all of which vary significantly. This does not even begin to get to the political factors (Table 2.2) which also vary in some significant ways.




Table 2.1

Economic Characteristics of Anglo-American Democracies




New Zealand

United Kingdom









GNP per Capital

($ US)

17, 720













Rate (a)






Labour force

in agriculture






Tertiary education (b)






a Increase in consumer price index

b Percentage of adult population with some form of tertiary education

Table 2.2

Political Characteristics of Anglo-American Democracies




New Zealand

United Kingdom















Party System








Turnout (recent election)






Partison Control (2/1998)






Tax as %

of GNP









The alternative strategy identified by Przeworski and Teune is the most different systems design. Here the logic of selecting cases, and indeed the whole logic of research, is exactly opposite that of the most similar systems design. In the first place, the most different design strategy begins with an assumption that the phenomenon being explained resides at a lower, sub-systemic level. This means that often this strategy is looking at individual level behaviour and attempting to explain relationships among variables in samples of individuals. The most different systems design is attempting to determine how robust any relationship among these variables may be does it hold up in a large number of varied places as if the observations were drawn from the same population of individuals? If it does, then we have some greater confidence that there is a true relationship, not one produced by some unmeasured third or fourth or fifth variables that exists in all relatively similar systems.

Further, the basic logic of the most different systems is falsification, very much in the tradition of Popperian philosophy of science (Popper, 1959). The basic argument is that science progresses by eliminating possible causes for observed phenomena rather than by finding positive relationships. As noted above, there is no shortage of positive correlations in the social sciences; what there is sometimes a shortage of is research that dismisses one or another plausible cause for that phenomenon. By setting up tests in a wide range of settings, the most different systems design attempts to do just that, while the most similar systems design can identify many possible causes but can eliminate none. This problem can be seen in part in a study of the Mediterranean democracies (Lijphart et al., 1988). These systems were thought to be similar, yet once they were analysed differences in their transitions to democracy did emerge. Unfortunately, they were sufficiently similar for it to be impossible to identify effectively the root cause of those differences.

The logic of this approach is therefore fundamentally different from the most similar systems design. Whereas the most similar design dealt with control through careful selection of matched cases this design deals with the control issue by virtually ignoring it. In the most different case, there may still be unmeasured extraneous sources of variance, but they will have to be very generic in order to survive in the range of social settings in which the research may be conducted. This strategy is, however, also dangerous, given that it can create yet another false sense of security in the strength of the findings. Indeed, the findings may be generalisable to a wide range of political and social systems, but the underlying causal process assumed to exist may not, even though it may appear from Berlin to Bombay to Bogota. The most similar and most different designs therefore do very different things. The former deals more directly with countries as a unit of analysis. It attempts to control for extraneous sources of variance by selecting cases in which this is not likely to be a major problem, although the researcher can rarely if ever know how big a problem really does exist. On the other hand, the most different design is not particularly interested in countries; this is more variable- based research, and is many ways closer to a statistical design than to the true comparative design. The principal task in this design is to find relationships among variables that can survive being transported across a range of very different countries. Given the statistical nature of the thinking here, controls for extraneous variance can be imposed by the usual statistical techniques.

The most similar and most different research design strategies appear very reasonable strategies in theory, but perhaps more difficult to implement in practice. Giesele De Meur and Dirk Berg-Schlosser (1996) have demonstrated more clearly how the approach to analysis could be used. They were interested in the success or failure of democratic regimes during the post-war periods in Europe. They first divided the countries into groups of similarity based on the persistence or breakdown of democracy, and then looked at the most different countries in each group, based on distance measures calculated from a large number of social and political variables. Thus, they could look at 'most different, similar outcome' cases to see what variables were in common to explain these outcomes.