NBER Working Paper 14690, January. Angus S. Deaton (2009).
Many development economists advocate the use of randomized controlled trials by development institutions. Deaton takes a more sceptical view, arguing that there are several drawbacks to randomized evaluations.
Randomized controlled trials have gained popularity among development economists over the past five years. Similar to medical drug tests, randomized allocation of development treatments (such as well-building) and comparison to a control group that does not receive the treatment allows for the unbiased estimate of the average treatment effect. They do not require unrealistic assumptions as many econometric models do. This approach has been advocated by prominent academic institutions such as the MIT Poverty Action Lab and the Bureau for Research and Economic Analysis of Development (BREAD) network, who see it as a scientific basis for policy recommendations. More broadly, this movement has arisen as a response to “scepticism about econometrics,” “doubts about the usefulness of structural models in economics,” “the endless wrangling over identification and instrumental variables,” and “frustration with the World Bank’s apparent failure to learn from its own projects, and its inability to provide a convincing argument that its past activities have enhanced economic growth and poverty reduction” (23). Economists dissatisfied with development policy as a “succession of fads” have moved towards randomized evaluations that are seen as “generating gold standard evidence that is superior to econometric evidence, and that is immune to the methodological criticisms that have been characteristic of econometric analyses” (24). One example is “the flagship study of the new movement in development economics,” Ted Miguel and Michael Kremer’s 2004 paper which showed that deworming medication administered in randomly selected schools was more effective in reducing absenteeism than that administered outside schools, because of children’s propensity to infect one another.
In this article, economist Angus Deaton takes the opposite approach. He acknowledges that “in ideal circumstances, randomized evaluations of projects are useful for obtaining a convincing estimate of the average effect of a program or project.” However, he argues that “The price for this success is a focus that is too narrow to tell us ‘what works’ in development, to design policy, or to advance scientific knowledge about development processes” (3). Although the best development economists already propose generalizable mechanisms that “explain why and in what contexts projects can be expected to work… there would be much to be said for doing so more openly” (4).
The first problem with randomized evaluations is the understanding of exogeneity. The dictionary definition of exogeneity is that “caused by factors or an agent from outside the organism or system.” The economic definition of exogeneity requires that the instrumental variable used be orthogonal to the error term. Deaton argues that the difference between these two definitions, both casually used, “has caused, and continues to cause, endless confusion in the applied development (and other) literatures” (12).
When natural or randomized experiments are not possible, economists often use instruments to try to isolate the causal impact of one variable on another. An instrument is a variable that is correlated with the independent variable but is not correlated with the error term. For example, an economist might want to estimate the impact of aid on growth. There are obvious problems with looking at the simple relationship between the two. There might be feedback—for example, if a hurricane strikes a country, growth will fall but aid will rise. This would make it look like aid depresses growth, while in fact no such relationship holds. Economists turn to instruments in this case. One commonly used instrument is francophone Africa status. Francophone African countries receive more aid from France because of their colonial history. Deaton explains that “By comparing these countries with countries not so favored… we can observe a kind of variation in the share of aid in GDP that is unaffected by the negative feedback from poor growth to compensatory aid. In effect, we are using the variation… as a natural experiment to reveal the effects of aid” (18).
Instruments are not without criticism. Instruments that are slightly correlated with the error term can produce extremely skewed results. Heterogeneity can be a problem as well. However, appropriately used instruments can lead to important policy conclusions. In one famous 1999 study, Joshua Angrist and Victor Lavy found that children in small classes in Israel performed better than those in large classes.
Deaton notes that although randomized evaluations have become important as a basis for policy recommendation due to their scientific nature, he suggests caution in terms of generalizability. For example, “an educational protocol that was successful when randomized across villages in India holds many things constant that would not be constant if the program were transported to Guatemala or Vietnam” (43).
Deaton’s concludes with a call for returning to theory-driven experiments. He writes that “In the end, there is no substitute for careful evaluation of the chain of evidence and reasoning by people who have the experience and expertise in the field. The demand that experiments be theory-driven is, of course, no guarantee of success, though the lack of it is close to a guarantee of failure” (45).