archives

Evaluation

This category contains 2 posts

Social frictions to knowledge sharing in rural India

There seems to be much enthusiasm today for efforts to improve access to information about poor people’s rights and entitlements. In a much debated recent example, Facebook’s “Free Basics” platform provides free access to a selected slice of the internet (including, of course, Facebook). In arguing for Free Basics, Mark Zuckerberg says that “everyone … deserves access to the tools and information that can help them to achieve all those other public services, and all their fundamental social and economic rights.” I think we would all agree; less obvious is whether Free Basics will help do that. Critics argue that it is a “walled garden” approach—indeed, a threat to net neutrality. There have been proposals for other options using subsidized internet data packs, as in the proposal for India made recently by Nandan Nilekani and Viral Shah.

Neither the Facebook proposal nor that of Nilekani and Shah includes explicit pro-poor targeting. Is that needed? It might be argued that it is likely to be the poor who are least connected now, so the gains will automatically be greater for them. Against this, those who have the hardware and are currently connected are less likely to be poor and will probably be in the best position to benefit from these initiatives, including enjoying any new subsidies.

Before we decide on Free Basics versus subsidized data packs, or some other option, we should see how well information spreads at present. There is already lots of “public information” out there relevant to poor people in India, and there are various dissemination channels. While there may well be frictions for knowledge diffusion, associated with illiteracy and caste-based social exclusion, how important are they? Are the poor still sufficiently well connected socially to tap into the flow of knowledge, or does poverty come with social exclusion, including exclusion from information about programs designed to help poor people? Is a more explicitly targeted approach called for? An understanding of the sources of current inequality in information access is a pre-condition for thinking seriously about policies.

Using edutainment to learn about knowledge diffusion

The use of entertaining media—“edutainment” as Eliana La Ferrara dubs it in her paper “Mass Media and Social Change”—is attracting attention as a means of both directly informing poor people of their rights and entitlements and changing preferences and how existing communities operate. Such interventions can also provide a lens on existing processes of knowledge diffusion.

In a new paper, “Social Frictions to Knowledge Diffusion,” with Arthur Alik-Lagrange, I have used an edutainment intervention to identify key aspects of how knowledge is shared within villages in rural Bihar (a relatively poor state of about 100 million people in the Northeast of India).  We show how an information campaign can throw light on the extent to which information is shared within villages. The campaign we studied used an entertaining fictional movie to teach people their rights under India’s National Rural Employment Guarantee Act (NREGA) (a motivating example used by Nilekani and  Shah). NREGA created a justiciable “right-to-work” for all rural households in India. The most direct and obvious way NREGA tries to reduce poverty is by providing extra employment in rural areas on demand. This requires an explicit effort to empower poor people, who must take deliberate unilateral actions to demand work on the scheme from local officials.

In a book I wrote with Puja Dutta, Rinku Murgai and Dominique van de Walle, “Right to Work?” it was found that most men and three-quarters of women had heard about NREGA, but most were unaware of their rights and entitlements under the scheme. Given that about half the adults in rural Bihar are illiterate, a movie made sense as an information intervention. The setting and movie are described in Right to Work? and you can see the movie (audio in Hindi) on my website, economicsandpoverty.com.

The movie was tailored to Bihar’s specific context. Professional actors performed in an entertaining and emotionally engaging story-based plot whose purpose was to provide information on how the scheme works, who can participate and how to go about participating. The story line was centered on a temporary migrant worker returning to his village from the city to see his wife and baby daughter. He learns that there is BREGS work available in the village, even though it is the lean season, so he can stay there with his family and friends rather than return to the city to find work. It was intended that the audience would identify strongly with the central characters.

With the aim of promoting better knowledge about NREGA in this setting, the movie was randomly assigned to sampled villages, with a control group not receiving the movie. Knowledge about NREGA was assessed in both treatment and control villages. Residents were encouraged to watch the movie, but not (of course) compelled to do so. Some watched it and some did not. The new paper studies the impacts on knowledge, and the channel of that impact—notably whether it was purely through the direct effect of watching the movie or whether it was through knowledge sharing within villages.

There is a methodological challenge here, namely how to identify the knowledge gains (if any) for those in the assigned villages who did not actually watch the movie. We postulate that there is a latent process of knowledge diffusion among households within the village. An individual’s knowledge reflects both this process and a latent individual effect representing the individual’s “connectedness.” The latter is assumed to be time invariant, as it depends on long-standing networks of association between people, reflecting how each individual fits within the village social structure including caste positions and the ability of that individual to process the new information. Having two observations within each household allows us to obtain an estimate that is robust to latent heterogeneity in household factors. By exploiting the differences over time, our method is also robust to latent individual effects.

Socially differentiated knowledge spillovers

We find robust evidence of spillover effects, which account for about one third of the average impact of the movie on knowledge about NREGA’s key wage and employment provisions. While knowledge sharing is evident, poorer people, by various criteria, appear to be less well connected, and so benefit less from the spillover effect—relying more on direct exposure to the intervention.

Our key finding is that the knowledge diffusion process is far weaker for disadvantaged groups, defined in terms of caste, landholding, literacy, or consumption poverty. For poor people, it appears that the direct effect of watching the movie is all that really matters to learning about NREGA. There is also some indication of negative spillover effects for illiterate and landless households, suggesting the strategic spread of misinformation.

More effective pro-poor knowledge diffusion does not, of course, assure an effective public response on the service supply side. In another paper, “Empowering Poor People through Public Information?,” it was shown that the (direct and indirect) knowledge gains from the movie did rather little to assure a more responsive program.  Right to Work? documents a number of specific, fixable, deficiencies in the responsiveness of NREGA in Bihar to the needs of poor people.

These research findings confirm that efforts are needed to improve the access of poor people to knowledge about public services that can help them, and that edutainment can work. The research also suggests that such efforts need to be directly targeted to poor groups, rather than relying on prevailing processes of knowledge diffusion, which may simply reflect, and reinforce, existing inequities.

(First posted on the World Bank’s Development Impact blog; 1/19/2016.)

The ethics of evaluation

More thought has been given to the validity of the conclusions drawn from development impact evaluations than to the ethical validity of how the evaluations were done. This is not an issue for all evaluations. Sometimes an impact evaluation is built into an existing program such that nothing changes about how the program works. The evaluation takes as given the way the program assigns its benefits. So if the program is deemed to be ethically acceptable then this can be presumed to also hold for the method of evaluation. (I leave aside ethical issues in how evaluations are reported and publication biases.) We can dub these “ethically benign evaluations.”

Another type of evaluation deliberately alters the program’s (known or likely) assignment mechanism—who gets the program and who does not—for the purpose of the evaluation. Then the ethical acceptability of the intervention does not imply that the evaluation is ethically acceptable. Call these “ethically contestable evaluations.” The main examples in practice are randomized control trials (RCTs). Scaled-up programs almost never use randomized assignment, so the RCT has a different assignment mechanism, and this may be contested ethically even when the full program is fine.

A debate has emerged about the ethical validity of RCTs. This has been brewing for some time but there has been a recent flurry of attention to the issue, stimulated by a New York Times post last week by Casey Mulligan and various comments including an extended reply by Jessica Goldberg.  Mulligan essentially dismisses RCTs as ethically unacceptable on the grounds that some of those to which a program is assigned for the purpose of evaluation—the “treatment group”—will almost certainly not need it, or benefit little, while some in the control group will. As an example, he endorses Jeff Sachs’s arguments as to why the Millennium Villages project was not set up as an RCT. Goldberg defends the ethical validity of RCTs against Mulligan’s critique. On the one hand she argues that randomization can be defended as ethically fair given limited resources, while (on the other hand) even if one still objects, the gains from new knowledge can outweigh the objections.

I have worried about the ethical validity of some RCTs, and I don’t think development specialists have given the ethical issues enough attention. But nor do I think the issues are straightforward. So this post is my effort to make sense of the debate.

Ethics is a poor excuse for lack of evaluative effort. For one thing, there are ethically benign evaluations. But even focusing on RCTs, I doubt if there are many “deontological purists” out there who would argue that good ends can never justify bad means and so side with Mulligan, Sachs and others in rejecting all RCTs on ethical grounds. That is surely a rather extreme position (and not one often associated with economists). It is ethically defensible to judge processes in part by their outcomes; indeed, there is a long tradition of doing so in moral philosophy, with utilitarianism as the leading example. It is not inherently “unethical” to do a pilot intervention that knowingly withholds a treatment from some people in genuine need, and gives it to some people who are not, as long as this is deemed to be justified by the expected welfare benefits from new knowledge.

Far more problematic is either of the following:

  • Any presumption that an RCT is the only way we can reliably learn. That is plainly not the case, as anyone familiar with the full range of (quantitative and qualitative) tools available for evaluation will know.
  • Any evaluation for which the expected gains from new knowledge cannot reasonably justify an ethically-contestable methodology.

The latter situation is clearly objectionable if it is seen to hold. But it is often hard to verify in development settings. Ethics has been much discussed in medical research. In that context, the principle of equipoise requires that there should be no decisive prior case for believing that the treatment has impact sufficient to justify its cost. (This is David McKenzie’s sensible modification to clinical equipoise to fit the types of programs in discussion here.) By this reasoning, only if we are sufficiently ignorant about the likely gains relative to costs should we evaluate further. Implementation of such an ethical principle may not be easy, however. In the context of antipoverty or other public programs, a priori (theoretical and/or empirical) arguments can often be made both for and against believing ex ante that impact is likely.  A clever researcher can often create a convincing straw man to suggest that some form of equipoise holds and that the evaluation is worth doing. While this cannot be prevented, we should at least demand that the case is made, and it stands up to scholarly public scrutiny.  That is clearly not the norm at present.

It has often been argued that whenever rationing is required—when there is not enough money to cover everyone—randomized assignment is a fair solution. (Goldberg makes this claim, though I have heard it often. Indeed, I have made this argument a few times with government counterparts in attempting to convince them on the merits of randomization.) In practice, this is clearly not the main reason that randomistas randomize. But should it convince the un-believers? It can be accepted when information is very poor, or allocative processes are skewed against those in need. In some development applications we may know very little ex ante about how best to assign participation to maximize impact. But when alternative allocations are feasible (and if randomization is possible then that condition is evidently met) and one does have information about who is likely to benefit, then surely it is fairer to use that information, and not randomize, at least unconditionally.

Conditional randomization can help relieve ethically concerns. One first selects eligible types of participants based on prior knowledge about likely gains, and only then randomly assigns the intervention, given that not all can be covered. For example, if one is evaluating a training program or a program that requires skills for maximum impact one would reasonably assume (backed up by some evidence) that prior education and/or experience will enhance impact and design the evaluation accordingly.  This has ethical advantages over simple randomization when there are priors about likely impacts.

But there is a catch. The set of things observable to the evaluator is typically only a subset of what is observable on the ground (such information asymmetry is, after all, the reason for randomizing in the first place). At local level, there will typically be more information—revealing that the program is being assigned to some who do not need it, and withheld from some who do. The RCT may be ethically unacceptable at (say) village level. But then whose information should decide the matter?  It may be seen as quite lame for the evaluator to plead, “I did not know” when others do in fact know very well who is in need and who is not.

Goldberg reminds us of another defense often heard, namely that RCTs can use what are called “encouragement designs.”  The idea here is that nobody is prevented accessing the primary service of interest (such as schooling) but the experiment instead randomizes access to some form of incentive or information. This may help relieve ethical concerns for some observers, but it clearly does not remove them—it merely displaces them from the primary service of interest to a secondary space. Ethical validity still looms as a concern when any “encouragement” is being deliberately withheld from some people who would benefit and given to some who would not.

While ethical validity is a legitimate concern in its own right, it also holds implications for other aspects of evaluation validity. There is heterogeneity in the ethical acceptability of RCTs. That will vary from one setting to another. One can get away with an RCT more easily with NGOs than governments, and with small interventions, preferably in out-of-the-way places. (By contrast, imagine a government trying to justify why some of its under-served rural citizens were randomly chosen to not get new roads or grid connections on the grounds that this will allow it to figure out the benefits to those that do get them.) An exclusive reliance on randomization for identifying impacts will likely create a bias in our knowledge in favor of the settings and types of interventions for which randomization is feasible; we will know nothing about a wide range of development interventions for which randomization is not an option. (I discuss this bias for inferences about development impact further in “Should the Randomistas Rule?”.) Given that evaluations are supposed to fill our knowledge gaps, this must be a concern even for those who think that consequences trump concerns about processes.

If evaluators take ethical validity seriously there will be implications for RCTs. Some RCTs may have to be ruled out as simply unacceptable. For example, I surely cannot be the only person who is troubled on ethical grounds by the (innovative) study done in Delhi India by Marianne Bertrand et al. that randomized an encouragement to obtain a driver’s license quickly, on the explicit presumption that this would entail the payment of a bribe to obtain a license without knowing how to drive. (This study was conducted and funded by the World Bank’s International Finance Corporation. And it was published in a prestigious economics journal.) The study confirmed that the process of testing and licensing was not working well even for the control group. But the RCT put even more drivers on Delhi roads who did not know how to drive, adding to the risk of accidents. The gain from doing so was a clean verification of the claim that corruption is possible in India and has real effects, though I was not aware of any prior doubt about the truth of that claim.

There may well be design changes to many RCTs that could assure their ethical validity, such as judged by review boards. One might randomly withhold the option of treatment for some period of time, after which it would become available, but this would need to be known by all in advance, and one might reasonably argue that some form of compensation would be justified by the delay. Adaptive randomizations are getting serious attention in biomedical research; for example, one might adapt the assignment to treatment of new arrivals along the way, in the light of evidence collected on covariates of impact. (The U.S. Food and Drug Administration issued guidelines a few years ago.)

The experiment might not then be as clean as in the classic RCT—the prized internal validity of the RCT in large samples may be compromised. But if that is always judged to be too high a price then the evaluator is probably not taking ethical validity seriously.

Martin Ravallion

(First posted on the World Bank’s Development Impact blog.)

Enter your email address to follow this blog and receive notifications of new posts by email.