There's another reason that some people might not like RCTs - they're remarkably good at calling the bluff of useless interventions! Imagine a government minister launching their pet anti-poverty initiative, or an ed tech startup trying to get more of their custom tablets and apps into school classes. The problem with RCTs is they can potentially come back with "confidence interval doesn't separate from zero and effect size is so small you need to zoom in on the graph to see it", or in plain English, that money was wasted. It's the same reason that homeopaths and astrologers and faith healers go on about how RCTs are bad, because their methods keep showing up as placebo at best when someone tries.
To steelman the criticism of the main example study: it was established that enforcement was by default spotty. If a study comes out that says enforcement is effective, that might inspire more enforcement, causing more people to have their water shut off.
I think some of the anger at RCTs is because we humans get used to what has been happening. “Africans going without water” is old news, and not worth paying attention to usually
While “africans going without water while an economist seemingly approves of thjs happening” is dramatic and newsworthy.
And normal people barely understand RCTs as you mentioned. They also think the world is far richer then it is, and that economists are rich. So they might think that the economists are choosing not to help the Africans when they have the means to do so, and not only dont they directly help them, they say the look forward to watching them be miserable!
(The economists dont, they look forward to seeing if there is a way to help or what to learn, but people dont get this unless you explain it)
This combined with rational ignorance could explain a lot
I love it when things are good actually. "X is good actually" is one of my favorite genres. And development RTCs are one of the most good actually things
I agree with this - it is kinda annoying that effect presence/sizes are claimed to be obvious by some - they clearly aren't! But I do find myself swayed by Lans Pritchett in some (other) ways. It's a complicated picture.
The measurement bias stuff is pretty obvious, I think we can agree on that. Relatedly, but more interestingly, I wonder if there is a scientism/institutional legitimacy problem? One reason I like givewell is that - contrary to certain critics - it doesn't (usually, I think) have this problem. The "assigning numbers, quantifying uncertainty" approach is maligned because when examined, it doesn't have this credibility. Givewell recognizes that it nevertheless works and is actually needed for honest truth seeking.
But maybe RCTs are making it too easy to credibility-launder sometimes? To be fair, grand theory based interventions may have this issue too; but at least they are already in a bind to justify themselves. Plus; it's alleged that dev econ is overrun with RCTs, that they are being conducted on everything and demanded on to the exclusion of everything else. If this really *is* the case, it seems like a problem. Waste in part, because then studies are demanded not because they are useful but because the credibility is needed. But also a "here's our RCT" problem - singular RCTs misplacing all the complicated thinking that really ought to occur. "Quicker feedback loops" is one example of something RCTs may hurt, such as in actually updating on new circumstances - eg malaria vaccines, give directly's fraud problem, certain legislative actions.
I'm not an economist, so I have no real insight into the contextual/experiential aspects of this to really evaluate whether these problems in theory have been actual problems.
For the RNG, picking the name out of a hat may not be particularly scalable, BUT, given RCTs have a limited number of groups, you could create groups before assigning them, and assign them randomly in an obvious way. So everybody is randomly assigned a number between 1 and 4, for the 4 trial groups, and then you have a child roll a fair dice in front of everybody (or some other public, unimpeachable method) to determine whether 1 is the control, test A, test B, or test C.
There's another reason that some people might not like RCTs - they're remarkably good at calling the bluff of useless interventions! Imagine a government minister launching their pet anti-poverty initiative, or an ed tech startup trying to get more of their custom tablets and apps into school classes. The problem with RCTs is they can potentially come back with "confidence interval doesn't separate from zero and effect size is so small you need to zoom in on the graph to see it", or in plain English, that money was wasted. It's the same reason that homeopaths and astrologers and faith healers go on about how RCTs are bad, because their methods keep showing up as placebo at best when someone tries.
To steelman the criticism of the main example study: it was established that enforcement was by default spotty. If a study comes out that says enforcement is effective, that might inspire more enforcement, causing more people to have their water shut off.
I think some of the anger at RCTs is because we humans get used to what has been happening. “Africans going without water” is old news, and not worth paying attention to usually
While “africans going without water while an economist seemingly approves of thjs happening” is dramatic and newsworthy.
And normal people barely understand RCTs as you mentioned. They also think the world is far richer then it is, and that economists are rich. So they might think that the economists are choosing not to help the Africans when they have the means to do so, and not only dont they directly help them, they say the look forward to watching them be miserable!
(The economists dont, they look forward to seeing if there is a way to help or what to learn, but people dont get this unless you explain it)
This combined with rational ignorance could explain a lot
Excellent article.
Indeed, although I fear it's mostly preaching to the choir.
I love it when things are good actually. "X is good actually" is one of my favorite genres. And development RTCs are one of the most good actually things
I agree with this - it is kinda annoying that effect presence/sizes are claimed to be obvious by some - they clearly aren't! But I do find myself swayed by Lans Pritchett in some (other) ways. It's a complicated picture.
The measurement bias stuff is pretty obvious, I think we can agree on that. Relatedly, but more interestingly, I wonder if there is a scientism/institutional legitimacy problem? One reason I like givewell is that - contrary to certain critics - it doesn't (usually, I think) have this problem. The "assigning numbers, quantifying uncertainty" approach is maligned because when examined, it doesn't have this credibility. Givewell recognizes that it nevertheless works and is actually needed for honest truth seeking.
But maybe RCTs are making it too easy to credibility-launder sometimes? To be fair, grand theory based interventions may have this issue too; but at least they are already in a bind to justify themselves. Plus; it's alleged that dev econ is overrun with RCTs, that they are being conducted on everything and demanded on to the exclusion of everything else. If this really *is* the case, it seems like a problem. Waste in part, because then studies are demanded not because they are useful but because the credibility is needed. But also a "here's our RCT" problem - singular RCTs misplacing all the complicated thinking that really ought to occur. "Quicker feedback loops" is one example of something RCTs may hurt, such as in actually updating on new circumstances - eg malaria vaccines, give directly's fraud problem, certain legislative actions.
I'm not an economist, so I have no real insight into the contextual/experiential aspects of this to really evaluate whether these problems in theory have been actual problems.
For the RNG, picking the name out of a hat may not be particularly scalable, BUT, given RCTs have a limited number of groups, you could create groups before assigning them, and assign them randomly in an obvious way. So everybody is randomly assigned a number between 1 and 4, for the 4 trial groups, and then you have a child roll a fair dice in front of everybody (or some other public, unimpeachable method) to determine whether 1 is the control, test A, test B, or test C.