このブログを検索

2011年9月20日火曜日

Maybe we should put rats in charge of foreign aid research

Maybe we should put rats in charge of foreign aid research
The usual statistical procedures are designed to keep this possibility small. The convention is that we believe a result if there is only a 1 in 20 chance that the result arose at random. So if a researcher does a study that finds a positive effect of aid on growth and it passes this “1 in 20” test (referred to as a “statistically significant” result), we are fine, right?

Alas, not so fast. A researcher is very eager to find a result, and such eagerness usually involves running many statistical exercises (known as “regressions”). But the 1 in 20 safeguard only applies if you only did ONE regression. What if you did 20 regressions? Even if there is no relationship between growth and aid whatsoever, on average you will get one “significant result” out of 20 by design. Suppose you only report the one significant result and don’t mention the other 19 unsuccessful attempts. You can do twenty different regressions by varying the definition of aid, the time periods, and the control variables. In aid research, the aid variable has been tried, among other ways, as aid per capita, logarithm of aid per capita, aid/GDP, logarithm of aid/GDP, aid/GDP squared, [log(aid/GDP) - aid loan repayments], aid/GDP*[average of indexes of budget deficit/GDP, inflation, and free trade], aid/GDP squared *[average of indexes of budget deficit/GDP, inflation, and free trade], aid/GDP*[ quality of institutions], etc. Time periods have varied from averages over 24 years to 12 years to to 8 years to 4 years. The list of possible control variables is endless. One of the most exotic I ever saw was: the probability that two individuals in a country belonged to different ethnic groups TIMES the number of political assassinations in that country. So it’s not so hard to run many different aid and growth regressions and report only the one that is “significant.”

This practice is known as “data mining.” It is NOT acceptable practice, but this is very hard to enforce since nobody is watching when a researcher runs multiple regressions. It is seldom intentional dishonesty by the researcher. Because of our non-rat-like propensity to see patterns everywhere, it is easy for researchers to convince themselves that the failed exercises were just done incorrectly, and that they finally found the “real result” when they get the “significant” one. Even more insidious, the 20 regressions could be spread across 20 different researchers. Each of these obediently does only one pre-specified regression, 19 of whom do not publish a paper since they had no significant results, but the 20th one does publish their spuriously “significant” finding (this is known as “publication bias.”)

0 件のコメント:

コメントを投稿