Saturday, 5 February 2011

Instrumental Variables

Marginal Revolution linked recently to this paper which claims that overuse of Instrumental Variables (IVs) in econometric studies has led to them being less useful. So far as I can tell, the argument given in the paper is exactly backwards... what am I missing?

For the uninitiated, an IV is a variable which is correlated with the thing we want to study but is unequivocally not caused by it. Let's say we want to study the effect of x on y, then an instrumental variable z is useful if its only effect on y is through its effect on x. To use the language of Pearl's causal graphs, this is equivalent to saying that every path in the causal graph from z to y passes through x.

For example, suppose we want to measure the effect of a change in the price of apple on the demand for apples. It may seem difficult to do this directly, as an increase in demand is likely to lead to an increase in supply, and so price is affected by demand. A potential instrumental variable is the weather. If the the weather is favourable for growing apples then more apples will end up being grown, and this is presumably independent of demand - the increase in apples will have a predictable increase in the supply (and therefore the price) and we can measure the effect on demand.

Now, the authors of this paper claim that:

A Tragedy of the Commons has led to overuse of instrumental variables anda depletion of the actual stock of valid instruments for all econometricians. Each time an instrumental variable is shown to work in one study, that result automatically generates a latent variable problem in every other study that has used or will use the same instrumental variables, or another correlated with it, in a similar context. We see no solution to this. Useful instrumental variables are, we fear, going the way of Atlantic Cod.

As I said, I think this is exactly backwards. It is not the fact that new papers are produced which use these instrumental variables in a new context which introduces the latency problem: it reveals a latency problem which already existed. The previous studies were already invalid. The new studies just reveal the fact.

E.g. Imagine that people buy more apples when they have high levels of vitamin D in their blood (because apples are a substitute for fish oil). Then you have to correct for the effect of the Sun on vitamin D levels when you're using the sun as a proxy for apple demand. The problem here though, is that the weather conditions already weren't a good IV for demand in apples. The fact that a new study appears to demonstrate this is not a "Tragedy of the Commons" in any meaningful sense - the study has not been made any worse, we've just found out that it was bad.

On the other hand, Alex Tabarrok, who is more of an economist than I am, appears to be taking the paper vaguely seriously, and not to have noticed this. As I said before: am I missing something?

Published with Blogger-droid v1.6.5

No comments: