Some Stats on Obesity vs Vegetable oil
Alright, so yesterday I included a really quick regression of vegetable oil on obesity rate. Today I had time to do more, and it doesn’t look too good for our theory here.
If you recall, here is our naive initial regression.
The most obvious confounder that I can think of is GDP per Capita, I used PPP
I think this is a pretty strong showing for GDP and obviously not so much for vegetable oil consumption, in terms of what causes obesity.
I think that alone should be enough for us to say that there is no real chance of proving anything here. So the following is probably only really useful as an exercise in bad stats
I still messed around though, I recalled using instrumental variables in my econometrics class so I gave that a whirl.
Basically, if you have
YObesity = BXveggieconsumption + e
and you say, there is a third term correlated with both X and Y and not with the error term, that causes X, you might try using it as an instrument. I think this is really fast and loose but I did it anyways, here’s a good source for learning about this. GDP actually fits the bill here, surprisingly.
“These are the requirements of an IV: 1) they can’t correlate with the error (exogeneity), and 2) they do correlate with X (education).”
So GDP explains vegetable consumption very significantly, and:
GDP is not significantly correlated with the residuals of Obesity = BVeggieConsumption + e
When you use GDP as an instrument, you get this:
I still highly doubt that there’s anything here, because, well a ton of reasons.
Mainly because GDP on its own did best of all this stuff
It still seemed worth trying though. I’d say I’d definitely revise down my estimates of the chances of vegetable oil consumption causing obesity after doing all this.
Here’s a link to the data, and here it is in picture form because I have no clue if Google drive will cooperate.