Nate Silver, confusion between prediction and causation

A huge problem I often see in health research is the inability to formulate a clear research question. If you start a project without a clear question it’s very easy to apply the wrong methods and misinterpret results. A very common error I see is confusing prediction and causation. A telltale sign of that is when people adjust their prediction model for confounders.

Nate Silver apparently has the same problem:

The way you run and think about analyses is ENTIRELY different depending on whether your goal is prediction or whether your goal is causation. Read A Second Chance to Get Causal Inference Right: A Classification of Data Science Tasks if you want more clarity on this issue

Another recent example:

Here Silver is equating importance with p-values. Rookie mistake. Even p-value advocates (which I am not) will be quick to point out this error. It’s hard to tell excatly what is the most important variable here because we’d need to know the scale of the variables but reopening is equivalent to a $\frac{7.27}{0.46}=15.8$ difference in Trump’s margin of victory in a state. And also equivalent to a $\frac{7.27}{0.81}=9.0$ swing in temperature (between winter and summer). It’s clear that reopening likely does play a large role but it requires the ability to interpret statistics properly to see that. (Disclaimer: this model is ridiculously simplistic and to be able to say any of these things would require a much MUCH more careful analysis.)

Avatar
Jeremy A. Labrecque
Assistant professor, Epidemiology and causal inference

My research is on how we know what we know.