Inspired by Riley et al and their BMJ article from last year, On the 12th Day of Christmas, a Statistician Sent to Me …, I’m going to try and come up with my own list of things to do when estimating causal effects. As I write this the page below is blank so let’s see if I make it to twelve or if I was have to pretend there are fewer days of Christmas. Who needs lords-a-leaping anyway…
- 1st day of Christmas: Make sure your question is causal to begin with. Are you interested in what would happen when you intervene on the world or are you more interested in prediction or description?
- 2nd day of Christmas: Copying Riley et al. here. Make sure your question is clear. Think of your question before you even look at or think about the data. Write it out in counterfactual notation. Don’t be afraid to be clear that your question is causal!. A causal question is also much more than just PICO. Think about how you would actually go about changing the exposure or treatment you’re studying. You might even want to use something like…
- 3rd day of Christmas: Target trial emulation. It won’t always save you. It won’t solve issues such as confounding. But it can help clarify your question and I have seen many, many examples where, if a study had used it, they would have avoided many self-inflicted errors in their analysis. See here for an example.
- 4th day of Christmas: Sticking with the theme of good questions, think about the consistency assumption. What does it mean to set your exposure to a specific level? How would you do that in practice? If there’s more than one way that would lead to different outcomes, the consistency assumptions is violated and you should consider what that might mean.
- 5th day of Christmas: Ok. Apparently it takes 4 days of Christmas just to get past the question asking part. Now that we have our question, let’s see if we can answer it. On Day 5, think about how you can use observed data to answer your counterfactual question. This is called “identification”. It’s the set of assumptions under which you can say that your estimate should be equal to the causal effect you’re trying to find. Most epidemiologists achieve this through adjusting for confounders but here is a great list of other ways you can do that.
- 6th day of Christmas: If you’re going the confounder control route you need to decide what you’re adjusting for. Use a directed acyclic graph informed by subject matter expertise to draw the causal structure surrounding the question you’re answering. The graph can tell you what you should and should not adjust for.
- 7th day of Christmas: Now that you know what you plan to adjust for, you should check the positivity assumption. Basically, you don’t want any combination of your confounders to perfectly predict who is exposed or unexposed. There are some relatively easy ways to check this.
- 8th day of Christmas: Measurement error. You’ve got it. We all got it. Take it seriously.
- 9th data of Christmas: Even if you’re using confounder control, there’s no reason to only rely on an outcome model. Use inverse probability of treatment weighting. Use standardization. Use TMLE to use both a treatment and outcome model. If they all give you the same answer, you can be more confident that your choice of model doesn’t affect your results.
- 10th day of Christmas: Remember when we talked about identification? Why limit yourself to one identification when you can use more than one. This is called causal triangulation. It’s tricky but if you can use different methods and get the same answer, then you might be more convinced of your answer.
- 11th day of Christmas: Exchangeability assumptions? Yeah, we never believe those. You can either do what everyone does and write the standard sentence “there’s always a possiblity of unmeasured confounding” in your discussion. Or you can do something about it. People have come up with so many different types of bias analyses that let you use your subject matter knowledge to try and make more quantitative statements about how worried your readers should be about potential bias.
- 12th day of Christmas: Name the causal assumptions you are relying on in your manuscript. Show your readers you know what causal assumptions your analysis relies on (which will depend on your identification strategy) and show them that you’ve thought about them deeply. The amount of papers that even mention assumptions like consistency and positivity