Causal Inference and Variable Control

May 9, 2025

Course Description

The poor social scientist […] is rescued as to the (Hypothesis → Observation) problem by the statistician. Comforted by these “objective” inferential tools (formulas and tables), the social scientist easily forgets about the far more serious, and less tractable, (Theory → Hypothesis) problem, which the statistics text does not address.

– Paul Meehl (1990)

Many scientific theories are causal; they describe the direction and functional form of causal effects between two or more variables. In many fields, researchers often study causal theories with observational data, and they try to infer the causes behind correlations they observe in data. In such cases, you may have learned that it is important to control for “confounding variables”. However, it is often not clear from statistical texts what it means for a variable to be a “confounder”, nor is it clear how such variables should be selected. In other words, it is often not clear how we should move from a substantive theory about causes to a statistical hypothesis about observed correlation, and vice versa (“Theory → Hypothesis” problem in Meehl (1990).

In theory, there does not need to be a problem. There is a small set of formal rules that we can always use to decide which variables to control for, given our causal beliefs. However, many researchers are not familiar with these rules and instead choose variables to control for based on correlations in past research or model fit indexes, which can very quickly lead to erroneous causal inferences.

In this workshop, we will learn how to create strong derivation chains from a causal theory to a statistical hypothesis, and back. We will first practice how to formalize our causal beliefs into simple formal models called DAGs (directed acyclic graphs). We will then learn how to derive statistical hypotheses from a DAG, and we will learn which variables in the DAG should (and should not) be controlled for to test a specific causal relationship. Finally, we will learn what we can infer about a causal relationship from a significant hypothesis test, we will discuss why causal inference based on observational data is hard even when we have a DAG and know how to test it. The workshop will include hands-on exercises to help you practice drawing DAGs, identifying statistical hypotheses implied by different DAGs, and setting up regression analyses in R to test the implied hypotheses.


Prerequisites

  • A basic understanding of regression/linear model analysis is required for this workshop.
  • A basic understanding of R is preferable for this workshop.


Reading materials

Required

Optional

  • Pearl, J., & Mackenzie, D. (2018). The book of why: the new science of cause and effect. Basic books.

  • Hernán MA, Robins JM (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC


Capacity

This course has a maximum capacity of 40 participants.


Time and Location

This workshop will be held on-site only at Eindhoven University of Technologyon May 9, 2025. Details will be provided to all attendees over email after registration for the workshop.

Workshops start from 9:30 to 16:30 with a lunch break from 12:30 to 13:30. Lunch will not be provided but can be purchased at the university canteen or the on-campus supermarket.


Registration

To register for this workshop, please complete the following form by April 22nd. Your registration will be considered finalized only after you receive a confirmation email. The registration link will remain open after this date if spots are still available. Registration Form


Instructors

Dr. Peder Isager

Peder Isager is an Associate Professor at Oslo New University College in Norway. He specializes at research methods and statistics and replication studies. He is also a board member on the Methodology & Data Analysis committee at the Psychological Science Accelerator.