Bleatorial on blocks, causal analysis and statistical thinking

Stephen Senn

Published Nov 10, 2021

Background

Judea Pearl drew attention on Twitter to a preprint by Abhishek Umrawal that applied directed acyclic graphs (DAGs) to blocking in experimental designs. Although I disagree with some of what the preprint says, it has been useful in clarifying some matters for me. This blog represeents the resulting tweetorial and might be called a bleatorial.

Confession and motivation

I am a complete beginner when it comes to causal analysis as developed by Judea Pearl. However I am also an admirer of what he has achieved. My inspiration here is that I think I that what the Rothamsted School achieved in experimental analysis is worth close examination and that there might be some aspects that could make causal analysis even better. I also cannot claim to be an expert on the extremely deep and beautiful theory of experimental design. I have been a user of it for many years but that is not the same thing.

The tweetorial

1. Thanks to Judea Pearl @yudapearl for drawing attention to this interesting paper, https://arxiv.org/pdf/2111.02306.pdf which I have found helpful in understanding some causal thinking. I shall now illustrate a particular problem related to fixed and random effects that it raises.

2. Have a look at Figure 1 and 2 in the paper. The figure suggests that if we can block for everything that is a direct ancestor of the outcome Y, we do not need to block for ancestors of the ancestors. This seems like a perfectly reasonable causal principle.

3. Now consider appendix C where we are invited to consider blocking for sex, age, weight, blood pressure, cholesterol, “factors” with “levels” (in stats speak) 2, 4,5,5,5 respectively.

Recommended by LinkedIn

Analyzing the Heart disease dataset: Predictive…

Arunava Mukherjee 2 years ago

Exploring Genetics with the Power of AI

David Cain 2 years ago

From Words to Werewolves: How Genomic Language Models…

Senthilkumar Deivasigamani 11 months ago

4. Suppose I can design a trial in which I can block for these and thousands of other factors known and unknown at least as successfully as blocking them individually.

5. What is this blocking factor? It’s “patient”. In cross-over trials, of which I have made a particular study, http://senns.uk/cticr2.html every patient is their own control. You are your own perfect twin.

6. I will treat you on one occasion with one drug and on another occasion with another, thus controlling for all genetic factors and your total life history up to (but not beyond) the start of the trial.

7. How many levels does this factor “patient” have? As many levels as there are patients. However, I shall have two outcome observations per patient (one on each of two treatments, say A and B) so that does not matter.

8. So are your genes an “ancestor” of you or are you an “ancestor” of your genes. This depends on context. Is it really helpful to think of things this way or not? Maybe, maybe not, the important thing however is that as a blocking factor it doesn’t matter.

9. If I put “patient” in the model as an effect it accounts for all these things. In stats terms. I can if I wish, put patients in the model as a “fixed” effect or a “random” one. With one A observation and one B observation per patient it does not matter.

10. Suppose that I am now informed that I can have only one period to observe patients in. I will now have to run a parallel group trial. Patients either get A or B. I can’t put patient in the model as a factor with levels equal to the number of patients. (It's "confounded")

11. What do I do. I declare it “random”. In terms of causal modelling, I treat it as an exogenous U variable. (I think.) But previously, it was an endogenous V variable. (The statistical distinction is not quite the same but that’s another tweetorial.)

12. OK you may say. Models depend on context and assumptions so what. Fine I answer. But what changed? Nothing at the level of deep mechanistic causation. What has changed is the design. I can only make progress by treating the patient effect as random.

13. That’s enough for now in a follow-up tweetorial I shall consider incomplete blocks. That shows that simple examples although useful are far from covering what is necessary.

Abhishek Umrawal 4y

Thank you so much for the mention, Stephen Senn!

To view or add a comment, sign in

Bleatorial on blocks, causal analysis and statistical thinking

Stephen Senn

Background

Confession and motivation

The tweetorial

Recommended by LinkedIn

More articles by Stephen Senn

Others also viewed

When AI Reads DNA, Who's Really Explaining the Answer?

The end of aging clocks? AI grows up

The New Biology – Individualism through Collectivism

Measuring Less to Understand More: Transcriptomics and the Logic of "Compressed Sensing"

COVID-19 - Clustering approach for knowledge extraction from scientific literature

Nine Ways That AI Is Going to Triumph

Nucleotide Transformer – Breaking New Ground in Genomic AI

Nobody was going to build this for me

Digging Deeper – how human intelligence still holds the key to true insight

Discussing Genome Language Models

Explore content categories

Background

Confession and motivation

The tweetorial

Recommended by LinkedIn

More articles by Stephen Senn

Cards on the Table

Causes and Covariates

Beware of the Morlocks:

Die, Dichotomy

Pooling the Interaction

Two ways to leave your ANOVA

Double Trouble

Illegible Eligible

Knowing ANOVA

Bridge Over Trebled Order

Others also viewed

When AI Reads DNA, Who's Really Explaining the Answer?

The end of aging clocks? AI grows up

The New Biology – Individualism through Collectivism

Measuring Less to Understand More: Transcriptomics and the Logic of "Compressed Sensing"

COVID-19 - Clustering approach for knowledge extraction from scientific literature

Nine Ways That AI Is Going to Triumph

Nucleotide Transformer – Breaking New Ground in Genomic AI

Nobody was going to build this for me

Digging Deeper – how human intelligence still holds the key to true insight

Discussing Genome Language Models

Explore content categories