Die, Dichotomy
We have studied 21 435 unique randomized controlled trials (RCTs) from the Cochrane Database of Systematic Reviews (CDSR).Of these trials, 7224 (34%) have a continuous (numerical) outcome and 14 211 (66%) have a binary outcome. We find that trials with a binary outcome have larger sample sizes on average, but also larger standard errors and fewer statistically significant results. [1]
Cutting remarks
Year in, year out, for a length of time which is only awarded to statistical survivors (no, this is not about immortal time bias), I have been banging on about the stupidity, the criminal vandalism, the wanton destruction of information involved in dichotomisation. It not only inflates standard errors and increases necessary sample sizes, thereby blurring inferences, while bloating budgets, delaying development, and obliterating other opportunities but it also rots brains, causing causal confusion via the number needed to trick.
That dichotomisation of continuous measurement scales is silly, is something , surely, on which Bayesians and frequentists can agree. Indeed, Frank Harrell, the well known Bayesian statistician, has been exposing the folly of this habit for as long as I have.
A Dram of data, beats a peck of pontification
However, Erik van Zwet had a much better idea than Frank and me. Why not quantify the effect? Thanks to his ingenuity, perseverance and insight we now have a paper with Erik as lead author and with Frank and me as co-authors[1]. Furthermore, Erik has also created a shiny app that statisticians can use to help discuss outcomes with life-scientist collaborators.
Recommended by LinkedIn
In Conclusion
This is what we say in the paper
We would like to see dichotomization abandoned, but we are realists and instead offer two concrete suggestions. First, we provide a method to assess the sample size requirements with and without dichotomization (Section 3.4). Since no additional information is needed beyond the hypothesized “responder” proportions, this can be done with little effort. We hope that this will make some researchers reconsider their intention to dichotomize during the planning phase of the trial. Second, we provide a method that enables researchers to assess the loss of information (and its consequences in terms of sample size) as a result of dichotomization
Good luck everybody and keep up the good work.
Reference
1. van Zwet EW, Harrell Jr FE and Senn SJ. An Empirical Assessment of the Cost of Dichotomization of the Outcome of Clinical Trials. Statistics in Medicine 2026; 45: e70402. DOI: https://doi.org/10.1002/sim.70402.
The disease has spread well outside medecine. I had a slide on the topic in my lecture on statistical fallacies in psychology and neuroscience.
Fantastic to see this work published. Will serve as another great reference for anyone trying to avoid the efficiency loss that comes with dichotomisation.
If only the FDA wouldn't ask for responder endpoints as primary, including the response being whether an individual patient meet a clinically meaningful improvement of a continuous endpoint from baseline, and the need to define what that improvement is.
Many thanks for another really informative post and paper! I think the very sensible recommended approach is similar to what Soussa proposed in 1991 with, unfortunately, very little subsequent take-up -hence the ongoing problems. These can be made even worse by some researchers actually inferring the presence of treatment response heterogeneity from all this, https://www.jclinepi.com/article/0895-4356(91)90035-8/abstract