Statistics on Statistics
Occasionally, as a statistician, one receives push back from a clinician. I remember one instance when the clinician wanted shift tables for laboratory values. I like shift tables in general, but in this case the study was in healthy subjects. By definition, all subjects should be normal at baseline. Thus, all a shift table does is show a lot of zeros since no one is abnormal at baseline. I explained this to the clinician and explained that the frequency tables planned were as informative and took up less space. I am not sure if they did not understand or if this was a power play on their part (this can happen), but they insisted. I responded with, no we will not be doing this. They took it to their management, who took it to mine, and I was chewed out for not being helpful.
In this case the logic was clear. It should show you, however, that even when the logic is obvious, it does not mean that a statistician can unilaterally act without consequence. Regardless, a clinician does not always understand the best analyses, and they often insist on what they know. A couple of simple examples that I have faced is ANOVA versus ANCOVA or a mean versus a geometric mean. I remember once that I was proposing that the analysis be ANCOVA and the clinician informed me that they did not understand ANCOVA and thus did not trust it. I drew my favorite picture to show them how this reduced variability. They saw this but they asked the question that I could not answer: in general, how much would the variability be reduced. If it is not much, why do this? Well, I knew it was better and I could show them mathematically there was a direct relationship between the correlation of the baseline and endpoint and the variability improvement. But, for this particular individual, even with my clearest explanation they were not convinced. They eventually folded to my assurance, but they were still uncomfortable.
So, I got a summer intern, Maria DeYoreo , and we did the following. We found 100 continuous positive data sets and ran analyses with ANCOVA, ANOVA, and change from baseline. We also did analyses for log-normality versus normality. We compared variability for each instance and translated this into cost through sample size needed. The results basically showed that on average, log-normality required 20% fewer subjects than normality, and ANCOVA on average required 20% fewer subjects than change from baseline.
I thought this was perfect. This is what every statistician needs to convince the clinician. I mean this is the type of evidence that convinces clinicians that treatment A is better than treatment B. We did statistics on statistics. We wrote (Maria did most of the writing) the manuscript and then had a devil of a time getting it published.
It is crazy to me that statistical reviewers had three reactions. One, everyone knows this. Obviously, that is not true, since not every clinician knows this. Also, does everyone know that on average the sample size is reduced 20%? I do not think so. Second, where is the mathematical proof? I do not need a mathematical proof, although it is well known. I am using statistics to show this. Is it not ironic that statisticians would find it problematic to use statistics to answer this question? Third, where are the simulations? A simulation only tells you about some particular truth; here we have answered the question in general. After about a year of rejection, we found a home. But honestly, the statistical community has largely ignored it. I regret that Maria worked her tail off based on my belief in its importance. I suspect, looking back on it, she may think I was just a bit too optimistic. Really, however, this is such a disappointment because this paper should have been seen as truly helpful.
Recommended by LinkedIn
So, I am either crazy or ahead of my time. The hardest thing about this project was getting the data together. I had to sign so many documents to get access to data from this many studies. Maria had to then go through each data set to understand naming conventions, etc. We live in a world today where there is at least lip service to centralizing data for use in clinical research. Those that suggest this are thinking about advancing clinical knowledge. I totally support this. But it also can give more complete variance estimates for sample size calculations. There is one last thing, if you have the imagination, this can be used to prove points about statistics through using statistics. There are whole avenues of useful questions that can be asked and answered that cannot be approached with mathematics. Who knows what we would find?
We could find which methods work best on average in practice. We could quantify this. We could identify study or endpoint characteristics that indicate when method A is better than method B for analysis. We could answer these questions in a format that is akin to clinical research, thus making the conclusions accessible to clinicians. Wow. There is just so much ripe fruit to be plucked, I envy all of you who actually take advantage of the modernization of data and use it to answer questions about statistics using statistics.
Read other articles I have written: Blog 3 — StartersGateBioQuantAdvisory
Never miss a weekly post again. I’ve started an email list for my blog. Besides sending my blog to your e-mail, I’m also kicking off an “Ask Brian” column. If you have a general question about drug development or quantitative work, send it in. I’ll answer what I can, depending on how many come through. You’ll also receive the occasional Starter's Gate update, which will be low-frequency and not spammy. Join the list at E-mail Sign Up — Starter's Gate BioQuant Advisory
Don’t even get me started on reviewers.. my most influential papers were dismissed as irrelevant. But they were the foundation for much of practical data science activity - good documentation, enabling tools, and reliable fast computation with proper parallel random numbers
Brian…thanks for posting this story and the “backstory”. You are an expert and I can relate to the disconnect between expertise and non-expertise. It is frustrating indeed. So, I am either crazy or ahead of my time. In response to your comment “I am either crazy or ahead of my time…”, I respond it’s likely both! A broken clock is right twice a day! I am glad you actually published this work…despite the questionable feedback from the reviewers.