A Significant Post
I often joke with my friends that are bi-lingual or more that my language skills are so bad that I am close to null-lingual. That is language is not my thing. (And I assume, if you have read some of my blogs, you see that!!) So, you may think it odd the topic on which I will write.
Whenever I hear the term statistically significant, my nerves are grated. Let me explain why. A small p-value is a metric. The smaller that it is the more evidence there is against the null. Which could be considered more evidence, if there are two treatments, of a difference.
It appears that R.A. Fisher, who in my mind may be the greatest scientist of the 20th century, started referring to a finding as significant if it had a small p-value. He later appears to have coined the term statistically significant. Later this morphed into a cut point that was determined by a two-sided p-value being less than 0.05. This is based on some things he wrote, but it certainly took a life on of its own, which I doubt he could have anticipated. But I am not necessarily commenting on the cut point notion. Although I have a problem with this, as well. I am going to comment on the term statistical significance.
Let us consider the meaning of these words individually. Statistical means of, relating to, based on, or employing the principles of statistics. Significance means having meaning or having or likely to have influence or effect. So, if one shows there is evidence of a difference does it have “statistical” meaning or likely to have “statistical” influence? Certainly, not. It depends not on the amount of evidence of a difference it depends on what is the things themselves that are different. Certainly, if I did the study I would find a small p-value in the difference in the average shoe size of men and women. Important? Noteworthy? Have influence? As a statistician I would answer no to each of these questions, so why would I call it statistically significant? Is that not implying that I as a statistician find this important?
One of the worst parts of statistical significance is that it gave birth to the term clinically significant. The reason is that in scientific publications the use of the word significant has basically been eliminated. Once I participated in a publication and the physician wrote in the conclusions about the significance of the research in their area and they were requested to provide a p-value. What is something that is clinically significant? Something that has meaning to a clinician. It is not clinically significant then, it is just significant.
The last thing that the term statistical significance does is hide the true nature of the metric that is being used. I personally try never to use the term statistically significant. I tend to use the following terms.
1) There is no evidence of a positive treatment difference (one-sided p-value > 0.5.)
2) There is a little evidence of a positive treatment difference (one-sided p-value between 0.2 and 0.5)
3) There is some evidence of a positive treatment difference (one-sided p-value between 0.05 and 0.2)
4) There is evidence of a positive treatment difference (one-sided p-value between 0.005 and 0.05)
5) There is strong evidence of a positive treatment difference (one sided p-value between 0.0001 and 0.005)
Recommended by LinkedIn
6) There is formidable evidence of a positive treatment difference (one sided p-value < 0.0001)
Now this needs to be added to a statement about the magnitude of the effect as seen by the confidence interval so that such statements as the following are added to the first statement.
1) But it does not appear large enough to have a clinical benefit (if upper edge of the CI is too small
2) Could have a clinical benefit (if the upper edge of the CI is large enough)
3) Seems likely to have clinical benefit (if the lower edge of the CI is large enough)
(These statements could be even further nuanced.)
Let us leave the word significant to the clinician who evaluates all the evidence and lets the reader know if they think it is important.
Read other articles I have written: Blog 3 — StartersGateBioQuantAdvisory
Never miss a weekly post again. I’ve started an email list for my blog. Besides sending my blog to your e-mail, I’m also kicking off an “Ask Brian” column. If you have a general question about drug development or quantitative work, send it in. I’ll answer what I can, depending on how many come through. You’ll also receive the occasional Starter's Gate update, which will be low-frequency and not spammy. Join the list at E-mail Sign Up — Starter's Gate BioQuant Advisory
Follow me on LinkedIn: www.garudax.id/comm/mynetwork/discovery-see-all?usecase=PEOPLE_FOLLOWS&followMember=brian-smith-67891a5
#drugdevelopment #biostatistics #clinicalpharmacology #pharma
Suppose the scientific community adopts the 1-6 terms from tonight , I foresee that ten years later they become a new “statistical significance “ headache and the same fight against “statistical significance “ starts over again.
Yes, it seems very popular when interpret the results as statistically significant, even for the secondary endpoint result without significance adjusting.