Machine Learning: The Cost of Getting it Wrong?

Dennis Lendrem

Published Oct 27, 2025

“When you worship technology, you stop questioning it.”

The Ovarian Cancer Story

In the early 2000s, proteomics promised to revolutionise cancer diagnostics. Instead, it generated a text book example of Statistical Debt in biomedical research - costing upwards of $1 billion, setting the field back a decade, precipitating the Proteomics Winter and eroding public trust in cancer diagnostics and science.

The Proteomics Gold Rush

At the turn of the century, Correlogic launched an early diagnostic test, OvaCheck, for ovarian cancer. Using mass spectrometry patterns from tiny patient cohorts and sophisticated machine learning algorithms they developed exciting and seductive models. On paper, those models were highly accurate promising hope and early intervention for patients with life-threatening ovarian cancers.

But the reality was simpler and harsher.

The models had no biological grounding, were not predictive, and they collapsed upon independent validation. These models were overfitting to noise variability. We learned the hard way that you can't validate a model using the data used to build that model.

“The models didn’t find biology - they found batch differences.”

Correlogic wasn’t alone. Dozens - if not hundreds - of academic labs and several startups joined the gold rush, claiming biomarkers for ovarian, prostate, pancreatic cancers, and anything else that moved.

Venture capital followed the hype.

Money was raised and lost.

Hopes were lifted, then dashed.

Counting the Cost

Let’s conservatively estimate the tangible costs of this misadventure.

Academic & Early-stage Research: Around 50 significant programs were launched, each burning roughly $500 k in expensive proteomics equipment and personnel. Say $25 million.
Commercialisation & Clinical Trials: Correlogic raised an estimated $30-50 million before failure. Two major companies went as far as Phase III trials, each spending around $75 million. Say $150 million.
Clean-up & Validation: Public agencies like the NCI spent heavily on multi-site validation efforts including the Clinical Proteomic Technology Assessment for Cancer consortium, designed to check the flaws. Say $25 million.

Total direct costs? Say $200 million.

But, the real damage came later.

The Proteomics Winter

An entire generation of postdocs and PIs had built their careers on these flawed foundations. When the house of cards collapsed, careers were damaged and intellectual effort was wasted. Estimated lost human capital: at least $50 million.

Investor scepticism triggered a Proteomics Winter. Venture capital dried up. Legitimate proteomics companies struggled for funding. Meanwhile, the entire field’s scientific credibility took a hit - good work became harder to publish, reviewers and editors having been burned by hype. Crucially, this fiasco delayed the development of robust, quantitative proteomics (such as modern SWATH/DIA-MS) by 5-10 years.

The real cost of the Proteomics Winter lies in the opportunities it stole. Quantifying this isn’t straightforward, but we can triangulate it in three ways: by looking at lost investment, lost time, and lost impact.

Lost Investment - The Counterfactual Pipeline

In the early 2000s, diagnostics platforms that caught investor and public funding waves - such as next-generation sequencing - typically attracted multiple, large-scale R&D programmes. Had proteomics retained investor confidence, it’s reasonable to expect that 5-10 legitimate diagnostic development programmes, each with budgets of $100-150 million, would have progressed through the pipeline over the following decade. This puts the foregone R&D investment in the $0.5-1 billion range, over and above the $200 million already burned on the flawed wave.

Recommended by LinkedIn

Application of Machine Learning algorithms in modeling…

Miodrag Cekikj, PhD CSE 3 years ago

Developing a reproducible, scalable, and shareable…

Zifo Bioinformatics 2 years ago

FindCSV for Structural Variant 🔍, sylph: Metagenome…

Zifo Bioinformatics 1 year ago

Lost Time - The Decade That Wasn’t

Investor scepticism and reputational damage delayed serious investment in proteomics by at least 5-10 years. Oncology diagnostics typically attract $100-200 million in annual global R&D spending when a platform is considered promising. Multiplying that by a lost decade gives a temporal opportunity cost of roughly $0.5-2 billion in delayed or diverted innovation. This is how economists often assess the cost of infrastructure delays - and the same principle applies here. A decade of cold investment meant slower technology maturation, fewer candidates in the pipeline, and ultimately, delayed patient benefit.

Lost Impact - Health and Societal Costs

Finally, there is the human dimension. Even modest improvements in early ovarian cancer detection can translate into hundreds or thousands of lives saved annually. We can value these savings through standard QALY (Quality Adjusted Life Year) or Value of Statistical Life metrics - typically $100,000-150,000 per life-year in the US. These societal opportunity costs run comfortably into the billions. This doesn’t require heroic assumptions: accelerating the availability of just one moderately effective diagnostic test by five years would have had major clinical and economic impact.

Taken together, these estimates give us a conservative figure of somewhere between $0.5 and $2+ billion in opportunity costs. For argument's sake, let's say that we're talking about $1 billion in round figures.

That's $1 billion in total.

This is the true cost of the Proteomics Winter. The financial burn was bad enough, compounded “interest” on the statistical debt was far, far greater - measured not just in dollars, but in lost innovation, delayed diagnostics, and avoidable deaths.

In addition, although OvaCheck never reached the market, the hype around it raised false hopes among patients. Its collapse contributed to public distrust in early detection tests and the erosion of trust in science itself. And trust, once lost, is expensive to rebuild.nbsp;

This was not a one-off, single bad paper.

This was Glitter Blindness. A system-wide failure: seductive technology + machine learning + compelling clinical need, combined with a lack of statistical and biological rigour. Statistical Debt may start small - a tiny advance on some “statistical nicety". But left unchecked, it compounds. And interest is paid in wasted resources, misdirected research, delayed discovery, and lost human capital.

Statistical debt is real and is taking place in a lab near you, right now.

There was nothing wrong per se with the various supervised machine learning classifiers. Genetic algorithms, decision trees, and support vector machines are all useful tools. The core problem wasn’t the algorithm itself, but the lack of statistical discipline in how it was trained, tested, and interpreted.

Statistical rigour isn’t a bureaucratic nicety. It’s an economic and strategic imperative.

The Proteomics Winter of Discontent

Reference: Ransohoff DF (2005). Ovarian cancer screening and serum proteomics. J Natl Cancer Inst, 97(4):315-319. DOI: 10.1093/jnci/dji054

David Ransohoff’s paper is a landmark critique of early claims that serum proteomic patterns could detect ovarian cancer with extraordinary sensitivity and specificity. He shows that the impressive results reported in early studies (notably the OvaCheck test) were artefacts of flawed study design, not genuine biological breakthroughs.

Good design matters more than algorithmic cleverness.

His critique punctured the hype surrounding proteomic diagnostics for ovarian cancer and became a turning point in recognising the dangers of overfitting, lack of replication, and systematic bias in high-dimensional biomarker research. It remains one of the clearest, most influential statements of why rigorous study design matters more than algorithmic cleverness.

More Statistical Tails of the Unexpected

Article content — Statistical Tails of the Unexpected

"This isn’t a stats textbook. It’s a demolition job on bad science."

https://www.amazon.co.uk/Apes-Anoraks-Statistical-Tails-Unexpected-ebook/dp/B0DVZQH3FR

Apes in Lab Coats

1,282 follower

+ Subscribe

Richard Riley (R²) 6mo

Machine learning fails when you need it most https://pubmed.ncbi.nlm.nih.gov/33307188/

9 Reactions

Victor GUILLER 6mo

💪🏻 "Good design matters more than algorithmic cleverness." Love that quote. When you create high quality and representative dataset to train any statistical model with discipline and good practices, the results are a lot more convincing and solid.

1 Reaction

Jenny Devenport 6mo

Glitter Blindness!!! “The core problem wasn’t the algorithm itself, but the lack of statistical discipline in how it was trained, tested, and interpreted.”

2 Reactions

Malcolm Moore 6mo

Thanks again for another enlightening post Dennis Lendrem. This reminds me of one of the quotes attributed to George Box: "All models are wrong, but some are useful". In particular don't trust a model until we have verified it with data that was not used to develop the model.

1 Reaction

Hitesh Mistry 6mo

There is a pre-cursor story to this which is relates to the current fad of "digital twins" , Systems Biology. A company called Merrimack Pharmacueticals, which in the end probably spent close to a billion dollars, failed miserably. (If you look at the old financial documents you see roughly a billion dollars of investor money taken and in the end only 1/3rd was ever got back, so a huge loss.) At the end all the company had for all its endeavours was a refformulation of Irinotecan, which clearly has little to do with systems biology! Its fascinating there there is little literature on such failures.

Machine Learning: The Cost of Getting it Wrong?

Dennis Lendrem

The Ovarian Cancer Story

The Proteomics Gold Rush

Counting the Cost

The Proteomics Winter

Lost Investment - The Counterfactual Pipeline

Recommended by LinkedIn

Lost Time - The Decade That Wasn’t

Lost Impact - Health and Societal Costs

The Proteomics Winter of Discontent

More Statistical Tails of the Unexpected

Apes in Lab Coats

1,282 follower

More articles by Dennis Lendrem

Others also viewed

In the Spotlight - Meet Maryna Chepeleva

Computational Epigenetics: A New Way to Understand Gene Expression

🚋TRAMbio: Graph Rigidity Analysis for Macromolecules⚛️, 💡ConceptDrift for Smarter Hypotheses🚗💨, 🤐 ZIPcnv: CNV Detection from Shallow WGS Data🧾

🌸Erianin Derivative for Colorectal Cancer🎗️, BCAT2: Cancer Biomarker🧬, R2R3-MYB in Apricot 🍑, Shiny ✨ for Python 🐍

i3S Inside Stories

Predicting Breast Cancer Risk with Machine Learning - Coimbra Dataset

AMRomics for Microbial Genomes🦠, ALLSTAR🌟: Linking Mutations to Cancer Types, Pangene: Graph for Genes📊, Placebo Pain Response Pathways in Mice 🐭

In Sync with Speciale #20

Rapid Expansion in Spatial Biology Market Driven by Multi-Omics Innovation Report 2026-2035

🚴♂️CiCLoDS: Single-Cell Spatial Transcriptomics Model, 🧫ViralQuest for viral-sequences, 🧠IntelliProfiler for behavioral profiling in mice🐭

Explore content categories

The Ovarian Cancer Story

The Proteomics Gold Rush

Counting the Cost

The Proteomics Winter

Lost Investment - The Counterfactual Pipeline

Recommended by LinkedIn

Lost Time - The Decade That Wasn’t

Lost Impact - Health and Societal Costs

The Proteomics Winter of Discontent

More Statistical Tails of the Unexpected

Apes in Lab Coats

1,282 follower

More articles by Dennis Lendrem

PRIMATE POLITICS

Screw It: How To Fix R&D

Always Improving. Never Improved.

The Chorus Paradox: Why An Innovation That Halved R&D Costs Never Caught On.

The Design of Experiments: DOE Box Set

R&D Truth? Or R&D Fairy Tale?

The PRIMATE Model of R&D

Why Organisations Keep Getting Blindsided by Technical Risk

The Design of Experiments III: Intelligence Failures

The Design of Experiments II: Hidden Risks

Others also viewed

In the Spotlight - Meet Maryna Chepeleva

Computational Epigenetics: A New Way to Understand Gene Expression

🚋TRAMbio: Graph Rigidity Analysis for Macromolecules⚛️, 💡ConceptDrift for Smarter Hypotheses🚗💨, 🤐 ZIPcnv: CNV Detection from Shallow WGS Data🧾

🌸Erianin Derivative for Colorectal Cancer🎗️, BCAT2: Cancer Biomarker🧬, R2R3-MYB in Apricot 🍑, Shiny ✨ for Python 🐍

i3S Inside Stories

Predicting Breast Cancer Risk with Machine Learning - Coimbra Dataset

AMRomics for Microbial Genomes🦠, ALLSTAR🌟: Linking Mutations to Cancer Types, Pangene: Graph for Genes📊, Placebo Pain Response Pathways in Mice 🐭

In Sync with Speciale #20

Rapid Expansion in Spatial Biology Market Driven by Multi-Omics Innovation Report 2026-2035

🚴♂️CiCLoDS: Single-Cell Spatial Transcriptomics Model, 🧫ViralQuest for viral-sequences, 🧠IntelliProfiler for behavioral profiling in mice🐭

Explore content categories