Binding Affinity Prediction Techniques

Explore top LinkedIn content from expert professionals.

Summary

Binding affinity prediction techniques are computational methods used to estimate how strongly a small molecule (like a drug) will bind to a target protein, which is a crucial aspect in drug discovery and development. Modern approaches rely on deep learning and advanced modeling to predict these interactions quickly and accurately, often using protein sequences, 3D structural data, and molecular dynamics simulations.

Explore deep learning: Consider using sequence-based and graph-based deep learning models to predict binding affinities, even when full structural data isn't available.
Clean your data: Always ensure your training and evaluation datasets are curated to avoid overlapping or duplicate examples, which helps build models that truly generalize to new molecules.
Leverage quick screening: Take advantage of production-ready AI tools to screen millions of molecular pairs rapidly, helping accelerate the search for promising new drug candidates.

Summarized by AI based on LinkedIn member posts

Ken Wasserman

Assistant Professor at Georgetown University School of Medicine

4,549 followers 9mo
Report this post
Introducing "Ligand-Transformer": "to predict the binding affinity between proteins and small molecules, we introduce Ligand-Transformer, a #DL method based on the transformer architecture. Ligand-Transformer implements a sequence-based approach, where the inputs are the amino acid sequence of the target protein and the topology of the small molecule to enable the prediction of the conformational space explored by the complex between the two. We apply Ligand-Transformer to screen and validate experimentally inhibitors targeting the mutant EGFRLTC kinase, identifying compounds with low nanomolar potency. We then use this approach to predict the conformational population shifts induced by known ABL kinase inhibitors, showing that sequence-based predictions enable the characterisation of the population shift upon binding." "We described a sequence-based virtual screening method of predicting the conformational space of a target protein and a ligand in their complex state, thus overcoming the limitations of relying on the structures of the binding partners in their free states. In this way, this approach provides the binding affinity and the corresponding binding mode, represented as distance matrices between the target protein and the ligand." "Through comparisons with baseline models30–32 and ablation experiments, we observed that Ligand-Transformer performs well in affinity predictions (Table S1, Supplementary Discussion). This result can be attributed to the protein and molecular representations pro-vided by pre-trained AlphaFold26 and GraphMVP29, as well as the structural information learned from the distance matrices of protein-ligand complexes during the training." https://lnkd.in/eqPacDAD

"Sequence-based virtual screening using transformers" Ken Wasserman on LinkedIn

7 Comments
Like Comment
Jorge Bravo Abad

AI/ML for Science & DeepTech | Prof. of Physics at UAM | Author of “IA y Física” & “Ciencia 5.0”

29,003 followers 6mo
Report this post
Toward truly generalizable binding affinity prediction Accurately predicting protein–ligand binding affinity is a cornerstone of structure-based drug design. Deep learning models have made major progress—but benchmarking them reliably is harder than it seems. Overlaps between commonly used training and test sets (such as PDBbind and CASF) can make models appear to generalize better than they truly do. David Graber and coauthors take an important step forward with PDBbind CleanSplit, a carefully curated dataset that removes structural overlaps using protein 3D similarity, ligand Tanimoto scores, and pocket-aligned ligand RMSD. The result is a cleaner separation between training and evaluation data, enabling a more realistic measure of model generalization. They also introduce GEMS, a sparse graph neural network that integrates protein–ligand interaction graphs with embeddings from large protein and chemistry language models. Trained on CleanSplit, GEMS maintains strong accuracy on CASF and independent test sets, even without benefiting from overlapping examples—showing genuine understanding of molecular interactions. Why this matters: as generative methods like AlphaFold3, RFdiffusion, and DiffSBDD begin creating massive libraries of new protein–ligand complexes, the field needs scoring functions that can assess novel structures with confidence. CleanSplit and GEMS together provide a foundation for the next generation of robust, data-leakage-free affinity prediction. Paper: https://lnkd.in/diPZ3QPD #AIforScience #DrugDiscovery #MachineLearning #DeepLearning #StructuralBiology #ComputationalChemistry #Bioinformatics #GraphNeuralNetworks #ProteinLigandInteractions #StructureBasedDesign #GenerativeAI #DataCuration #MolecularModeling #ArtificialIntelligence
No more previous content

No more next content
4 Comments
Like Comment
Centre of Bioinformatics Research and Technology (CBIRT)

Democratizing Bioinformatics for a Smarter, Healthier Future!

50,402 followers 1y
Report this post
Scientists at Tsinghua University and Westlake University introduced #Dynaformer, a revolutionary graph-based Deep Learning model for predicting protein-ligand binding affinities. Unlike previous methods, Dynaformer leverages molecular dynamics simulations to capture the dynamic nature of protein-ligand interactions. 🎯 Dynaformer demonstrates state-of-the-art performance on the CASF-2016 benchmark, outperforming existing methods. The model learns from a curated dataset of 3,218 protein-ligand complexes, offering unprecedented accuracy in binding affinity prediction. 🔬 In a real-world test, Dynaformer identified 12 hit compounds (including 2 submicromolar hits) for HSP90 through virtual screening. This success, coupled with novel scaffold discoveries, showcases Dynaformer's potential to accelerate early-stage drug discovery. Quick Read: https://lnkd.in/gRWKt-V8 #Bioinformatics #MolecularDynamics #DeepLearning #StructuralBIology #AIinDrugDiscovery #ComputationalChemistry #DrugDesign #ScienceNews
No more previous content

No more next content
Like Comment
Kristin Gleitsman

CSO at Eigen Bio | AI x Bio Advisor | Scaling Systems for Diagnostics & Discovery | Fellow, Fellows Fund VC | ex VCYT, GH, PACB

8,167 followers 9mo
Report this post
Boltz-2: How much can 3D structure really tell us about molecular binding energetics? This week’s AI ∩ Bio: Reading the Revolution series covers Boltz-2, a new structural biology foundation model that exhibits strong performance for both structure and affinity prediction. To put this work in context, let’s start with the classic protein modeling pipeline logic: 🧬 Sequence → 🧱 Structure → 🎯 Function AlphaFold revolutionized the first step, grounded in the premise that function follows from structure. Boltz-2 puts that premise to the test. It starts at the middle of the pipeline — with the 3D structure of a protein–ligand complex — and asks: 👉 Can we predict binding affinity using only geometry? Key Insight: Structure is signal. Boltz-2 is a deep learning model that predicts binding affinity directly from 3D geometry — no sequence, no docking scores, no molecular dynamics. It learns by: >Using real 3D snapshots of protein–ligand complexes from experiments (via the PDBBind database) as “correct” examples >Comparing them to incorrect or nonbinding versions (decoys) >Teaching itself to distinguish between the two by assigning higher scores to the true binders — a method called contrastive learning >Viewing each complex from multiple angles and modeling how atoms interact using cross-attention between the ligand and protein The result? Accuracy approaching Free Energy Perturbation (FEP) — a gold-standard physics-based method — at a fraction of the computational cost. So: IF you have the correct structure, you can get binding affinity. But that’s the tradeoff. Boltz-2 doesn’t predict binding sites. It doesn’t model flexible loops or conformational dynamics. It assumes the structure is already known — and that it’s accurate. But we know that: 📎 Crystallography can trap proteins in inactive states 📎 Ligand poses may not reflect behavior in solution 📎 Flexibility is collapsed into a single static frame Still, Boltz-2 shows how much signal is embedded in structure — when that structure is right. 🌱 Reflection for Early-Career Scientists What happens when you flip the framing? Instead of building up from sequence to structure to function, Boltz-2 works from the middle, assuming structure is known, and asking how far that alone can take you. As a result, Boltz-2 sharpens the boundary of what structure can predict — and what it can’t. In other words, Boltz-2 is a boundary marker: a way to measure what’s possible if geometry is complete and correct.
No more previous content

No more next content
19 Comments
Like Comment
John Guanjing Zhang

Chairman & CEO | AI Health @ We3 | DePin @ Blockchain | Crypto | 2000+IP @ AI Healthcare | H&J CRO | USEV Tech | Ex Huawei Global Market Leader | Alumni @Tsinghua,UIBE & HBS |Multi-Tech Entrepreneur |Edu Only,No Advice|

16,836 followers 8mo
Report this post
1. MIT has developed Boltz-2, a next-generation AI biomolecular model for drug discovery, in collaboration with Recursion. 2. Boltz-2 integrates complex-structure prediction and binding-affinity estimation, allowing researchers to screen vast molecular libraries in hours instead of weeks. 3. The model approaches the accuracy of physics-based free-energy perturbation (FEP) simulations but operates up to 1,000 times faster. 4. On Recursion’s BioHive-2 supercomputer, Boltz-2 processes millions of ligand-protein pairs in parallel, providing binding results in about 20 seconds each. 5. Boltz-2 outperforms traditional docking methods and previous machine-learning approaches, doubling average precision in high-throughput screens. 6. The model's performance is enhanced by engineering optimizations, including custom kernels that reduce training and inference costs by up to 3x. 7. Boltz-2 is available as a production-ready microservice, Boltz-2 NIM, which supports biopharma companies with higher throughput and reduced computing costs. 8. The model was released as open-source under an MIT license on June 6, allowing for retraining and deployment for both academic and commercial use.

MIT’s next-gen AI screens millions of molecules at supercomputer speed for drug study — Interesting Engineering apple.news
Like Comment
Pallavi Nanda

AI Product Leader | AI Growth Strategist | Ex-AI Research Engineer | Helping AI & Healthtech products get discovered

6,260 followers 5mo
Report this post
The most valuable problem in drug discovery just got cracked in "18 SECONDS". And the team behind it didn’t patent it... they open-sourced it. Last week, MIT and Recursion released Boltz-2. An AI that predicts how tightly a drug binds to its protein target… in just 18 seconds. - No $100 FEP simulations. - No clusters. - No 48-hour jobs. Just a PROTEIN TARGET Just a MOLECULE Just 18 seconds on a consumer GPU The old world: - Run Free Energy Perturbation (FEP) - Wait hours or days - Pay $100+ per molecule Screen dozens, not thousands Timeline: months of compute The new world: - Run Boltz-2 - Physics-level accuracy - Predictions in 18 seconds - Cost approaches zero Timeline: screen millions of molecules in DAYS And the benchmarks are hard to ignore: - CASP16 affinity challenge → first place across 140 complexes - FEP+ benchmark → Pearson correlation of 0.62, matching the physics gold standard - Speed → nearly 1000x faster than classical FEP This isn’t incremental progress. It’s a different solution class altogether. Why Boltz-2 matters? - AlphaFold2 predicted structures. - AlphaFold3 mapped interactions. - Boltz-1 made prediction open-source. - Boltz-2 adds binding affinity i.e. the missing piece for computational drug design. It predicts: • Structure • Binding affinity • Protein dynamics (B-factors) • Physical plausibility (99.9%) All in a SINGLE FORWARD PASS This unlocks: → Millions of molecules screened per week → Hit-to-lead cycles collapsing from months to weeks → Startups competing with Big Pharma on equal footing → Academic labs getting supercomputer-class modeling on a laptop → A genuine collapse in drug discovery cost curves And yes... it’s fully open-source under MIT license, including weights and training pipeline. Reality check: - These are computational predictions, not clinical validation. - Wet-lab experiments, safety studies, and human trials still decide outcomes. But the bottleneck has moved permanently: From searching for candidates → to optimizing them. The new timeline: - AlphaFold2 predicted nature. - AlphaFold3 explained interactions. - Boltz-2 predicts which small molecules actually bind at supercomputer speed. MIT + Recursion + NVIDIA’s BioHive-2 Five million training measurements. Used by thousands of researchers already. The question isn’t whether it works. It’s who builds the fastest end-to-end pipeline on top of it. Is binding affinity the final frontier for AI in drug discovery? Or are we still missing critical pieces of the puzzle? #AI #DrugDiscovery #Biotech #MachineLearning #OpenSource #ComputationalBiology #ProteinEngineering #MIT #Healthcare #Innovation #DeepLearning
No more previous content

No more next content
24 Comments
Like Comment
Tom Knight

Research scientist (MChem, MRSC). Digital chemistry, cheminformatics, and machine learning to enhance chemical research.

1,939 followers 5mo
Report this post
🚀 New tutorial live! I’ve just published a step-by-step Google Colab + Python tutorial on using Boltz-2 for structure-based drug design – all in the cloud, no local installs needed. 🧠 What is Boltz-2? Boltz-2 is an open-source biomolecular foundation model that jointly predicts 3D protein–ligand structures and binding affinities, approaching the accuracy of physics-based FEP methods while being orders of magnitude faster – making it a powerful tool for modern in silico drug discovery. 🎥 Watch the tutorial here: https://lnkd.in/e3uwwMjE In the video, I cover: 🔹 Setting up a reproducible Boltz-2 environment in Colab 🔹 Running Boltz-2 with Python for protein–ligand systems 🔹 Using predictions in a drug design / virtual screening workflow 🔹 How to adapt the notebook to your own targets and ligands Ideal for: 🧪 Computational & medicinal chemists 🧠 ML researchers exploring AI for drug discovery 🎓 Students who want a ready-to-run Colab workflow If you watch it, I’d really appreciate any feedback, questions, or ideas for follow-up videos – drop a comment or message me! #DrugDiscovery #Cheminformatics #GenerativeAI #AIforScience #Python #GoogleColab #Boltz2 #ComputationalChemistry

Boltz-2 predictions of protein-ligand structures and binding affinities with Python / Google Colab

https://www.youtube.com/

4 Comments
Like Comment
Abeeb Abiodun Yekeen

Protein Engineer (PhD) • Synthetic & Computational Structural Biologist • Postdoc @ UTSW Medical Center • I developed CHAPERONg, PASCAR, & Subtimizer • I created BioMoDes • TWAS Alumnus • 2024 DE Shaw Research Fellow

7,013 followers 2y
Report this post
𝐓𝐫𝐮𝐬𝐭𝐀𝐟𝐟𝐢𝐧𝐢𝐭𝐲: a sequence-based method for "trust-worthy" protein-ligand binding affinity prediction for out-of-distribution generalizations TrustAffinity attempts to address the following challenges in binding affinity prediction: ▪ out-of-distribution (OOD) generalizations for the prediction of protein-ligand interactions between understudied proteins and ligands with novel scaffold ▪ quantification of uncertainty in predictions ▪ scalability to large scale predictions Using protein sequence and chemical SMILES, TrustAffinity predicts binding affinity, along with an estimation of the associated uncertainty, for a protein-ligand pair in ~0.003 seconds. Paper -> https://lnkd.in/gJHS9z9W
No more previous content

No more next content
Like Comment
Anima Anandkumar Anima Anandkumar is an Influencer

227,558 followers 6mo
Report this post
We are proud to present our latest paper on physics-informed AI for drug design appearing in PNAS special issue on machine learning in chemistry . Standard data-driven AI does not work well on examples that are significantly different from training data. This can result in unphysical predictions that are clearly wrong. To limit this type of unphysical result in the realm of drug design we introduced a new machine learning model called NucleusDiff, which incorporates a simple physical idea into its training, greatly improving the algorithm's performance. NucleusDiff ensures that atoms stay at an appropriate distance from one another, accounting for physical concepts such as repellant forces that prevent atoms from overlapping or colliding. Rather than accounting for the distance between every single pair of atoms in a molecule, which would be expensive, NucleusDiff estimates a manifold, and on that manifold, it then establishes main anchoring points to watch, making sure that the atoms never get too close to one another. We predicted binding affinities of a newer molecule that was not included in the training dataset: the COVID-19 therapeutic target 3CL protease. NucleusDiff showed increased accuracy and a reduction of atomic collisions by up to two-thirds as compared to other leading models.
No more previous content

No more next content
15 Comments
Like Comment

Binding Affinity Prediction Techniques

Summary

Boltz-2 predictions of protein-ligand structures and binding affinities with Python / Google Colab

https://www.youtube.com/

More in AI in Molecular Prediction

Explore categories