Bioinformatics for Drug Discovery

Explore top LinkedIn content from expert professionals.

  • Today OpenMed released a dataset the likes of which I have never seen before. They just put over 1 billion rows of psychiatric genetics data on Hugging Face. ADHD. Depression. Schizophrenia. Bipolar disorder. PTSD. OCD. Autism. Anxiety. Tourette syndrome. Eating disorders. 12 conditions. 52 landmark studies. Every genome-wide association study (GWAS) ever published by the Psychiatric Genomics Consortium, but now standardized and accessible in one place. Previously, accessing this data meant tracking down dozens of files scattered across FTP servers, wrestling with inconsistent formats, and spending more time debugging download scripts than doing actual science. Now it's one line of Python! Each row represents a statistical test: how strongly is this specific point in the human genome associated with this psychiatric condition? In more depth, each row is a single variant-phenotype association test from a GWAS meta-analysis. For every SNP, you get: • Variant ID (e.g. rs6702460) and genomic location (CHR/POS) • Effect allele and reference allele (A1/A2) • Effect size — BETA or OR — and its standard error (SE) • P-value, imputation quality (INFO), allele frequency (FRQ/MAF) • Sample sizes — total, cases, and controls (N/Nca/Nco) A typical single GWAS tests 7–15 million variants. We have 52 of them, many with multiple ancestry groups and sub-analyses. That's how you get to 1.14 billion rows. Hopefully, this will mean: 🧠 Earlier identification of people at genetic risk for psychiatric conditions (even before symptoms emerge!) 💊 Better drug targets: pinpointing the genes and biological pathways causally involved across multiple disorders 🔗 Understanding why conditions co-occur (e.g., depression and anxiety, ADHD and autism) through shared genetic architecture 🌍 More equitable research: ancestry-stratified data means findings that apply beyond predominantly European study populations Mental health research has long been underpowered relative to the scale of the problem. Open, accessible data is one way to change that 🤗

  • View profile for Lavinia Ionita

    Medical doctor and founder @Sorcova Health | I prevent chronic stress and burnout through biological testing and AI | Preventive and functional medicine | Addiction medicine

    14,971 followers

    Half a million genomes. 1.5 billion variants. One breakthrough: we are all truly unique. Twenty years ago, the Human Genome Project took 13 years and $2.7B to sequence a single genome. Today? We can sequence a genome in less than 24 hours for under $1,000. Last week, UK Biobank released 490,640 whole genomes — the largest genetic dataset ever (Nature, 2025). What did we learn? • Each person carries 4–5 million variants • 76% appear in fewer than 10 people — your genome is almost entirely yours • 1 in 10 carries clinically actionable mutations where doctors can intervene today (e.g., BRCA1/2 for cancer, LDLR for heart disease) Why it matters: • Previous genetic tests captured ~6% of human variation. This dataset reveals 40× more • In non-coding regions — the biological switches controlling genes — researchers found 63 new disease associations • Adding 31,785 non-European genomes uncovered 82 disease links invisible in Eurocentric studies From genetics to health impact This transforms medicine today: • Prevention - Polygenic risk scores flag disease decades before symptoms • Diagnosis - Rare disease patients waiting years for answers finally find them • Treatment - Pharmacogenomics matches the right drug, right dose, to your genome The next frontier: genetics + everything else Genetics is the hardware. Health is the software running in real time. Your DNA is fixed, but biology is dynamic, shaped by: • Epigenetics: how environment and lifestyle switch genes on/off • Proteomics & metabolomics: molecular signals revealing your current health state • Digital biomarkers: continuous data from stress, sleep, glucose, heart rate • Stress biology & neuroendocrine signaling: how cortisol and brain-body responses reshape your health trajectory Layer these dynamic signals onto genetic foundations, power them with AI, and you create living health models, not just predicting disease, but understanding when, why, and how it manifests in YOU. The critical question? We've spent decades treating the "average patient" — who doesn't exist. Now we can better see each person as they truly are: biologically unique, dynamically changing, infinitely complex. The healthcare winners of the next decade won't just collect data: they'll integrate genetics, epigenetics, molecular and phenotypic tests, lifestyle, stress biology, and digital signals to deliver truly personalized, preventive care at scale. There is no "normal" genome, only 8 billion unique experiments in being human. And we just decoded the first half million. 👉 Which excites you more: knowing your genetic blueprint, or understanding how your daily choices rewrite it?

  • View profile for Claudia Barth, PhD

    Biologist| Neuroscientist | Professor | Leader FemHealth Project | ERC StG MappingPerimenopause | Women’s Brain/Mental Health Adviser & Science Liaison in Clinical Settings

    3,186 followers

    🚀 Excited to share our new publication in Nature Communications, spearheaded by the amazing Hannah Oppenheimer! Sex steroids such as estradiol are often proposed as key contributors to why depression and Alzheimer’s disease disproportionately affect females. Yet, most existing evidence comes from observational studies, where confounding and reverse causation make it hard to draw firm conclusions. To address this, we used Mendelian Randomization (MR): a method that leverages genetic variants as natural experiments, similar to randomised controlled trials. Because genes are thought to be randomly assigned at conception, MR helps us estimate whether a biological factor (exposure) causally influences an outcome, offering stronger evidence than observational associations. In our study, we combined openly available and newly run and annotated genome wide association studies (GWAS) with MR. We tested whether genetically predicted estradiol exposure across multiple traits (e.g., estradiol levels pre- and post-menopause, reproductive span, age at menarche/menopause, number of childbirths, and more) causally influences: 🧠 Brain age gap (a machine-learning derived proxy of brain health) 🧬 Alzheimer’s disease risk 💭 Depression risk We also replicated relevant analyses in males. 🔍 Our key finding: Across all robust methods and samples, we found no evidence that genetically predicted (i.e., "absolute"/constant) estradiol levels causally affect brain aging, depression risk, or Alzheimer’s disease risk. 💡 What this suggests: Our findings strengthen the idea that "absolute" estradiol levels may not be the key driver. Instead, an individual’s sensitivity to hormonal fluctuations could be crucial in understanding hormone-brain interactions. This work highlights the importance of moving beyond average hormone levels toward more nuanced, dynamic measures - an exciting direction for future research. A huge thank you to all co-authors: Dennis van der Meer, Louise Schindler, Arielle Crestol, Alexey Shadrin, Ole Andreassen, Lars T. Westlye, Ann-Marie de Lange 📄 Read the full article here: https://lnkd.in/dyydRXYu

  • View profile for Francesco Rugolo, PhD

    Molecular Biologist & Biochemist | Oncology & Immunology Research | Bioinformatics & Data-Driven Scientist | Published Scientist & Mentor

    5,431 followers

    Curious about bioinformatics but not into coding? You’re not the only one.🧬 The good news: you can still explore high-impact biological datasets with zero programming skills. Here are 3 no-code platforms that bring advanced bioinformatics into reach – right from your browser.👇 🧫 1. Cellenics Designed for single-cell transcriptomics, Cellenics lets you upload your raw data (like 10X Genomics) and performs the entire scRNA-seq analysis pipeline for you. From clustering and UMAPs to marker gene identification – no command line needed. 🔗 Great for: Single-cell RNA-seq | Cell type annotation | Exploratory analysis 🌐 https://cellenics.org 🧬 2. GEPIA (Gene Expression Profiling Interactive Analysis) GEPIA lets you dive into RNA-seq expression data from TCGA and GTEx across thousands of tumors and normal samples. Quickly generate survival plots, boxplots, differential expression, and more – all with a clean, user-friendly interface. 🔗 Great for: Tumor vs. normal expression | Survival analysis | Biomarker validation 🌐 http://gepia.cancer-pku.cn 🧠 3. GeneMANIA Input your gene of interest and get a network of co-expressed, co-localized, or functionally similar genes. GeneMANIA helps you generate hypotheses and explore gene functions visually. 🔗 Great for: Functional genomics | Network biology | Hypothesis generation 🌐 https://genemania.org 💡 Whether you're in a wet lab or just getting started with data, these tools can empower your research - without writing a single line of code. ✨ Want more no-code tools for your next project? Let me know in the comments and I’ll drop another set. -- #BioinformaticsForAll #PhDLife #PostdocLife #NoCodeScience #SingleCell #CancerResearch #FunctionalGenomics #LifeScienceTools #WetLabToWebApp #ResearchSimplified

  • View profile for Azeem Azhar
    Azeem Azhar Azeem Azhar is an Influencer

    Making sense of the Exponential Age

    430,790 followers

    GENERATIVE BIOLOGY AI just wrote genetic instructions that cells actually followed – a breakthrough that turns biology into a programming language. For the first time ever, researchers at the Center for Genomic Regulation created AI-generated DNA sequences that successfully controlled gene expression in healthy mammalian cells. Think of it as writing software, but for living organisms. Why this matters: → The AI can design custom 250-letter DNA fragments with specific instructions like "activate this gene in stem cells becoming red blood cells but not platelets" → These synthetic enhancers worked EXACTLY as predicted when tested in mouse blood cells → Unlike previous efforts focused on cancer cells, this team worked with healthy cells, uncovering subtle mechanisms that shape our immune system → The researchers built a library of 64,000+ synthetic enhancers tested across seven stages of blood cell development Most fascinating was discovering "negative synergy" - where two factors that individually activate genes can completely shut them down when combined. This unlocks precision we never had before. The implications are enormous for gene therapy. Instead of being limited to DNA sequences evolution produced, we can now design ultra-selective gene switches customized to specific cells and tissues - potentially making treatments more effective with fewer side effects. Full paper: https://lnkd.in/en3bGZP9 Follow-up with @EricTopol's post about curing rare diseases with the existing genomic technology stack https://lnkd.in/eGCYMjGJ

  • View profile for Alexey Navolokin

    FOLLOW ME for breaking tech news & content • helping usher in tech 2.0 • at AMD for a reason w/ purpose • LinkedIn persona •

    778,881 followers

    AMD, UNSW Sydney & Pawsey: Redefining Real-Time Genomics with Slorado A major milestone for open science and high-performance genomics. AMD, UNSW Sydney, and the Pawsey Supercomputing Research Centre have introduced Slorado — the world’s first fully open-source, real-time nanopore DNA basecaller designed for AMD GPUs and powered by the ROCm open software platform. This breakthrough removes long-standing vendor lock-in and dramatically accelerates genomic workflows, empowering researchers with speed, scale, and flexibility. 🔬 What Slorado Enables + Fully open-source basecalling pipeline for nanopore sequencing + Runs on AMD GPUs via ROCm and supports hybrid GPU environments + Scales across multi-GPU and HPC infrastructures + Delivers performance parity with proprietary alternatives while improving accessibility ⚡ Performance Highlights on Pawsey’s Setonix Supercomputer Powered by AMD Instinct GPUs: + Full human genome decoded in: + 2.3 hours on MI250X GPUs + Just 0.8 hours on next-gen MI300X GPUs + High-accuracy models (HAC & SUP) also show significant acceleration without compromising data quality This level of performance transforms what once took days into hours — or even minutes — enabling faster research cycles, real-time pathogen surveillance, and scalable population genomics. 🌍 Why This Matters ✅ Democratizes access to high-performance genomics ✅ Accelerates discovery and clinical research ✅ Strengthens reproducibility through open-source transparency ✅ Expands AMD’s role as a trusted platform for scientific computing and AI ✅ Bridges HPC, AI, and bioinformatics into a unified ecosystem Slorado is more than a tool — it’s a signal of where the future of genomics is heading: open, accelerated, and accessible at global scale. AMD continues to push the boundaries of what’s possible in scientific computing – from AI to genomics and beyond. 🔗 Explore more: https://lnkd.in/ghSRHX7S #AMD #Genomics #OpenScience #HPC #AIinHealthcare #ROCm #InstinctGPUs #Supercomputing #Innovation #Bioinformatics #FutureOfScience #AMDBrandAmbassador

  • View profile for Abhijeet Satani

    Research Scientist | Inventor of Cognitively Operated Systems 🧠 | Neuroscience | Brain Computer Interface (BCI) | Published Author with a BCI patent and several other Patents (mentioned below🔻) and IPRs

    8,873 followers

    We’re getting better at reading genes. Now we’re learning how to read them in 3D. A new study introduces a method to resolve signal overlap in spatial transcriptomics data, one of the biggest technical bottlenecks in mapping gene expression inside intact tissue. In dense biological samples, transcripts from neighboring cells often overlap, making it difficult to accurately assign signals to the correct cellular source. This blurring limits how precisely we can reconstruct tissue architecture. By improving how overlapping signals are separated computationally in three-dimensional space, researchers can generate far more accurate maps of how cells are organized in situ. This doesn’t just refine the data, it changes the reliability of downstream biological interpretation. For neuroscience, this is particularly significant. The brain is a tightly packed 3D network of gradients, microenvironments and dynamic cellular interactions. Circuit function, disease progression and developmental processes all depend on spatial context. If our spatial resolution is compromised, our models of brain function are incomplete. As biology moves from bulk averages toward high-resolution spatial systems, segmentation accuracy becomes foundational infrastructure, not a minor technical upgrade. Precision in three dimensions is what enables precision in understanding. Source: Nature Biotechnology, 2026 — “Identifying 3D signal overlaps in spatial transcriptomics data with ovrlpy.” #Neuroscience #SpatialTranscriptomics #SystemsBiology #Genomics #BrainResearch #Biotechnology #Innovation #Research

    • +1
  • View profile for Segun Fatumo

    Professor and Chair of Genomic Diversity, Queen Mary University of London; Head of NCD Genomics at MRC/UVRI and LSHTM Uganda Entebbe

    8,950 followers

    I’m excited to share our latest work, published in #Nature Communications This study, delivered through the KidneyGenAfrica Consortium, represents the largest genome-wide association study (GWAS) of kidney function in continental Africans to date. We analysed genomic data from: ** ~26,000 individuals across Eastern, Western, and Southern Africa ** ~81,000 individuals of African ancestry in the diaspora, expanding previous African GWAS efforts by more than eightfold. ==Key insights from our study **We identified several genetic variants that are common in African populations but absent in European and Asian datasets highlighting the limitations of extrapolating findings across populations. **Polygenic risk scores derived from genetically similar populations outperformed those built from larger but genetically distant datasets. **Variants in the APOL1 are known to increase kidney disease risk ~3-fold in African Americans. However, our findings show lower frequencies of these high-risk variants in continental Africa **This suggests that the genetic architecture of kidney disease may differ between African Americans and continental African populations -a crucial insight for risk prediction and treatment strategies. ==The bigger picture This work reinforces a simple but powerful idea: You cannot build equitable genomic medicine without studying diverse populations at scale. By generating large, high-quality genomic datasets directly from Africa, we move closer to more accurate disease prediction Better-targeted therapies and truly inclusive precision medicine A huge thank you to all the analysts, collaborators, participants, and teams who made this possible, most especially Abram Kamiza, Tinashe Chikowore, Sola Ojewunmi, PhD, June Fabian, Michele Ramsay, Brandenburg Jean-Tristan, Precision Healthcare University Research Institute (PHURI), Genomic Diversity Research Group, MRC/UVRI & LSHTM Uganda Research Unit Link to publication: https://lnkd.in/e5fTZQmz Link to the press release: https://lnkd.in/etRgfkT2 #Genomics #PrecisionMedicine #HealthEquity #GWAS #KidneyDisease #KidneyGenAfrica #AfricanGenomics #NatureCommunications #DataScience #GlobalHealth

  • View profile for Ganna Posternak

    Drug Discovery Scientist | Translating Complex Research Into Strategic Insight & Business Value for Biotech | AI & Biotech | Scientific Strategy & Narrative | 15+ Years Experience

    5,973 followers

    Machine Learning in Preclinical Drug Discovery 🧬💊 Machine learning (ML) is increasingly integrated into preclinical drug discovery, offering promising advancements across hit identification, mechanism-of-action elucidation, and translational investigations. A recent paper in Nature Chemical Biology, "Machine Learning in Preclinical Drug Discovery", provides a thorough analysis of how ML is being utilized to enhance efficiency in early-stage drug development. 🔬 Key Insights from the Paper 1️⃣ Hit Identification & Virtual Screening Traditionally, high-throughput screening (HTS) has been the gold standard for identifying potential drug candidates. However, it is resource-intensive and slow. ML-based virtual screening, powered by deep learning models and molecular featurization techniques, is enabling rapid exploration of chemical libraries far beyond what traditional HTS can achieve. The paper highlights the impact of message-passing neural networks (MPNNs) and Deep Docking as effective methods for prioritizing hit compounds. 2️⃣ Mechanism-of-Action (MOA) Elucidation Understanding how a compound interacts with biological targets is critical for drug development. ML is now playing a pivotal role in MOA elucidation through: AlphaFold and RoseTTAFold: AI-driven protein structure prediction is accelerating target identification and binding site analysis. Generative models: Variational autoencoders (VAEs) and diffusion models are not only aiding in de novo drug design but also helping predict chemical interactions with biological systems. 3️⃣ Translational Investigations & ADMET Predictions Many promising compounds fail in later stages due to poor pharmacokinetics and toxicity profiles. ML is being leveraged to enhance ADMET predictions, improving the likelihood of clinical success. The paper discusses advancements in: Solubility and Lipophilicity Predictions: ML-driven models now outperform traditional log(P) estimations, increasing the reliability of early-stage compound selection. Toxicity Screening: AI-powered tools are improving predictions of hERG binding and organ toxicity, reducing late-stage failures. 🚀 The Future of AI in Drug Discovery While ML is proving to be a game-changer, challenges remain, including data quality, interpretability of AI models, and integration with experimental validation. The paper underscores the importance of open-source datasets, AI transparency, and active learning strategies to enhance model accuracy. 🔗 Read the full paper here: https://lnkd.in/gMtXHrHi AI is reshaping the landscape of drug discovery. As these technologies evolve, collaboration between computational scientists, biologists, and chemists will be critical to unlocking their full potential. #AI #MachineLearning #DrugDiscovery #Pharma #Biotech #ArtificialIntelligence #ComputationalBiology #NatureChemicalBiology

Explore categories