🌍 Improving Climate Model Accuracy Using Machine Learning: A Multi-Model Ensemble Approach 📢 Just wrapped up an exciting project where I used Bayesian Optimization + XGBoost to compute a Multi-Model Ensemble (MME) of Global Climate Models (GCMs). 🧠 The Goal: Climate models vary widely. Instead of relying on a single GCM, I combined outputs from multiple models—CESM2-WACCM, INM-CM4-8, and EC-Earth3—to better match observed record. 🔧 The Process: ✅ Data Preprocessing ✔ Cleaned + normalized GCM & observed data ✔ Filled missing values and ensured time-consistent splits ✅ Bayesian Optimization Used scikit-optimize to find optimal hyperparameters for an XGBoost model, accelerating convergence with smart probabilistic search. ✅ Grid Search Refinement Fine-tuned the best Bayesian result using a local Grid Search for extra precision. ✅ Evaluation 📊 Metrics: R², RMSE, and NSE 📈 Visuals: Time series comparison + residual analysis 🔍 Why It Matters: MMEs are crucial for reducing uncertainty in climate predictions. By integrating machine learning with GCM outputs, we can boost reliability for real-world decision-making—from water resource management to climate adaptation strategies. 🚀 Youtube video link🌱 https://lnkd.in/d5k3wFrx #ClimateChange #MachineLearning #XGBoost #BayesianOptimization #GCM #EnvironmentalScience #AI4Climate #Hydrology #DataScience #ClimateModeling #Python #TimeSeries #MME
Building accurate climate models with multiple data sources
Explore top LinkedIn content from expert professionals.
Summary
Building accurate climate models with multiple data sources means combining information from satellites, weather stations, and other technologies to create detailed predictions about climate patterns. By using machine learning and integrating varied datasets, scientists can make climate forecasts that are more reliable and useful for planning and managing environmental risks.
- Combine diverse data: Gather weather observations, satellite imagery, and climate records from multiple sources to build a comprehensive model of current and future climate conditions.
- Apply machine learning: Use AI tools to analyze different datasets, fill in missing information, and improve the precision of climate predictions over time.
- Streamline data access: Develop systems that make it easier for researchers to find, organize, and use climate datasets, allowing for wider participation and faster scientific progress.
-
-
You might have seen news from our Google DeepMind colleagues lately on GenCast, which is changing the game of weather forecasting by building state-of-the-art weather models using AI. Some of our teams started to wonder – can we apply similar techniques to the notoriously compute-intensive challenge of climate modeling? General circulation models (GCMs) are a critical part of climate modeling, focused on the physical aspects of the climate system, such as temperature, pressure, wind, and ocean currents. Traditional GCMs, while powerful, can struggle with precipitation – and our teams wanted to see if AI could help. Our team released a paper and data on our AI-based GCM, building on our Nature paper from last year - specifically, now predicting precipitation with greater accuracy than prior state of the art. The new paper on NeuralGCM introduces 𝗺𝗼𝗱𝗲𝗹𝘀 𝘁𝗵𝗮𝘁 𝗹𝗲𝗮𝗿𝗻 𝗳𝗿𝗼𝗺 𝘀𝗮𝘁𝗲𝗹𝗹𝗶𝘁𝗲 𝗱𝗮𝘁𝗮 𝘁𝗼 𝗽𝗿𝗼𝗱𝘂𝗰𝗲 𝗺𝗼𝗿𝗲 𝗿𝗲𝗮𝗹𝗶𝘀𝘁𝗶𝗰 𝗿𝗮𝗶𝗻 𝗽𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝗼𝗻𝘀. Kudos to Janni Yuval, Ian Langmore, Dmitrii Kochkov, and Stephan Hoyer! Here's why this is a big deal: 𝗟𝗲𝘀𝘀 𝗕𝗶𝗮𝘀, 𝗠𝗼𝗿𝗲 𝗔𝗰𝗰𝘂𝗿𝗮𝗰𝘆: These new models have less bias, meaning they align more closely with actual observations – and we see this both for forecasts up to 15 days, and also for 20-year projections (in which sea surface temperatures and sea ice were fixed at historical values, since we don’t yet have an ocean model). NeuralGCM forecasts are especially performant around extremes, which are especially important in understanding climate anomalies, and can predict rain patterns throughout the day with better precision. 𝗖𝗼𝗺𝗯𝗶𝗻𝗶𝗻𝗴 𝗔𝗜, 𝗦𝗮𝘁𝗲𝗹𝗹𝗶𝘁𝗲 𝗜𝗺𝗮𝗴𝗲𝗿𝘆, 𝗮𝗻𝗱 𝗣𝗵𝘆𝘀𝗶𝗰𝘀: The model combines a learned physics model with a dynamic differentiable core to leverage both physics and AI methods, with the model trained directly on satellite-based precipitation observations. 𝗢𝗽𝗲𝗻 𝗔𝗰𝗰𝗲𝘀𝘀 𝗳𝗼𝗿 𝗘𝘃𝗲𝗿𝘆𝗼𝗻𝗲! This is perhaps the most exciting news! The team has made their pre-trained NeuralGCM model checkpoints (including their awesome new precipitation models) available under a CC BY-SA 4.0 license. Anyone can use and build upon this cutting-edge technology! https://lnkd.in/gfmAx_Ju 𝗪𝗵𝘆 𝗧𝗵𝗶𝘀 𝗠𝗮𝘁𝘁𝗲𝗿𝘀: Accurate predictions of precipitation are crucial for everything from water resource management and flood mitigation to understanding the impacts of climate change on agriculture and ecosystems. Check out the paper to learn more: https://lnkd.in/geqaNTRP
-
🌾New Dataset Out 🌾 🌾 How do we make crop monitoring truly climate-aware at scale? In many EO/ML pipelines, we can model #crop dynamics reasonably well — but linking them consistently with #weather variability, drought, and #climate extremes across large geographies is still difficult. A major reason is simple: 👉 the community still lacks large-scale, multimodal, ML-ready datasets that unify satellites + climate signals + agricultural outcomes. So my PhD student Adrian Höhl (Technical University of Munich) built one. 📢 Very excited to share our new #ScientificData paper introducing #CropClimateX: a large-scale, multi-task, multi-sensory dataset for climate-aware crop monitoring in the contiguous US (2018–2022). 🔍 What makes CropClimateX different? ✅ 15,500 “minicubes” (each 12×12 km) spanning 1,527 counties ✅ Multi-source EO inputs (including Sentinel-1/2, Landsat-8, MODIS) ✅ Climate + extremes context (e.g., Daymet, U.S. Drought Monitor, heat/cold wave indicators) ✅ Supporting multi-task learning targets such as crop yield and broader crop monitoring applications To keep the dataset representative yet scalable, we use an optimized sampling strategy (Sliding Grid + Genetic Algorithm), reducing redundancy while retaining broad cropland coverage. 🚀 Why we hope this helps CropClimateX is designed to support research on: 🌱 climate-aware crop modeling 🛰️ multi-sensor fusion & spatiotemporal learning 🌍 generalizable EO foundation models for agriculture If you’re working on crop monitoring, climate resilience, or geospatial ML, take a look at CropClimateX. 🔗 Link to paper: https://lnkd.in/d3W3mnFZ 🔗 Link to dataset: https://lnkd.in/dVp4s-Mh 🔗 Link to Github: https://lnkd.in/d8DvYkMD This is a collaboration with Stella Ofori-Ampofo, Miguel Ángel Fernández Torres (Universidad Carlos III de Madrid), and Rıdvan Salih (German Aerospace Center (DLR)). The project is funded by the Deutsche Raumfahrtagentur im DLR in the framework of #ML4Earth (project page: ml4earth.de) #RemoteSensing #EarthObservation #GeospatialAI #ClimateAI #AgTech #CropMonitoring #Datasets #MachineLearning International Future AI4EO Lab, TUM School of Engineering and Design (ED)
-
🌍 Climate scientists often face a trade-off: Global Climate Models (GCMs) are essential for long-term climate projections — but they operate at coarse spatial resolution, making them too crude for regional or local decision-making. To get fine-scale data, researchers use Regional Climate Models (RCMs). These add crucial spatial detail, but come at a very high computational cost, often requiring supercomputers to run for months. ➡️ A new paper introduces EnScale — a machine learning framework that offers an efficient and accurate alternative to running full RCM simulations. Instead of solving the complex physics from scratch, EnScale "learns" the relationship between GCMs and RCMs by training on existing paired datasets. It then generates high-resolution, realistic, and diverse regional climate fields directly from GCM inputs. What makes EnScale stand out? ✅ It uses a generative ML model trained with a statistically principled loss (energy score), enabling probabilistic outputs that reflect natural variability and uncertainty ✅ It is multivariate – it learns to generate temperature, precipitation, radiation, and wind jointly, preserving spatial and cross-variable coherence ✅ It is computationally lightweight – training and inference are up to 10–20× faster than state-of-the-art generative approaches ✅ It includes an extension (EnScale-t) for generating temporally consistent time series – a must for studying events like heatwaves or prolonged droughts This approach opens the door to faster, more flexible generation of regional climate scenarios, essential for risk assessment, infrastructure planning, and climate adaptation — especially where computational resources are limited. 📄 Read the full paper: EnScale: Temporally-consistent multivariate generative downscaling via proper scoring rules ---> https://lnkd.in/dQr5rmWU (code: https://lnkd.in/dQk_Jv8g) 👏 Congrats to the authors — a strong step forward for ML-based climate modeling! #climateAI #downscaling #generativeAI #machinelearning #climatescience #EnScale #RCM #GCM #ETHZurich #climatescenarios
-
🧵 Real Stories of Generative AI in Action (Feature 54 of a multi-part series, you can access the full series at #AWSGenAIinAction) 🌍 Columbia University's LEAP- STC (Learning the Earth with AI & Physics Scient and Technology Center) was founded in 2021 with a mission to revolutionize climate modeling by merging traditional physics-based approaches with machine learning to improve near-term climate projections and train the next generation of climate data scientists. 🔬 The Challenge: LEAP faced a fundamental barrier in their mission: their researchers were spending countless hours navigating fragmented data portals across NASA, NCAR, and other institutions, writing complex retrieval scripts, and manually harmonizing inconsistent data formats—technical hurdles that limited participation in critical climate modeling work and slowed the pace of discovery needed for improved near-term climate projections. ☁️ The Solution: Working with the AWS Generative AI Innovation Center (#GenAIIC), LEAP built AutoClimDS on a sophisticated multi-agent architecture where specialized AI agents work together like a human research team. The orchestrator agent interprets research objectives and delegates tasks to specialized agents handling data discovery, acquisition, analytics, and verification—all coordinated around a curated knowledge graph organizing climate datasets, tools, and workflows. The system runs on #AmazonBedrock for foundation model access and generative AI capabilities, Amazon Neptune with Neptune Analytics for knowledge graph database supporting semantic querying and vector-based similarity search, AWS Lambda for serverless data transformation, Amazon S3 for scalable climate dataset storage, and Amazon Textract for document data extraction. A fine-tuned ClimateBERT transformer classifier achieves the semantic linking that bridges raw observational data with structured climate modeling formats. 📊 Impact: AutoClimDS achieves 99.17% semantic accuracy in linking observational metadata to standardized Earth System Model variables and successfully reproduces published climate research workflows from natural language instructions alone. The open-source, modular design democratizes access to climate data science while maintaining the transparency and rigor required for peer-reviewed research—enabling researchers without specialized coding expertise to participate in addressing climate challenges. This breakthrough demonstrates how agentic AI systems can transform scientific research by removing technical barriers and accelerating discovery in fields where accessibility directly impacts our ability to address humanity's most urgent challenges. #AWS #ClimateScience #GenerativeAI #MachineLearning #AIforGood #CloudComputing #ScientificResearch 🔗 https://lnkd.in/gHhNCgJb
Explore categories
- Hospitality & Tourism
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Healthcare
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Career
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development