Plot GISTEMP Climate Data with Python

Plot GISTEMP Climate Data with Python

Intended audience: Early Career/Students; Educators; Science communicators

By Vasco Mantas (Oct 2023)

🎯 Learning objectives:

  1. Raise Awareness of Important Climate Datasets: The first learning objective is to increase awareness about the significance of climate datasets like GISTEMP, their role in climate research, and their contribution to our understanding of global temperature trends.
  2. Master Data Access and Visualization: The second objective is to empower learners with the skills to access and visualize climate data effectively using Python. This includes loading data, handling missing values, and creating informative visualizations.
  3. Analyze Climate Trends: The third objective is to enable participants to analyze climate trends and detect patterns within the GISTEMP dataset. This involves learning how to perform statistical analyses, such as linear regression, to identify long-term temperature trends and fluctuations.

Motivation

As the world grapples with the far-reaching consequences of climate change, the importance of understanding and monitoring global temperature trends has never been more critical. António Guterres, the Secretary-General of the United Nations, poignantly stated, "Our planet has just endured a season of simmering — the hottest summer on record. Climate breakdown has begun." These words serve as a stark reminder of the urgent need to address the challenges of our changing climate.

At the time of writing, we have just emerged from the warmest September on record ("by a large margin"), a fact underscored by organizations like the Copernicus Climate Change Service and other reputable sources. This alarming trend highlights the pressing nature of climate change and its undeniable impact on our planet.

In this tutorial, we embark on a journey to explore one of the pivotal climate datasets, GISTEMP (GISS Surface Temperature Analysis), which offers invaluable insights into global temperature anomalies. By gaining the skills to access, analyze, and visualize this data, we take a step towards equipping ourselves to comprehend the complex realities of climate change, facilitating informed discussions and actions to mitigate its effects.

Furthermore, it is crucial to share this data with a broad spectrum of audiences, including students, citizens, and decision-makers. In doing so, we not only raise awareness but also empower individuals to make informed choices and influence policies that can address the global challenges posed by climate breakdown.

Let's delve into the world of climate data, uncover trends, and contribute to the global effort to address the challenges of climate breakdown.

Background on the dataset used

GISTEMP v4, short for the GISS Surface Temperature Analysis version 4, is a pivotal climate dataset created by NASA's Goddard Institute for Space Studies. Drawing from meteorological stations (NOAA GHCN v4) and sea surface temperature records (ERSST v5), it serves as a robust tool for estimating changes in global surface temperatures. What makes GISTEMP such an important dataset is its ability to provide a monthly-resolution timeline of temperature records, spanning from 1880 to the present day. An important attribute of the dataset is the careful quantification of uncertainty. Employing interpolation methods to fill data gaps, GISTEMP offers a comprehensive perspective on global and regional temperature fluctuations, making it an indispensable resource for climate researchers.


Target audience: This tutorial is designed for those at the introductory level, making it particularly valuable for students or early career professionals seeking to learn about the dataset and quick methods for plotting its data. Additionally, educators and science communicators with similar objectives will find this tutorial a valuable resource.

The tutorial is organized in 2 parts: 1) Access data and create a Google Colab (python); 2) Load GISTEMP data and plot the annual data / trends (full dataset and post-2000).


PART 1: Download data and setup the environment

Step 1: Download GISTEMP Data

The first crucial step in your journey to analyzing climate data with GISS Surface Temperature Analysis version 4 (GISTEMP) in Python is to obtain the dataset. You can access this data from the official GISTEMP website provided by NASA. Here's how to do it:

  1. Go to the GISTEMP website: https://data.giss.nasa.gov/gistemp/.
  2. Scroll down until you find the section titled "Global-mean monthly, seasonal, and annual means, 1880-present, updated through most recent month":

This section contains links to the GISTEMP data in various formats, including TXT and CSV. For this tutorial, we'll focus on downloading the data in CSV format.

The name of the file is: GLB.Ts+dSST.csv


Important note:

To ensure that everyone can follow this tutorial seamlessly, regardless of Python installations and other technical details, we'll be using the free Google Colab tool.

Google Colab is a cloud-based platform that provides free access to Jupyter notebooks. It's an ideal choice for this tutorial as it requires no setup and allows for collaborative and interactive coding. Google Colab provides a Python runtime environment and integrates with Google Drive, making it an excellent choice for data analysis and machine learning projects without the need for local installations.


Step 2: Set Up a Google Drive Folder, Create a Google Colab Notebook, and Upload the CSV File

In this step, we'll organize your Google Drive by creating a dedicated folder, generating a Google Colab notebook within that folder, and then proceeding to upload the GISTEMP CSV file. This method streamlines access to your data and helps keep your analysis organized.

  1. Create a Google Drive Folder: a) Open your Google Drive in a web browser and log in to your Google account if you haven't already; b) Click on the "+ New" button on the left-hand side to create a new folder. Name it appropriately, e.g., "GISTEMP Analysis."
  2. Create a Google Colab Notebook: a) Inside the "GISTEMP Analysis" folder, click on the "+ New" button once more, but this time select "More" and then "Google Colaboratory."A new Google Colab notebook will be created and automatically placed within the folder.
  3. Upload the CSV File: a) Open the "GISTEMP Analysis" folder by double-clicking on it; b) In your newly created Google Colab notebook, you can upload the GISTEMP CSV file directly. To do this, click on the "File" menu in the notebook and select "Upload."; c) Locate the GISTEMP CSV file that you downloaded in Step 1 on your computer, select it, and click "Open."The CSV file will be uploaded to your Google Colab notebook and placed within your "GISTEMP Analysis" folder in Google Drive.

You now have a dedicated folder on your Google Drive named "GISTEMP Analysis," a Google Colab notebook for your analysis, and the GISTEMP CSV file, all neatly organized within the same location. This setup ensures easy access to your dataset and maintains a tidy workspace for your analysis.


Part 2: Now, let's start coding!

Step 3. Mount your Google Drive folder

Now, we'll proceed to code the processing and analysis of the GISTEMP data within the Google Colab environment.

# Mount the google drive
from google.colab import drive
drive.mount('/content/drive')        

In this code snippet, you are using Google Colab, a cloud-based Jupyter Notebook environment, to mount your Google Drive into your Colab session.


Step 4. Load and plot GISTEMP data

The following code essentially loads, preprocesses, visualizes, and analyzes the GISTEMP data to identify and visualize temperature anomalies and their overall trend over time. The trend line helps to reveal long-term climate trends.

The selected CSV file contains monthly temperature anomaly data. Each row represents a different year, and the columns store temperature anomaly values for each month of the year (January to December), as well as annual values (e.g., "J-D" for the annual mean). The temperature anomalies are measured in degrees Celsius (°C) and represent deviations from a reference period. Some missing values are denoted as '***,' and the data spans multiple years, allowing for the analysis of long-term temperature trends.

We will narrow our focus to annual trends in our analysis. This means that we will examine temperature anomalies and trends on a yearly basis, which can provide a more comprehensive and discernible view of how global temperatures have evolved over time. By breaking down the data into yearly increments, we can identify patterns and fluctuations more clearly, making it easier to draw meaningful conclusions from the plot.

Step 5. Read the GISTEMP data

filePath = 'REPLACE WITH YOUR FILE PATH'

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import linregress
import numpy as np        

Explaining the code above:

  • In the first part, the code sets the file path filePath to the location of your GISTEMP data CSV file. It then imports the necessary libraries, including Pandas for data handling, Matplotlib for plotting, Seaborn for data visualization enhancements, Scipy's linregress for linear regression analysis, and NumPy for numerical operations.
  • The file path might look something like:

'/content/drive/My Drive/GISTEMP/GLB.Ts+dSST.csv'

data = pd.read_csv(filePath, skiprows=1,  na_values='***')        

Explaining the code above:

  • This code reads the CSV file using Pandas, skipping the first row (a header row) and replacing any '***' values in the dataset with NaN (Not a Number). It loads the data into a Pandas DataFrame called data.

data = data[["Year", "J-D"]]        

Explaining the code above:

  • Here, the code selects only two columns from the DataFrame: "Year" and "J-D," which correspond to the year and the annual temperature anomaly values.

data = data.replace("***", np.nan)        

Explaining the code above:

  • This line replaces any remaining '***' values in the selected columns with NaN.

data["Year"] = pd.to_numeric(data["Year"])
data["J-D"] = pd.to_numeric(data["J-D"])

data = data.dropna()        

Explaining the code above:

  • These lines convert the "Year" and "J-D" columns to numeric data types, ensuring that the data is suitable for plotting and analysis. It may be redundant, but still worth it to include.
  • Rows with missing (NaN) values are removed from the dataset to prepare it for visualization and analysis.

Step 6. Start plotting the data and calculate trends:

plt.figure(figsize=(12, 6))
sns.lineplot(x='Year', y='J-D', data=data)
plt.title('Global Temperature Anomalies Over Time (Annual Anomaly)')
plt.xlabel('Year')
plt.ylabel('Temperature Anomaly (°C)')        

Explaining the code above:

  • This code sets up a figure for the plot, and uses Seaborn to create a line plot of "Year" on the x-axis and "J-D" (annual temperature anomaly) on the y-axis. It also adds a title and labels to the plot for clarity.

Next, we will calculate the trend and add it to the figure:

slope, intercept, _, _, _ = linregress(range(len(data)), data['J-D'])
trend_line = intercept + slope * range(len(data))        

Explaining the code above:

  • The code detects climate trends by performing linear regression analysis using linregress(). This calculates the slope and intercept of the trend line. The trend line is created based on these values and is plotted in red. The legend is added to the plot to label the trend line.

plt.plot(data['Year'], trend_line, label='Trend Line', color='red')
plt.legend()
plt.show()        

Explaining the code above:

  • The trend line is created based on the slope and intercept values and is plotted in red. The legend is added to the plot to label the trend line.

Congratulations! You reached the end of this tutorial. This is what you should get (approximately) after running the code described above:

Article content
Figure 1. GISTEMP data (blue) and the trend (red) are plotted using Python.

In conclusion, this tutorial has guided you through the process of accessing, processing, visualizing, and analyzing GISTEMP temperature anomaly data (clearly highlighting the warming trends) in Python. With these newly acquired skills, you are well-equipped to explore climate data from GISTEMP and other sources and gain a deeper understanding of our planet's changing climate (and sharing it with others).

What if you want to compare trends over different periods within your dataset?

That will be the topic of an upcoming tutorial!


📖 Read more about GISTEMP:

Hansen, J., R. Ruedy, M. Sato, and K. Lo, 2010: Global surface temperature change. Rev. Geophys., 48, RG4004, doi:10.1029/2010RG000345. (🔗 Click here for the pdf)

Lenssen, N., G. Schmidt, J. Hansen, M. Menne, A. Persin, R. Ruedy, and D. Zyss, 2019: Improvements in the GISTEMP uncertainty model. J. Geophys. Res. Atmos., 124, no. 12, 6307-6326, doi:10.1029/2018JD029522. (🔗 Click here)


📖 Read more about Google Colab:

https://colab.research.google.com/?utm_source=scs-index

https://colab.google/


📖 GISTEMP citation:

📢Follow me or Connect on LinkedIn for more


Article content



Thank you for the excellent walk through! I was able to reproduce it in a Jupyter notebook easily to help my daughter with a high school International Baccalaureate project!

To view or add a comment, sign in

More articles by Vasco Mantas

Others also viewed

Explore content categories