Plot the climate stripes using NASA GISTEMP and Python.

Plot the climate stripes using NASA GISTEMP and Python.

Intended audience: Early Career/Students; Educators; Science communicators

By Vasco Mantas (Feb 2024)

🎯 Learning objectives:

  1. Understanding Climate Data Processing: Learn how to process and manipulate climate data, specifically GISTEMP data, using a Google Colab and Python libraries such as Pandas . Understand concepts such as anomaly calculation and data normalization to prepare the data for visualization.
  2. Visualization Techniques with Python and Matplotlib: Explore visualization techniques using Matplotlib to create the iconic climate stripes. Learn how to use colormaps to represent temperature anomalies, customize bar plots for displaying yearly data, and incorporate colorbars to provide a visual reference for temperature variations.
  3. Interpreting Climate Patterns: Gain insight into interpreting climate patterns and trends from climate stripes. Understand how to identify long-term temperature trends, anomalies, and variability from the visualization. Learn to communicate climate change effectively by understanding the visual cues provided by climate stripes.

This continuation of our GISTEMP tutorial (find it here) guides you through the needed steps of constructing climate stripes graphs using Python libraries. We'll leverage the foundation you established in Part 1, building upon your data acquisition and manipulation skills to transform numbers into visually striking stripes that paint a clear picture of our warming planet.

But what are the Climate Stripes?

Climate stripes are a visualization technique representing temperature anomalies over time using colored stripes. Each stripe represents a year, with colors indicating temperature anomalies (cooler to warmer). They're effective for conveying long-term temperature trends, communicating climate change, and highlighting temperature variations in a visually intuitive way.

So, are you ready to unveil the hidden narratives within GISTEMP data? Let's dive into the world of climate stripes and empower ourselves to become effective communicators of climate science!


Target audience: This tutorial is designed for those at the introductory level, making it particularly valuable for students or early career professionals seeking to learn about the dataset and quick methods for plotting its data. Additionally, educators and science communicators with similar objectives will find this tutorial a valuable resource.

The tutorial is organized in 2 parts: 1) Access data and create a Google Colab (python); 2) Load GISTEMP data and plot the anomalies in the form of climate stripes, similar to those created by Dr. Ed Hawkins.


For a background on GISTEMP and an explanation on our motivation to write the tutorial, please read the previous article here.

To make sure readers can follow the tutorial even if they didn't read the previous article, all needed code is included (even if it becomes redundant to those who completed the first part).


PART 1: Download data and setup the environment (skip part 1, if you completed the previous tutorial).

Step 1: Download GISTEMP Data

The first crucial step in your journey to analyzing climate data with GISS Surface Temperature Analysis version 4 (GISTEMP) in Python is to obtain the dataset. You can access this data from the official GISTEMP website provided by NASA. Here's how to do it:


  1. Go to the GISTEMP website: https://data.giss.nasa.gov/gistemp/.
  2. Scroll down until you find the section titled "Global-mean monthly, seasonal, and annual means, 1880-present, updated through most recent month":


This section contains links to the GISTEMP data in various formats, including TXT and CSV. For this tutorial, we'll focus on downloading the data in CSV format.

The name of the file is: GLB.Ts+dSST.csv


Important note:

To ensure that everyone can follow this tutorial seamlessly, regardless of Python installations and other technical details, we'll be using the free Google Colab tool.

Google Colab is a cloud-based platform that provides free access to Jupyter notebooks. It's an ideal choice for this tutorial as it requires no setup and allows for collaborative and interactive coding. Google Colab provides a Python runtime environment and integrates with Google Drive, making it an excellent choice for data analysis and machine learning projects without the need for local installations.

Step 2: Set Up a Google Drive Folder, Create a Google Colab Notebook, and Upload the CSV File

In this step, we'll organize your Google Drive by creating a dedicated folder, generating a Google Colab notebook within that folder, and then proceeding to upload the GISTEMP CSV file. This method streamlines access to your data and helps keep your analysis organized.

  1. Create a Google Drive Folder: a) Open your Google Drive in a web browser and log in to your Google account if you haven't already; b) Click on the "+ New" button on the left-hand side to create a new folder. Name it appropriately, e.g., "GISTEMP Analysis."
  2. Create a Google Colab Notebook: a) Inside the "GISTEMP Analysis" folder, click on the "+ New" button once more, but this time select "More" and then "Google Colaboratory."A new Google Colab notebook will be created and automatically placed within the folder.
  3. Upload the CSV File: a) Open the "GISTEMP Analysis" folder by double-clicking on it; b) In your newly created Google Colab notebook, you can upload the GISTEMP CSV file directly. To do this, click on the "File" menu in the notebook and select "Upload."; c) Locate the GISTEMP CSV file that you downloaded in Step 1 on your computer, select it, and click "Open."The CSV file will be uploaded to your Google Colab notebook and placed within your "GISTEMP Analysis" folder in Google Drive.


You now have a dedicated folder on your Google Drive named "GISTEMP Analysis," a Google Colab notebook for your analysis, and the GISTEMP CSV file, all neatly organized within the same location. This setup ensures easy access to your dataset and maintains a tidy workspace for your analysis.


Part 2: Load GISTEMP data and plot the anomalies in the form of climate stripes

Step 3. Mount your Google Drive folder

Now, we'll proceed to code the processing and analysis of the GISTEMP data within the Google Colab environment.

# Mount the google drive
from google.colab import drive
drive.mount('/content/drive')        

In this code snippet, you are using Google Colab, a cloud-based Jupyter Notebook environment, to mount your Google Drive into your Colab session.


Step 4. Load and plot GISTEMP data

The following code essentially loads, preprocesses, visualizes, and analyzes the GISTEMP data to identify and visualize temperature anomalies and their overall trend over time. The trend line helps to reveal long-term climate trends.

The selected CSV file contains monthly temperature anomaly data. Each row represents a different year, and the columns store temperature anomaly values for each month of the year (January to December), as well as annual values (e.g., "J-D" for the annual mean). The temperature anomalies are measured in degrees Celsius (°C) and represent deviations from a reference period. Some missing values are denoted as '***,' and the data spans multiple years, allowing for the analysis of long-term temperature trends.

We will narrow our focus to annual trends in our analysis. This means that we will examine temperature anomalies and trends on a yearly basis, which can provide a more comprehensive and discernible view of how global temperatures have evolved over time. By breaking down the data into yearly increments, we can identify patterns and fluctuations more clearly, making it easier to draw meaningful conclusions from the plot.

Step 5. Read the GISTEMP data

filePath = 'REPLACE WITH YOUR FILE PATH'

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import linregress
import numpy as np        

Explaining the code above:

  • In the first part, the code sets the file path filePath to the location of your GISTEMP data CSV file. It then imports the necessary libraries, including Pandas for data handling, Matplotlib for plotting, Seaborn for data visualization enhancements, Scipy's linregress for linear regression analysis, and NumPy for numerical operations.
  • The file path might look something like:

'/content/drive/My Drive/GISTEMP/GLB.Ts+dSST.csv'

data = pd.read_csv(filePath, skiprows=1,  na_values='***')        

Explaining the code above:

  • This code reads the CSV file using Pandas, skipping the first row (a header row) and replacing any '***' values in the dataset with NaN (Not a Number). It loads the data into a Pandas DataFrame called data.

data = data[["Year", "J-D"]]        

Explaining the code above:

  • Here, the code selects only two columns from the DataFrame: "Year" and "J-D," which correspond to the year and the annual temperature anomaly values.

data = data.replace("***", np.nan)        

Explaining the code above:

  • This line replaces any remaining '***' values in the selected columns with NaN.

data["Year"] = pd.to_numeric(data["Year"])
data["J-D"] = pd.to_numeric(data["J-D"]) 
data = data.dropna()        

Explaining the code above:

  • These lines convert the "Year" and "J-D" columns to numeric data types, ensuring that the data is suitable for plotting and analysis. It may be redundant, but still worth it to include.
  • Rows with missing (NaN) values are removed from the dataset to prepare it for visualization and analysis.

From here on, the code dives into the thrilling realm of climate stripes visualization! While code so far equipped you with data acquisition skills, already covered in the previous tutorial, this section introduces fresh code specifically tailored to craft these impactful representations of global temperature trends. So, buckle up and get ready to transform numbers into visually captivating stripes that tell a powerful and sad story about our warming planet.

Step 6. Plot the climate stripes using GISTEMP data.

cmap = plt.get_cmap('coolwarm')        

Explaining the code above:

  • cmap = plt.get_cmap('coolwarm')The code defines a colormap using the plt.get_cmap() function from the Matplotlib library. The chosen colormap is 'coolwarm', which represents a range of colors from cool tones (like blue) to warm tones (like red).
  • Colormaps are used in visualizations to assign colors to data points based on their values, helping to convey information effectively through color variations.

Now, we need to determine the minimum and maximum anomaly values, and normalize to a range that is easier to plot.

anomaly_min = data['J-D'].min()
anomaly_max = data['J-D'].max()
normalize = mcolors.Normalize(vmin=anomaly_min, vmax=anomaly_max)        

Explaining the code above:

  • The code calculates the minimum anomaly value from the 'J-D' column of the data variable and assigns it to anomaly_min. It also calculates the maximum anomaly value from the same column and assigns it to anomaly_max.
  • These minimum and maximum values are then used to normalize the anomaly values to the range [0, 1].
  • The mcolors.Normalize() function from Matplotlib's colors module is utilized for this purpose, with vmin set to anomaly_min and vmax set to anomaly_max. This ensures that the anomaly values are scaled proportionally to fit within the [0, 1] range, which is often necessary for proper colormap visualization.

Next, we'll create a list of colors based on the range of anomaly values.

colors = [cmap(normalize(val)) for val in data['J-D']]        

Explaining the code above:

  • It utilizes a list comprehension to iterate over each anomaly value val in the 'J-D' column of the data variable. For each val, it normalizes the value using the normalize object created previously. This ensures that the anomaly value is mapped to a value within the [0, 1] range.
  • The normalized value is then passed to the colormap (cmap) using cmap(normalize(val)). This returns the corresponding color from the colormap based on the normalized value.
  • The resulting list contains colors corresponding to each anomaly value, allowing for visualization of the data using colors that reflect the magnitude of anomalies.

The following code chunk is responsible for creating a bar graph visualization of global temperature anomalies over time.

plt.style.use('dark_background')
plt.figure(figsize=(12, 6))
plt.bar(data['Year'], [1] * len(data), color=colors, width=bar_width)
plt.colorbar(plt.cm.ScalarMappable(norm=normalize, cmap=cmap), label='Temperature Anomaly (°C)')
plt.title('Global Temperature Anomalies Over Time (Bar Graph)')
plt.xlabel('Year')
plt.ylabel('Temperature Anomaly (°C)')        

Explaining the code above:

  • plt.figure(figsize=(12, 6)): This line creates a new figure for the plot with a specified size of 12 inches in width and 6 inches in height.
  • plt.bar(data['Year'], [1] * len(data), color=colors, width=bar_width): This line creates a bar plot where each bar represents a year. The height of each bar is set to 1, and the width of the bars is defined by the bar_width variable. The color parameter is set to the list of colors generated earlier based on the anomaly values.
  • plt.colorbar(plt.cm.ScalarMappable(norm=normalize, cmap=cmap), label='Temperature Anomaly (°C)'): This line adds a colorbar to the plot. The colorbar represents the range of temperature anomalies, with colors mapped according to the colormap (cmap) and normalized using the normalize object. The label parameter sets the label for the colorbar, indicating the unit of temperature anomaly in degrees Celsius.
  • plt.title('Global Temperature Anomalies Over Time (Bar Graph)'): Sets the title of the plot as "Global Temperature Anomalies Over Time (Bar Graph)".
  • plt.xlabel('Year'): Sets the label for the x-axis as "Year".
  • plt.ylabel('Bars (Equal Heights)'): Sets the label for the y-axis as "Bars (Equal Heights)".

Finally, let's apply some adjustments to the plot's axes limits before displaying the plot.

plt.ylim(0, 1)
plt.xlim(1880,2022)

plt.show()        

Explaining the code above:

  • plt.show()plt.ylim(0, 1): Sets the limits for the y-axis, ensuring that the range is from 0 to 1. This ensures that the bars representing anomalies are all of equal height and are consistent with the normalization performed earlier.
  • plt.xlim(1880, 2022): Sets the limits for the x-axis, restricting the range from the year 1880 to the year 2022. This focuses the plot on the specified time period.
  • plt.show(): Finally, this command displays the plot with all the specified settings and adjustments.

Congratulations! You reached the end of this tutorial. This is what you should get (approximately) after running the code described above:

Article content

In a climate stripes plot, viewers can interpret various aspects of temperature patterns over time. The color gradient, ranging from cooler blue tones to warmer red tones, allows for immediate visual identification of temperature anomalies. Blue stripes indicate lower-than-average temperatures, typically observed earlier in the analysis period, while red stripes signify higher-than-average temperatures, often occurring later in the timeline. By observing the overall pattern of colors, viewers can discern long-term temperature trends, identifying periods of stability, fluctuations, or significant changes in temperature over time. A concerning trend highlighted by climate stripes is the gradual transition from predominantly cooler blue stripes to warmer red stripes, indicating a consistent increase in global temperatures over the analysis period. This visual representation effectively communicates the phenomenon of climate change, emphasizing the urgency of addressing its impacts on our planet's climate system.


In conclusion, climate stripes offer a clear visualization of temperature anomalies over time, revealing trends such as the gradual warming of our planet. Stay tuned for upcoming tutorials that will further explore the use of GISTEMP and other datasets, providing insights into plotting and interpreting climate data to deepen your understanding of climate change dynamics.


📖 Read more about GISTEMP and the climate stripes:

Hansen, J., R. Ruedy, M. Sato, and K. Lo, 2010: Global surface temperature change. Rev. Geophys., 48, RG4004, doi:10.1029/2010RG000345. (🔗 Click here for the pdf)

Lenssen, N., G. Schmidt, J. Hansen, M. Menne, A. Persin, R. Ruedy, and D. Zyss, 2019: Improvements in the GISTEMP uncertainty model. J. Geophys. Res. Atmos., 124, no. 12, 6307-6326, doi:10.1029/2018JD029522. (🔗 Click here)

Website on the climate stripes by the University of Reading. (🔗 Click here)


📖 Read more about Google Colab:

https://colab.research.google.com/?utm_source=scs-index

https://colab.google/


📖 GISTEMP citation:



📢Follow me or Connect on LinkedIn for more



Looking forward to trying it out! 🌿📈

Like
Reply

Can you put up a tutorial on Shortest Route Anaylsis using Network Dataset,using Python?

Like
Reply

Congratulations on your tutorial publication! Exciting to see your guide on creating climate stripes with NASA GISTEMP. It's a powerful tool for science communication, offering insights into climate change trends. The Python libraries for visualization make it even more valuable. Well done! 🎉📊

Like
Reply

The bar_width variable in plt.bar() function needs to be defined earlier.

To view or add a comment, sign in

More articles by Vasco Mantas

Others also viewed

Explore content categories