Pythonizing Business Efficiency: Bioenergy Supply Chain Optimization
Photo by Ronan Furuta on Unsplash

Pythonizing Business Efficiency: Bioenergy Supply Chain Optimization


Quest for Sustainable Energy

So, I recently participated in my very first Shell.ai Hackathon (Waste to Energy — Shell.ai Hackathon 2023). The problem for this year is one that most, if not every, business intrinsically faces today — supply chain optimization. Irrespective of the industry, the million-dollar question is — how can we increase efficiency while reducing cost? In this case study, we are dealing with the bioenergy industry- something I am excited about as the world cranks down on fossil fuels.

I will now provide an overview of the problem. As promising as an alternative fuel source is, it comes with its caveats, limiting widespread adoption.

  1. First and foremost, implementing these innovative options can be expensive. To make them a viable choice compared to traditional gasoline, diesel, and natural gas, it takes ambitious government goals and substantial financial support in the form of subsidies. Despite these goals and aids, there is yet another obstacle.
  2. Biomass-energy conversion ratio.: Around 100 tons of crude oil can provide roughly 75 tons of gasoline and diesel combined. However, generating the same 75 tons of biofuel takes 375 tons of biomass in a biorefinery — a staggering amount.
  3. Management of complex logistics.: Unlike oil wells, biomass concentration in a single, easily accessible location is seldom the case. Instead, it is distributed across numerous agricultural lands, spanning various geographical areas, thus, the intricate logistics of gathering biomass from diverse sources.

Now, let us envision the setup of a biorefinery in a geographic region. To do this effectively, one needs a deep understanding of the region’s current and future biomass production. The collection and transportation of this biomass to intermediate depots become crucial steps in the process. It is a complex puzzle that requires careful planning and coordination.

In a nutshell, while the potential for cleaner alternatives is immense, there are hurdles to jump. However, with determination, collaboration, and innovation, we can clear these hurdles, leading us towards a more sustainable and eco-friendly future for all.


Data is Everything

You may refer to the original competition’s problem statement here, but I encourage you to read along to better understand it in ‘“natural language”.

Article content
The State of Gujarat, India (Google Earth)

If you look over the state of Gujarat, India, on Google Maps, you’ll find agricultural fields spread all over it. These are the fields where the biomass is generated and harvested in variable quantities, i.e., each site may produce a different amount of biomass, which could vary over time (in this case, years). This fluctuation could be due to several factors, such as the size of the agricultural field, its productivity, type of crop, and climatic conditions.

Furthermore, there is an interconnecting of road networks between the fields. The presence of curves and corners in the road networks indicates in some cases, the road leading from field-A to field-B is not the same road that leads from field-B to field-A. Thus, the distances between two fields could be different based on what direction you’re going. Since there is a sparse distribution of the biomass across the region, it only makes sense that we have intermediate locations, like a depot, where we could gather biomass from nearby fields to be preprocessed (i.e., dehydrated and densified into pellets) before being shipped to the nearest refineries.

Since we do not have infinite resources, we would have to consider some constraints in solving this problem.

  • There are a total of 2418 biomass harvest sites.
  • We can only build a maximum of 25 depots and five refineries.
  • Each depot and refinery has a maximum capacity of 20,000 and 100,000, respectively.
  • We must process not less than 80% of the total biomass forecasted for the year.
  • Finally, and somewhat obviously, for each harvest site for each year, we cannot transport a negative amount of biomass, nor can we transport more than was forecasted. Again, the sum of the biomass (collected from several harvest sites) shipped to a depot and a refinery must not exceed its maximum capacity.

To solve this problem, we need data. Thankfully, we don’t have to go scavenging to find one. Here is the dataset we’ll be working with.

Biomass_History.csv: Here, we have a time series of biomass production history in Gujarat from 2010 to 2017. The breakdown considers the arable land, depicted as a map comprising 2418 uniformly sized grid blocks, representing the distinct harvesting sites. Accompanying the dataset is the location index, latitude, and longitude.

Distance_Matrix.csv: The travel distance between the source grid block and the destination grid block is encapsulated within a matrix of dimensions 2418 by 2418. As hinted above, it’s important to note that this matrix is not symmetric owing to factors such as U-turns and one-way routes, which contribute to variations in distances for trips from source to destination as opposed to trips from destination to source.

sample_submission.csv: Contains sample format for the solution submissions.

You can find the full dataset here.

Given the prerequisites above, our task is to:

First, forecast the volume of biomass for each biomass site for both the year 2018 and 2019.
Second, determine the optimal number of depots and refineries needed to process that amount of forecasted biomass. It should be capable of processing at least 80% of the total forecasted biomass each year.
Third, determine the optimal locations to build the depots, considering the amount of biomass produced in each site and the distances from each other. Depots must be within the distance matrix of the biomass sites. Indeed, my first thought was to locate the depots in the harvest sites that produce the most biomass, but the issue with that idea is that it only considers the amount and not the distances the biomass has to travel. As you’ll soon see on the map, there is a cluster of some of the most productive biomass sites in a few areas. We could build the 25 depots there, but that would not be optimal because the rest of the sites would have to travel longer distances to transport the biomass for preprocessing, defeating the reason we decided to have these depots in the first place. The same goes for the refineries.
Fourth, we must allocate the sites (location) and the amount (quantity) of biomass to haul to each depot. Subsequently, we must allocate the site and the amount of pellets to haul from each depot to the refineries.

Without further ado, let’s jump right in.


Step 1: Exploratory Data Analysis (EDA)

We begin by importing several Python packages.

import pandas as pd
import geopandas as gpd
pd.set_option('display.max_columns', 200)
import os        

Next we load our datasets.

DATA_PATH = "/kaggle/input/shell-ai-waste-to-energy-dataset"

bh_dataset = pd.read_csv(os.path.join(DATA_PATH, "Biomass_History.csv"))
dm_dataset = pd.read_csv(os.path.join(DATA_PATH, "Distance_Matrix.csv"))
ss_dataset = pd.read_csv(os.path.join(DATA_PATH, "sample_submission.csv"))

bh_dataset.head()        
Article content
Table of biomass production in 5 harvest sites from the year 2010 to 2017

Next we combine the longitudes and latitudes to create a point geography for plotting.

# Create point geometries
geometry = gpd.points_from_xy(bh_dataset.Longitude, bh_dataset.Latitude)
geo_df = gpd.GeoDataFrame(bh_dataset, geometry=geometry)
# Plot biomass production distribution for the year 2017
geo_df.plot("2017", legend = True)        
Article content
Plot of biomass production distribution for the year 2017

To keep the length of the post in check, I will be publishing the solution in chunks over several manageable posts. Please leave a like (clap) if you enjoyed this post. See you soon!



To view or add a comment, sign in

More articles by Collins Patrick O.

Others also viewed

Explore content categories