Decline Curve Analysis with Python

Decline Curve Analysis with Python

Oil and gas fields need to be forecasted. Before the subject of reservoir engineering (mainly the Decline Curve Analysis, or DCA) came as knowledge and practice, people still did not have the notion that oil and gas reservoirs could be made predictable. No longer until J. J. Arps published his "Analysis of Decline Curves" to AIME in 1945, it is now the duty of every reservoir engineer to be able to forecast the future production rate of their wells in the field.

Before Arps classified three decline curve types, actually there were many engineers as early as in 1908 who were analyzing decline curves. In 1924, Cutler discovered the hyperbolic decline on a log-log paper for his analysis. One year later, Larkey proposed the use of least squares to extrapolate decline curves. Both contributed the basis to Arps.

DCA has been general knowledge and practice nowadays to forecast future production. Additionally, the popularity of scientific computing gives more promises to engineers so that we can do rigorous analysis much faster using programs. This objective of this article is to share a tutorial on doing DCA using Python.

Three Decline Curve Types of Arps

Arps (1945) formulated the general decline function for oil and gas wells as follows:

No alt text provided for this image

Where q is the production rate (STB/day or SCF/day), t is the time (days), qi is the initial production rate (same unit as q), di is the initial decline rate (same unit as q), and b is the decline exponent. This decline function is known as the hyperbolic decline function.

Based on the value of b in the function, he then classified the decline curves into three types:

  1. The exponential decline has b equals 0.
  2. The harmonic decline has b equals 1.
  3. The hyperbolic decline has b ranges between 0 and 1.

In other words, the hyperbolic decline is the general decline curve. If presented graphically (see below), the harmonic decline has the least decline rate and the exponential decline has the most decline rate.

No alt text provided for this image

Given the data that contains the production rate (q) over time (t), we need to find out the three unknown variables that best match the data, which are qi, di, and b. This is the essential of DCA. Non-linear regression is the widely used technique to estimate these three variables.

In some applications, we may find out that b value is larger than 1. Although this result is formally not established in Arps decline curves, there are lots of interpretations why b can be larger than 1. This will be briefly discussed in the last section about Arps limitation.

Now let us start the fun part.

Python Tutorial for Decline Curve Analysis

In this tutorial, I will take a sample from production data of Norne Field in the North Sea. The sample is from one well and recorded gas production rate between 2004 to 2006. The Norne Field production dataset is licensed by The Norwegian Petroleum Directorate for research purposes in universities and institutions. The original dataset contains more than one wells with a recorded gas, oil, and water production data from 1997 to 2006. The sample data is available in my repository: Here.

No alt text provided for this image

As usual, firstly, import the three libraries.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Load the data, create into dataframe using Pandas, convert the date column into datetime format, and specify both the time and the production rate.

filepath = './norne_production_rate_sample.csv'


# load data
df = pd.read_csv(filepath)


# convert date string to Panda datetime format
df['Date'] =  pd.to_datetime(df['Date'], format='%Y-%m-%d') 


t = df['Date']
q = df['Rate (SCF/d)']

# display the data
df

Here, we need to take concern. In a production data (that will be ready for DCA analysis), the time axis should be displayed as days (from 0 to N days), not as dates. The following scripts do the conversion.

import datetime


# subtract one datetime to another datetime
timedelta = [j-i for i, j in zip(t[:-1], t[1:])]
timedelta = np.array(timedelta)
timedelta = timedelta / datetime.timedelta(days=1)


# take cumulative sum over timedeltas
t = np.cumsum(timedelta)
t = np.append(0, t)
t = t.astype(float)

Next, we plot the data.

plt.plot(t, q, '.', color='red')
plt.title('Production Rate from 01/12/1990 to 01/11/1992', size=13, pad=15)
plt.xlabel('Days')
plt.ylabel('Rate (SCF/d)')
plt.xlim(xmin=0); plt.ylim(ymin=0)


plt.show()

The gas production rate from 1 April 2004 to 1 December 2006 looks like the following.

No alt text provided for this image

On this data, we will now do a decline curve analysis using the non-linear regression (curve fitting) technique, that is available in the Scipy package. Before doing curve fitting, it's important to normalize our data, by dividing each time and rate values with their maximum values, so that values distribute from 0 to 1.

# normalize the time and rate data
t_normalized = t / max(t)
q_normalized = q / max(q)

Now, is our main part of the non-linear regression. We make the hyperbolic function, import the Scipy package, then we fit the function to the normalized data using the curve_fit method to produce the result of the three parameters: qi, di, and b.

# function for hyperbolic decline
def hyperbolic(t, qi, di, b):
  return qi / (np.abs((1 + b * di * t))**(1/b))


# fitting the data with the hyperbolic function
from scipy.optimize import curve_fit


popt, pcov = curve_fit(hyperbolic, t_normalized, q_normalized)
  

It takes less than a second. Now, we need to pay attention to. What we are going to do is to "de-normalize" our resulted parameters, since we do the regression on the normalized data. After a simple mathematical operation, the "de-normalized" version of the hyperbolic function looks like this.

No alt text provided for this image

Hence, we need to multiply the resulted qi with the maximum value of q (or qmax), and to divide the resulted di with the maximum value of t (or tmax). After that, we print all the fitted parameters.

qi, di, b = popt


# de-normalize qi and di
qi = qi * max(q)
di = di / max(t)


print('Initial production rate:', np.round(qi, 3), 'MMSCF')
print('Initial decline rate:', np.round(di, 3), 'SCF/D')
print('Decline coefficient:', np.round(b, 3))

Here is our result:

Initial production rate: 2866266.453 SCF/D
Initial decline rate: 0.007 SCF/D
Decline coefficient: 0.5

We obtain the initial production rate equals 2.86 MMSCF per day, the initial decline rate equals 0.07 SCF per day, and most importantly the decline exponent equals 0.5. We are always interested to forecast how our production rate and cumulative production until 1,500 days will be. So next, this will be our task.

No alt text provided for this image

The above equation is to calculate cumulative production (Qp). We create the function first.

# function for hyperbolic cumulative production
def cumpro(q_forecast, qi, di, b):
  return (((qi**b) / ((1 - b) * di)) * ((qi ** (1 - b)) - (q_forecast ** (1 - b))))  

Then use the function to calculate the cumulative production with time extended to 1,500 days.

# forecast gas rate until 1,500 days
t_forecast = np.arange(1501)
q_forecast = hyperbolic(t_forecast, qi, di, b)


# forecast cumulative production until 1,500 days
Qp_forecast = cumpro(q_forecast, qi, di, b)

All done. We plot the production date vs. time (with the fitted hyperbolic curve) and the cumulative production.

# plot the production data with the forecasts (rate and cum. production)
plt.figure(figsize=(15,5))


plt.subplot(1,2,1)
plt.plot(t, q, '.', color='red', label='Production Data')
plt.plot(t_forecast, q_forecast, label='Forecast')
plt.title('Gas Production Rate Result of DCA', size=13, pad=15)
plt.xlabel('Days')
plt.ylabel('Rate (SCF/d)')
plt.xlim(xmin=0); plt.ylim(ymin=0)
plt.legend()


plt.subplot(1,2,2)
plt.plot(t_forecast, Qp_forecast)
plt.title('Gas Cumulative Production Result of DCA', size=13, pad=15)
plt.xlabel('Days')
plt.ylabel('Production (SCF)')
plt.xlim(xmin=0); plt.ylim(ymin=0)


plt.show()

Here are our final plots of the decline curve analysis.

No alt text provided for this image

With this representation, we could forecast our production rate and cumulative production of gas from the well, based on the decline curve.

Limitations of Arps' Decline Curve

Arps formulated his three type-curves based on evidence in the late-stage of the field life. If we recall about types of flow behavior in a reservoir, the flow in this stage behaves in a boundary-dominated condition, or often called as the pseudo-steady flow at a time when the reservoir approaches the boundary. In the early life of the field, the flow in the reservoir behaves in a transient condition. If we recall why we could get the b value larger than 1, this could mean that the flow inside still behaves transiently.

In other words, Arps' decline curves give more acceptable forecasts only in the later life of the field. 30 years later, Fetkovich (1980) and Carter (1985) proposed other type curves that extend the Arps' type to be capable in the analysis of the early life of the field. According to Lee and Wattenbarger (1996), the main difference between Arps' and both of the proposed type curves is that Arps' decline is based on empirical evidence, unlike both Fetkovich and Carter are based on theoretical. Therefore, both methods are generally more applicable.

References

Lee, J., R. A. Wattenbarger. (1996). Gas Reservoir Engineering: SPE Textbook Series Vol. 5. Society of Petroleum Engineers

Towler, B. F. (2002). Fundamental Principles of Reservoir Engineering: SPE Textbook Series Vol. 8. Society of Petroleum Engineers

Norwegian Petroleum Directorate website about Norne Field information

Pretty good, except that in a DCA for an oil/gas well, I would like to keep the first parameter (qi) fixed, since it represents the max production. And this code does not do that. So, how do we fix qi with curve_fit?

I was creating a module for decline curves and was puzzled with fitting method (was returning NaN values for some flow rates), turns out the solution to the problem was normalization. This post was very helpful.

Like
Reply

Hi Yohanes! Congratulation on your amazing work! I have a question, you briefly explain why might get a b coefficient greater than 1. But what about negative values (less than 0)?

Donal Marta

Field Operations Manager with 10 Years of Experience

4y

As Petroleum Engineer and Data Science Enthusiast, I have been looking this tutorial. Thank You!!

To view or add a comment, sign in

More articles by Yohanes Nuwara

  • MLP (Keras) Optimizers for Discrete Problems

    Optimization problem occurs every time in our daily life. In engineering, optimization is widely in many setup and it…

    10 Comments
  • Training Kolmogorov-Arnold Network (KAN) for Lithology Classification

    We may have heard about the ground-breaking new type of neural network called Kolmogorov-Arnold Network (KAN), which is…

    11 Comments
  • Well Placement Optimization using Python PyMRST

    Well placement optimization is one of the most challenging and expensive problems in the oil and gas industry. Well…

    1 Comment
  • Tutorial: Reservoir Simulation with Python PyMRST

    Can you name the best reservoir simulator in the world? Perhaps ECLIPSE, CMG IMEX, Nexus, and so on. You may have heard…

    31 Comments
  • CutMix Augmentation for Object Detection

    When building a computer vision model, sometimes, we get an F1-score accuracy that is not good enough. Then, we decide…

    1 Comment
  • Big Data & AI in Indonesian Healthcare Services

    Artificial Intelligence (AI) has become a transformative force in the field of healthcare, offering numerous benefits…

    7 Comments
  • ChatGPT for Sustainability

    The use of ChatGPT in the corporate world is becoming popular nowadays. It helps people work more productively and…

    5 Comments
  • Unlocking the Value of AI in Precision Forestry

    We use forest products every day, for example paper products. Paper products come from wood fiber as its raw material…

    3 Comments
  • PDA Series #2 Facies Classification from Well Logs

    In the second series, we will discuss the application of supervised learning ML for the classification of lithofacies…

    2 Comments
  • PDA Series #1 Recovering Missing Sonic Logs

    In the first series, we will discuss the application of supervised learning ML for recovering missing sonic logs. PDA…

    10 Comments

Others also viewed

Explore content categories