Decline Curve Analysis with Python
Oil and gas fields need to be forecasted. Before the subject of reservoir engineering (mainly the Decline Curve Analysis, or DCA) came as knowledge and practice, people still did not have the notion that oil and gas reservoirs could be made predictable. No longer until J. J. Arps published his "Analysis of Decline Curves" to AIME in 1945, it is now the duty of every reservoir engineer to be able to forecast the future production rate of their wells in the field.
Before Arps classified three decline curve types, actually there were many engineers as early as in 1908 who were analyzing decline curves. In 1924, Cutler discovered the hyperbolic decline on a log-log paper for his analysis. One year later, Larkey proposed the use of least squares to extrapolate decline curves. Both contributed the basis to Arps.
DCA has been general knowledge and practice nowadays to forecast future production. Additionally, the popularity of scientific computing gives more promises to engineers so that we can do rigorous analysis much faster using programs. This objective of this article is to share a tutorial on doing DCA using Python.
Three Decline Curve Types of Arps
Arps (1945) formulated the general decline function for oil and gas wells as follows:
Where q is the production rate (STB/day or SCF/day), t is the time (days), qi is the initial production rate (same unit as q), di is the initial decline rate (same unit as q), and b is the decline exponent. This decline function is known as the hyperbolic decline function.
Based on the value of b in the function, he then classified the decline curves into three types:
- The exponential decline has b equals 0.
- The harmonic decline has b equals 1.
- The hyperbolic decline has b ranges between 0 and 1.
In other words, the hyperbolic decline is the general decline curve. If presented graphically (see below), the harmonic decline has the least decline rate and the exponential decline has the most decline rate.
Given the data that contains the production rate (q) over time (t), we need to find out the three unknown variables that best match the data, which are qi, di, and b. This is the essential of DCA. Non-linear regression is the widely used technique to estimate these three variables.
In some applications, we may find out that b value is larger than 1. Although this result is formally not established in Arps decline curves, there are lots of interpretations why b can be larger than 1. This will be briefly discussed in the last section about Arps limitation.
Now let us start the fun part.
Python Tutorial for Decline Curve Analysis
In this tutorial, I will take a sample from production data of Norne Field in the North Sea. The sample is from one well and recorded gas production rate between 2004 to 2006. The Norne Field production dataset is licensed by The Norwegian Petroleum Directorate for research purposes in universities and institutions. The original dataset contains more than one wells with a recorded gas, oil, and water production data from 1997 to 2006. The sample data is available in my repository: Here.
As usual, firstly, import the three libraries.
import numpy as np import matplotlib.pyplot as plt import pandas as pd
Load the data, create into dataframe using Pandas, convert the date column into datetime format, and specify both the time and the production rate.
filepath = './norne_production_rate_sample.csv' # load data df = pd.read_csv(filepath) # convert date string to Panda datetime format df['Date'] = pd.to_datetime(df['Date'], format='%Y-%m-%d') t = df['Date'] q = df['Rate (SCF/d)'] # display the data df
Here, we need to take concern. In a production data (that will be ready for DCA analysis), the time axis should be displayed as days (from 0 to N days), not as dates. The following scripts do the conversion.
import datetime # subtract one datetime to another datetime timedelta = [j-i for i, j in zip(t[:-1], t[1:])] timedelta = np.array(timedelta) timedelta = timedelta / datetime.timedelta(days=1) # take cumulative sum over timedeltas t = np.cumsum(timedelta) t = np.append(0, t) t = t.astype(float)
Next, we plot the data.
plt.plot(t, q, '.', color='red')
plt.title('Production Rate from 01/12/1990 to 01/11/1992', size=13, pad=15)
plt.xlabel('Days')
plt.ylabel('Rate (SCF/d)')
plt.xlim(xmin=0); plt.ylim(ymin=0)
plt.show()
The gas production rate from 1 April 2004 to 1 December 2006 looks like the following.
On this data, we will now do a decline curve analysis using the non-linear regression (curve fitting) technique, that is available in the Scipy package. Before doing curve fitting, it's important to normalize our data, by dividing each time and rate values with their maximum values, so that values distribute from 0 to 1.
# normalize the time and rate data t_normalized = t / max(t) q_normalized = q / max(q)
Now, is our main part of the non-linear regression. We make the hyperbolic function, import the Scipy package, then we fit the function to the normalized data using the curve_fit method to produce the result of the three parameters: qi, di, and b.
# function for hyperbolic decline def hyperbolic(t, qi, di, b): return qi / (np.abs((1 + b * di * t))**(1/b)) # fitting the data with the hyperbolic function from scipy.optimize import curve_fit popt, pcov = curve_fit(hyperbolic, t_normalized, q_normalized)
It takes less than a second. Now, we need to pay attention to. What we are going to do is to "de-normalize" our resulted parameters, since we do the regression on the normalized data. After a simple mathematical operation, the "de-normalized" version of the hyperbolic function looks like this.
Hence, we need to multiply the resulted qi with the maximum value of q (or qmax), and to divide the resulted di with the maximum value of t (or tmax). After that, we print all the fitted parameters.
qi, di, b = popt
# de-normalize qi and di
qi = qi * max(q)
di = di / max(t)
print('Initial production rate:', np.round(qi, 3), 'MMSCF')
print('Initial decline rate:', np.round(di, 3), 'SCF/D')
print('Decline coefficient:', np.round(b, 3))
Here is our result:
Initial production rate: 2866266.453 SCF/D Initial decline rate: 0.007 SCF/D Decline coefficient: 0.5
We obtain the initial production rate equals 2.86 MMSCF per day, the initial decline rate equals 0.07 SCF per day, and most importantly the decline exponent equals 0.5. We are always interested to forecast how our production rate and cumulative production until 1,500 days will be. So next, this will be our task.
The above equation is to calculate cumulative production (Qp). We create the function first.
# function for hyperbolic cumulative production def cumpro(q_forecast, qi, di, b): return (((qi**b) / ((1 - b) * di)) * ((qi ** (1 - b)) - (q_forecast ** (1 - b))))
Then use the function to calculate the cumulative production with time extended to 1,500 days.
# forecast gas rate until 1,500 days t_forecast = np.arange(1501) q_forecast = hyperbolic(t_forecast, qi, di, b) # forecast cumulative production until 1,500 days Qp_forecast = cumpro(q_forecast, qi, di, b)
All done. We plot the production date vs. time (with the fitted hyperbolic curve) and the cumulative production.
# plot the production data with the forecasts (rate and cum. production)
plt.figure(figsize=(15,5))
plt.subplot(1,2,1)
plt.plot(t, q, '.', color='red', label='Production Data')
plt.plot(t_forecast, q_forecast, label='Forecast')
plt.title('Gas Production Rate Result of DCA', size=13, pad=15)
plt.xlabel('Days')
plt.ylabel('Rate (SCF/d)')
plt.xlim(xmin=0); plt.ylim(ymin=0)
plt.legend()
plt.subplot(1,2,2)
plt.plot(t_forecast, Qp_forecast)
plt.title('Gas Cumulative Production Result of DCA', size=13, pad=15)
plt.xlabel('Days')
plt.ylabel('Production (SCF)')
plt.xlim(xmin=0); plt.ylim(ymin=0)
plt.show()
Here are our final plots of the decline curve analysis.
With this representation, we could forecast our production rate and cumulative production of gas from the well, based on the decline curve.
Limitations of Arps' Decline Curve
Arps formulated his three type-curves based on evidence in the late-stage of the field life. If we recall about types of flow behavior in a reservoir, the flow in this stage behaves in a boundary-dominated condition, or often called as the pseudo-steady flow at a time when the reservoir approaches the boundary. In the early life of the field, the flow in the reservoir behaves in a transient condition. If we recall why we could get the b value larger than 1, this could mean that the flow inside still behaves transiently.
In other words, Arps' decline curves give more acceptable forecasts only in the later life of the field. 30 years later, Fetkovich (1980) and Carter (1985) proposed other type curves that extend the Arps' type to be capable in the analysis of the early life of the field. According to Lee and Wattenbarger (1996), the main difference between Arps' and both of the proposed type curves is that Arps' decline is based on empirical evidence, unlike both Fetkovich and Carter are based on theoretical. Therefore, both methods are generally more applicable.
References
Lee, J., R. A. Wattenbarger. (1996). Gas Reservoir Engineering: SPE Textbook Series Vol. 5. Society of Petroleum Engineers
Towler, B. F. (2002). Fundamental Principles of Reservoir Engineering: SPE Textbook Series Vol. 8. Society of Petroleum Engineers
Norwegian Petroleum Directorate website about Norne Field information
Pretty good, except that in a DCA for an oil/gas well, I would like to keep the first parameter (qi) fixed, since it represents the max production. And this code does not do that. So, how do we fix qi with curve_fit?
I was creating a module for decline curves and was puzzled with fitting method (was returning NaN values for some flow rates), turns out the solution to the problem was normalization. This post was very helpful.
Hi Yohanes! Congratulation on your amazing work! I have a question, you briefly explain why might get a b coefficient greater than 1. But what about negative values (less than 0)?
Field Operations Manager with 10 Years of Experience
4yAs Petroleum Engineer and Data Science Enthusiast, I have been looking this tutorial. Thank You!!