Simplistic Automation of Waterflood Forecasting

Dimitri Moler

Published Oct 13, 2024

Background

This Python script was initially written for a friend who was facing the challenge of generating basic production forecasts for dozens of oil-producing wells under waterflood, using very limited data. Tackling this manually with "your tool of choice" would have been time-consuming and tedious. To address this, I suggested automating part of the process by writing a simple Python script. I’m now sharing this with the broader community, as I believe others may face similar challenges.

Disclaimer

This script was originally developed in a Jupyter Notebook during a quiet Sunday afternoon. No effort was made to optimize the code or adhere to advanced Pythonic principles, and the comments are minimal. The functionality, as well as the reservoir engineering principles applied, are basic—this was intentional, given the scope of the task.

Problem Context and Assumptions

Large number of oil producers under waterflood.
Limited input data, consisting only of initial water rate (Qw) and Water-Oil Ratio (WOR).
Fixed total liquid rate for each well.
Voidage replacement ratio is assumed to be 1, and oil rate decline is attributed solely to hydrocarbon sweep and water cut development.
The decline rate follows a simple Arps exponential decline, though it would be straightforward to adapt the code for hyperbolic decline if needed.

From the above, it's clear why I've referred to this as a "simplistic" solution.

Input

The script reads data from a .csv file, which I’ve simply named "input.csv." Below is a template for the file:

I believe the column names are self-explanatory, but for clarity, here’s a brief description of each:

Column A: Well names. The number of rows corresponds to the number of wells you want to forecast.
Column B: Start date of the forecast for each well.
Column C: End date of the forecast for each well.
Columns D/E/F: Initial oil rate in STB/D for Low, Mid, and High cases. If only one base case is required, these values should be identical.
Columns G/H/I: Arps decline rates for Low, Mid, and High cases.
Column J: Total liquid rate, assumed constant throughout the forecast period.
Columns K and L: Forecast termination criteria. The forecast will end either when the specified end date is reached or when the oil rate drops below a set cutoff value, or when the WOR exceeds the given limit.

The output of the script is a .csv file generated for each well, containing the relevant forecast data such as oil rates, water rates, WOR, and cumulative production for the Low, Mid, and High cases.

The script also generates a plot of oil rates, water rates, and WOR versus cumulative production (Np), in case anyone finds it useful:

Python Code

Below is the script itself. Feel free to use or modify it as needed. I haven’t uploaded it to GitHub yet for proper cloning, so for now, you can use the good old copy-and-paste method from here.

import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import os

matplotlib.rcParams['figure.figsize'] = (16.0, 12.0)
matplotlib.style.use('ggplot')

# Enter the actual path to the working folder here!!!
# Out folder will be created there with all output data.
working_dir = r'C:\Users\molok\Scripts\\'

outdir = f'{working_dir}Out'

if not os.path.exists(outdir):
    os.mkdir(outdir)
    
df = pd.read_csv(f'{working_dir}input.csv', skiprows=1,
                 names=['Well', 'StartDate', 'EndDate', 'Qoil', 'Qoim', 'Qoih', 'Dil', 'Dim', 'Dih', 'Qliq', 'Qcut', 'WORcut'], 
                 parse_dates=['StartDate', 'EndDate'], dayfirst=True, engine='python')

for idx, row in df.iterrows():
    buffer = pd.DataFrame({'Date':pd.date_range(start=row['StartDate'], 
                                            end=row['EndDate'], freq='MS')})

    buffer['DaysMonth']=buffer['Date'].dt.daysinmonth
    buffer['CumProdDays']=buffer['Date'].dt.daysinmonth.cumsum().shift().fillna(0)
    buffer['Well'] = row['Well']
    buffer['Qol'] = row['Qoil']*np.exp(-row['Dil']*buffer['CumProdDays'])
    buffer['Qom'] = row['Qoim']*np.exp(-row['Dim']*buffer['CumProdDays'])
    buffer['Qoh'] = row['Qoih']*np.exp(-row['Dih']*buffer['CumProdDays'])
    
    buffer['MonthProdLow'] = buffer['Qol']*buffer['DaysMonth'].shift(-1)
    buffer['MonthProdMid'] = buffer['Qom']*buffer['DaysMonth'].shift(-1)
    buffer['MonthProdHigh'] = buffer['Qoh']*buffer['DaysMonth'].shift(-1)
    
    buffer['Qwl'] = row['Qliq'] - buffer['Qol']
    buffer['Qwm'] = row['Qliq'] - buffer['Qom']
    buffer['Qwh'] = row['Qliq'] - buffer['Qoh']
    
    buffer['MonthWatProdLow'] = buffer['Qwl']*buffer['DaysMonth'].shift(-1)
    buffer['MonthWatProdMid'] = buffer['Qwm']*buffer['DaysMonth'].shift(-1)
    buffer['MonthWatProdHigh'] = buffer['Qwh']*buffer['DaysMonth'].shift(-1)
    
    buffer['WOR_low'] = buffer['Qwl']/buffer['Qol']
    buffer['WOR_mid'] = buffer['Qwm']/buffer['Qom']
    buffer['WOR_high'] = buffer['Qwh']/buffer['Qoh']
    
    buffer.loc[buffer['Qol'] < row['Qcut'], ['Qol', 'Qwl', 'WOR_low']] = np.NaN
    buffer.loc[buffer['WOR_low'] > row['WORcut'], ['Qol', 'Qwl', 'WOR_low']] = np.NaN
    
    buffer.loc[buffer['Qom'] < row['Qcut'], ['Qom', 'Qwm', 'WOR_mid']] = np.NaN
    buffer.loc[buffer['WOR_mid'] > row['WORcut'], ['Qom', 'Qwm', 'WOR_mid']] = np.NaN
    
    buffer.loc[buffer['Qoh'] < row['Qcut'], ['Qoh', 'Qwh', 'WOR_high']] = np.NaN
    buffer.loc[buffer['WOR_high'] > row['WORcut'], ['Qoh', 'Qwh', 'WOR_high']] = np.NaN
    
    buffer['Np_low']=buffer['MonthProdLow'].cumsum()
    buffer['Np_mid']=buffer['MonthProdMid'].cumsum()
    buffer['Np_high']=buffer['MonthProdHigh'].cumsum()
    
    fullname = os.path.join(outdir, row['Well'])
    buffer.to_csv(fullname+'.csv')
    
    plt.figure(figsize = (14, 6))

    plt.scatter(buffer['Np_low'], buffer['WOR_low'], color='red')
    plt.scatter(buffer['Np_mid'], buffer['WOR_mid'], color='green')
    plt.scatter(buffer['Np_high'], buffer['WOR_high'], color='blue')
    
    plt.title(row['Well'])
    plt.xlabel('Np, STB', fontsize=10)
    plt.ylabel('WOR', fontsize=10)

    #Save the figure
    plt.savefig(fullname+".png", dpi = 300, bbox_inches = "tight")
    
    plt.figure(figsize = (14, 6))

    plt.plot(buffer['Date'], buffer['Qol'], color='red')
    plt.plot(buffer['Date'], buffer['Qom'], color='green')
    plt.plot(buffer['Date'], buffer['Qoh'], color='blue')
    
    plt.title(row['Well'])
    plt.xlabel('Date', fontsize=10)
    plt.ylabel('Qoil, STB/D', fontsize=10)

    #Save the figure
    plt.savefig(fullname+"_rate.png", dpi = 300, bbox_inches = "tight")