Use Joblib to Speed up Loops in Python

Recently, I carried out a Heat Stress analysis in some pacific islands using some hourly meteorological data. It is very common for such an analysis to calculate some derivatives based on the known variables. Moreover, iteration algorithms have to be used in many cases.

For example, the wet bulb temperature (Twb) was generally computed based on the variables of [temperature, dew point, and air pressure] using an iteration procedure just like the following code for each hour.

def CAL_TW(T, Td, P, num_iters_max=100, precision=0.01):

    ......

    for I in np.arange(num_iters_max):
        TM = (TM0 + TM1)/2.
        LQ = (2500000.0-2368*TM)*6.112*10**(7.5*TM/(237.3+TM))
        TOTALM = TM + 0.622*LQ/PK/CPD
        if(math.fabs(TOTALM-TOTAL0)>precision):
            if(TOTALM > TOTAL0):
                TM0 = TM
                TM1 = TM1
            else:
                TM0 = TM0
                TM1 = TM       
        else:
            TW = TM
            break

    return(TW)

When the length of hourly data is too big (e.g., 30 years ~ 365*24*30), it will take a lot of time to finish the above computing.

However, Joblib provides a simple helper class to write parallel for loops using multiprocessing. The core idea is to write the code to be executed as a generator expression, and convert it to parallel computing.

from joblib import Parallel, delayed

data  ​= df[['Temperature', 'Td', 'Pressure']].values

wbs = Parallel(n_jobs=-1)(delayed(CAL_TW)(row[0], 
                                          row[1], 
                                          row[2], 
                                          num_iters_max=1000, 
                                          precision=0.001) for row in data)
df['Twb'] = wbs

It took less 10 minutes to finish the above computing for the whole 30-year hourly data. As a comparison, the method provided by the package of MetPy was also used to calculate Twb, which spent more than 3 hours finishing the computing. It is quite significant to apply Joblib to speed up loops.

Of course, there are many other parallel computing methods available. However, it quite straightforward to understand and apply Joblib in practice. About Joblib, see more at https://joblib.readthedocs.io/en/latest/parallel.html.



Thanks for the article! NiCe to know this library.

Like
Reply

To view or add a comment, sign in

More articles by Chonghua Yin

Others also viewed

Explore content categories