Machine Learning the Fed Reaction Function
To try and capture the Federal Open Market Committee’s reaction function to set the Federal Funds Rate, I fit a random forest model to selected economic data. A recent data science blog wrote that it is a good idea to share when you get models to work. A goal of mine is to use machine learning and artificial intelligence to augment my portfolio management decisions. I believe this use of the random forest algorithm (my first) provides some insight. Hopefully by sharing this model either I will help someone else with a similar idea, or someone can help improve my work.
Goal: Get a quantitative sense of the Fed’s pace of rate hikes and the terminal Fed Funds rate.
Idea: The Fed Funds rate is a function of inflation, employment, and financial conditions.
Data: Time period is January 2000 through December 2017. To capture inflation, I use CPI year over year and PCE core year over year. To capture employment, I use the U6 and U3 unemployment rates, the Fed Atlanta Median Wage, and a variable “Gap” equal to Nairu – U3 unemployment rate. To capture financials conditions, I use the Bloomberg financial conditions index. Source: Bloomberg, St Louis Fed.
Results: Before my attempts to learn machine learning, I was stuck with multiple regression, or the linear model. It is worth showing the results of that model graphically below to motivate us to do better and to see how much a new model in the toolkit helps.
R code : fit_lm <- lm(Fedfunds ~. , data = fed)
The redline is the actual Fed Funds target rate (upper bound) and the black circles are the model prediction. The linear model clearly does a bad job. We move on to the random forest.
R code: r_model <- randomForest(Fedfunds ~ ., data = fed)
Again, the red line is the actual Fed funds rate (upper bound) and the black circles are the model prediction. Much better fit.
Discussion: While I think we can get usable information with this model, we must be mindful of its faults. The biggest issue with the model is probably overfitting. Another issue is the fed balance sheet; the balance sheet wasn’t a factor pre-2008, now it is, and its not in the model.
The model lists the U6, Employment Gap, and wages as the variables of highest importance. This is useful information as these variables are mentioned in FOMC speeches and they are hitting levels usually seen well into tightening cycle.
If I use data I think is consistent with peak economic activity:
Nairu U.6 U.3 Gap CPI PCEcore FCI Wage 4.74 6.8 3.7 -1.04 2.4 2 0.829 3.5
The model prediction is 3.30%. I think the model is probably overestimating the Fed Funds rate, given it is calibrated on the past pre-2008 experience. Nevertheless, it leads me to 2.75-3.00 conclusion which is consistent with the fed projections. The value is I can throw different economic scenarios at the model and get a feel for the response.
Conclusion: The model, coupled with fed statements and actions, suggests the Fed will raise rates every other meeting until either 1) we hit Fed Funds somewhere close to 3% or 2) something breaks in the real economy. As a fixed income portfolio manager, the decision to be patient, and to only add duration on the short end when the hikes are priced in. In addition, the June, Sept, and Dec Eurodollar Futures are probably too high in price (too low in yield) given economic projections as of mid-March.
Any suggestions to improve are appreciated.
Thanks. Interesting application of ML. Can you share your code? 1 - How did you split your data between training/test set? If you didn't split the data, then you're definitely overfitting. RF are very good at fitting to the training set so you'll often get very high accuracy if you test against the training data. That may be why your results appear to be so accurate. 2- What was the R-sq and RMSE against the test set? It's a little tricky with time series because you have to respect the temporal aspect of the data. You may have to produce multiple train/test splits and calculate accuracy metrics for each, or use a walk forward validation. You can learn more here: https://machinelearningmastery.com/backtest-machine-learning-models-time-series-forecasting/ 3 - What hyperparameters did you use? You could be overfitting if you have 1 sample leaf size and infinite splits. 4 - Were your features just a one period lags of the economic variables? What periodicity did you use for the input and outputs? 5. Did you try any other regression models? 6. Did you scale or normalize any of your data? In generally RF don't require scaling but I think with temporal data it might be helpful as the scales change over time for some variables.
Insightful way to use AI to augment investment decisions Jim. Very cool!