Predict Bitcoin Prices Using Deep Learning
Time series forecasting using Long Short Term Memory (LSTM) network
Trillions in central bank stimulus have some investors betting that Bitcoin can be used as a hedge against future inflation. As noted by Richard Galvin of Digital Asset Capital Management, "It’s not just the U.S. story, but more or less every major government is doing that to a similar magnitude.” However, in 2020, Bitcoin fared worse than traditional safe havens like gold, up 11%, and U.S. 10-year Treasuries. During the depths of the pandemic on March 12, when Bitcoin crashed 40% to its worst single day since 2013, other so-called safe havens proved far more resilient. Given the price volatility and unexpected trends, this article investigates whether the LSTM variant of Recurrent Neural Networks (RNN) can accurately predict Bitcoin prices. This article contains a gentle introduction to the theoretical framework of LSTMs and a Python implementation of the algorithm.
Recurrent Neural Networks (RNN) are chain-like neural networks that pass on data from one chunk of a network to the next, thereby allowing data to persist. For example, a chunk A of a neural network receives an input xt, and outputs a value ht where the chunk A loops back information to itself as shown in the diagram below. When the chunk A is unrolled, the RNN reveals itself to be equivalent to multiple copies of the chunk A with multiple inputs x and outputs h, each passing on data to the successor chunk for each iteration. In this way, the recurrent loop allows information to be passed from one step of the network to the next.
By connecting previous information to the next node or chunk, RNNs perform very well with sequential data such as time series, e.g, Bitcoin daily close prices. However, as the gap between inputs x and outputs h grows, information from a previous input/output combination gets lost. Consider the eye-popping run-up to $20,000 for Bitcoin in 2017. That event or dependency is "forgotten" in an RNN as the network receives more recent data from 2020. To avoid memory loss, parameters can be selected to retain certain data. Long Short Term Memory (LSTM) networks avoid the long-term dependency problem by remembering data, such as an unusual run-up in price, by using a series of activation layers in each cell that function as a gating mechanism. Because the focus of this article is on the code, we forego the math, but essentially, the gating functions "decide" which data to keep and which to forget. Data that is weighted as being more important in a prior cell state is passed on or adjusted by the function to be retained by subsequent cells. Less important data is discarded. A detailed explanation can be found here.
Getting Data
To train the LSTM model, we need data. Historical Bitcoin prices can be obtained from yahoo finance. The daily close prices represent our predicted value. As our input feature, we will use the Crypto Fear & Greed Index (FNG). The FNG is a multi-factorial market sentiment analysis for Bitcoin. The index factors in price volatility, market momentum and volume, social media sentiment, surveys, and trend data. As the index-authors themselves note, the Bitcoin market tends to react violently, relative to the broader stock and bond market. Bitcoin participants tend to irrationally bid up prices based on a Fear of Missing Out (FOMO), and then overreact with aggressive selling when prices drop. Some have argued that Bitcoin price volatility could be attributed to the relative inexperience of the typical Bitcoin millennial investor. Interestingly, the recent run-up in the stock market during the COVID-19 pandemic has also been attributed to new day traders placing risky bets enabled by another millennial favorite, the Robinhood trading platform. Regardless of the source, we will use the numerical FNG index ranging from 0 to 100 where zero represents "Extreme Fear" and 100 represents "Extreme Greed."
Python Environment
To run the code, a Python SciPy environment must be installed preferably on Python 3. You must have Keras (2.0 or higher) installed with tensorflow, scikit-learn, pandas, numpy and matplotlib. Getting your environment up and running can be frustrating, but thankfully, many resources are available on the internet. One such guide is found here. How to Setup a Python Environment for Machine Learning and Deep Learning with Anaconda
Data
As the date of this publication June 20, 2020, the datasets are up to date. All the code and data can be found on my github. https://github.com/rhahn28/predict_BTC_price
The code below imports the necessary libraries and can be used to query the FNG index via an API.
import numpy as np
import pandas as pd
import hvplot.pandas
# this code can be used to obtain FNG data
get_fng_index = "https://api.alternative.me/fng/"
url = get_fng_index + "?limit=2000&format=csv&date_format=us"
# Running the following code loads the datasets into a pandas dataframe.
btc_fng_df = pd.read_csv('btc_fng.csv')
btc_fng_df
The next lines of code makes the data-time index and drops the fng_classification column. The fng_value is an integer that can be used in the algorithm whereas the string in fng_classification cannot.
df = pd.read_csv('btc_fng.csv', index_col="date", infer_datetime_format=True, parse_dates=True)
df = df.drop(columns="fng_classification")
df.tail()
Next, the daily close prices and FNG index are inner joined into a single dataframe.
df = df.join(df2, how="inner")
Missing values are dropped using the dropna method and the dataframe is reserved chronologically.
df.dropna(inplace=True) df = df[::-1]
Create a function that accepts the column number for the features (X), which in this case will be the FNG index, and target (Y), which is the closing price.
def window_data(df, window, feature_col_number, target_col_number):
X = []
y = []
for i in range(len(df) - window - 1):
features = df.iloc[i:(i + window), feature_col_number]
target = df.iloc[(i + window), target_col_number]
X.append(features)
y.append(target)
return np.array(X), np.array(y).reshape(-1, 1)
Try a window size of 10
# Try a window size from 1 to 10 and see how the model performance changes window_size = 10 # Column index 1 is the `Close` column feature_column = 0 target_column = 1 X, y = window_data(df, window_size, feature_column, target_column)
Segment the data using a standard 70/30 split where 70% of the data is used to train the algorithm and 30% is used to test the model.
# x split split = int(.7 * len(X)) X_train = X[:split - 1] X_test = X[split:] # y split y_train = y[:split -1] y_test = y[split:]
Use MinMaxScaler from sklearn to scale the data from 0 to 1. This estimator scales and translates each feature individually such that it is in the given range on the training set, e.g. between zero and one. Then reshape the dataframes.
from sklearn.preprocessing import MinMaxScaler scaler = MinMaxScaler() scaler.fit(X) X_train = scaler.transform(X_train) X_test = scaler.transform(X_test) scaler.fit(y) y_train = scaler.transform(y_train) y_test = scaler.transform(y_test) # Reshape the features for the model X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1)) X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], 1))
Build and Train the LSTM RNN
Using two lines of code, we can easily import the Sequential and LSTM libraries to build the models.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
# Build the LSTM model
model = Sequential()
model.add(LSTM(
units=30, return_sequences=True,
input_shape=(X_train.shape[1], 1)))
model.add(Dropout(0.2))
model.add(LSTM(units=30, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=30))
model.add(Dropout(0.2))
model.add(Dense(1))
Train the model. Use at least 10 epochs, but experiment using any number as computing resources permit. In the snippet below, 100 epochs were used.
# Train the model model.fit(X_train, y_train, epochs=100, shuffle=False, batch_size=1, verbose=1)
Make predictions and recover the original prices instead of the scaled versions and create a dataframe of real and predicted values. As can be seen, at instance 0, the prediction is off since the window has not begun.
predicted = model.predict(X_test)
predicted_prices = scaler.inverse_transform(predicted)
real_prices = scaler.inverse_transform(y_test.reshape(-1, 1))
btc_price_predictions = pd.DataFrame({
"Real": real_prices.ravel(),
"Predicted": predicted_prices.ravel()
})
btc_price_predictions.head()
Now, plot the real versus predicted values as a line chart using hvplot.
btc_price_predictions.head(100).hvplot()
The plot show a reasonable prediction of Bitcoin prices. Additional parameters can be tuned to provide a more accurate result.
Conclusion
Whether you consider Bitcoin a safe haven, inflation hedge, or nothing more than a casino, deep learning algorithms such as LSTM, a type of Recurrent Neural Networks (RNN), can be used to predict prices with reasonable accuracy.Tuning hyper-parameters, and experimenting with other data sources may improve predictions. Next post we will delve into tuning and evaluating the results. Yay, confusion matrix!
Roger, thanks for sharing!
Nice! Might want to look at the daily price changes instead of the levels and compare the R^2, AIC/BIC, and RMSE::STD DEV
👍