Data Analysis Project on Cryptocurrency Market Based on Machine Learning Algorithm
Hi! In this project, we are going to analyze the Dataset of Cryptocurrency Market
In this project, you have to Apply Regression to predict the closing price. And after that, you will convert the closing price into True and False. if CLOSE > 2.1
Download the CryptoCurrency Dataset
Now Import some Library to analyze and predict the Closing Price
import pandas as pd from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score
In this Model, we are using Logistic Regression for analysis
Now we are going to read the Cryptocurrency dataset for analysis
df = pd.read_csv("crypto-markets.csv")
After Reading this CSV file we get our dataset
slug symbol name date ranknow open high low close volume 0 bitcoin BTC Bitcoin 2013-04-28 1 135.30 135.98 132.10 134.21 0.0 1 bitcoin BTC Bitcoin 2013-04-29 1 134.44 147.49 134.00 144.54 0.0 2 bitcoin BTC Bitcoin 2013-04-30 1 144.00 146.93 134.05 139.00 0.0 3 bitcoin BTC Bitcoin 2013-05-01 1 139.00 139.89 107.72 116.99 0.0 4 bitcoin BTC Bitcoin 2013-05-02 1 116.38 125.60 92.28 105.21 0.0
To find the Total Number of Rows and Columns
df.shape
To find the Name of all Column
df.keys()
As we don't need all the column for calculating the Closing Price so we include only 'open', 'high', 'low', 'close' column in our data frame
df = df[[ "open", "high", "low", "close"]]
There are lot's of the row which has a close value less than 1 so we don't include those rows
df = df[df.close > 1]
Now all the values of close in my data frame are greater than 1
open high low close 0 135.30 135.98 132.10 134.21 1 134.44 147.49 134.00 144.54 2 144.00 146.93 134.05 139.00 3 139.00 139.89 107.72 116.99 4 116.38 125.60 92.28 105.21
Now we want to analyze the close price which is greater than 2.1 so we give this condition if it is greater than 2.1 then it returns 'True' otherwise 'false'
TF = df.close > 2.1
Now see how many values are greater and less than 2.1
TF.value_counts()
Now we get the output
True 86056 False 38972 Name: close, dtype: int64
Now we are going to predict the Closing Price by my Machine
df["TF"] = TF.values x = df.drop(["close", "TF"], axis = 1).values y = df.TF.values
Now Train my machine using sklearn model
trainX, testX, trainY, testY = train_test_split(x,y, test_size= 0.2) Lr = LogisticRegression() Lr = Lr.fit(trainX, trainY) pred = Lr.predict(testX)
After prediction, we get the predicted output
So our Model is Trained now
array([False, True, False, ..., True, True, False])
Now we are going to find out the accuracy score
accuracy_score(pred, trainY) accuracy_score(pred, testY)
Therefore we get our Accuracy of test data
Accuracy_Score for Traing Data is: 0.9830237347783488 Accuracy_Score for Test Data is: 0.9830840598256418
So the Accuracy_Score for Training and Testing Data is Nearly equal
Therefore my machine is Ready to analyze our Data.