Analysis of CAR Performance in R
An Analysis of a CAR Dataset in R
Dublin Business School
Continuous Assessment 2 - R Language
Lecturer: Thomas Fitzsimons
Student: Colm Dougan
Student ID: 10205174
Assignment:
Based on this course please create an example use case based on some data that you have. Use the R Graphics library to visualize the data.
Solution:
I will base my analysis on Data concerning High Performance Cars that I have downloaded from the Internet.
I used the Dplyr library to process the Data and the ggplot2 library to create the Graphics.
I downloaded Six Data Files on the following characteristics:
Torque, Acceleration Times, Engine Size, Max Speed, and Power to Weight Ratio.
The Data covering a range of Cars from the 1930’s to 2012.
Then I loaded the Data Files into R-Studio.
I searched for Duplicate Entries in the files and removed them using the following commands on each data set.
> df.(dataset_name) %>% group_by(car_full_nm) %>% summarise(count=n()) %>% filter(count != 1)
> df.(dataset_name) <- distinct(df.(dataset.name ,car_full_nm)
The datasets were then joined together using a Left Join.
> df.car_spec_data <- left_join(df.car_horsepower, df.car_torque, by="car_full_nm")
Variables were then added for the year, decade, make, weight and torque.
A Scatterplot was then made of Horsepower v Top Speed for all the Cars.
Here we can see a correlation between speed and horsepower.
A Scatterplot was then made of Speed by Year.
There is no strong Correlation between Top Speed and Year.
A plot was then made of fastest car by year.
Here we see a correlation between Top Speed and Year.
A Plot of Acceleration v Horsepower was then made.
Here we see a Negative Correlation between Acceleration and Horsepower
A plot was then made of Horsepower by Weight
Another Negative Correlation is seen here.
Here is the Code used to Generate the Above Graph:
> ggplot(data=df.car_spec_data, aes(x=df.car_spec_data$torque_per_ton,y=car_0_60_time_seconds)) +
geom_point(size=4, alpha=.5,color="#880011",position="jitter") +
stat_smooth(method="auto",size=1.5) +
ggtitle("Torque-per-Tonne") +
labs(x="Torque-per-tonne",y="0-60 time \n seconds") +
theme.car_chart_SCATTER
Similar Code was used as above to create all the charts.
Finally a plot was made of Torque per Tonne.
A negative exponential correlation is seen here.
Conclusion:
I would recommend R-Studio as an Excellent Method of Analysing and Visualizing Car Data in a Production Setting. It has a wide range of built in tools and many great external packages that could be used.
It could also be used by a Motoring Magazine for including Graphs and Charts in Articles and comparing Cars Performances.
In addition R can be helpful when buying a used second-hand car by scraping automotive web pages for car data using a page scraper. Once we have the data we are looking for we can load it into a Data Frame and process it. Then we can plot it using ggplot2.
R can also be used for analysing performance data while a new car is being developed.
Output could be similar to the Graphs above, as well as tables of data could be produced in Reports.
This data would come from many sensors attached to the car in question.
R could be a very valuable tool for the Automotive Industry.
Colm Dougan
Thank You Wade. 😀😀
thanks for sharing this colm.