COVID-19 ANALYSIS USING SQL AND VISUALIZATION USING TABLEAU

COVID-19 ANALYSIS USING SQL AND VISUALIZATION USING TABLEAU



The Covid-19 pandemic was a really serious outbreak that cost lives and affected the economy negatively, thus, public communication is needed for the audience and the government to make impactful decisions on this pandemic. The Covid-19 dataset is updated daily and can be found on various websites, but I got the daily updated dataset on this website; https://bit.ly/3LmfAAG . The dataset used for this project recorded cases from 24th February 2020 to 2nd May 2022.

I explored this using MSSQL and made dashboards and visualizations using Tableau. The method and processes will be explored in this article.

The dataset contains details of columns on cases, deaths, and vaccinations. For simplicity, I decided to split the dataset into two forms with Excel: CovidDeaths, and CovidVaccinations. The CovidDeaths dataset contained columns and records on cases, and deaths while the CovidVaccinations dataset contains only columns and rows regarding vaccinations and tests. This whole SQL project was executed on the Microsoft SQL Server Management Studio, Below is a snapshot of a few of the columns and entries in the CovidDeaths, CovidVaccinations dataset, the SQL query file will be available on my Github Repo: https://github.com/Savepeter2/SQL-Covid-19-Portfolio-Project.


No alt text provided for this image
No alt text provided for this image


Case Scenarios Explored


1.      Daily Total Cases, Deaths, PercentageDeathRate which indicates the likelihood of a person infected dying. I explored Nigeria only in this particular case scenario.

N.B; This scenario isn't limited to Nigeria only, I decided to explore Nigeria, information for other countries can be explored in this scenario.

No alt text provided for this image
No alt text provided for this image
No alt text provided for this image

This shows the query and the results, the first table represents the results for the first 15 days since the covid-19 outbreak in the country, we can see that no deaths were recorded and there were just a total of two cases. The second table represents results for the last 15 days prior to May 2ND, 2022, when the cases already blew up over 255,000 cases and 3000 deaths.


2.  CASE 2 SCENARIO: Explored Countries with their Total Cases, Total Deaths, Percentage Of the Country Population Infected, and Percentage Of the Country’s population dead. The result is sorted in descending order by the Percentage Of the Country’s population Infected.

No alt text provided for this image


No alt text provided for this image

Above shows the query and snapshots of the top rows from the result, It can be deduced that Faeroe Islands had the Highest Percentage of Population that got infected with covid-19, scrolling down the result shows the results of other countries, and the columns explored in descending order of its percentage of the population infected.

N.B; the above exploration can be sorted in descending order of its Total Cases, Total Deaths e.t.c., It depends on whatever information you’re interested in exploring; For instance, if we sort the above case scenario in descending order of its total cases, the query and result using Microsoft SQL Server Studio is given below;

No alt text provided for this image
No alt text provided for this image


The snapshot of the result above shows the top 15 countries with the highest total cases, United States emerges as the country with the highest sum of cases of COVID-19, with over an 81million cases and recorded almost a million total deaths, followed by India recording over 43 million cases and so on in descending order of Total Cases. Scrolling down the results on MSSQL studio shows more countries with their corresponding Cases, Deaths, populations e.t.c.


3.   CASE 3 SCENARIO: It explores an alternative form of exploring countries with their total cases, the sum of deaths, the Percentage of Population Infected, and the Percentage of their population Dead.

It is regarded here as the country’s highest infection count and highest death count, they imply similar meaning as the country’s total cases and its total deaths because of columns in the dataset record the cumulative cases and deaths. Thus, the highest infection count and death count are simply the maximum value of these corresponding columns. The query and result are given below:


No alt text provided for this image
No alt text provided for this image

Thus, we can deduce it gave the same results as the former case scenario. It is an alternative solution to the former scenario.

4.      CASE 4 SCENARIO:

Case 4 scenario gives information about global numbers: world’s total cases, total deaths, total population, death percentage, and case percentage.

No alt text provided for this image
No alt text provided for this image

The result shows the World’s recorded COVID-19 total cases as of 2nd, May 2022 to be 513,180,614, estimated roughly to be over 500 million cases. The total number of deaths recorded across the world was 6,194,913. The world’s total population result was also over 6 trillion with a death percentage and case percentage of 1.2% and 0.008% respectively.


CASE 5 SCENARIO:

 This explores the country's recorded daily new cases, new deaths, cumulative cases, cumulative deaths, new vaccinations, and cumulative vaccinations. Visualizations were made for this covid-19 case study analysis in order for us to have a better view of these insights and relate to this information better. The Tableau link to the visualizations is attached at the top header.

Here, the CovidVaccinations dataset was joined with the CovidDeaths dataset to achieve this goal. Below shows the query and some snapshots of the query result;


No alt text provided for this image
No alt text provided for this image
No alt text provided for this image
No alt text provided for this image

The result gives the country's daily recorded population, cases, the cumulative sum of daily cases, deaths, the cumulative sum of daily deaths, vaccinations, and the cumulative sum of the daily vaccinations. Insights from these results will be understood more clearly through the visualizations.

CASE 6 SCENARIO:

Here, we look into continents and their confirmed cases, deaths, and vaccinations. We want to know the order in which these continents are ordered in their confirmed total cases, deaths, and vaccinations.

No alt text provided for this image
No alt text provided for this image

Here, we can see the continents and their respective total cases and total deaths, This was sorted by the total cases in descending order, Thus Europe had the highest sum of cases, followed by Asia, NA, SA, Africa, and Oceania.

Then, we will explore the continents sorted in descending order of their total deaths

No alt text provided for this image
No alt text provided for this image

SUMMARY

This is an SQL COVID-19 exploration project which focuses on cases, deaths, and vaccinations. I believe more explorations can be done as the purpose of this project was to apply basic knowledge of SQL in exploring a real-life scenario(COVID-19 Outbreak). The queries cover basic SQL functions, window functions, and CTEs. The case scenarios covered in this project were visualized using Tableau. The Tableau link; https://tabsoft.co/3lf7csg .

No alt text provided for this image
No alt text provided for this image


 I’ll highly appreciate questions, suggestions, and feedback, feel free to reach me on Linkedin and you can also check out my other projects at https://github.com/savepeter2 .

How do get it on tableau, I did the queries in Microsoft sql but I am using tableau public and can’t connect to sql server.

Like
Reply

Thank you so much for sharing. It was very usefull to learn SQL 😀 .

Like
Reply

To view or add a comment, sign in

Others also viewed

Explore content categories