Visualizing UN Population Data
The Final Dashboard
Motivation & Objectives:
The United Nations is an international organization whose objectives are to promote international peace, develop friendly relations among nations, to achieve international collaboration in solving international problems, and to be a center for harmonizing the actions of nations.
The United Nations strongly believes in the power of data to make the world a better place. This is evident through their effort in making their data publicly available and encouraging people to interact with it by, for example, organizing hackathons and challenges for aspiring data analysts.
As prospective business analysts, Shanti, Hans, and I felt that it would be an interesting challenge to contribute to the UN’s data initiatives. To do this, we decided to use population data from UNdata (http://data.un.org) and create a dashboard that would make the comprehension and visualization of this information easier for the general public.
Most of the data files on UNdata are split into various subsections. Information from each subsection is available in both csv and pdf format. However, there was neither a single sheet containing all the information by countries nor an easy way to visualize the data. The objective of this project was to solve this problem.
Image 1: Screenshot of UNdata site
Image 2: Part of the pdf document for the population, surface area and density section
Data Preparation:
We decided to focus on the population data in this project. The principles we applied in creating this dashboard can also be applied to other datasets found on the UNdata site such as ‘Price and Production Indices’ and ‘Education’. There are four tables listed under the UN population data: ‘Population, surface area and density,’ ‘International migrants and refugees,’ ‘Population growth, fertility, life expectancy and mortality,’ and ‘Population in the capital city, urban and rural areas’.
One of the issues we faced was that the csv data file contained multiple rows for the same country and year.
Image 3: Part of the csv file from the UN statistics site
With the help of the dcast function in the reshape2 package in R, we first prepared the tables by transposing the measures into different columns so that the data was organized by a single country and year (X3) as shown below.
Image 4: Part of the csv file after data manipulation
We then used inner_join in the dplyr package to merge the four tables together.
After importing the resulting csv file into Tableau, we excluded rows which covered multiple countries such as ‘West Africa’ and ‘Australia and New Zealand’, and grouped all the countries into larger geographic regions.
We also found that when we imported data, multiple dimensions and measures were wrongly classified. For example, we had to reclassify columns Asylum seekers (number) and Infant mortality rate as measures and Year as a dimension.
Our final population data consisted of 3 dimensions and 16 measures:
Dashboard Creation:
One of our objectives was to provide a more interactive platform for people to understand the data better. A map is one of the most recognizable chart types and is much more understandable to the general population than, for example, a treemap or a bullet chart. Using a map allows users to find and filter by a certain country without scrolling through a long alphabetized list. If however, they are unsure of a location, there is also a drop-down menu to choose from.
Another important principle of visualization is not to overload the audience with too many charts. The other three charts are also relatively simple to understand. The dynamic bar chart at the bottom allows users to compare countries within a specifically chosen measure. They are also able to choose multiple countries to compare using the filter beside the map.
The bar chart and pie chart on the right-hand side of the dashboard encompass information we thought was better shown separately: male and female life expectancy and age distribution. The median was used to display life expectancy values so as to limit the impact of extreme data points. Both these charts change dynamically according to the country or region that is chosen on the map.
Our final dashboard allows users to access all the information on one page and added functionality that was not previously available on the UNdata site: easily compare metrics between specific countries.
Link to The Final Dashboard
Created together with Shanti Marcus and Hans Jogoo
Looks awesome! Congratulations!
This is amazing! I love the blue theme 😍