DATA VISUALIZATION
In the era of big data, Data-Visualization plays a key role in data analysis. Data-Visualization helps in delivering information from data in the most effective way and it can be applied in all domains. It is the most sought-after skill required for business intelligence professionals.
What is Data-Visualization?
Data-Visualization is the presentation of raw data in pictorial or graphical format. This makes the data more natural for the human mind to understand thereby making it easier to identify trends, patterns, and outliers within large data sets. Data-Visualization helps in remembering trends or patterns of business, as the human brain processes image 60,000 times faster than texts.
Choosing the Right Visualization Type
Depending on data and intended purpose, there is a variety of different graphs and tables available that may be utilized to create an easy and informative dashboard. Choosing the right visualization type helps to fetch the maximum insights from data which serve as a prerequisite to make the right decision by stakeholders, business owners, and decision-makers.
This series discusses the most widely used analysis types and their best match chart types. Also, it includes an important list of techniques and thumb Rules which is considered to be important to take one’s visualization skills from ordinary to wow.
Showing Data Over Time
One of the most commonly used methods for analyzing data is to track a trend over time. Watching the change in data over time period helps in identifying drifts. It helps to identify sudden peaks and dip along with any seasonal trends of data over a period of time.
Picking the wrong visualization type for data leads to ambiguous results. Thus, choosing the right chart type is a must.
Some of the best visualizations for showing trends over time are:
1. Line Chart
2. Area Chart
3. Stacked Bar Chart
4. Candlestick/Stock Chart
Let us take a look at the data given below and understand the pros & cons of each chart type.
We have been supplied with Sales Data of ABC Store for the year 2020 along with the Price variation on daily basis. The data dictionary is as below:
1. Order ID: Unique order id for every transaction
2. Order Date: The Date when order is placed
3. Customer ID: Unique ID of every customer
4. Product Name: 3 products (Chair, White Bulbs, Cisco Setup)
5. Selling Price: Selling Price of the product (Price varies for every transaction)
6. Qty Purchased: Quantity purchased in a single transaction
7. Highest Price of the Day: The product’s highest price on a given day. (Price varies for every transaction)
8. Lowest Price of the day: The product’s sold for the lowest price on a given day. (Price varies for every transaction)
9. 1st Transaction of the day: Price of a product’s for the 1st transaction on a given day
10. Last Transaction of the Day: Price of a product’s for the last transaction on a given day
11. Sales: Actual Bill Amount (Qty * Sales Price)
Let us analyze the above data:
First, let’s look at the line chart. We put Months on X-Axis, Sales on Y-Axis, and segment product by color.
It can be analyzed from this chart that all products follow the same trend in sales over time. Each product has a spike in sales in March, September & November whereas there is a dip in sales in April, October & December.
But what about the overall Sales trend? Can the sales for all products in march or at any other point in time be seen/observed?
The answer is a Big NO.
These questions cannot be answered via line charts. However, Area chart or Stacked Bar Charts can be used to see the overall trend of sales along with comparing product’s trend individually.
What can be analyzed in the above Area Chart?
The overall sales trend along with individual products for every period of time can be analyzed.
In Area Chart, the sum of sales for the individual category is stacked on top of each other, and lines are plotted one at a time. Hence, the sum of sales for White bulbs is plotted first, serving as a moving baseline for the Sales value of the Cisco setup. The fully-stacked height of the topmost line will correspond to the total sales across all products.
In the Above Stacked bar chart, the same KPIs can be analyzed. The only difference is the area chart treats each product as a single pattern while the bar chart focuses on each month as a single pattern.
The last one is Candlestick or Stock chart and is most commonly used for the stock market price variation to describe price movements every day. Thus, a one-month chart may show the 30 trading days as 30 candlesticks.
In the same example, there is also the price variation of products every day, as the price of every transaction is different. Hence let us see the distribution of product price every day for the month of March 2020.
The above visualization shows us the price variation of the product Chair for Mar 2020. The explanation to read the chart is as below:
The thin black line shows the price range from high to low and the real body shows a wider area that helps to compute the difference between the first and the last price of a product on a given day. Green represents the first transaction on a given day was less than the last transaction, and red means the last transaction price was less than the first transaction.
Conclusion
Revealing trends in data over time gets easier when the right chart type is chosen. It can be assumed that Line chart, Area Chart & Stacked Bar Chart are majorly used to show trends over time whereas a Candlestick or a Stock chart is used to compare the price variation on a daily basis.
Needless to say, please always pay attention to data first and consider your goals.
By Kanav Taneja
Thanks for posting this!
Great Initiative Kanaav.
Thank you for an elaborative and thorough article.
Very useful