Pivot Table in Python
Template By @Raining Leads on Canva

Pivot Table in Python

Drag & Drop Data Analysis in Python


Introduction

Exploratory Data Analysis in Python was typically done by using the Pandas library. If you're using Python for data analysis you’ll know that analyzing data with different situations wasn’t accessible, because you’ve to write separate code for various conditions (filters, aggregations).

In contrast with analyzing data in Tableau, we can just drag and drop columns into rows & columns to create different levels of aggregation as well as use the filters card to create drop-down menus to interactively filter the values in our table.


We all know how useful python is to clean data, write calculations and conduct analysis but sometimes, especially at the beginning of the analysis, we just want to quickly run through different levels of aggregations and filter conditions to answer a few business questions (understand the data in business perspective).

Therefore, I would usually start exploring the data with Tableau and Python interchangeably so that I could benefit from both tools.

With that said, it makes me think that what if Python could have a similar feature (drag & drop analysis) with Tableau? Then I came across a random post from my connections on LinkedIn sharing about this python library called ipypivot and pivottablejs.


Install Library

If you haven’t already, you’ll need to install the library.

No alt text provided for this image
install pivottablejs library
No alt text provided for this image
install ipypivot library

Data & Resources

Data: GDrive Link

Medium: medium.com/@foocheechuan

Youtube: Chee-Chuan

Read Data

import pandas as pd
df = pd.read_excel(f'{filepath}/Global Superstore.xlsx')
df        

Pivot Table (pivottablejs)

from pivottablejs import pivot_ui
pivot_ui(df)        
No alt text provided for this image
pivot table called by pivot_ui(df)

Drag & Drop pivot table

No alt text provided for this image
Creating pivot table
No alt text provided for this image
Pivot table outcome

If you’ve used Tableau (cross table) or Excel (pivot table) before, the idea is pretty similar. We need a Dimension (Categorical field) and a Measure (Numerical field).

We’ll need to drag a Dimension into the view and then select the aggregation method and measure field in the drop-down menu. Unlike the usual drag & drop analytics, that drags both dimension and measure into the view. If we drag Sales into the view, it will be treated as a dimension.

No alt text provided for this image
Categorical sales values when dragged into the horizontal container

Problem with (pivottablejs)

There is one problem with the pivottablejs library. There are unexpected null values in my table when the raw data wasn’t supposed to have null values.

No alt text provided for this image
Unexpected null values

I changed the aggregation to count to understand what happened.

No alt text provided for this image
Aggregation method = Count

There are 51,2941 null values and 51,290 non-null values which means it's a duplication of my raw data (inclusive of column name).

I solved this problem thanks to the solution stated on this page.

To put it in my code, I modified the solution a little bit.


# We already defined df before this
  # If you're starting new define your df first
def pivot_ui(df, **kwargs):
    import pivottablejs
    class _DataFrame(pd.DataFrame):
        def to_csv(self, **kwargs):
            return super().to_csv(**kwargs).replace("\r\n", "\n")
    return pivottablejs.pivot_ui(_DataFrame(df), **kwargs)

pivot_ui(df)        

No alt text provided for this image
No more null values

Problem Solving

Now let’s answer some basic data exploration questions with this pivot table.

Question 1:

Which Sub-Categories have the top 3 highest total profit? For these Sub-Categories, are they in the same Category? Which Market in these Sub-Categories has the highest total profit?

No alt text provided for this image
Question 1

Q1 Solution:

Drag Category into the vertical container followed by Sub-Category below it.
Drag Market into the horizontal container
Aggregation method changed to Sum
Measure choose Profit
Sort descending order vertically (Arrow beside aggregation drop-down menu)

Q1 Answer:

Copiers, Phones, and Bookcases have the top 3 highest total profits.
Copiers and Phones are in the Technology Category while Bookcases are in the Furniture Category.
Copiers and Phones have the highest total profit in the APAC market while bookcases have the highest total profit in the EMEA market.

Question 2:

Which market has the highest total profit in the “Furniture” category?

No alt text provided for this image
Question 2

Q2 Solution:

Using the same pivot table removes the Sub-Category from the vertical container (drag it back to the left container).
Drag Market from the horizontal container to the vertical container.
Filter Category to “Furniture” -> Click on the arrow beside Category -> uncheck all check boxes except “Furniture”.

Q2 Answer:

The APAC market has the highest total profit for the furniture Category.

Video Tutorial

Final Words

Data analysis used to be a code-only profession. With technology improving day by day data analysis has become easier. Many see this opportunity and offer code-free saas tools for drag & drop analysis but charge a premium for it. This makes new learners hard to start their first project with limited resources. So I’ll try my best to provide free tutorials to enable new learners to learn easier and faster.

With that said, kindly leave your comment down below on what kind of tutorials you’d like me to do. I’ll keep creating content like this on my medium and youtube channel so follow me on these platforms.

Thank you for reading till the end. See you at the next one. :)

No alt text provided for this image
Photo by Thought Catalog on Unsplash

To view or add a comment, sign in

More articles by Chee-Chuan Foo

Explore content categories