Unlocking HR Insights with Graphs Using Python (with Example Codes)

Gayane Haçik

Published Sep 27, 2023

Data, data, data! Just like any other sector in the ever-evolving world of human resources, data-driven decisions are becoming more critical than ever. To harness the power of HR data, it’s essential to visualize it effectively. In this article, you’ll explore five types of graphs using Python that are perfect for HR data analysis. Each graph has its unique advantages, providing valuable insights into various aspects of HR.

The data I generated as a source for examples is a simple CSV file that contains the following information:

Example code:

#Import libraries
import csv
import os

# Define the data
data = [
    ['EmployeeID', 'Gender', 'Department', 'RecruitmentSource', 'HiringOutcome', 'Team', 'EngagementScore', 'Productivity', 'EmployeeSatisfaction', 'Tenure', 'PerformanceRating'],
    [1, 'Male', 'HR', 'LinkedIn', 'Accepted', 'Team A', 75, 80, 4.5, 3, 'Excellent'],
    [2, 'Female', 'Finance', 'Indeed', 'Accepted', 'Team B', 80, 85, 4.8, 4, 'Outstanding'],
    [3, 'Male', 'Engineering', 'LinkedIn', 'Rejected', 'Team C', 90, 92, 4.2, 5, 'Outstanding'],
    [4, 'Female', 'Marketing', 'Indeed', 'Accepted', 'Team A', 72, 78, 4.0, 2, 'Average'],
    [5, 'Male', 'Engineering', 'LinkedIn', 'Accepted', 'Team D', 88, 90, 4.9, 6, 'Outstanding'],
    [6, 'Male', 'HR', 'Referral', 'Accepted', 'Team B', 79, 82, 4.6, 2, 'Excellent'],
    [7, 'Female', 'Finance', 'LinkedIn', 'Accepted', 'Team C', 82, 88, 4.7, 3, 'Excellent'],
    [8, 'Female', 'Engineering', 'Indeed', 'Rejected', 'Team D', 91, 94, 4.4, 7, 'Outstanding'],
    [9, 'Male', 'Marketing', 'Referral', 'Accepted', 'Team A', 70, 75, 4.1, 2, 'Good'],
    [10, 'Female', 'Engineering', 'LinkedIn', 'Accepted', 'Team B', 86, 89, 4.7, 4, 'Excellent']
]

# Get the desktop directory path
desktop_path = os.path.join(os.path.expanduser("~"), "Desktop")

# Specify the file path on the desktop
file_path = os.path.join(desktop_path, 'employee_data.csv')

# Create and open the CSV file in write mode
with open(file_path, 'w', newline='') as csv_file:
    # Create a CSV writer
    csv_writer = csv.writer(csv_file)
    
    # Write the data to the CSV file
    csv_writer.writerows(data)

print(f'CSV file "{file_path}" has been created on your desktop successfully.')

This code will help you generate and save a CSV file on your PC desktop. You can generate different data or use already existing data if you have but for this article, I’ll use the data generated above. So, if you’re all set let’s dive in!

1. Mosaic Plot (Marimekko Chart):

- Why it’s used: Mosaic plots are ideal for visualizing categorical data, especially when you want to explore the relationship between two or more categorical variables.

- HR data needed: Use it to analyze employee demographics, such as gender and department, or compare recruitment sources and hiring outcomes.

- Benefits: Mosaic plots provide a clear representation of how categories within different variables intersect, making it easy to identify patterns and trends.

Here’s the example code:

# Example code for creating a mosaic plot using Python
#Import libraries
import pandas as pd
from statsmodels.graphics.mosaicplot import mosaic
import matplotlib.pyplot as plt

# Load the dataset (replace 'employee_data.csv' with your actual file path)
employee_data = pd.read_csv('employee_data.csv')

# Create the mosaic plot
plt.figure(figsize=(10, 6))
mosaic(employee_data, ['Gender', 'Department'], title='Mosaic Plot: Gender vs. Department')
plt.show()

This code will create a mosaic plot that visualizes the relationship between gender and department in your employee dataset. You can adjust the column names in the mosaic function to explore other relationships or variables as needed.

Article content — A mosaic plot generated with the example data

2. Treemap:

- Why it’s used: Treemaps are excellent for displaying hierarchical data structures, making them suitable for visualizing HR organizational hierarchies and reporting structures.

- HR Data Needed: Explore the breakdown of employees by department, teams, or hierarchical levels.

- Benefits: Treemaps provide a hierarchical view of data, allowing HR professionals to understand the distribution and relationships within the organization.

Here’s the example code:

# Example code for creating a treemap using Python
#Import libraries
import pandas as pd
import squarify
import matplotlib.pyplot as plt

# Load the dataset (replace 'employee_data.csv' with your actual file path)
employee_data = pd.read_csv('employee_data.csv')

# Calculate the department sizes
department_sizes = employee_data['Department'].value_counts()

# Create labels for each department
labels = department_sizes.index

# Create treemap
plt.figure(figsize=(10, 6))
ax= squarify.plot(sizes=department_sizes, label=labels, alpha=0.7)
# Annotate each square with the number of employees
for i, label in enumerate(labels):
    x, y, dx, dy = ax.patches[i].get_bbox().bounds
    plt.text(x+dx/2, y+dy/3 , f'{department_sizes[i]}', va='center', ha='center', fontsize=12, fontweight='bold')

plt.axis('off')
plt.title('Employee Breakdown by Department (Treemap)')
plt.show(

This code will create a treemap visualization with numbers representing the count of employees in each department, and the numbers are centered within each rectangle.

3. Heatmap:

- Why it’s used: Heatmaps are effective for visualizing correlations between variables, making them valuable for identifying relationships within HR data.

- HR data needed: Analyze correlations between employee performance metrics, such as engagement scores and productivity.

- Benefits: Heatmaps make it easy to spot trends, outliers, and areas of concern in HR data, facilitating data-driven decision-making.

Recommended by LinkedIn

Hire Like a Data Scientist | How to screen 1000 resume…

Richard Meng 2 years ago

Enhancing Business Profitability with Organizational…

Renjith P 1 year ago

Bureau of Labor Statistics: ETL and Analysis Project…

Grant Arbuckle 3 years ago

Here’s the example code:

# Example code for creating a heatmap using Python
#Import libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Load the dataset (replace 'employee_data.csv' with your actual file path)
employee_data = pd.read_csv('employee_data.csv')

# Select the columns for analysis
columns_to_analyze = ['EmployeeSatisfaction', 'Tenure']

# Create a correlation matrix
correlation_data = employee_data[columns_to_analyze].corr()

# Create a heatmap
plt.figure(figsize=(10, 6))
sns.heatmap(correlation_data, annot=True, cmap='YlGnBu', fmt='.2f', cbar=True)
plt.title('Correlation Heatmap of Employee Metrics')
plt.show()

This heatmap will show the correlations between the selected variables, allowing you to analyze how they relate to each other in your employee dataset.

4. Box Plot (Box-and-Whisker Plot):

- Why it’s used: Box plots are a valuable tool for visualizing and summarizing the distribution of a single continuous or numerical variable. They are especially useful when you need to understand the spread, central tendency, and presence of outliers within the data.

- HR data needed: Box plots can be applied to HR data to gain insights into various aspects, such as employee salary distributions, performance rating variations, or tenure across different departments or teams.

- Benefits: Box plots provide a clear summary of key statistics, including the median (central tendency), quartiles (spread), and potential outliers, making it easy to grasp the overall distribution of the variable.

Here’s the example code:

# Example code for creating a box plot using Python
#Import libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Load the dataset (replace 'employee_data.csv' with your actual file path)
employee_data = pd.read_csv('employee_data.csv')

# Create a box plot of employee performance ratings by department
plt.figure(figsize=(12, 6))
sns.boxplot(data=employee_data, x='Department', y='EngagementScore', palette='Set2')
plt.xlabel('Department')
plt.ylabel('Engagement Score')
plt.title('Engagement Score by Department (Box Plot)')
plt.xticks(rotation=45)
plt.show()

This plot allows you to detect outliers and understand the central tendency and spread of engagement ratings in your HR data.

5. Stacked Bar Chart:

- Why it’s used: Stacked bar charts are effective for visualizing categorical data and comparing the composition of categories within a variable across different groups or categories.

- HR data needed: Use stacked bar charts to represent how different categories, such as employee genders or recruitment sources, are distributed within various departments or teams.

- Benefits: Stacked bar charts provide a clear visual comparison of category distributions across different groups, allowing HR professionals to easily identify disparities or trends.

Here’s the example code:

# Example code for creating a stacked bar chart using Python
#Import libraries
import pandas as pd
import matplotlib.pyplot as plt

# Load the dataset (replace 'employee_data.csv' with your actual file path)
employee_data = pd.read_csv('employee_data.csv')

# Create a stacked bar chart to visualize recruitment source distribution by department
recruitment_data = employee_data.groupby(['Department', 'RecruitmentSource']).size().unstack(fill_value=0)
recruitment_data.plot(kind='bar', stacked=True, figsize=(10, 6), colormap='Paired')
plt.xlabel('Department')
plt.ylabel('Count')
plt.title('Recruitment Source Distribution by Department (Stacked Bar Chart)')
plt.xticks(rotation=45)
plt.legend(title='Recruitment Source', loc='upper right')
plt.show()

This chart helps you understand where the organization’s talent is primarily sourced from in each department, which can be valuable for recruitment and hiring strategy decisions.

So here you have it! I hope these examples help you through your analysis journey. I’ll continue to share more graphs, code snippets, and deeper analysis to enhance our understanding of HR data.

Cheers,

Gayane

Diverse Insights Hub

581 followers

+ Subscribe

Stacy Luck 2y

Thanks for posting Gayane! I’ve been meaning to switch my focus to more data analysis.

1 Reaction

Rapeka Çakır 2y

Thanks for sharing. These charts look very useful. I’m looking forward to your new articles. 👏

1 Reaction

Sinem Feyzi 2y

Important graphs for a quick insight into HR analytics. Thank you for sharing Gayane 💡 I'm looking forward to your new articles 🚀👨💻

Unlocking HR Insights with Graphs Using Python (with Example Codes)

Gayane Haçik

Recommended by LinkedIn

Diverse Insights Hub

581 followers

More articles by Gayane Haçik

Others also viewed

Data Visualization

Visualizing Data

R vs. SPSS: Which Is the Better Choice?

Customer Churn Analysis Using Python | A Beginner Data Analytics Project

Demystifying Regular Expressions (Regex): A Crucial Skill for Data and Tech Professionals

Google Merchandise Sales Analysis using Python

Streamlining Your Data: The Ultimate Guide to Removing Duplicate Rows or De-Duplication

Roadmap to Cracking Your First Data Science Interview: From Zero to Job-Ready

How SAS Changed the Game for Me as a Data Scientist

Python Versus Business Intelligence Tools: A Data Analyst's Perspective

Explore content categories

Recommended by LinkedIn

Diverse Insights Hub

581 followers

More articles by Gayane Haçik

The hustle never loved you

Why I Stand Against AI Creating Art Instead of Humans

Moonlighting: What we do in the shadows after work?

Why Data Doesn't Always Equal Insight in People Analytics Perspective

Engage, Retain, Succeed: How AI Hyper-Personalization is Redefining HR

Chat GPT: The AI Assistant with a Thirsty Habit

Revolutionizing Gender Dynamics in 2023 and Beyond

How Deleting an E-mail Can Save the Planet

Decoding HR Patterns: A Step-by-Step Guide to Correlation Matrices with Python

Reading is Fundamental and here’s why

Others also viewed

Data Visualization

Visualizing Data

R vs. SPSS: Which Is the Better Choice?

Customer Churn Analysis Using Python | A Beginner Data Analytics Project

Demystifying Regular Expressions (Regex): A Crucial Skill for Data and Tech Professionals

Google Merchandise Sales Analysis using Python

Streamlining Your Data: The Ultimate Guide to Removing Duplicate Rows or De-Duplication

Roadmap to Cracking Your First Data Science Interview: From Zero to Job-Ready

How SAS Changed the Game for Me as a Data Scientist

Python Versus Business Intelligence Tools: A Data Analyst's Perspective

Explore content categories