📊 Day 20: Creating Cumulative Frequency Tables in Python

📊 Day 20: Creating Cumulative Frequency Tables in Python

🔍 What is Cumulative Frequency?

Cumulative frequency is the running total of frequencies in a dataset. Instead of just showing how often a value appears, it adds up the previous frequencies, helping to analyze trends and percentile ranks easily.


🤔 Why is Cumulative Frequency Important?

✔ Helps in percentile analysis (e.g., "What percentage of students scored below 80?")

✔ Useful for identifying trends (e.g., "How many customers spend below $100?")

✔ Makes data visualization easier (often used in Ogive graphs)


📌 Where is it Used?

🔹 Education – Finding students below a certain grade level 🎓

🔹 Finance – Tracking cumulative revenue 💰

🔹 E-commerce – Analyzing spending patterns 🛒

🔹 Healthcare – Patient recovery progress 📈


📖 Scenario: Analyzing Customer Purchases in a Store 🏪

Imagine you run a supermarket, and you want to analyze how much customers typically spend in a single visit.

🔍 Question: "What percentage of customers spend below $100?"

Let's create a Cumulative Frequency Table for customer spending.

import pandas as pd
import numpy as np

# Sample data: Spending amounts of 20 customers
np.random.seed(42)
spending = np.random.randint(10, 200, 20)  # Random spending amounts between $10 and $200

# Convert to DataFrame
df = pd.DataFrame(spending, columns=['Spending'])

# Create frequency table
df['Spending Range'] = pd.cut(df['Spending'], bins=[0, 50, 100, 150, 200], labels=["$0-$50", "$50-$100", "$100-$150", "$150-$200"])
freq_table = df['Spending Range'].value_counts().sort_index()

# Calculate cumulative frequency
cumulative_freq = freq_table.cumsum()

# Create cumulative frequency table
cumulative_table = pd.DataFrame({'Frequency': freq_table, 'Cumulative Frequency': cumulative_freq})

# Display table
print(cumulative_table)        


Article content

💡 Insights:

7 out of 20 customers (35%) spent below $100.

80% of customers spent less than $150.

✔ This helps in offering better discounts & promotions for frequent spenders!


📈 Visualizing Cumulative Frequency (Ogive Plot)

To make it more insightful, let's plot a cumulative frequency graph.

import matplotlib.pyplot as plt
import seaborn as sns

# Plot cumulative frequency
plt.figure(figsize=(8,5))
sns.lineplot(x=cumulative_freq.index, y=cumulative_freq.values, marker='o', linestyle='-', color='b')
plt.xlabel('Spending Range')
plt.ylabel('Cumulative Frequency')
plt.title('Cumulative Frequency Distribution of Customer Spending')
plt.grid(True)
plt.show()        


Article content

🎯 What Can We Learn from This?

  • If the curve flattens early, most customers spend less money.
  • If the curve keeps rising, customers tend to spend more.
  • The steepest part shows the most common spending range.


💡 Fun Fact

Did You Know? Cumulative frequency is widely used in percentile calculations—it’s the reason your exam scores get ranked among other students! 🎓📊


🚀 Key Takeaways

Cumulative frequency helps in trend analysis and decision-making.

It is widely used in business, finance, education, and healthcare.

Python makes it easy to generate and visualize cumulative frequency tables.


Happy learning!

To view or add a comment, sign in

More articles by Premanand S

Explore content categories