DESCRIPTIVE STATISTICS

DESCRIPTIVE STATISTICS

Descriptive statistics is the fundamental requirement for any data analysis. Exploratory Data Analysis (EDA) is performed in almost all data science projects to understand the nature of the data. In this article, I would like to discuss two underlying concepts involved in descriptive statistics

  1. Estimate of Location: This is the technique performed to understand where most of the data is clustered.
  2. Estimate of variability (Deviation): This is a way to identify, how dispersed is the value, or how cluttered the value from the median or mean.

Now let's take a deeper look at the individual concepts of Estimate of Location and Estimate of variability

ESTIMATE OF LOCATION:

  1. Mean: is the average value of the dataset. it is calculated by adding or summing the value field and dividing the total sum by the number of rows.

For Instance: 2+4+4+2= 12

We have 4 records and the total sum is 12, hence the mean value would be 3

  1. Median: This is a middle value in a dataset. This is the more robust value when compared to the mean because it takes in the consideration of outliers in the dataset.
  2. Mode: This is how many times a particular value has been repeated in the dataset.

For Instance: 1, 2, 2, 4, 5, 6

2 is the mode in the above example because 2 is repeated more times.

ESTIMATE OF VARIABILITY:

  1. Range: This is the difference between the maximum value and the minimum value.

For instance: Max value = 10 and the Min value = 2 in a given variable, 8 is the range

  1. Variance: This is a measure of how the data differs from the mean value.
  2. Deviation Score: This is the difference between the number of observed values and the estimate of location (Mean or Median)
  3. Standard Deviation: This is simply the square root of the variance

For Instance: if the variance is 34.8 then the Standard deviation is 5.89 (Square root of 34.8)

  1. Percentile: This is the place of the value in a given dataset. (This is also a way of ranking observations)

For instance: Mr. X scored 75th percentile, which means 75% of the people scored less than Mr. X

  1. Interquartile Range: The difference between 75th and 25th percentile. This is also nothing but middle 50th percentile or 50th Percentage

To view or add a comment, sign in

More articles by Manjunath Lakshman

  • Work Is Worship

    In the hustle and bustle of modern life, finding meaning and purpose in our daily endeavors can be a challenge. We…

    1 Comment
  • Unleashing the power of data science

    In today's data-driven world, businesses are constantly seeking ways to gain a competitive edge. Sales leadership plays…

  • Lead Vs Account Vs Contact Vs Opportunity

    Starting the year with yet another important blog post. This topic is a must-read for anyone who is part of the Sales…

  • One KPI metric which keeps your business afloat: Pipeline coverage

    Sales & Sales Operations will be more acquainted with today’s topic as they have to deal with it almost every day and…

  • How to do Sales Territory Planning the correct way

    What is sales territory planning? Sales Territory planning is the process of creating a workable plan to target the…

    3 Comments
  • Tips on Making the numbers in the last month of the year

    37 DAYS TO GO (how much more can you push) You have preciously 37 days to close the business for the year. The festive…

  • Avoid these Misconceptions About Data-Science

    "Failure is not the opposite of success it is part of the success" - By Debbie Gregory Data Science jobs are the…

  • KEY TO A SUCCESSFUL SALES OPS

    In this article, I would like to share what a successful sales ops look like. How it positively impacts results if…

  • 3 Strategies for Learning Data Science

    Let me start by saying "Too much information is as good as no information" Soon after I committed to the continuous…

Explore content categories