Evaluation Metrics in Classification Problem

Alisha Metkari

Published Sep 17, 2023

Classification is a supervised machine learning model used to classify new observations. The model learns from training data and classifies new observations. Choosing the correct evaluation metric for a classification problem is important as it can vary from problem to problem. Let’s understand classification evaluation metrics.

1. Accuracy

Accuracy simply measures how often the model correctly predicts. It is the ratio of the number of correct predictions divided by the number of total predictions.

TP: Number of positive labels that are also predicted as positive.

FP: Number of negative labels that are predicted as positive.

TN: Number of negative labels that are also predicted as negative.

FN: Number of positive labels that are predicted as negative.

But it only takes into account the correctly classified predictions and not the wrong predictions. Let’s see an example. Suppose there are 99 negative labels and 1 positive label then the classifier would always predict observation as positive then we got 99%(TP = 99, TN = 0, FP = 1, FN = 0) accuracy which is a good score but the model is poor as it is not predicting negative label, hence the model is biased. To tackle this situation, we need some other metrics. Accuracy is useful when the target class is well-balanced.

2. Precision

Precision is defined as the number of true positives divided by positive predictions(TP+FP). It explains correct positive prediction among all positive predictions. It is useful when false positive(FP) is higher concern than false negative(FN)

3. Recall

Recall is defined as the number of true positives divided by actual positives. It explains correct positive predictions among all actual positive labels. It is useful when false negative(FN) is higher concern than false positive(FP). Example, In cancer detection, FP is fine but the actual positive case should not go undetected.

Recommended by LinkedIn

Evaluation Metrics For Classification and Regression…

Saumya Mohandas N 2 years ago

What is Hyperparameter Tuning - Best Optimization…

Shriyansh Tiwari 1 year ago

An Introduction to Hyperparameter Tuning

Panth Shah 11 months ago

We need both precision and recall to be high but there is a trade-off between precision and recall. Hence we have F1 Score.

4. F1 Score

F1 Score is the combination of precision and recall. It is calculated as harmonic mean of precision and recall. It is useful when FN and FP are equally important. It is maximum when precision and recall are equal.

5. AUC-ROC

The ROC(Receiver Operating Characteristics) curve, is a graph showing the performance of the classification model at various threshold values. It plots TPR Vs FPR. True Positive Rate(TPR) is nothing but recall. False Positive Rate(FPR) is the ratio of FP and the number of actual negatives.

AUC is the Area Under Curve of the ROC plot. Maximum the area, the better the classification model. When AUC is 1, it means the classifier is able to perfectly distinguish between all positive and negative classes.

We can compare the performance of multiple models and choose the model with the highest AUC value. It is a good metric to compare performance between two or more models.

These are the most used classification evaluation metrics and should be used according to the given problem.

End Notes

Thanks for reading! I hope I have given some basic understanding of evaluation metrics in the classification problem. I am always open to your questions and suggestions. Do connect with me on LinkedIn.

To view or add a comment, sign in

Evaluation Metrics in Classification Problem

Alisha Metkari

1. Accuracy

2. Precision

3. Recall

Recommended by LinkedIn

4. F1 Score

5. AUC-ROC

End Notes

More articles by Alisha Metkari

Others also viewed

Threshold Tuning: The Most Ignored “Model Upgrade” in Machine Learning (With Confusion Matrices)

Measuring the performance of machine learning classification

Building an Effective Classification Model: A Step-by-Step Guide

Machine Learning is all about Common Sense.

When Machine Learning Models Drift: Stories of Learning in the Fast Changing World

Getting a new periodic table of elements using AI

Unleashing the Power of Patterns: Elevating Data Mastery through Enhanced Recognition and Processing

Generalized Machine Learning

How Model Diagnosis Can Aid in Early Detection of Data Issues

Understanding Predictive Analytics

Explore content categories

1. Accuracy

2. Precision

3. Recall

Recommended by LinkedIn

4. F1 Score

5. AUC-ROC

End Notes

More articles by Alisha Metkari

Power Transformations In Machine Learning

Others also viewed

Threshold Tuning: The Most Ignored “Model Upgrade” in Machine Learning (With Confusion Matrices)

Measuring the performance of machine learning classification

Building an Effective Classification Model: A Step-by-Step Guide

Machine Learning is all about Common Sense.

When Machine Learning Models Drift: Stories of Learning in the Fast Changing World

Getting a new periodic table of elements using AI

Unleashing the Power of Patterns: Elevating Data Mastery through Enhanced Recognition and Processing

Generalized Machine Learning

How Model Diagnosis Can Aid in Early Detection of Data Issues

Understanding Predictive Analytics

Similar topics

Best Practices For Evaluating Predictive Analytics Models

Explore content categories