Remove Outliers with IQR Method in Python

🚀 Removing Outliers using IQR Method in Python Outliers can seriously impact your data analysis and model performance. Instead of ignoring them, it’s important to detect and handle them properly. 📊 One of the most reliable techniques is the Interquartile Range (IQR) method. 📌 How it works: Calculate Q1 (25th percentile) and Q3 (75th percentile) Compute IQR = Q3 − Q1 Define boundaries: Lower Fence = Q1 − 1.5 × IQR Upper Fence = Q3 + 1.5 × IQR IQR=Q3−Q1 Any value outside these boundaries is considered an outlier. import numpy as np def detect_outliers(data, k=1.5):   data.sort()   arr = np.array(data, dtype=float)   Q1 = np.percentile(arr, 25, method='linear')   Q3 = np.percentile(arr, 75, method='linear')   IQR = Q3 - Q1   lower = Q1 - k * IQR   upper = Q3 + k * IQR   mask = (arr >= lower) & (arr <= upper)   outliers_mask = ~mask   return {     "outliers": arr[outliers_mask].tolist(),     "clean_data": arr[mask].tolist()   } student_score = [10, 12, 45, 34, 20, 33, 35, 40, 55, 44, 48, 53, 90, 98] print(detect_outliers(student_score)) 📈 Output Insight: Outliers detected → [98] Clean data → Remaining values within range 🎯 Why use IQR? ✅ Robust to skewed data ✅ Easy to implement ✅ Works well for real-world datasets ⚠️ Tip: Don’t blindly remove outliers — sometimes they carry valuable insights! 💬 Good data preprocessing leads to better models. #DataScience #Python #MachineLearning #DataAnalytics #Statistics #Pandas #AI #Learning

To view or add a comment, sign in

Explore content categories