Data Cleaning in Python for Data Analysts

Day 4 of My Data Analyst Journey – Data Cleaning in Python Today, I practiced data cleaning techniques using Python, focusing on handling real-world messy text data. Problem Statement: I had a dataset of customer feedback containing: • Extra spaces • Mixed casing (UPPER/lower) • Punctuation (., !, ?) Objective: Clean and standardize the feedback text for better analysis. What I implemented: Removed punctuation using .replace() Converted text to lowercase Removed leading & trailing spaces using .strip() Handled lists inside a dictionary Python Code: import string feedback_data = { 'S_No': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'Name': ['Ravi', 'Meera', 'Sam', 'Anu', 'Raj', 'Divya', 'Arjun', 'Kiran', 'Leela', 'Nisha'], 'Feedback': [ ' Very GOOD Service!!!', 'poor support, not happy ', 'GREAT experience! will come again.', 'okay okay...', ' not BAD', 'Excellent care, excellent staff!', 'good food and good ambience!', 'Poor response and poor handling of issue', 'Satisfied. But could be better.', 'Good support... quick service.' ], 'Rating': [5, 2, 5, 3, 2, 5, 4, 1, 3, 4] } punctuation = ".,!?" cleaned_feedbackdata = {} for key, value in feedback_data.items(): if isinstance(value, list): new_list = [] for item in value: if isinstance(item, str): item = item.strip().lower() for p in punctuation: item = item.replace(p, "") new_list.append(item) cleaned_feedbackdata[key] = new_list else: cleaned_feedbackdata[key] = value print(cleaned_feedbackdata) Outcome: Cleaned and structured feedback data ready for analysis like sentiment detection, keyword extraction, and insights generation. Key Learning: Data cleaning is one of the most important steps in data analysis—clean data = better insights! #Python #DataCleaning #DataAnalytics #LearningJourney #BeginnerToPro #CodingPractice #100DaysOfCode

To view or add a comment, sign in

Explore content categories