Handling Missing Values in Python Datasets

📊 Handling Missing Values in Python - The First Real Data Problem You’ll Face You’ve loaded your dataset. Everything looks fine… until you notice this: 👉 Some values are missing. Blank cells. NaN values. Incomplete records. And here’s the truth: 👉 Almost every real-world dataset has missing data. 🔹 What Are Missing Values? Missing values are simply gaps in your dataset — places where data should exist but doesn't. In Python, they usually appear as: NaN # Not a Number — most common in pandas None # Python's version of empty 🔹 Why Do Missing Values Matter? Because they can silently break your analysis. ❌ Wrong averages ❌ Incorrect insights ❌ Errors in calculations ❌ Poor model performance 👉 Ignoring missing data = trusting wrong results 🔹 Simple Real-Life Example Imagine you’re analyzing employee salaries. Some entries are missing. Now if you calculate average salary: 👉 Your result will be misleading But once you handle missing values properly: 👉 Your analysis becomes accurate and reliable 🔹 How to Detect Missing Values In Python, it’s very simple: df.isnull().sum() 👉 This shows how many values are missing in each column 🔹 How to Handle Missing Values There is no “one right way”—it depends on the situation. But commonly, analysts use: ✔ Remove missing data df.dropna() ✔ Fill with mean (for numerical data) df['salary'] = df['salary'].fillna(df['salary'].mean()) ✔ Fill with mode (for categorical data) df['city'] = df['city'].fillna(df['city'].mode()[0]) ✔ Forward fill (for time-based data) df.fillna(method='ffill') 🔹 One Rule to Always Remember Missing % What to DoLess than 5% Safely drop the rows 5% to 30% Fill with mean, median or mode More than 30% Investigate — the column may be unreliable 🔹 When Should You Handle Missing Values? Always: ✔ Right after loading your dataset ✔ Before doing calculations ✔ Before building any model 👉 Cleaning comes before analysis. 🚀 Final Thought Dirty data is not the problem. Not knowing how to clean it — that is. Every professional dataset has missing values. What separates a good analyst from a great one is knowing exactly how to handle them. 💡 #DataAnalytics #Python #MissingValues #DataCleaning #pandas #DataAnalyst #LearningInPublic #PythonForData #AnalyticsJourney #DataScience

To view or add a comment, sign in

More Relevant Posts

Manish Mohapatra
3w
Report this post
🐍 Data Types & Type Casting in Python (Small Concept, Big Impact) When working with data in Python, one mistake beginners often make is ignoring data types. And trust me, this small thing can break your entire analysis. When you load a dataset in Python, it doesn't always read your data the way you expect. A column full of numbers might be stored as text. A date column might be treated as a random string. A true/false column might come in as an object. And if you don't fix this early, your entire analysis will give you wrong results. 🔹 So What Are Data Types? Every value in Python has a type - it tells Python what kind of data this is and what you can do with it. The most common ones in data analysis: int → Whole numbers → 25, 100, -5 float → Decimal numbers → 3.14, 99.9, -0.5 str → Text → "John", "Mumbai", "Yes" bool → True or False → True, False datetime → Dates & times → 2024-01-15 👉 Think of data types as the language your data speaks, If you misunderstand it, your analysis goes wrong. 🔹 Why Data Types Matter in Data Analysis Because Python behaves differently based on data types. Example: 👉 "100" + "20" → "10020" (string concatenation) 👉 100 + 20 → 120 (numeric addition) Same values. Different result. 🔹 A Simple Real-Life Example Imagine a salary column in your dataset. You try to calculate the average: df['salary'].mean() But Python throws an error. You check the data type and you see - salary is stored as object (string), not a number. Python literally can't do math on it. That's where Type Casting comes in. 🔹 What is Type Casting? Type casting means converting one data type into another. Your salary column is stored as "50000" (a string). Every calculation you run will give wrong results or fail completely. After type casting: # Convert salary column to number df['salary'] = df['salary'].astype(float) # Now calculate average salary df['salary'].mean() # works perfectly # Convert joining date to datetime df['join_date'] = pd.to_datetime(df['join_date']) # Convert employment status to boolean df['is_active'] = df['is_active'].astype(bool) Now Python understands your data — and you can calculate average salaries, find top earners, compare departments, and build models correctly. 🔹 Why This Matters in Real Projects Wrong data types silently break your analysis. - Calculations fail on string columns - Sorting dates goes wrong if stored as text - Visualizations won't plot numeric data stored as objects - Machine learning models reject incorrect types completely Checking and fixing data types is not optional — it is one of the first things a professional analyst does. 🔹 When Should You Always Check Data Types? ✔ Right after loading your dataset ✔ Before doing any salary calculations ✔ During data cleaning df.dtypes # check all column types at once One wrong data type = one wrong insight. And in salary analysis, one wrong insight can mislead an entire business decision. #DataAnalytics #Python #DataTypes #TypeCasting #pandas
Like Comment
To view or add a comment, sign in
anuj chhetri
3w
Report this post
My Data Science Journey — Python Tuple, Set, Dictionary & the Collections Library Today’s focus was on Python’s core data structures — Tuples, Sets, and Dictionaries — along with the powerful collections module that enhances their functionality for real-world use cases. 𝐖𝐡𝐚𝐭 𝐈 𝐋𝐞𝐚𝐫𝐧𝐞𝐝: Tuple – Ordered, immutable, allows duplicates – Single element tuples require a trailing comma → ("cat",) – Supports packing and unpacking → x, y = 10, 30 – Cannot be modified after creation (TypeError by design) – Faster than lists in certain operations – Used in scenarios like geographic coordinates and fixed records – Can be used as dictionary keys (unlike lists) Set – Unordered, mutable, stores unique elements only – No indexing or slicing support – Empty set must be created using set() ({} creates a dict) – .remove() raises KeyError if element not found – .discard() removes safely without error – Supports operations like union, intersection, difference, symmetric_difference – Methods like issubset(), issuperset(), isdisjoint() help in set comparisons – frozenset provides an immutable version of a set – Offers O(1) average time complexity for membership checks Dictionary – Key-value pair structure, ordered, mutable, and keys must be unique – Built on hash tables for fast lookups – user["key"] → raises KeyError if missing – user.get("key", default) → safe access with fallback – Methods: keys(), values(), items() for iteration – pop(), popitem(), update(), clear(), del for modifications – Widely used in real-world data like APIs and JSON responses – Common pattern: list of dictionaries for structured datasets Collections Library – namedtuple → tuple with named fields for better readability – deque → efficient queue with O(1) operations on both ends – ChainMap → combines multiple dictionaries without merging copies – OrderedDict → maintains order with additional utilities like move_to_end() – UserDict, UserList, UserString → useful for customizing built-in behaviors with validation and extensions Performance Insight – List → O(n) – Tuple → O(n) – Set → O(1) (average lookup) – Dictionary → O(1) (average lookup) 𝐊𝐞𝐲 𝐈𝐧𝐬𝐢𝐠𝐡𝐭: Understanding when to use each data structure — and how collections enhances them — is crucial for writing efficient, scalable, and clean Python code. Read the full breakdown with examples on Medium 👇 https://lnkd.in/gvv5ZBDM #DataScienceJourney #Python #Tuple #Set #Dictionary #Collections #Programming #DataStructures

Python — Tuple, Set, Dictionary & the collections Library: A Complete Guide medium.com
Like Comment
To view or add a comment, sign in
Nischal Karki
2w
Report this post
Day 2 of Learning Python – And I Just Built My First Real Data Audit System 📊🐍 Today I didn’t just “learn Python”… I used it to analyze structured company-style audit data and built a Mistake Scoring System that automatically evaluates performance. And honestly, It felt like stepping into real business intelligence work. 💡 What I built today: Using Pandas, I processed an audit dataset and generated insights like: 📌 Total deals per responsible person 📌 Pipeline distribution per team member 📌 Mistake scoring based on missing actions (follow-ups, updates, documents) 📌 Final performance summary ranking everyone by errors ⚙️ The idea behind the system: Instead of manually checking performance, I created a logic-based scoring system where: Missing documents = +1 error No follow-up = +1 error No comment update = +1 error Unresolved status = +3 heavy penalty This turns raw data into actionable performance insights. 💻 Code I used: import pandas as pd file_path = r " Instered your excel data file here" Note: The r before the file path means it is a raw string, which helps Python correctly read the path without treating backslashes as escape characters. Also, make sure your Excel file is saved in the same folder where your Python script is located, or ensure the correct full file path is provided. df = pd.read_excel(file_path) # CLEAN DATA df.columns = df.columns.str.strip() df = df.fillna("No") # MISTAKE SCORE SYSTEM df["Mistake Score"] = 0 df.loc[df["Document/RF Request"] == "No", "Mistake Score"] += 1 df.loc[df["Comment Updates"] == "No", "Mistake Score"] += 1 df.loc[df["Follow up"] == "No", "Mistake Score"] += 1 df.loc[df["Status"].str.lower() == "unresolved", "Mistake Score"] += 3 # ANALYSIS print(df["Responsible"].value_counts()) print(df.groupby(["Responsible", "Pipeline"]).size()) mistakes = df.groupby("Responsible")["Mistake Score"].sum().sort_values(ascending=False) print(mistakes) summary = df.groupby("Responsible").agg( Total_Deals=("Responsible", "count"), Total_Mistakes=("Mistake Score", "sum") ) print(summary.sort_values("Total_Mistakes", ascending=False)) 🚀 Key takeaway: Even simple Python + Excel data can be transformed into a decision-making system that highlights performance gaps instantly. Day 2 of learning — and I’m already seeing how powerful data can be in real business environments. Can’t wait to build dashboards and automate even more next 🔥 #Python #DataAnalysis #Pandas #LearningInPublic #DataScience #Automation #BusinessIntelligence #CareerGrowth
Like Comment
To view or add a comment, sign in
Shravan Kharade
4w
Report this post
🚀 Day 7 of My Python Learning Journey | String Methods | Business Analyst Aspirant Continuing my Python journey to strengthen my skills for a Business Analyst role 📊 Today, I worked on String Methods in Python, which are extremely useful for data cleaning, transformation, and preprocessing — key tasks in real-world analytics. 💻 Topic: String Methods in Python # Remove spaces text1 = " hello python learners " print("Clean text:", text1.strip()) # Upper & Lower case print("Upper:", text1.upper().strip()) print("Lower:", text1.lower().strip()) # Replace text print("Replace:", text1.replace("python", "SQL").strip()) # Count occurrences print("Count of 'o':", text1.count("o")) # Check start print("Starts with hello:", text1.strip().startswith("hello")) # Check numeric mobile = "9876543210" print("Is numeric:", mobile.isnumeric()) # Split & Join msg = "Welcome to python Course" words = msg.split() print("Words list:", words) joined_text = "_".join(words) print("Joined text:", joined_text) # Find position print("Index of 'p':", msg.find("p")) # Extract domain email = "student@example.com" domain = email[email.find("@") + 1:] print("Domain:", domain) # Data Cleaning Example (Price) price_text = "Price : ₹3500/-" clean_price = price_text.replace("Price :", "")\ .replace("₹", "")\ .replace("/-", "")\ .strip() print("Clean price:", clean_price) 💡 Key Learnings: Cleaned raw text data using strip() and replace() Transformed text using upper(), lower(), split(), and join() Extracted useful information (like email domain) Practiced real-world data cleaning (price formatting) 📌 These skills are directly applicable in: ✔ Data Cleaning ✔ Excel / SQL transformations ✔ Power BI datasets I’m learning Python through Satish Dhawale sir course (SkillCourse) and practicing daily 💻 🔥 Next step: Applying these concepts on real datasets and analytics projects Let’s connect if you're also learning Python or Data Analytics 🤝 #Python #StringMethods #DataCleaning #BusinessAnalyst #DataAnalytics #LearningJourney #SkillDevelopment #SatishDhawale #SkillCourse #UpGrad
Like Comment
To view or add a comment, sign in
Ishant Bhardwaj
1w Edited
Report this post
🚀 Built-in vs External Packages in Python #Day26 If you're learning Python for data analytics, automation, or development, understanding packages is a game changer. 🔹 What Are Python Packages? Python packages are collections of modules (code files) that help you perform specific tasks without writing everything from scratch. Think of them as ready-made tools 🧰 that save your time and effort. 🔸 Built-in Packages (Standard Library) 🏗️ These packages come pre-installed with Python. No need to install anything — just import and use! ✅ Key Features: ✔ Already available ✔ No installation required ✔ Optimized & reliable ✔ Covers common tasks 📌 Examples: math ➝ Mathematical operations (square root, factorial, etc.) datetime ➝ Work with dates & time ⏰ os ➝ Interact with operating system 💻 random ➝ Generate random numbers 🎲 sys ➝ System-level operations 👉 Example: import math print(math.sqrt(16)) # Output: 4.0 🔸 External Packages (Third-party Libraries) 📦 These are packages created by developers and shared online. You need to install them using pip. ✅ Key Features: ✔ Not included by default ✔ Installed when needed ✔ Huge variety of tools ✔ Used for advanced tasks 📌 Examples: numpy ➝ Numerical computing 🔢 pandas ➝ Data analysis 📊 matplotlib ➝ Data visualization 📈 requests ➝ API calls 🌐 tensorflow ➝ Machine learning 🤖 👉 Installation: pip install pandas 👉 Example: import pandas as pd data = pd.DataFrame({"A": [1,2,3]}) print(data) ⚖️ Built-in vs External Packages (Quick Comparison) FeatureBuilt-in Packages 🏗️External Packages 📦AvailabilityPre-installedNeed installationUsageBasic tasksAdvanced tasksSetupNo setup neededUse pipExamplesmath, ospandas, numpy 🎯 When Should You Use What? 👉 Use Built-in Packages when: You need simple functionality You want faster and lightweight solutions 👉 Use External Packages when: You are working on real-world projects You need advanced features (data science, ML, APIs, etc.) 💡 Pro Tip A good Python developer knows when NOT to install a package 😉 Sometimes, built-in modules can do the job perfectly! 💬Which external Python package do you use the most — and why? Let's learn together! 🚀 #Python #DataAnalytics #Programming #Coding #PythonLearning #Developers #Tech #AI #DataAnalysts #DataAnalysis #DataCleaning #DataCollection #PowerBI #Excel #MicrosoftExcel #MicrosoftPowerBI #PythonProgramming #LearningJourney #SQL #CodeWithHarry

1 Comment
Like Comment
To view or add a comment, sign in
Ishant Bhardwaj
1w Edited
Report this post
🚀 Variables & Data Types in Python #Day27 If you're starting your Python journey, understanding Variables and Data Types is your first big step toward becoming a pro 💡 Let’s break it down in a simple and practical way 👇 🔹What is a Variable? A variable is like a container 🧺 that stores data values which you can use and update in your program. 👉 Example: name = "Ishu" age = 20 Here, name stores text and age stores a number. 🔹Rules for Naming Variables 📝 ✔ Start with a letter or underscore _ ✔ Cannot start with a number ❌ ✔ Use only letters, numbers, and underscores ✔ Case-sensitive (age ≠ Age) 👉 Best Practice: student_name = "Rahul" total_marks = 95 🔹Dynamic Typing in Python 🔄 Python automatically understands the type of data — no need to declare it! x = 10 # Integer x = "Hello" # Now String 🔥 This flexibility makes Python beginner-friendly. 🔹What are Data Types? 📊 Data types define the type of value a variable can store. 🔸1. Numeric Types 🔢 Used for numbers: a = 10 # int b = 3.14 # float c = 2 + 3j # complex 🔸2. String (str) 🔤 Used for text: name = "Python" msg = 'Hello World' 🔸3. Boolean (bool) ⚖️ Represents True or False: is_logged_in = True 🔸4. List 📋 (Ordered & Mutable) Can store multiple values: fruits = ["apple", "banana", "mango"] ✔ Changeable (mutable) 🔸5. Tuple 📦 (Ordered & Immutable) Similar to list but cannot be changed: point = (10, 20) ✔ Faster than lists ❌ Cannot modify 🔸6. Set 🔗 (Unordered & Unique) Stores unique values only: numbers = {1, 2, 3, 3} 👉 Output: {1, 2, 3} 🔸7. Dictionary 📚 (Key-Value Pair) Stores data in pairs: student = {"name": "Ishu", "age": 20} 🔹Type Checking 🔍 Check the type of any variable: print(type(name)) 🔹Type Conversion 🔄 Convert one data type to another: x = int("10") y = str(20) z = float(5) 🔹Multiple Variable Assignment ⚡ a, b, c = 1, 2, 3 x = y = z = 0 🔹Constants in Python 🔒 Python doesn’t have fixed constants, but we follow naming convention: PI = 3.14 MAX_LIMIT = 100 💡Pro Tips: ✔ Use meaningful variable names ✔ Follow snake_case naming style ✔ Keep your code readable and clean ✔ Understand data types deeply to avoid bugs 🔥Conclusion Variables store your data, and data types define what kind of data you're working with. Master these fundamentals, and you’ll unlock the true power of Python 💪 💬 What’s your favorite Python data type? #DataAnalytics #DataAnalysts #DataAnalysis #Excel #PowerBI #PythonProgramming #Python #MicrosoftExcel #MicrosoftPowerBI #DataCleaning #DataCollection #DataVisualization #SQL #CodeWithHarry #Variables #DataTypes #learningJourney #Learning #Consistency

1 Comment
Like Comment
To view or add a comment, sign in
Vivek Bhave
2w Edited
Report this post
𝗖𝗮𝗻 𝗦𝗤𝗟 𝗱𝗼 𝗳𝗲𝗮𝘁𝘂𝗿𝗲 𝗮𝗻𝗮𝗹𝘆𝘀𝗶𝘀? We usually do feature analysis in Python, but what if we cannot load millions of rows in Python? Can we do that with SQL? To figure this out, I took the problem of customer churn and tried to understand why customers are leaving and what we can do about it. For this, I tried to understand the behavior of churned customers across the different groups of each feature. For example, does a high number of support calls lead to churning? To study customer behavior, I calculated the churn rates across the groups of each feature using AVG() in SQL. I used churn rate because it allows comparison irrespective of group size. For calculating the churn rate for numerical features like payment delay, I first divided this feature into groups using GROUP BY in SQL. I did this by identifying the sudden difference in churn rates between two values. Consequently, I identified the thresholds of behavioral change and labeled the groups using a CASE conditional statement. For categorical features, it can be easily calculated. To decide which feature is important, I used this criteria: 1. The churn rate difference must be significant for at least one group compared to others. This suggests that after this threshold is the breaking point of customer behavior. 2. The pattern should be stable, to avoid random noise. 3. Group sizes should be comparable. Example: Issue Level (Support Calls) +------------------+------------------+ | Issue Level | Churn Rate | +------------------+------------------+ | Low | 0.10 | | Medium | 0.25 | | High | 0.80 | +-------------------+-----------------+ Churn rate stays stable across low and medium but increases sharply at high issue level. Customers waited patiently until the support calls were in the medium issue level. Once the threshold is crossed, 80% of the customers leave. That means one should respond to support calls before reaching the high issue level; otherwise, the customer will leave. In customer churn, the features are: Age, Gender, Tenure, Usage Frequency, Support Calls, Payment Delay, Subscription Type, Contract Length, Total Spend, Last Interaction, and Churn. For more detailed analysis, check out github repo (Notebooks/SQL_Analysis folder): https://lnkd.in/gUx9vgyE #SQL #FeatureAnalysis #CustomerChurn #DataAnalytics #DataScience #SQLAnalytics #ChurnAnalysis #DataEngineering #BehavioralAnalysis #AnalyticsEngineering #BigData #DataCommunity
Like Comment
To view or add a comment, sign in
Awini Maxwell
3d
Report this post
If anyone is interested in developing their skills in Object-oriented Languages, a quick thought based on my experience that might be helpful. 💬 Here are some tips for developing this skill: 1. Think in terms of behavior, not just data Python style: Use properties to protect attributes when needed, but don’t write trivial getters/setters. python # Bad (Java-style) class Person: def get_name(self): return self._name def set_name(self, name): self._name = name # Good Pythonic OOP class Person: def __init__(self, name): self.name = name # plain attribute @property def formal_name(self): return f"Ms./Mr. {self.name}" # behavior 2. Tell, Don't Ask Python style: Pass messages, avoid peeking inside other objects. python # Avoid if cart.items_count() > 0: apply_shipping(cart) # Prefer cart.checkout() # cart decides if shipping applies 3. Polymorphism over conditionals Python example – use subclasses with a common interface: python # Instead of: def play(sound_file): if sound_file.ext == 'mp3': play_mp3(sound_file) elif sound_file.ext == 'wav': play_wav(sound_file) # Do: class AudioFile: ... class MP3File(AudioFile): def play(self): ... class WAVFile(AudioFile): def play(self): ... # Then: sound_file.play() Python’s duck typing also allows polymorphism without formal inheritance – just implement the same method names. 4. Study small, well-designed OOP systems in Python Look at: collections.abc (abstract base classes: MutableSequence, Iterable) datetime module (e.g., datetime, timedelta, tzinfo) pathlib.Path – excellent example of OOP design with operator overloading (/ for joining paths) 5. Refactor a procedural script into OOP Take a script that reads a CSV, processes rows, and outputs a report. Create classes like: DataReader (reads, yields rows) ReportGenerator (processes, formats) FileSystemHandler (writes output) 6. SOLID principles in Python Example of Single Responsibility: python # Bad (reporting + saving) class Report: def generate(self): ... def save_to_db(self): ... # Good class Report: def generate(self): ... class ReportSaver: def save(self, report): ... Open-Closed via subclassing or using abc.ABC and @abstractmethod. 7. Unit tests drive design Use unittest or pytest. Write tests early to force clean interfaces: python def test_bank_account_deposit(): acc = BankAccount() acc.deposit(100) assert acc.balance == 100 This prevents you from making deposit depend on some hidden global state. 8. Get feedback with Python-specific examples Post a small class (e.g., a ShoppingCart with add_item, total, apply_discount) to r/learnpython or Stack Overflow. Ask: “How would you reduce coupling here? Should I use composition or inheritance?”
Like Comment
To view or add a comment, sign in
Ankit Kshirsagar
6d
Report this post
🐍 𝐇𝐨𝐰 𝐃𝐨 𝐘𝐨𝐮 𝐌𝐚𝐤𝐞 𝐚 𝐂𝐥𝐚𝐬𝐬 𝐉𝐒𝐎𝐍 𝐒𝐞𝐫𝐢𝐚𝐥𝐢𝐳𝐚𝐛𝐥𝐞 𝐢𝐧 𝐏𝐲𝐭𝐡𝐨𝐧? If you’ve worked with APIs or data storage in Python, you’ve likely encountered this error: 👉 “𝘖𝘣𝘫𝘦𝘤𝘵 𝘰𝘧 𝘵𝘺𝘱𝘦 𝘟 𝘪𝘴 𝘯𝘰𝘵 𝘑𝘚𝘖𝘕 𝘴𝘦𝘳𝘪𝘢𝘭𝘪𝘻𝘢𝘣𝘭𝘦” It’s a common challenge — but also an important concept to understand for real-world development. Let’s break it down in a simple and practical way. ⚙️ 𝐖𝐡𝐲 𝐓𝐡𝐢𝐬 𝐏𝐫𝐨𝐛𝐥𝐞𝐦 𝐎𝐜𝐜𝐮𝐫𝐬 Python’s built-in json module can only serialize basic data types such as: - dict - list - str, int, float, bool 👉 Custom class objects? Not directly supported. That’s why you need to convert your object into a serializable format. 🚀 𝐌𝐞𝐭𝐡𝐨𝐝 1: 𝐂𝐨𝐧𝐯𝐞𝐫𝐭 𝐎𝐛𝐣𝐞𝐜𝐭 𝐭𝐨 𝐃𝐢𝐜𝐭𝐢𝐨𝐧𝐚𝐫𝐲 The simplest approach is to convert your class object into a dictionary. class User: def __init__(self, name, age): self.name = name self.age = age user = User("John", 22) import json json_data = json.dumps(user.__dict__) ✔ Quick and easy ✔ Works well for simple classes 🧠 𝐌𝐞𝐭𝐡𝐨𝐝 2: 𝐂𝐫𝐞𝐚𝐭𝐞 𝐚 𝐂𝐮𝐬𝐭𝐨𝐦 𝐒𝐞𝐫𝐢𝐚𝐥𝐢𝐳𝐞𝐫 You can define a function to handle serialization: def serialize(obj): return obj.__dict__ json_data = json.dumps(user, default=serialize) ✔ Flexible for multiple classes ✔ Cleaner and reusable 🧩 𝐌𝐞𝐭𝐡𝐨𝐝 3: 𝐔𝐬𝐞 𝐚 𝐂𝐮𝐬𝐭𝐨𝐦 𝐉𝐒𝐎𝐍𝐄𝐧𝐜𝐨𝐝𝐞𝐫 For more control, extend Python’s encoder: import json class UserEncoder(json.JSONEncoder): def default(self, obj): return obj.__dict__ json_data = json.dumps(user, cls=UserEncoder) ✔ Ideal for larger applications ✔ Centralized control over serialization ⚡ 𝐌𝐞𝐭𝐡𝐨𝐝 4: 𝐔𝐬𝐞 𝐃𝐚𝐭𝐚𝐜𝐥𝐚𝐬𝐬𝐞𝐬 (𝐌𝐨𝐝𝐞𝐫𝐧 𝐀𝐩𝐩𝐫𝐨𝐚𝐜𝐡) Python’s dataclasses make serialization cleaner: from dataclasses import dataclass, asdict import json @dataclass class User: name: str age: int user = User("Jhon", 22) json_data = json.dumps(asdict(user)) ✔ Cleaner syntax ✔ Built-in support for conversion 🔥 𝐁𝐨𝐧𝐮𝐬: 𝐓𝐡𝐢𝐫𝐝-𝐏𝐚𝐫𝐭𝐲 𝐋𝐢𝐛𝐫𝐚𝐫𝐢𝐞𝐬 Libraries like: - pydantic - marshmallow provide: ✔ Validation + serialization ✔ Better structure for APIs ✔ Production-ready solutions ⚖️ 𝐂𝐨𝐦𝐦𝐨𝐧 𝐌𝐢𝐬𝐭𝐚𝐤𝐞 👉 Trying to directly serialize complex objects without conversion. This leads to: - Errors - Broken APIs - Debugging frustration 💡 𝐁𝐞𝐬𝐭 𝐏𝐫𝐚𝐜𝐭𝐢𝐜𝐞 ✔ Always convert objects to dictionaries ✔ Use dataclasses for clean design ✔ Use libraries for scalable applications 🧠 𝐅𝐢𝐧𝐚𝐥 𝐓𝐡𝐨𝐮𝐠𝐡𝐭𝐬 Serialization is not just about converting data — it’s about making your application communicate effectively. Mastering this concept will: ✔ Improve your API development ✔ Make your code more robust ✔ Help you handle real-world data efficiently #Python #JSON #Serialization #BackendDevelopment #SoftwareEngineering #Programming #Developers #API #DataEngineering #TechTips #LearningToCode #FullStackDeveloper #CareerGrowth #LinkedInTech
Like Comment
To view or add a comment, sign in
Obiageli Innocent
3w
Report this post
Day 11/30 - Python Dictionaries Today I learned the most powerful data structure in Python. And honestly it changed how I think about storing data. What is a Dictionary? A dictionary is an ordered, mutable collection of key-value pairs defined using curly braces {}. Unlike lists which use index numbers, dictionaries use keys , meaningful labels to access each value. Think of it like a real dictionary: you look up a word (key) to get its definition (value). Three core traits: Ordered — from Python 3.7+, dictionaries remember insertion order Mutable — you can add, update, and remove pairs after creation No duplicate keys — if you add the same key twice, the second value overwrites the first Syntax Breakdown my_dict = {"key1": value1, "key2": value2} "key" -> the label used to look up a value - must be unique and immutable value -> the data stored - can be any type: string, int, list, even another dict { } -> curly braces wrap the whole dictionary - pairs separated by commas Accessing Values dict.get("key") → returns None safely if the key is missing dict.get("key", "default") → returns your fallback value instead of None Rule: Use dict["key"] when you're sure the key exists. Use dict.get() when you're not — it's always the safer choice. Adding & Updating Items Add new key: dict["new_key"] = value Update existing key: dict["key"] = new_value Update multiple: dict.update({"key1": val, "key2": val}) Remove a key: dict.pop("key") Code Example student = { "name" : "Obiageli", "course": "Machine learning", "year" : 2024, "gpa" : 4.5 } print(student["name"]) =Obiageli print(student.get("gpa")) = 4.5 student["age"] = 22 student["gpa"] = 4.7 Key Learnings ☑ A dictionary stores data as key-value pairs. keys are labels, values are the data ☑ Use dict.get("key") over dict["key"] when unsure a key exists ☑ Keys must be unique and immutable — values can be any data type ☑ Use .keys(), .values(), .items() to loop through dictionaries effectively ☑ Dictionaries are the foundation of JSON — the format every web API sends data in Why It Matters Every time an app stores a user profile, an API sends data, or a program reads a config file, that data is almost always in dictionary format. Mastering dictionaries means you can work with real-world data right now. My Takeaway Lists store things in a line , dictionaries store things with meaning. Once I started thinking in key-value pairs, data started making a lot more sense. It's not just storage , it's structured storage. #30DaysOfPython #Python #LearnToCode #CodingJourney #WomenInTech
Like Comment
To view or add a comment, sign in

813 followers

19 Posts

View Profile Connect

Handling Missing Values in Python Datasets

More Relevant Posts

Explore content categories