Python for Developers | Step 3 — Data Structures (Q&A Series) Instead of using just variables containing single values, in real development work we often deal with collections of data. What looked simple in the course—lists, dictionaries, sets, tuples—starts behaving differently once you rely on them in real scenarios. Not because they change, but because their internal behavior starts to matter. This post is not a recap. It’s a breakdown of the parts that were easy to miss, or simply not emphasized. List — more than just a container At first glance, when you create a list like my_list = [], it appears to be a simple ordered collection of values, indexed starting from 0. In reality, it’s a dynamic array, and that detail changes how you should use it. What does that mean? A list stores elements in a contiguous block of memory. This is why: Access by index is fast → O(1) But inserting in the middle is expensive → O(n) Why does .append() feel fast, but .insert() doesn’t? Because: - .append() adds at the end → no shifting → efficient - .insert(i, x) shifts all elements after index i → costly So two operations that look similar in syntax behave very differently in performance. Do lists store values or something else? They store references, not copies. Meaning: -When you add an object to a list, you’re storing a pointer to it -Not duplicating it This leads to behavior that can be unexpected: If the same object is referenced multiple times, modifying it affects all appearances. Is slicing just “accessing part of the list”? No. new_list = my_list[1:3] This creates a new list (copy), not a view. Why it matters: -Extra memory is used -Time complexity becomes proportional to slice size Then when does a list stop being the right choice? -When you insert/delete frequently in the middle -When memory copies become costly -When you assume each element is independent, but in reality, multiple elements can reference the same object. Lists are simple to use, but not always simple in behavior. Looks basic? Try this and think twice: What do you think the output will be, and why? rows = [[0]*3]*3 rows[0][0] = 1 print(rows) Now compare it with: rows = [[0]*3 for _ in range(3)] rows[0][0] = 1 print(rows) What changed between the two, and why did it affect the result? Surely no one knows this better than the DataCamp tutor herself Jasmin Ludolf, I’d love to hear your perspective, do you agree with this explanation, and how would you approach the same example?
Python Data Structures: Lists, Indexing, and Memory
More Relevant Posts
-
Python for Developers | Step 3 — Data Structures (Q&A Series) After lists and dictionaries, the remaining structures look simpler. In practice, they enforce stricter rules — and that’s where their value comes from. Sets Sets are not just collections — they are uniqueness + hashing. They store: -unique elements only -no duplicates -If you add the same value twice, it doesn’t raise an error. It simply ignores the duplicate. Why are sets fast? -Because they use hashing, similar to dictionaries. -Instead of scanning elements one by one (like lists → O(n)), -Python uses the hash to locate elements directly → O(1) average. Why can’t sets contain mutable elements? -Sets themselves are mutable: -you can add/remove elements -But their elements must be immutable (hashable). -Allowed: int, str, tuple -Not allowed: list, dict, set Example: s = {1, [2, 3]} # TypeError Why this restriction exists Hashing requires: -value does not change -hash remains consistent If a mutable object were allowed: -its value could change -its hash would change -Python would lose track of where it is stored -Result → broken lookup Other constraints -No indexing → unordered access -No slicing -Can be sorted, but sorting creates a new list, not a set When sets make sense -Tracking unique elements -Fast membership checks -Removing duplicates from data Tuples Tuples look like lists but enforce a key constraint: immutability. t = (1, 2, 3) What makes tuples different? -Ordered (like lists) -Immutable (unlike lists) Why immutability matters Because tuples: -cannot be modified after creation -can be used as dictionary keys or set elements -This is not possible with lists. Internal implication Since tuples don’t change: -safer to use -slightly more memory-efficient -predictable behavior Hidden detail A tuple can still contain mutable objects: t = ([1, 2], 3) t[0].append(4) The tuple is unchanged, but its content is not. Final Take Sets → enforce uniqueness using hashing Tuples → enforce immutability for stability Looks simple? Consider this: If tuples are immutable, how is it possible to store a list inside a tuple and still modify that list? What exactly is “immutable” here?
To view or add a comment, sign in
-
-
My Data Science Journey — Python Tuple, Set, Dictionary & the Collections Library Today’s focus was on Python’s core data structures — Tuples, Sets, and Dictionaries — along with the powerful collections module that enhances their functionality for real-world use cases. 𝐖𝐡𝐚𝐭 𝐈 𝐋𝐞𝐚𝐫𝐧𝐞𝐝: Tuple – Ordered, immutable, allows duplicates – Single element tuples require a trailing comma → ("cat",) – Supports packing and unpacking → x, y = 10, 30 – Cannot be modified after creation (TypeError by design) – Faster than lists in certain operations – Used in scenarios like geographic coordinates and fixed records – Can be used as dictionary keys (unlike lists) Set – Unordered, mutable, stores unique elements only – No indexing or slicing support – Empty set must be created using set() ({} creates a dict) – .remove() raises KeyError if element not found – .discard() removes safely without error – Supports operations like union, intersection, difference, symmetric_difference – Methods like issubset(), issuperset(), isdisjoint() help in set comparisons – frozenset provides an immutable version of a set – Offers O(1) average time complexity for membership checks Dictionary – Key-value pair structure, ordered, mutable, and keys must be unique – Built on hash tables for fast lookups – user["key"] → raises KeyError if missing – user.get("key", default) → safe access with fallback – Methods: keys(), values(), items() for iteration – pop(), popitem(), update(), clear(), del for modifications – Widely used in real-world data like APIs and JSON responses – Common pattern: list of dictionaries for structured datasets Collections Library – namedtuple → tuple with named fields for better readability – deque → efficient queue with O(1) operations on both ends – ChainMap → combines multiple dictionaries without merging copies – OrderedDict → maintains order with additional utilities like move_to_end() – UserDict, UserList, UserString → useful for customizing built-in behaviors with validation and extensions Performance Insight – List → O(n) – Tuple → O(n) – Set → O(1) (average lookup) – Dictionary → O(1) (average lookup) 𝐊𝐞𝐲 𝐈𝐧𝐬𝐢𝐠𝐡𝐭: Understanding when to use each data structure — and how collections enhances them — is crucial for writing efficient, scalable, and clean Python code. Read the full breakdown with examples on Medium 👇 https://lnkd.in/gvv5ZBDM #DataScienceJourney #Python #Tuple #Set #Dictionary #Collections #Programming #DataStructures
To view or add a comment, sign in
-
🏭 Alternative Constructors in Python — Creating Objects in Different Ways! Just built an alternative constructor using a class method — perfect for handling different input formats! 🚀 🔍 What's Happening? | Constructor | Format | Example | |--------------------------------|-----------------------|--------------------------| | "Regular (`__init__`)" | Separate arguments | 'Student('Hamza', 94)' | | "Alternative (class method)" | String with delimiter |'Student.from_string('Ali-88')' | 💡 Why Alternative Constructors Are Useful: ✅ "Multiple Input Formats" – Handle CSV, JSON, strings, dictionaries ✅ "Cleaner Code" – Parsing logic inside the class, not scattered ✅ "Flexibility"—Create objects from different data sources ✅ "Self-Documenting"—Method name describes the input format 📌 Real-World Applications: | Class | Alternative Constructor | Input Format | |-------------|----------------------------------------------|-------------------| | 'Employee' | 'from_csv_row(row)' |"John,30,50000" | | 'Date | 'from_string('2024-12-25')' | "YYYY-MM-DD" | | 'Product' | 'from_dict({'name':'Laptop','price':1000})' | Dictionary | | 'User' | 'from_json(json_string)' | JSON data | 📌 More Examples: ``` class Date: def __init__(self, year, month, day): self.year = year self.month = month self.day = day @classmethod def from_string(cls, date_str): # "2024-12-25" → year=2024, month=12, day=25 year, month, day = map(int, date_str.split('-')) return cls(year, month, day) @classmethod def today(cls): # Returns today's date from datetime import date today = date.today() return cls(today.year, today.month, today.day) # Usage d1 = Date(2024, 12, 25) # Regular d2 = Date.from_string('2024-12-25') # From string d3 = Date.today() # Today's date ``` 📌 Key Takeaway: > Class methods = Multiple ways to create objects! > The '__init__' is one constructor. > Class methods can be many for different input formats. #Python #OOP #ClassMethods #AlternativeConstructors #Coding #Programming #LearnPython #Developer #Tech #FactoryMethods #PythonTips #CodingLife #SoftwareDevelopment #CleanCode #Day62
To view or add a comment, sign in
-
-
Python for Developers | Step 3 — Data Structures (Q&A Series) Dictionaries — not just “key-value pairs” At first, a dictionary looks like a simple mapping: my_dict = {"Mahmoud": 100} But internally, it behaves very differently from lists. That difference directly affects performance, correctness, and even bugs. What is a dictionary really? What: -A dictionary is a hash table, not just a collection of pairs. Why: Instead of searching linearly, Python: -Computes hash(key) -Maps it to an index in memory -Stores or retrieves the value directly Consequence: -Lookup (d[key]) is O(1) average -Performance depends on hashing, not position Why must keys be immutable? What: -Keys must be hashable (effectively immutable) Why: -The hash of a key determines where it is stored -If the key changes → hash changes → location becomes invalid Consequence: d = {[1, 2]: 10} # TypeError -Mutable objects (like lists) are rejected -Prevents silent data corruption What happens with duplicate keys? d = {"a": 1, "a": 2} What: -Only one entry exists Why: -Keys must be unique -Second insertion overwrites the first Consequence: {"a": 2} -No error raised -Earlier value is discarded immediately Why is lookup “fast” and when is it not? What: -Dictionary operations are O(1) on average Why: -Direct index access via hashing Consequence: -Fast lookups—until collisions happen What is a hash collision? What: -Two different keys map to the same index Why: -Hash space is finite -Collisions are unavoidable Consequence: -Python must resolve it → extra work → slower operations How does Python resolve collisions? What: -Using probing (open addressing) Why: -If a slot is occupied, Python searches for another one Consequence: -Lookup may require multiple steps -Too many collisions → performance degrades toward O(n) Why do dictionaries resize? What: -Dictionary expands when it becomes too full Why: -High load → more collisions -Need more space to keep O(1) behavior Consequence: -Temporary cost (rehashing all keys) -Restores performance Do dictionaries store values directly? What: -They store references to objects, not copies Why: -Consistent with Python’s memory model Consequence: a = {"x": []} b = a.copy() b["x"].append(1) -Both dictionaries change -Inner object is shared (shallow copy) What do .keys(), .values(), .items() return? What: -They return view objects, not lists Why: -Avoid copying data -Provide real-time access Consequence: k = d.keys() d["new"] = 1 -k updates automatically -But cannot be modified directly Views are not independent k = d.keys() d.clear() Consequence: -k becomes empty -It reflects the source, not a snapshot Final Question If dictionaries are “O(1)”, but collisions and probing exist: At what point does a dictionary stop behaving like O(1), and what kind of key patterns could cause that degradation in real systems?
To view or add a comment, sign in
-
-
Day 12/30 - Nested Data Structures in Python Today everything clicked. Lists, dicts, tuples. They don't live separately. Real data nests them together. What is Nesting? Nesting means placing one data structure inside another. A list can contain dictionaries. A dictionary can contain lists. A dictionary can even contain other dictionaries. This is how Python represents complex, real-world data - the same structure used in JSON APIs, databases, and config files. Four Common Nesting Patterns List inside Dict -> a dictionary key holds a list as its value e.g. a student's list of scores Dict inside List -> a list contains multiple dictionaries e.g. a list of student records Dict inside Dict -> a key holds another dictionary e.g. a user with a nested address object List inside List -> a list contains other lists e.g. rows and columns in a grid or table How to Access Nested Data You access nested data by chaining brackets one for each level you go deeper: data["student"]["scores"][0] -->open dict , go to scores key, grab index 0 Rule: count the levels of nesting, then use that many brackets to reach the value. Looping Through Nested Structures When your data is a list of dictionaries, use a for loop to go through each dictionary, then use bracket notation to pull out values. This is the most common real-world pattern- reading records from an API or database. Code Example 1: List Inside a Dict python student = { "name" : "Obiageli", "scores": [88, 92, 75, 95], "passed": True } print(student["scores"]) = [88, 92, 75, 95] print(student["scores"][0]) = 88 print(student["scores"][-1]) = 95 Key Learnings ☑ Nesting = placing one data structure inside another ☑ Access nested data by chaining brackets , one bracket per level ☑ A list of dictionaries is the most common pattern, it's how API and database data looks ☑ Use a for loop to go through a list of dicts and pull values from each record ☑ Nested structures are the foundation of JSON -master this and real-world data won't feel foreign My Takeaway Nested data structures are where all the previous days connect. Lists, tuples, sets, dictionaries - they don't live in isolation. Real data combines all of them. Today I started seeing data the way Python sees it. #30DaysOfPython #Python #LearnToCode #CodingJourney #WomenInTech
To view or add a comment, sign in
-
-
Here's my Ultimate Python Commands Cheatsheet for Data Analytics: (Save this post - you will use it more than you think) SQL gets you in the door. Python gets you promoted. Every analyst I have hired who moved up fastest had one thing in common - they did not just know Python, they knew which commands actually matter for real analytics work. This cheatsheet covers the 20 commands you will use every single day: -- Loading and exploring data with head(), tail(), info(), describe() -- Filtering and selecting with loc, iloc, query(), and conditions -- Aggregating with groupby(), pivot_table(), and value_counts() -- Combining datasets with merge() and join() -- Handling missing data with fillna(), dropna(), and drop() -- Date operations with to_datetime() and date filtering -- Advanced calculations with rolling(), cumsum(), and apply() -- Exporting clean data with to_csv() Here is the honest truth: You do not need to know everything. You need to know these commands deeply and know when to use each one. Real Python skill is knowing which 5 commands to chain together to go from raw messy data to a clean business insight in under 30 minutes. 5 Free YouTube Resources to Learn Python for Data Analytics: --Alex Freberg - https://lnkd.in/gJ75EQZE --Keith Galli - youtube.com/@KeithGalli --Corey Schafer - youtube.com/@coreyms --freeCodeCamp - youtube.com/@freecodecamp --Luke Barousse - youtube.com/@LukeBarousse Save this. Build something with it. Watching videos without practicing is not learning. Which command do you use most in your daily work? ♻️ Repost to help someone learning Python for data analytics 💭 Tag someone who keeps saying they will learn Python "next month" 📩 Get my full data analytics guide: https://lnkd.in/gjUqmQ5H
To view or add a comment, sign in
-
-
Day 2 of Learning Python – And I Just Built My First Real Data Audit System 📊🐍 Today I didn’t just “learn Python”… I used it to analyze structured company-style audit data and built a Mistake Scoring System that automatically evaluates performance. And honestly, It felt like stepping into real business intelligence work. 💡 What I built today: Using Pandas, I processed an audit dataset and generated insights like: 📌 Total deals per responsible person 📌 Pipeline distribution per team member 📌 Mistake scoring based on missing actions (follow-ups, updates, documents) 📌 Final performance summary ranking everyone by errors ⚙️ The idea behind the system: Instead of manually checking performance, I created a logic-based scoring system where: Missing documents = +1 error No follow-up = +1 error No comment update = +1 error Unresolved status = +3 heavy penalty This turns raw data into actionable performance insights. 💻 Code I used: import pandas as pd file_path = r " Instered your excel data file here" Note: The r before the file path means it is a raw string, which helps Python correctly read the path without treating backslashes as escape characters. Also, make sure your Excel file is saved in the same folder where your Python script is located, or ensure the correct full file path is provided. df = pd.read_excel(file_path) # CLEAN DATA df.columns = df.columns.str.strip() df = df.fillna("No") # MISTAKE SCORE SYSTEM df["Mistake Score"] = 0 df.loc[df["Document/RF Request"] == "No", "Mistake Score"] += 1 df.loc[df["Comment Updates"] == "No", "Mistake Score"] += 1 df.loc[df["Follow up"] == "No", "Mistake Score"] += 1 df.loc[df["Status"].str.lower() == "unresolved", "Mistake Score"] += 3 # ANALYSIS print(df["Responsible"].value_counts()) print(df.groupby(["Responsible", "Pipeline"]).size()) mistakes = df.groupby("Responsible")["Mistake Score"].sum().sort_values(ascending=False) print(mistakes) summary = df.groupby("Responsible").agg( Total_Deals=("Responsible", "count"), Total_Mistakes=("Mistake Score", "sum") ) print(summary.sort_values("Total_Mistakes", ascending=False)) 🚀 Key takeaway: Even simple Python + Excel data can be transformed into a decision-making system that highlights performance gaps instantly. Day 2 of learning — and I’m already seeing how powerful data can be in real business environments. Can’t wait to build dashboards and automate even more next 🔥 #Python #DataAnalysis #Pandas #LearningInPublic #DataScience #Automation #BusinessIntelligence #CareerGrowth
To view or add a comment, sign in
-
🚀 Mastering Python Strings: Methods, Operations & Real-World Examples Strings are one of the most fundamental yet powerful data types in Python. Whether you're building data pipelines, scraping websites, or creating user-friendly applications, mastering string operations is essential for every aspiring developer and data analyst. Let’s break it down 👇 🔹 1. What is a String? A string is simply a sequence of characters enclosed in quotes. Example: ""Hello, World!"" 🔹 2. Essential String Methods Python provides built-in methods that make string manipulation easy: ✔ "lower()" & "upper()" → Case conversion ✔ "strip()" → Remove unwanted spaces ✔ "replace()" → Substitute text ✔ "split()" → Break into a list ✔ "find()" → Locate substrings 💡 Example: text = " Data Analytics " clean_text = text.strip().lower() print(clean_text) # data analytics 🔹 3. String Operations You Should Know ✔ Concatenation ("+") → Combine strings ✔ Repetition ("*") → Repeat text ✔ Indexing & Slicing → Extract parts of a string 💡 Example: name = "Python" print(name[0:3]) # Pyt 🔹 4. Real-World Applications 📊 Data Cleaning – Removing unwanted spaces or symbols 🌐 Web Scraping – Extracting useful text from HTML 📧 Email Automation – Formatting messages dynamically 📁 File Handling – Processing text data efficiently 🔹 5. Pro Tip: Use f-Strings for Clean Code name = "Sai" role = "Data Analyst" print(f"My name is {name} and I am a {role}") ✨ Why It Matters? Strong string handling skills can significantly improve your coding efficiency and are crucial for real-world projects, especially in Data Analytics and Software Development at Innomatics Research Labs 💬 What’s your favorite Python string method? Let’s discuss in the comments! #Python #Programming #DataAnalytics #Coding #Learning #TechSkills #CareerGrowth #Developers #innomatics Research Labs Innomatics Research Labs Research Labs
To view or add a comment, sign in
-
𝗖𝗮𝗻 𝗦𝗤𝗟 𝗱𝗼 𝗳𝗲𝗮𝘁𝘂𝗿𝗲 𝗮𝗻𝗮𝗹𝘆𝘀𝗶𝘀? We usually do feature analysis in Python, but what if we cannot load millions of rows in Python? Can we do that with SQL? To figure this out, I took the problem of customer churn and tried to understand why customers are leaving and what we can do about it. For this, I tried to understand the behavior of churned customers across the different groups of each feature. For example, does a high number of support calls lead to churning? To study customer behavior, I calculated the churn rates across the groups of each feature using AVG() in SQL. I used churn rate because it allows comparison irrespective of group size. For calculating the churn rate for numerical features like payment delay, I first divided this feature into groups using GROUP BY in SQL. I did this by identifying the sudden difference in churn rates between two values. Consequently, I identified the thresholds of behavioral change and labeled the groups using a CASE conditional statement. For categorical features, it can be easily calculated. To decide which feature is important, I used this criteria: 1. The churn rate difference must be significant for at least one group compared to others. This suggests that after this threshold is the breaking point of customer behavior. 2. The pattern should be stable, to avoid random noise. 3. Group sizes should be comparable. Example: Issue Level (Support Calls) +------------------+------------------+ | Issue Level | Churn Rate | +------------------+------------------+ | Low | 0.10 | | Medium | 0.25 | | High | 0.80 | +-------------------+-----------------+ Churn rate stays stable across low and medium but increases sharply at high issue level. Customers waited patiently until the support calls were in the medium issue level. Once the threshold is crossed, 80% of the customers leave. That means one should respond to support calls before reaching the high issue level; otherwise, the customer will leave. In customer churn, the features are: Age, Gender, Tenure, Usage Frequency, Support Calls, Payment Delay, Subscription Type, Contract Length, Total Spend, Last Interaction, and Churn. For more detailed analysis, check out github repo (Notebooks/SQL_Analysis folder): https://lnkd.in/gUx9vgyE #SQL #FeatureAnalysis #CustomerChurn #DataAnalytics #DataScience #SQLAnalytics #ChurnAnalysis #DataEngineering #BehavioralAnalysis #AnalyticsEngineering #BigData #DataCommunity
To view or add a comment, sign in
-
Explore related topics
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development
Keep going 💪🏾💪🏾💪🏾