Data Visualization with Matplotlib and StackOverflow Data

📅 Day 73 of #100DaysOfCode — and today the data told a story I didn't expect! Today's focus: data visualization with Matplotlib using real StackOverflow data on programming language popularity from 2008 to 2020. Here's what I worked through today: 🔧 Renamed DataFrame columns using the names parameter in read_csv() for cleaner, more readable data 📅 Converted messy datetime strings into proper pandas datetime objects — a crucial data cleaning step before any time series analysis 🔍 Used groupby() + sum() + idxmax() to identify the most popular programming language of all time by total posts (spoiler: JavaScript 👑) 📊 Filtered DataFrames using boolean indexing to isolate specific languages for visualization 📈 Plotted time series data with Matplotlib — first a single language, then overlaid two languages on the same chart The most compelling insight? The chart says it all: 🔵 Java peaked around 2013-2014 and has been declining ever since 🟠 Python has been on a relentless rise — and by 2020, it's not even close The numbers don't lie. If you're wondering whether to learn Python, the StackOverflow community already voted with their questions. Onward to Day 74! 💪 #Python #Pandas #Matplotlib #DataVisualization #100DaysOfCode #DataScience #ContinuousLearning #MicrosoftFabric

1 Comment

Dusty-Lynn Brown, MSOLQ 3w

The data has spoken. Python supremacy confirmed.

1 Reaction

To view or add a comment, sign in

More Relevant Posts

Data Driven Community
1mo
Report this post
🚀 New Webinar: Fabric Data Engineering with Python Notebooks 📅 April 2, 2026 | 12:00–1:30 PM EDT | Online If you’re building on Microsoft Fabric and looking to do more with less, this session is going to be a game‑changer. Python notebooks are quickly becoming the most cost‑efficient and flexible way to engineer data in Fabric—especially for small teams and organizations watching capacity consumption closely. In this webinar, we’ll explore how to design smarter pipelines using modern libraries like Polars, Delta Lake, DuckDB, and MS SQL, and how to evaluate cost tradeoffs using the Capacity Metrics app. 🎤 Speaker: John Miner Senior Data Architect at Insight Digital Innovation 10x Microsoft MVP | 30+ years of data engineering expertise John will walk through practical patterns, real‑world examples, and cost‑optimized design strategies you can apply immediately. 💡 You’ll learn: - Why Spark notebooks and Dataflows Gen2 can be more expensive than Python notebooks - How to build efficient ETL pipelines using modern Python data libraries - How to compare engineering designs using Fabric’s Capacity Metrics - How small companies can maximize value with minimal capacity 🔗 Register here: https://lnkd.in/dnm6irSM FutureDataDriven CloudDataDriven #microsoftfabric #dataengineering #python

Fabric Data Engineering with Python Notebooks, Thu, Apr 2, 2026, 12:00 PM | Meetup meetup.com

1 Comment
Like Comment
To view or add a comment, sign in
LAYA MARY JOY
1mo
Report this post
🔢 Why NumPy Matters in Data Science (More Than I Thought) Hi everyone! 👋 While learning Python for data work, I came across NumPy — and initially, it just looked like another library. But after spending some time with it, I realized why it’s so widely used. At its core, NumPy is about working efficiently with numbers and arrays. A few things that stood out to me: ✔️ Faster computations compared to regular Python lists ✔️ Ability to perform operations on entire datasets at once (no loops needed) ✔️ Foundation for libraries like Pandas, Scikit-learn For example, instead of looping through values one by one, NumPy lets you do operations in a single line — which is both cleaner and faster. This made me think about real-world scenarios: When dealing with large datasets, performance really matters. Even small optimizations can save a lot of time. Coming from SQL and ETL, this feels similar to optimizing queries — but now at a programming level. Still exploring more, but it’s clear that understanding NumPy well can make a big difference in data processing and model performance. Have you used NumPy in your work? Or do you rely more on Pandas/SQL? #DataScience #Python #NumPy #MachineLearning #LearningInPublic
Like Comment
To view or add a comment, sign in
Sameeksha Rai
2w
Report this post
Just shared my Data Visualization Notes! I’ve created structured notes covering charts, graphs, and data storytelling concepts — designed for easy understanding and practical use. 📌 Available in: 🌐 HTML Version (interactive): https://lnkd.in/dvhKH9dw 📄 PDF Version (downloadable):https://lnkd.in/dh97QjWc Perfect for students, beginners, and anyone looking to strengthen their data visualization skills. #DataVisualization #DataScience #Python #Learning #GitHub #StudentProjects
Like Comment
To view or add a comment, sign in
Jeyasi Pandey
1w
Report this post
🚀 Built a Python Project: Corporate Data Analyzer Most business users struggle to analyze raw data efficiently without technical tools. So I built a simple desktop application to solve this problem. 💡 What it does: • Import CSV / Excel data • Perform GroupBy & aggregations (sum, mean, max, etc.) • Generate interactive charts (Bar, Line, Pie) • Export reports (Excel/CSV) • Export charts as PNG 🛠 Tech Stack: Python | Pandas | Tkinter | NumPy | Matplotlib 📊 This project helped me improve: ✔ Data analysis using Pandas ✔ GUI development using Tkinter ✔ Data visualization using Matplotlib ✔ Building end-to-end real-world tools 🔗 GitHub Repository: https://lnkd.in/giyeMwRd I’d really appreciate your feedback and suggestions! #Python #DataAnalytics #Projects #GitHub #Learning #DataScience #Portfolio #OpenToWork

GitHub - JeyasiPandey/Corporate-Data-Analyzer: A Python GUI tool for analyzing corporate data, generating reports, charts, and exporting insights. github.com
Like Comment
To view or add a comment, sign in
PRANAY SAI UPPU
3w
Report this post
Day 9/120 – Today I learned something most beginners ignore… but pros don’t 😳🔥 Yesterday → Lists Today → CONTROL over data 👇 👉 Tuples & Sets in Python Here’s the problem 🤯 Lists can be changed anytime… But what if your data SHOULD NOT change? ❌ Example: Coordinates 📍 Dates 📅 Configurations ⚙️ That’s where TUPLES come in 👇 data = (10, 20, 30) ✔ Cannot be modified ✔ Safe & reliable Now comes something even more powerful 👇 👉 SETS nums = {1, 2, 2, 3, 3} Output? 😳 {1, 2, 3} ✔ No duplicates ✔ Clean data This is HUGE in Data Analytics 📊 Now I can: ✔ Protect data (Tuples) ✔ Clean data (Sets) This is getting serious now 🔥 Comment “DATA” if you're learning with me 💪 #Day9 #Python #DataAnalytics #LearningInPublic #CodingJourney #Consistency
Like Comment
To view or add a comment, sign in
Kritika Shersia
2w Edited
Report this post
I Tracked My Expenses Using Python & NumPy — Here's What ₹38,940 Taught Me About My Spending Habits I built a Personal Finance Tracker using just Python and NumPy — no Pandas, no fancy libraries. Here's what I discovered about my own spending 👇 The project started simple: a CSV file with 50 transactions across 3 months. But when I ran the numbers through NumPy, the insights hit different. What the data revealed: • Shopping eats 40% of my budget — with just 6 transactions • My Top 5 purchases alone = 36% of total spending • Average spend (₹779) vs Median (₹465) — proof that a few big buys skew everything • 56% of money goes to just 11 "high-tier" transactions What I actually built: → Read raw CSV data using Python's csv module → Converted everything to NumPy arrays for fast computation → Used np.sum(), np.mean(), np.max(), np.median(), np.std() → Boolean masking to filter by category & month → np.argsort() to rank top expenses → np.percentile() for distribution analysis → A formatted summary report printed right to the console. Key takeaway: You don't need complex tools to get powerful insights. NumPy + a CSV file + curiosity = real, actionable data about your life. Watch the screen recording below to see the full report output! This is Week 1 of my Python data journey. Next stop: Pandas & Matplotlib. #NumPy #DataAnalysis #PersonalFinance #LearningInPublic #PythonProjects #BuildInPublic #Python #DataScience #CodeNewbie #Programming #TechTwitter #DataDriven #100DaysOfCode #FinanceTracker

3 Comments
Like Comment
To view or add a comment, sign in
Sarthak Sachdeva
5d Edited
Report this post
🚀 I just published my first Python library on PyPI! As a self-taught developer learning Data Science, I faced a simple but annoying problem every day: ❌ print(df) → boring console output ❌ Hard to read 3500+ rows in terminal ❌ No visual info about nulls, duplicates So I built the solution myself. 💡 ✅ Introducing ViewTable — A beautiful GUI table viewer for Pandas DataFrames! 📦 pip install viewtb 🔥 What it does: → Opens a beautiful dark-mode GUI table → Shows null cells in Blue → Shows duplicate rows in Red → Sidebar with dataset info — rows, columns, memory → Just ONE line of code! 💻 Usage: ——————————————— from viewtb import ViewTable df = pd.read_csv('data.csv') df.dropna(inplace=True) ViewTable(df, info=True) ✨ ——————————————— Built with: 🐍 Python 🐼 Pandas 🎨 Tkinter Canvas + CustomTkinter This is Day 1 of my Data Science journey. Small library. Big learning. 🙏 👇 Check it out: 🔗 GitHub: [your link] 📦 PyPI: [your link] #Python #DataScience #OpenSource #MachineLearning #100DaysOfCode #Programming #buildinpublic
Like Comment
To view or add a comment, sign in
Sarthak Sachdeva
5d
Report this post
🚀 I just published my first Python library on PyPI! As a self-taught developer learning Data Science, I faced a simple but annoying problem every day: ❌ print(df) → boring console output ❌ Hard to read 3500+ rows in terminal ❌ No visual info about nulls, duplicates So I built the solution myself. 💡 ✅ Introducing ViewTable — A beautiful GUI table viewer for Pandas DataFrames! 📦 pip install viewtb 🔥 What it does: → Opens a beautiful dark-mode GUI table → Shows null cells in Blue → Shows duplicate rows in Red → Sidebar with dataset info — rows, columns, memory → Just ONE line of code! 💻 Usage: ——————————————— from viewtb import ViewTable df = pd.read_csv('data.csv') df.dropna(inplace=True) ViewTable(df, info=True) ✨ ——————————————— Built with: 🐍 Python 🐼 Pandas 🎨 Tkinter Canvas + CustomTkinter This is Day 1 of my Data Science journey. Small library. Big learning. 🙏 👇 Check it out: 🔗 GitHub: [your link] 📦 PyPI: [your link] #Python #DataScience #OpenSource #MachineLearning #100DaysOfCode #Programming #buildinpublic
Like Comment
To view or add a comment, sign in
Nurudeen MURAINA
5d
Report this post
Python libraries every data analyst needs. The only Python libraries you need to start: 📊 pandas: data manipulation 📈 matplotlib + seaborn: visualization 🔢 numpy: numerical computing 📋 openpyxl: Excel automation 🔌 sqlalchemy: database connections That's it. Master these 5 and you can handle 90% of real-world analytics work. Don't get distracted by ML libraries until the basics are solid. #Python #DataAnalytics #DataTools #Pandas
Like Comment
To view or add a comment, sign in
Shubham Nimgade
1w
Report this post
Most people learn Python for Data Analysis like this: Watch tutorials. Copy code. Feel productive. But when it’s time to analyze real data… they get stuck. Because tools ≠ understanding. Two libraries change everything when used right: Pandas and Matplotlib Here’s the simple way to think about them: → Pandas = Thinking in tables You clean data, filter it, group it, and actually understand what’s going on → Matplotlib = Seeing patterns You turn numbers into visuals so insights become obvious You need to ask: “What question am I trying to answer?” Because real Data Analysis isn’t about code. It’s about: → Asking better questions → Exploring data step by step → Communicating insights clearly Pandas helps you explore. Matplotlib helps you explain. Together, they turn raw data into decisions. Quick question: What kind of data would you actually like to analyze? 👇
Like Comment
To view or add a comment, sign in

1,126 followers

170 Posts

View Profile Follow

Data Visualization with Matplotlib and StackOverflow Data

More Relevant Posts

Explore related topics

Explore content categories