Unpopular opinion: Is Jupyter *really* the best tool for *everything* in your data science workflow? 🤔 While notebooks are great for exploration, let's talk about building robust, maintainable projects. I'm advocating for a move towards: * Modular Code (.py files): For better organization and reusability. * Git Versioning: Because "final_version_v2_FINAL.ipynb" gives me nightmares. * Unit Testing: Catching bugs before they become full-blown crises. Are we over-relying on notebooks? What are your thoughts on moving towards more structured approaches in data science? Share your experiences in the comments! 👇 #DataScience #MachineLearning #Python #SoftwareEngineering #CodeQuality
Rethinking Jupyter's Role in Data Science Workflows
More Relevant Posts
-
🔥 Day 4 – Pandas Selection & Production-Style Filtering Today I focused on strengthening my data selection and filtering skills using Pandas — but doing it the right way. Instead of just filtering rows, I practiced production-style defensive programming. Here’s what I worked on: ✅ Column & row selection using .loc and .iloc ✅ Boolean filtering with multiple conditions ✅ Cleaning messy CSV column names ✅ Safe numeric conversion using pd.to_numeric() ✅ Writing a custom function to parse "HH:MM" delay values into proper Timedelta objects ✅ Handling invalid values using pd.NaT ✅ Preventing runtime errors with defensive filtering logic Built a workflow that: • Filters orders with Miles ≤ 30 • Converts delay strings into real time objects • Filters delays ≤ 30 minutes • Ensures no invalid comparisons occur Real-world data is messy. Learning how to clean, validate, and safely filter it is what turns simple analysis into production-ready logic. 📂 GitHub Repository: https://lnkd.in/gNWeQ5KE On to Day 5 🚀 #Python #Pandas #DataEngineering #Analytics #LearningInPublic #100DaysOfCode
To view or add a comment, sign in
-
DAY 5/30 Excel for Analytics with TS Academy Our tutor Ezekiel Aleke strongly emphasized starting with Microsoft Excel before moving into more advanced tools like Python or SQL. At first, it sounded simple. But he explained that it's not about starting with the complex tools, It is about building analytical thinking. This means many professionals still rely on Excel for high level business decisions. That alone says a lot about its relevance. Do not rush past the foundation. #30DaysOfTech #LearningWithTS
To view or add a comment, sign in
-
-
Registering a Virtual Environment as a Jupyter Kernel: If you’ve ever: ❌ Activated a virtual environment ❌ Opened Jupyter Notebook ❌ Still seen ModuleNotFoundError This is why 👇 👉 Jupyter doesn’t detect environments — it detects kernels. In this reel, I show: ✅ Why your venv doesn’t appear ✅ Why installing Anaconda is NOT required ✅ How to properly register your venv as a Jupyter kernel ✅ A lightweight, production-ready workflow This fix is: ✔ VM-friendly ✔ Low-RAM ✔ Industry standard If you work with Python, Data Science, or Machine Learning, this will save you hours. 📖 Complete setup guide available on GitHub 🖇Follow the link for the full walkthrough shown in this video: https://lnkd.in/dxu-nBYD 🔁 Re-share to help other developers 💾 Save this for later 💬 Comment “kernel” if this helped you #Python #JupyterNotebook #VirtualEnvironment #DataScience #MachineLearning #DeveloperTips #PythonTips #VSCode
To view or add a comment, sign in
-
🔥 Python Guide for Beginners Complete crash course – From zero to hero! -> Basics: Syntax, variables, data types (lists/tuples/sets/dicts) -> Control flow: if/else/while/for + loops -> Functions (args/kwargs/lambda/recursion) -> OOP: Classes/inheritance/polymorphism -> Modules/Files/Exceptions/JSON/CSV -> Libs: NumPy/Pandas/Matplotlib/Flask APIs -> Data Eng gold: ETL, APIs, DBs (SQLite/MySQL) Perfect PySpark/DE foundation – Install, REPL, code ready! #Python #Beginners #DataEngineering
To view or add a comment, sign in
-
🚀 Pandas Data Analysis – Part 1 | Python for Data Science I’ve started building a practical learning series on Data Analysis using Python. In this Part 1, I worked on understanding and implementing core concepts of Pandas with real datasets using Jupyter Notebook. 📊 Topics Covered in this PDF & Notebook: ✔ Introduction to Pandas ✔ Installing and Importing Pandas ✔ Series and DataFrame Basics ✔ Reading Different Dataset Formats (CSV, Excel, etc.) ✔ Data Inspection Methods ✔ Selecting Columns and Rows ✔ Indexing and Slicing Data ✔ Handling Missing Values ✔ Data Cleaning Techniques ✔ Basic Data Analysis Operations ✔ Statistical Functions ✔ Sorting and Filtering Data All concepts are implemented step-by-step with practical examples and datasets. 🔗 GitHub Repository (Notebook + Dataset): https://lnkd.in/g9v4uXFC 📄 I have also attached a detailed PDF explaining the concepts covered in this project. This is Part 1 of my Data Analysis learning journey — more advanced parts coming soon! Feedback and suggestions are always welcome 🙌 #Python #Pandas #DataAnalysis #MachineLearning #AIML #DataScience #GitHub #StudentDeveloper #LearningInPublic
To view or add a comment, sign in
-
🚀 Day 5/100 — Working with Persistent Storage 🧠 “Persistence transforms execution into continuity.” Systems become meaningful when they retain and retrieve information reliably. Today, I learned how Python interacts with files to store and retrieve persistent data. ⚙️ 🔧 Today’s focus areas: 📂 File Reading — Accessing stored data 📝 File Writing — Persisting new information 🔄 File Modes — Managing read and write operations 🎯 Data Persistence — Ensuring continuity across executions 🎯 The objective was to enable programs to maintain state beyond runtime. ✅ Day 5 complete: Persistent data handling established. ▶️ Day 6: Strengthening reliability through exception handling. Step by step. The system evolves. 🏗️ #Python #BackendDevelopment #100DaysOfCode #SoftwareEngineering
To view or add a comment, sign in
-
Day 9/100: Mastering Python Dictionaries & Data Nesting! Today was all about organizing complex data. I moved from simple Lists to Dictionaries, which allow for much more powerful data management using Key-Value pairs. What I covered today: Dictionary Basics: Creating, editing, and wiping dictionaries. The "Secret" of Nesting: Placing Lists inside Dictionaries and even Dictionaries inside Dictionaries! Looping: How to efficiently iterate through keys and values. Daily Project: Secret Auction (Bidding System) I built a blind auction program where multiple users can enter their names and bids. The program keeps the bids secret and automatically identifies the highest bidder at the end. This project was a great test of my ability to manage user input within nested data structures. Check out the "Secret Auction" code here: https://lnkd.in/edbJz2bW #Python #100DaysOfCode #DataStructures #Dictionaries #CodingJourney #BackendDevelopment
To view or add a comment, sign in
-
-
Day 1 of learning Pandas and I survived. 🐼 Honestly? I had no idea what I was getting into. But here I am after Day 1 having gone through: 📂 Loading data (yes, I learned what r" " does and why it matters 😅) 🔍 Filtering rows like a detective 🗂️ Indexing (loc vs iloc broke my brain for a bit ngl) 📊 GroupBy & aggregation — basically Excel PivotTables but cooler 🔗 Merging DataFrames — SQL vibes but in Python 📈 Visualizing data with just one line of code 6 lessons. 60 exercises. 1 day. 0 regrets. I'm sharing my journey here as I go — the wins, the confusion, and everything in between. If you're on a similar path, let's connect! And if you've already been through this... any tips for Day 2? 👇 #Python #Pandas #LearningInPublic #DataScience #100DaysOfCode #JustStarted
To view or add a comment, sign in
-
🚀 New Project Released: NumPy Data Explorer I’m excited to share my latest project — NumPy Data Explorer, a versatile Python tool designed to help users explore and analyze data using NumPy with ease. 🐍📊 🔗 Check it out here: https://lnkd.in/dV9cyAUx Key features: • Effortless data analysis using NumPy arrays • Interactive data summarization • Simple, clean codebase — perfect for learners and practitioners Would love your feedback, stars ⭐ and contributions! #Python #NumPy #DataScience #OpenSource #GitHub
To view or add a comment, sign in
-
I’ve continued my practical learning journey in Data Analysis using Python. In this Part 2, I focused on advanced Pandas operations that are essential for real-world data manipulation, cleaning, and combining multiple datasets using Jupyter Notebook. 📊 Topics Covered in this PDF & Notebook: ✔ Data Modification (Adding, Updating, Removing Columns) ✔ Detecting and Handling Missing Data ✔ Filling Missing Values (Mean, Median, Mode) ✔ Interpolation Techniques ✔ Sorting Data (Single & Multiple Columns) ✔ Aggregation Functions (sum, mean, max, min, agg) ✔ GroupBy Operations for Category-wise Analysis ✔ Merging DataFrames (Inner, Left, Right, Outer, Cross Join) ✔ Concatenating DataFrames (Vertical & Horizontal) ✔ Real-world Data Cleaning and Transformation Techniques All concepts are implemented step-by-step with practical examples and datasets. 🔗 GitHub Repository (Notebook ): https://lnkd.in/gYgkth2v 📄 I have also attached a detailed PDF explaining the concepts covered in this project. This is Part 2 of my Data Analysis learning journey Feedback and suggestions are always welcome 🙌 #Python #Pandas #DataAnalysis #MachineLearning #AIML #DataScience #GitHub #StudentDeveloper #LearningInPublic
To view or add a comment, sign in
Explore related topics
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development