Rethinking Jupyter's Role in Data Science Workflows

2mo

Unpopular opinion: Is Jupyter *really* the best tool for *everything* in your data science workflow? 🤔 While notebooks are great for exploration, let's talk about building robust, maintainable projects. I'm advocating for a move towards: * Modular Code (.py files): For better organization and reusability. * Git Versioning: Because "final_version_v2_FINAL.ipynb" gives me nightmares. * Unit Testing: Catching bugs before they become full-blown crises. Are we over-relying on notebooks? What are your thoughts on moving towards more structured approaches in data science? Share your experiences in the comments! 👇 #DataScience #MachineLearning #Python #SoftwareEngineering #CodeQuality

To view or add a comment, sign in

More Relevant Posts

Arley Balderas
2mo
Report this post
🔥 Day 4 – Pandas Selection & Production-Style Filtering Today I focused on strengthening my data selection and filtering skills using Pandas — but doing it the right way. Instead of just filtering rows, I practiced production-style defensive programming. Here’s what I worked on: ✅ Column & row selection using .loc and .iloc ✅ Boolean filtering with multiple conditions ✅ Cleaning messy CSV column names ✅ Safe numeric conversion using pd.to_numeric() ✅ Writing a custom function to parse "HH:MM" delay values into proper Timedelta objects ✅ Handling invalid values using pd.NaT ✅ Preventing runtime errors with defensive filtering logic Built a workflow that: • Filters orders with Miles ≤ 30 • Converts delay strings into real time objects • Filters delays ≤ 30 minutes • Ensures no invalid comparisons occur Real-world data is messy. Learning how to clean, validate, and safely filter it is what turns simple analysis into production-ready logic. 📂 GitHub Repository: https://lnkd.in/gNWeQ5KE On to Day 5 🚀 #Python #Pandas #DataEngineering #Analytics #LearningInPublic #100DaysOfCode

GitHub - abalderas16/data-engineering-practice: Hands-on practice with Python and data engineering concepts. github.com
Like Comment
To view or add a comment, sign in
Matthew Markus
2mo
Report this post
DAY 5/30 Excel for Analytics with TS Academy Our tutor Ezekiel Aleke strongly emphasized starting with Microsoft Excel before moving into more advanced tools like Python or SQL. At first, it sounded simple. But he explained that it's not about starting with the complex tools, It is about building analytical thinking. This means many professionals still rely on Excel for high level business decisions. That alone says a lot about its relevance. Do not rush past the foundation. #30DaysOfTech #LearningWithTS
1 Comment
Like Comment
To view or add a comment, sign in
Akram Khan
2mo
Report this post
Registering a Virtual Environment as a Jupyter Kernel: If you’ve ever: ❌ Activated a virtual environment ❌ Opened Jupyter Notebook ❌ Still seen ModuleNotFoundError This is why 👇 👉 Jupyter doesn’t detect environments — it detects kernels. In this reel, I show: ✅ Why your venv doesn’t appear ✅ Why installing Anaconda is NOT required ✅ How to properly register your venv as a Jupyter kernel ✅ A lightweight, production-ready workflow This fix is: ✔ VM-friendly ✔ Low-RAM ✔ Industry standard If you work with Python, Data Science, or Machine Learning, this will save you hours. 📖 Complete setup guide available on GitHub 🖇Follow the link for the full walkthrough shown in this video: https://lnkd.in/dxu-nBYD 🔁 Re-share to help other developers 💾 Save this for later 💬 Comment “kernel” if this helped you #Python #JupyterNotebook #VirtualEnvironment #DataScience #MachineLearning #DeveloperTips #PythonTips #VSCode
Like Comment
To view or add a comment, sign in
Rakesh D L
2mo
Report this post
🔥 Python Guide for Beginners Complete crash course – From zero to hero! -> Basics: Syntax, variables, data types (lists/tuples/sets/dicts) -> Control flow: if/else/while/for + loops -> Functions (args/kwargs/lambda/recursion) -> OOP: Classes/inheritance/polymorphism -> Modules/Files/Exceptions/JSON/CSV -> Libs: NumPy/Pandas/Matplotlib/Flask APIs -> Data Eng gold: ETL, APIs, DBs (SQLite/MySQL) Perfect PySpark/DE foundation – Install, REPL, code ready! #Python #Beginners #DataEngineering

2 Comments
Like Comment
To view or add a comment, sign in
Mahak Diwakar
2mo Edited
Report this post
🚀 Pandas Data Analysis – Part 1 | Python for Data Science I’ve started building a practical learning series on Data Analysis using Python. In this Part 1, I worked on understanding and implementing core concepts of Pandas with real datasets using Jupyter Notebook. 📊 Topics Covered in this PDF & Notebook: ✔ Introduction to Pandas ✔ Installing and Importing Pandas ✔ Series and DataFrame Basics ✔ Reading Different Dataset Formats (CSV, Excel, etc.) ✔ Data Inspection Methods ✔ Selecting Columns and Rows ✔ Indexing and Slicing Data ✔ Handling Missing Values ✔ Data Cleaning Techniques ✔ Basic Data Analysis Operations ✔ Statistical Functions ✔ Sorting and Filtering Data All concepts are implemented step-by-step with practical examples and datasets. 🔗 GitHub Repository (Notebook + Dataset): https://lnkd.in/g9v4uXFC 📄 I have also attached a detailed PDF explaining the concepts covered in this project. This is Part 1 of my Data Analysis learning journey — more advanced parts coming soon! Feedback and suggestions are always welcome 🙌 #Python #Pandas #DataAnalysis #MachineLearning #AIML #DataScience #GitHub #StudentDeveloper #LearningInPublic

1 Comment
Like Comment
To view or add a comment, sign in
Lokesh deesh V
2mo
Report this post
🚀 Day 5/100 — Working with Persistent Storage 🧠 “Persistence transforms execution into continuity.” Systems become meaningful when they retain and retrieve information reliably. Today, I learned how Python interacts with files to store and retrieve persistent data. ⚙️ 🔧 Today’s focus areas: 📂 File Reading — Accessing stored data 📝 File Writing — Persisting new information 🔄 File Modes — Managing read and write operations 🎯 Data Persistence — Ensuring continuity across executions 🎯 The objective was to enable programs to maintain state beyond runtime. ✅ Day 5 complete: Persistent data handling established. ▶️ Day 6: Strengthening reliability through exception handling. Step by step. The system evolves. 🏗️ #Python #BackendDevelopment #100DaysOfCode #SoftwareEngineering
Like Comment
To view or add a comment, sign in
Ravi Ranjan Kumar
1mo
Report this post
Day 9/100: Mastering Python Dictionaries & Data Nesting! Today was all about organizing complex data. I moved from simple Lists to Dictionaries, which allow for much more powerful data management using Key-Value pairs. What I covered today: Dictionary Basics: Creating, editing, and wiping dictionaries. The "Secret" of Nesting: Placing Lists inside Dictionaries and even Dictionaries inside Dictionaries! Looping: How to efficiently iterate through keys and values. Daily Project: Secret Auction (Bidding System) I built a blind auction program where multiple users can enter their names and bids. The program keeps the bids secret and automatically identifies the highest bidder at the end. This project was a great test of my ability to manage user input within nested data structures. Check out the "Secret Auction" code here: https://lnkd.in/edbJz2bW #Python #100DaysOfCode #DataStructures #Dictionaries #CodingJourney #BackendDevelopment
Like Comment
To view or add a comment, sign in
Dhanush Manivannan
1mo
Report this post
Day 1 of learning Pandas and I survived. 🐼 Honestly? I had no idea what I was getting into. But here I am after Day 1 having gone through: 📂 Loading data (yes, I learned what r" " does and why it matters 😅) 🔍 Filtering rows like a detective 🗂️ Indexing (loc vs iloc broke my brain for a bit ngl) 📊 GroupBy & aggregation — basically Excel PivotTables but cooler 🔗 Merging DataFrames — SQL vibes but in Python 📈 Visualizing data with just one line of code 6 lessons. 60 exercises. 1 day. 0 regrets. I'm sharing my journey here as I go — the wins, the confusion, and everything in between. If you're on a similar path, let's connect! And if you've already been through this... any tips for Day 2? 👇 #Python #Pandas #LearningInPublic #DataScience #100DaysOfCode #JustStarted
Like Comment
To view or add a comment, sign in
Arshi Khan
2mo
Report this post
🚀 New Project Released: NumPy Data Explorer I’m excited to share my latest project — NumPy Data Explorer, a versatile Python tool designed to help users explore and analyze data using NumPy with ease. 🐍📊 🔗 Check it out here: https://lnkd.in/dV9cyAUx Key features: • Effortless data analysis using NumPy arrays • Interactive data summarization • Simple, clean codebase — perfect for learners and practitioners Would love your feedback, stars ⭐ and contributions! #Python #NumPy #DataScience #OpenSource #GitHub

GitHub - arshikhanofficial/Syntecxhub_-NumPy-Data-Explorer: Learned NumPy fundamentals – array creation, indexing, and slicing. Performed mathematical, axis-wise, and statistical operations on datasets. Applied reshaping and broadcasting techniques for efficient computation. Implemented save/load operations for NumPy arrays. Compared NumPy’s performance with standard Python lists github.com
Like Comment
To view or add a comment, sign in
Mahak Diwakar
2mo
Report this post
I’ve continued my practical learning journey in Data Analysis using Python. In this Part 2, I focused on advanced Pandas operations that are essential for real-world data manipulation, cleaning, and combining multiple datasets using Jupyter Notebook. 📊 Topics Covered in this PDF & Notebook: ✔ Data Modification (Adding, Updating, Removing Columns) ✔ Detecting and Handling Missing Data ✔ Filling Missing Values (Mean, Median, Mode) ✔ Interpolation Techniques ✔ Sorting Data (Single & Multiple Columns) ✔ Aggregation Functions (sum, mean, max, min, agg) ✔ GroupBy Operations for Category-wise Analysis ✔ Merging DataFrames (Inner, Left, Right, Outer, Cross Join) ✔ Concatenating DataFrames (Vertical & Horizontal) ✔ Real-world Data Cleaning and Transformation Techniques All concepts are implemented step-by-step with practical examples and datasets. 🔗 GitHub Repository (Notebook ): https://lnkd.in/gYgkth2v 📄 I have also attached a detailed PDF explaining the concepts covered in this project. This is Part 2 of my Data Analysis learning journey Feedback and suggestions are always welcome 🙌 #Python #Pandas #DataAnalysis #MachineLearning #AIML #DataScience #GitHub #StudentDeveloper #LearningInPublic
Like Comment
To view or add a comment, sign in

2,103 followers

84 Posts

View Profile Connect

Rethinking Jupyter's Role in Data Science Workflows

More Relevant Posts

Explore related topics

Explore content categories