Mastering Arrays and NumPy in 100 Days of Coding

6mo Edited

🚀 Days 41–56 of #100DaysOfCoding: Strengthening Problem-Solving with Arrays & NumPy Over the past 16 days, I’ve focused on building a solid foundation in data structures and numerical computing. This phase has been about improving efficiency, mastering algorithmic thinking, and understanding data manipulation at a deeper level. 🔹 Core Array Problems Solved: 1️⃣ Find Min/Max Elements – Implemented an O(n) linear-time approach without sorting, optimizing both time and space 2️⃣ In-Place Array Reversal – Applied the two-pointer technique to reverse arrays efficiently (O(1) space) 3️⃣ Element Frequency Counter – Designed a function to compute occurrences of target elements in linear time 4️⃣ Second Largest Element – Solved using two tracking variables in a single traversal 5️⃣ Move Zeros to End – Implemented a stable version maintaining element order; currently refining with a two-pointer optimization 🔹 NumPy Fundamentals: Explored essential NumPy operations for data analysis, including: Mean, Median, Standard Deviation, Variance Array slicing, broadcasting, and vectorized computations These are fundamental tools for upcoming machine learning and data science projects. 🔹 Key Learnings: ✅ Optimization in both time and space complexity is critical ✅ In-place algorithms significantly enhance memory efficiency ✅ Clean, simple solutions often outperform over-engineered ones Next steps: optimizing current implementations and diving deeper into advanced data structures and algorithms. GitHub Repository: https://lnkd.in/gsucUW-F What’s your favorite array-related problem or concept I’d love to hear your thoughts in the comments 👇 #Python #NumPy #DataStructures #Algorithms #ProblemSolving #CodingChallenge #LearningInPublic #TechJourney #100DaysOfCode #yohancodes #selflearnig

To view or add a comment, sign in

More Relevant Posts

Shubhankar G.
5mo
Report this post
Diving deeper into performance optimization! 🚀 Memory-Mapped Arrays in NumPy: Processing Datasets Larger Than RAM After our 162TB weather data pipeline, we explored NumPy's memory-mapping capabilities for large-scale data processing. This deep dive shares 7 critical lessons: - Why dtype mismatches cost us hours of work - How sequential access was 5-10× faster than random - Strategic flush() patterns for data integrity - Real performance gains: 10-20× RAM reduction, multi-core parallelism Key insight: Memory mapping isn't magic - it fails on small datasets and random access patterns. But for large-scale sequential processing? Absolute game changer. Whether you're working with terabytes of data, building scalable ML pipelines, or hitting RAM limits, these lessons will save you debugging time. Link in comments 👇 What's your biggest challenge with large-scale data processing? Would love to hear your experiences! #DataEngineering #Python #NumPy #MachineLearning #PerformanceOptimization #BigData
1 Comment
Like Comment
To view or add a comment, sign in
Soumyadeep Saha
6mo Edited
Report this post
Excited to share the final evolution of my Top IMDb Movies project: from a data analysis deep dive to a deployed machine learning application. After the initial exploratory analysis, I built a predictive model to answer a more nuanced question: "What attributes truly drive a movie's rating?" The process of building and deploying this model as a live Streamlit app was a challenging and incredibly insightful journey. My biggest takeaways weren't just about code, but about the practical realities of data science: 🔹 The Model's Story: Predicting a subjective outcome like a movie rating is inherently complex. The final XGBoost model achieved a 25% R-squared, which is a respectable result for a social science problem. More importantly, the low error metrics (like a MAPE of ~2%) prove the model's practical accuracy. This taught me that the context of a problem is just as important as the final score. 🔹 The Value of Debugging: I identified and corrected two subtle but critical forms of data leakage in my preprocessing pipeline. This experience was the most valuable lesson of the project, reinforcing the importance of a methodologically sound process. 🔹 Feature Engineering is the Real MVP: The most significant performance gains came from thoughtful feature engineering and selection, not from simply using a complex algorithm. Discovering that a simpler model with better features could outperform a complex one was a key insight. This project has been a journey from a static CSV file to a functional, interactive application. I would be thrilled for you to try it out and share any feedback. 🚀 Live App Link: https://lnkd.in/gzCY7TJq 📖 Full Project & Code on GitHub: https://lnkd.in/gBKXtVtr #DataScience #MachineLearning #DataAnalysis #Python #Streamlit #PortfolioProject #XGBoost #ScikitLearn #FeatureEngineering

2 Comments
Like Comment
To view or add a comment, sign in
EdTechInformative

946 followers
6mo
Report this post
Python + Data Science = Your Next Big Skill. 🚀 Diving in? Start here: 1️⃣ Set up: Anaconda or Jupyter Notebooks 2️⃣ Master: Pandas (data) + Matplotlib (visuals) 3️⃣ Level up: Scikit-Learn for predictive models 🔑 Key? Practice. Tackle real datasets. Join challenges. Build. Soon, you’ll turn raw data into powerful stories. 💪 👉 Follow @EdTechInformative for more tech & data tips. 🔗 edtechinformative.uk
Like Comment
To view or add a comment, sign in
Amr Abdelkarem
5mo
Report this post
📌 25 Algorithms Every Programmer Should Know 🔗 Start mastering algorithms and data structures → https://lnkd.in/dR96YGaA Searching → Linear Search Algorithm → Binary Search Algorithm → Depth First Search Algorithm → Breadth First Search Algorithm Sorting → Insertion Sort → Heap Sort → Selection Sort → Merge Sort → Quick Sort → Counting Sort Graphs → Kruskal’s Algorithm → Dijkstra’s Algorithm → Bellman–Ford Algorithm → Floyd–Warshall Algorithm → Topological Sort Algorithm → Flood Fill Algorithm → Lee Algorithm Arrays → Kadane’s Algorithm → Floyd’s Cycle Detection Algorithm → Knuth–Morris–Pratt (KMP) Algorithm → Quickselect Algorithm → Boyer–Moore Majority Vote Algorithm Basics → Huffman Coding Compression Algorithm → Euclid’s Algorithm → Union–Find Algorithm (Disjoint Set Union) Professional courses to boost your skills: Python: → Meta Data Analyst Professional Certificate → https://lnkd.in/dtcBsxQm → Microsoft Python Development Professional Certificate → https://lnkd.in/dtRs5huq → Google IT Automation with Python Professional Certificate → https://lnkd.in/ddvJ4y3d Data Science: → IBM Data Science → https://lnkd.in/dCZvDFwF #ProgrammingValley #Algorithms #DataStructures #CodingInterview #ComputerScience #Programming
Like Comment
To view or add a comment, sign in
Rishabh Srivastava
5mo
Report this post
🤖 [AI-ML: POST05] How the Regression Line Fits the Data (A Quick Visual Intuition!) Before we jump into coding Linear Regression, let’s take a moment to really grasp the core math — because once you get this, the code will feel effortless. 📊 Imagine this: Blue dots → Actual data points (Cost vs No of Feature) Red line → Model’s predicted line The machine tries to find the best-fit line that follows the equation: Y=mX+c keeps adjusting the values of m and c — slightly changing the line’s slope and position — until the total distance between the red line and all blue dots is as small as possible. That’s how your model learns the perfect line ✅ 💬 Why we’re revisiting this: Because this simple line is the heart of Regression — and understanding it deeply makes the jump to code (and even advanced ML) much easier. 👉 If you want to brush up on your basics, this is a great time to quickly revise linear algebra concepts like slope, intercept, and mean. 🚀 Next Post Preview: I’ll share my GitHub repo and the Python implementation where we’ll actually plot this line and watch the math come alive! #AIJourneyWithRishabh #ArtificialIntelligence #MachineLearning
Like Comment
To view or add a comment, sign in
Muhammad Jawad Jafri
6mo
Report this post
🚀 Day 9 of My Data Science Journey: Model Evaluation in Action! Today I explored one of the most important steps in machine learning — model evaluation. After training a regression model to predict house prices, I learned how to measure how well the model performs using key metrics: 📊 Evaluation Results: MAE (Mean Absolute Error) MSE (Mean Squared Error) RMSE (Root Mean Squared Error) R² Score These metrics helped me understand how close (or far!) my model’s predictions are from actual values. The next step — improving the model using better features and advanced algorithms like Random Forest or Polynomial Regression. Every day brings more insights and learning in this journey toward mastering data science and machine learning. you can check out code on github. https://lnkd.in/dG-7b2ZJ if you have idea to improve my learning feel free to share m happy to learn from your experience in data science field #DataScience #MachineLearning #Python #Regression #LearningJourney #ModelEvaluation
Like Comment
To view or add a comment, sign in
Nitin Singh
6mo
Report this post
🚀 DSA Challenge – Day 93 Problem: Find Median from Data Stream 📊⚙️ This was an exciting deep dive into data structures and real-time computation — maintaining the median efficiently while continuously adding numbers from a stream. 🧠 Problem Summary: You need to design a class MedianFinder that can: ✅ Add numbers dynamically from a data stream. ✅ Return the median at any point in time. If the total number of elements is even → median = mean of the two middle values. If odd → median = middle value. ⚙️ My Approach: 1️⃣ Use two heaps to maintain balance: A max heap (maxHeap) for the smaller half of numbers. A min heap (minHeap) for the larger half. 2️⃣ Whenever a new number arrives: Push it into the maxHeap (inverted to simulate max behavior). Balance both heaps so that their size difference is never more than 1. 3️⃣ Ensure heap order: the maximum in maxHeap ≤ minimum in minHeap. 4️⃣ The median is: The top of the larger heap (if odd count). The average of both tops (if even count). 📈 Complexity: Time: O(log n) → For insertion and heap balancing. Space: O(n) → To store all elements in heaps. ✨ Key Takeaway: This problem highlights how heaps can turn complex real-time median calculations into a smooth, efficient process — a great example of data structure synergy in action. ⚡ 🔖 #DSA #100DaysOfCode #LeetCode #ProblemSolving #Heaps #PriorityQueue #DataStructures #Algorithms #Median #Python #CodingChallenge #InterviewPrep #EfficientCode #DynamicProgramming #TechCommunity #LearningByBuilding #CodeEveryday
Like Comment
To view or add a comment, sign in
parul pal
5mo Edited
Report this post
📘 Deepening My Foundations in Python & NumPy As part of my continuous learning journey in data science and numerical computing, I recently completed a structured NumPy practice notebook covering a wide range of essential operations. Today, I’m excited to share the fully solved version of that work. This practice allowed me to strengthen my understanding of: ✨ Array creation (zeros, ones, ranges, random values) ✨ Indexing & slicing techniques ✨ Boolean operations & conditional filtering ✨ Matrix manipulation (reshape, transpose, submatrices) ✨ Elementwise and matrix-level arithmetic ✨ Using np.where() for conditional replacement ✨ Working with identity matrices and reshaping ✨ Extracting patterns and substructures within arrays Each question helped reinforce not just NumPy syntax but also the thought process behind efficient numerical operations — something extremely important in data science, machine learning, and analytics workflows. 📄 I’ve attached the complete solved PDF (array.pdf) for anyone who wants to learn or revise these concepts. This file includes step-by-step solutions directly executed in a Jupyter environment. I’m excited to continue building on this momentum as I study pandas, data visualization, and eventually machine learning pipelines. If you’re also learning Python, NumPy, or data science, I’d be happy to connect and share resources! 🤝 🌐 GitHub link : https://lnkd.in/g4Xcznqz #NumPy #Python #DataScience #MachineLearning #DataAnalytics #WomenInTech #Programming #TechJourney #LearningInPublic #100DaysOfCode #DeveloperJourney #AI #ML #CodingCommunity #JupyterNotebook
Like Comment
To view or add a comment, sign in
Divya Zeliye
6mo
Report this post
🎯 Proud to share my latest hands-on Data Science & Machine Learning repository! This project brings together a series of practical Jupyter notebooks designed to cover the core steps of the DS/ML journey — from exploring data to building and evaluating predictive models. 📊 Key Highlights: Data Acquisition & EDA using pandas Statistical Measures & Insights with NumPy/SciPy Data Visualization using matplotlib Simple Linear Regression on salary prediction Classification Models (Logistic Regression, KNN, SVM, Decision Tree, Random Forest) applied to heart disease data Each notebook is focused on one core concept, explained clearly with step-by-step logic, metrics, and visual results. 🧠 Tools & Libraries: Python | Jupyter | pandas | NumPy | matplotlib | scikit-learn If you’re starting your Data Science or ML journey, you’ll find this repo full of practical, easy-to-understand examples. 🔗 Explore it here: https://lnkd.in/e65bdTcT Grateful to Ashish Sawant Sir for his constant guidance and mentorship throughout this work 🙏 #python #datascience #machinelearning #practicallearning #jupyternotebook #github #prmceam #projects
Like Comment
To view or add a comment, sign in
PyData Pittsburgh

584 followers
6mo
Report this post
Hey PyData Pittsburgh! Join us Tuesday evening, November 4th, at the Swartz Center for Entrepreneurship as Ehsan Totoni CTO and Co-Founder of bodo.ai discusses how Bodo DataFrames brings high-performance computing (HPC) techniques like MPI and JIT compilation to the familiar Pandas API—allowing data scientists to scale Python workloads from millions to billions of rows without rewriting their code. Talk – Bodo DataFrames: A Fast and Scalable HPC-Based Drop-In Replacement for Pandas. Times: 5:30pm – Doors Open. 6:00pm – Talk. More information and RSVP at the link in the comments! About the talk: Pandas is a popular library for data scientists but it struggles with large datasets; programs either become too slow or run out of memory. In this talk, we introduce Bodo DataFrames as a drop-in replacement for the Pandas library that uses high performance computing (HPC) based techniques such as Message Passing Interface (MPI) and JIT compilation for acceleration and scaling. We give an overview of its architecture and explain how it avoids the problems of Pandas (while keeping user code the same), go over concrete examples, and finally discuss current limitations. This talk is for Pandas users who would like to run their code on larger data while avoiding frustrating code rewrites to other APIs. Basic knowledge of Pandas and Python is recommended. #Python #Pandas #HighPerformanceComputing #DataScience
1 Comment
Like Comment
To view or add a comment, sign in

105 followers

40 Posts

View Profile Connect

Mastering Arrays and NumPy in 100 Days of Coding

More Relevant Posts

Explore related topics

Explore content categories