🧠 Hands-on Practical on Missing Value Treatment | Titanic Dataset 🚢 Today, I explored one of the most important preprocessing steps in Machine Learning — Missing Value Treatment — using the Titanic dataset. Handled missing data using various techniques like mean/median imputation, mode replacement, and row/column removal to ensure the dataset is clean and ready for analysis. This exercise helped me understand how data quality directly impacts model performance and reliability. It was a great experience working on real-world data and applying practical data cleaning techniques using Python (Pandas, NumPy). 📘 GitHub Repository: https://lnkd.in/gsPj_hxs 🎓 Under the guidance of: Ashish Sawant #DataScience #MachineLearning #Python #Pandas #DataCleaning #TitanicDataset #DataPreprocessing #LearningEveryday #MLJourney #AI
More Relevant Posts
-
📅 Day 11: Hyperparameter Tuning & Cross Validation ⚙️📊 🎯 Learning Goals: Learned how to improve model performance using Hyperparameter Tuning Explored techniques like Grid Search, Random Search, and Bayesian Optimization Understood Cross Validation (K-Fold) to check model stability and avoid overfitting Tuned ML models to achieve the best accuracy and generalization 🧠 Key Takeaway: “Training a model is easy — making it perform consistently is the real art.” Hyperparameter tuning helps us find the sweet spot where the model learns effectively without memorizing the data. 📈 Tech Stack: Python | Scikit-learn | GridSearchCV | RandomizedSearchCV | KFoldCV #MachineLearning #DataScience #HyperparameterTuning #CrossValidation #AI #LearningJourney #ModelOptimization #Python #ScikitLearn
To view or add a comment, sign in
-
Fake News Detection using Machine Learning I built a Fake News Detection model that classifies articles as Real or Fake using Python ,Scikit-learn and TF-IDF Vectorizer. – Data preprocessing & feature extraction using TF-IDF – Logistic Regression for classification – Achieved ~95 % accuracy on test data – Implemented in Google Colab and uploaded on GitHub Project Link: [https://lnkd.in/gEqUfWfc) #MachineLearning #AI #Python #DataScience #FakeNewsDetection #MLProjects #GitHub
To view or add a comment, sign in
-
-
🎯 Decision Trees & Random Forests — From Concept to Implementation Today’s session with Monal S. Sir helped me deeply understand how Decision Trees make predictions by splitting data based on the feature that gives the best variance reduction or information gain. 🌳 I learned how overfitting can be controlled using parameters like min_samples_leaf and min_samples_split, and how Ensemble Methods like Bagging and Boosting combine multiple models for stronger performance. We also explored the Random Forest algorithm, which builds several decision trees using bootstrap datasets and random subsets of features — making it more accurate and less prone to overfitting. Finally, I implemented everything in Python using the Iris dataset, visualized the tree, checked feature importance, and even saved the model using joblib. It was a great blend of theory and hands-on learning! 💻 #MachineLearning #DataScience #DecisionTree #RandomForest #Python #AI #LearningJourney
To view or add a comment, sign in
-
-
Day 9 – Exploring NumPy in Python Today, I deepened my understanding of NumPy, one of the most powerful Python libraries for numerical and scientific computing. 🧮 Here’s what I explored: ✅ The concept of ndarrays – NumPy’s high-performance multidimensional arrays ✅ Element-wise operations and universal functions (ufuncs) for fast computation ✅ Broadcasting — performing operations on arrays of different shapes ✅ Aggregate functions like sum(), mean(), and std() ✅ Slicing & indexing in 1D and 2D arrays ✅ Reshaping, flattening, and transposing arrays ✅ Using np.newaxis to modify array dimensions NumPy makes data manipulation incredibly fast and memory-efficient compared to traditional Python lists — an essential skill for data science, AI, and machine learning! ⚙️ 📘 #100DaysOfCode #Python #NumPy #DataScience #MachineLearning #CodingJourney #LearnEveryday
To view or add a comment, sign in
-
🤖 Performed Practical on K-Nearest Neighbors (KNN) Algorithm Implemented and understood the working of KNN, a powerful supervised learning algorithm used for both classification and regression tasks. Learned how distance metrics and the value of k influence model accuracy and performance. 🔗 GitHub Repository: https://lnkd.in/gsPj_hxs 🧑🏫 Under the Guidance of: Ashish Sawant #MachineLearning #DataScience #KNN #Python #AI #MLAlgorithms #LearningJourney #GitHub
To view or add a comment, sign in
-
Week 5 of my AI & Data Science journey 🚀 This week, I explored Python Memory Management — a crucial concept for writing efficient and scalable programs. Key learnings: Understanding how Python allocates and manages memory Exploring the heap, stack, and reference counting mechanism Working with the garbage collector (gc module) Analyzing memory leaks and optimization techniques for data-heavy applications Efficient memory handling is key to ensuring ML models and data pipelines run smoothly — especially when working with large datasets. 📂 Notes & Assignments: https://lnkd.in/gPnQkhGY #Python #DataScience #AI #MachineLearning #MemoryManagement #LearningJourney #CodeOptimization
To view or add a comment, sign in
-
-
Simple Linear Regression Project: Predicting House Prices🏠 In this project, I built a simple Linear Regression model using Python and Scikit-learn to predict house prices based on the area (in m²). 🔹 Steps included: * Data visualization using Matplotlib 📊 * Splitting data into training and testing sets * Training a Linear Regression model * Predicting and evaluating results * Visualizing the regression line 📈 The project demonstrates how machine learning can be used to make real-world predictions in a simple and interpretable way. Taghrida Mohamed ♥️♥️ #MachineLearning #DataScience #Python #LinearRegression #AI #LearningJourney
To view or add a comment, sign in
-
📶 Experiment 12: Random Forest Algorithm using Python 🤖 In this lab, I explored the Random Forest Algorithm, a powerful ensemble learning technique that builds multiple decision trees and combines their outputs for more accurate and stable predictions. 🔍 Key learning outcomes: • Understanding the concept of bagging and ensemble averaging • Implementing Random Forest using scikit-learn • Evaluating model performance using metrics like accuracy and feature importance • Learning how Random Forest reduces overfitting and improves generalization • Visualizing feature contributions to model decisions This experiment strengthened my grasp on how ensemble models enhance predictive power and reliability, making Random Forests a go-to choice for many real-world machine learning tasks. 📁 Explore the repository here : 👉 https://lnkd.in/epWys7e7 #DataScience #MachineLearning #Python #ScikitLearn #EnsembleLearning #PredictiveModeling #DataAnalysis #AI #LearningJourney #JupyterNotebook Ashish Sawant sir
To view or add a comment, sign in
Explore related topics
- Data Preprocessing Techniques
- Data Cleansing Best Practices for AI Projects
- How to Improve Data Practices for AI
- Best Practices for Data Management in AI Models
- Best Practices for Data Quality in Generative AI
- Importance of Clean Data for AI Predictions
- Using machine learning to audit gender representation
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development