Building a Recommendation System with Machine Learning

🤖 Machine Learning Project 2 📚Book Recommendation! ✅ Step 1: Data Loading & Inspection First, I loaded the three separate datasets: Books.csv, Users.csv, and Ratings.csv. I immediately checked their shapes, finding missing data and checking for duplicates. ✅ Step 2: Model 1 - Popularity-Based Recommender My first goal was to create a general "Top 50" list, perfect for new users. I merged the ratings and books tables. I grouped them to find the total ratings and average rating for each book. To ensure statistical significance, I filtered for books with 200 or more ratings. Finally, I sorted this filtered list by average rating to get my Top 50. ✅ Step 3: Model 2 - Collaborative Filtering Recommender This is where the personalization comes in. My process was: A) Filtering: To build a robust model, I filtered the data to include only users who had rated 150+ books and only books that had 50+ ratings. This is critical for reducing noise. B) Creating the Pivot Table: I pivoted the filtered data to create a user-item matrix, with book titles as the index, user IDs as the columns, and the ratings as the values. C) Filling Null Value: This matrix was very sparse, so I filled all NaN values (where a user hadn't rated a book) with 0. D) Calculating Similarity: I used scikit-learn's Cosine Similarity on this matrix. This powerful function calculated the similarity "score" between every book based on who rated them. E) Building the recommend Function: Finally, I built a function that takes a book title, finds its vector in the similarity matrix, and returns the top 5 most similar books. This project was a great exercise in: 🔹 Following a clear data pipeline (load, clean, filter, model). 🔹 Building different models for different use cases (new users vs. known users). 🔹 Matrix manipulation as the core of collaborative filtering. Github Repo: https://lnkd.in/dMZc-QQJ #DataScience #MachineLearning #Python #RecommendationSystem #CollaborativeFiltering #Pandas #ScikitLearn #DataAnalysis #Projects

To view or add a comment, sign in

Explore content categories