Exploring practical applications of regression with Python and scikit-learn

Stefano Parmesani

Published Feb 5, 2025

This week I focused on regression analysis using Python and delved on the scikit-learn library. As I mentioned in my previous post, this module is heavily maths-based and seems a bit dry, but I'll do my best to make it interesting without diving too deep into the technical details.

However, before doing that, I want to share the feedback I got for my module 2 assignments. For the professional practice assignment, where I reflected on my ongoing study of Python, I scored 30/40. For the main module assignment, "Machine Learning using Cloud Computing," where I developed my first machine learning model using NLP to recognise sentiment analysis, I scored 73/100 which seems to be one of the top marks in the class again. This result was both surprising and rewarding. I say that because, at the end of the module 2 practical workshops, I felt I had lots of gaps in my understanding. But while I was writing this assignment and developing my artefact, I put in extra time to understand and learn what I needed to make it successful. This end goal helped me fill some of those gaps, which is why the feedback was so rewarding as it confirmed that I'd got my ideas and learning right. Reflecting on this, I can say that I learn better when I'm working on something real and practical.

This week, we explored scikit-learn, which is a machine learning library in Python that offer tools for data mining and data analysis. It's built on NumPy, SciPy, and matplotlib, and is used because it offers a range of supervised and unsupervised learning algorithms.

During the session, we looked at different types of regression using scikit-learn. Here are the main points to remember, without going too much into the details:

Linear Regression: This is the simplest form of regression, where the relationship between the dependent and independent variables is modelled as a straight line. We used scikit-learn to fit a linear model to our data, making predictions based on the learned relationship.
Polynomial Regression: When the relationship between variables is not linear, polynomial regression can be used. By transforming the original features into polynomial features, we can fit a non-linear model to the data.
Ridge Regression: This is a type of linear regression that includes a regularisation term to prevent overfitting. It is particularly useful when dealing with multicollinearity or when the number of predictors is larger than the number of observations.

In the workshop we also touched on the use of Support Vector Machines (SVM) with scikit-learn. SVM is a machine learning algorithm that tries to fit the best possible line (or hyperplane) within a certain margin of tolerance. This makes SVM particularly effective for high-dimensional spaces and complex datasets. For example, SVM can be used for detecting spam emails by classifying emails into spam and non-spam categories based on their content.

The combination of Python and scikit-learn is a very useful toolset for data analysis, and I look forward to applying these skills in future projects. While reading the learning material is a bit challenging, especially when delving into the maths behind it, the practical applications make it worthwhile.

To view or add a comment, sign in

Exploring practical applications of regression with Python and scikit-learn

Stefano Parmesani

More articles by Stefano Parmesani

Explore content categories

More articles by Stefano Parmesani

How AI is reshaping industries: When AI stops being "Tech" and starts being "Normal"

Module 4 - Overview for AI and Digital Innovation

The UKERRO Emergency Robotics Summit 2025

End of module 3: Programming for Artificial Intelligence - Predicting Credit Card Payment Defaults

Time Series Analysis with ARIMA

Exploring Regression and Neural Networks

Supervised learning - Classifier

Reflecting on the Two-Minute Productivity Hack

Programming for AI

Final Week of Module Two and Future Aspirations

Explore content categories