Supervised learning - Classifier

Supervised learning - Classifier

This week, I studied the classification algorithms. I won't go into the math behind it as, to be honest, I won't be able to write an interesting article talking about math formulas. My goal is to break down what I learned from this lesson in simple terms

So, what do I mean for classification algorithms? These algorithms are used in machine learning to predict which category or class a new piece of data belongs to. A classic example is the algorithm that decides whether an email is spam or not. I also understood that those classifiers learn from existing labelled data, which is why this process is called supervised learning.

While studying, I came across some interesting practical examples beyond spam emails. One of the most compelling applications, in my view, is how these algorithms can assist doctors in predicting or diagnosing diseases based on patient data. This field is rapidly expanding, and it's clear that AI will play a crucial role in healthcare's future. For instance, AI can help doctors identify patterns in patient data that might not be immediately evident, leading to more accurate diagnoses and better patient outcomes. I think this is one of the applications where AI can be most effectively be utilised, and I hope that it will become a major focus for AI development and application over the next few years.

Thinking about my job role, I'm becoming more interested in image recognition. Although I'm not planning to launch my own self-driving cars, I see potential in developing an algorithm that can identify architecture diagrams and convert them into text. This could help generate solution designs in the enterprise architecture field, streamlining the process of interpreting complex diagrams and turning them into actionable solutions.

During my studies, we explored several types of classification algorithms:

  • Decision Trees: These are straightforward and work like a flowchart, asking a series of questions to determine which category the data belongs to.
  • Random Forests: Similar to decision trees, but they use multiple trees together to make more accurate predictions. They're excellent at handling large datasets and reducing errors.
  • Support Vector Machines (SVMs): These algorithms find the best line (or hyperplane in higher dimensions) to separate different categories. They excel at dealing with complex data.
  • Logistic Regression: Often used for binary classification (e.g., spam vs. not spam emails), it calculates the probability of something belonging to a particular category.
  • K-Nearest Neighbours (KNN): This algorithm looks at the nearest neighbours to a new piece of data and decides which category it belongs to based on those neighbours.

As part of my assignment, I chose to develop an artefact using Random Forests as a fraud detector. I wanted to further expand my understanding of testing and using these algorithms. Having already been familiar with decision trees, I appreciated their graphical representation, but I felt that exploring Random Forests would take my knowledge to the next level without starting from scratch. I'm also keen to understand the differences and implications of deploying these algorithms in machine learning.


To view or add a comment, sign in

More articles by Stefano Parmesani

Explore content categories