An Introduction to Machine Learning Models using the kNN Algorithm

Amsal G.

Published May 30, 2018

Recently, I was looking for a new Machine Learning Model, which most people have not heard of. So, I came across something called a kNN and started learning about it and coding it. A kNN stands for k Nearest Neighbours. This algorithm is primarily used for classification predictive problems. Here is everything you need to know about this simple and easy to use algorithm.

How do kNN's work?

Let's start off with a easy case. Here is a plane with a set of data points. In this case the points are either red circles or green squares.

But, there is a blue star in the dataset as well. We need to know what the blue star identifies as. Is the blue star a red circle or is it a green square?

This is where the kNN algorithm can help us. The "k" in kNN, is the amount of nearest neighbours we want to look at, in order to classify the blue star. Let's give "k" the value of 3 in this case. Now, let's take a look at the 3 closest data points to the blue star.

The 3 closest data points to the blue star are all red circles. This means that we can safely predict the blue star is a red circle. Seem easy enough? The truth is, this is all there is to the algorithm.

Here is another example:

The test sample (green circle) should be classified either to the first class of blue squares or to the second class of red triangles. If k = 1 it is assigned to the first class because there is only 1 blue square. If k = 3, it is assigned to the second class because there is only one blue square but there are 2 red triangles.

Applications of kNN's?

Healthcare: Who has a tumour like mine? Was it a malignant or benign cancer? Was it an operable or inoperable condition? Do I need a biopsy?
Banks: Should a bank loan me money? Would I pay interest rates? Do people with characteristics like me pay their interest rates?
Handwriting detection: Who has handwriting similar to mine?

Summary:

Pros:

Simple algorithm
High accuracy - it is relatively high but not enough to compete with better Supervised Learning Models like SVM's or Naive Bayes
The algorithm is efficient, comprehensive and generally versatile to different industries (see applications of kNN's above)

Cons:

High memory requirement
Stores most of the training data (as it is essential for the algorithm)
Prediction time can be slow if there is a big value of "k"

Steps of a kNN:

A positive integer value is given to k
The nearest amount of "k" neighbours are found
The most common classification is found
This is the classification which is given to the sample

I hope this gave you a good introduction to Machine Learning models as well as what a kNN model is. Please feel free to comment or ask questions below.

To view or add a comment, sign in

An Introduction to Machine Learning Models using the kNN Algorithm

Amsal G.

How do kNN's work?

Applications of kNN's?

Summary:

Pros:

Cons:

Steps of a kNN:

More articles by Amsal G.

Others also viewed

Machine Learning on Small Datasets

Random Forest In Machine Learning

Cost function in Machine learning

Forecasting with Machine Learning: Turning Historical Data into Future Insights

Machine Learning: The Ultimate Battle Royale of Algorithms

Machine Learning in 500 words

Handling Imbalanced Datasets in Machine Learning

Support Vector Machine (SVM) in Machine Learning

Do Machine Learning algorithms have memory?

Stanford Machine Learning Week 7 review

Machine Learning Models For Healthcare Predictive Analytics

Understanding Approximate Nearest Neighbor Algorithms

Machine Learning Models for Breast Cancer Risk Assessment

Model Interpretability and Explainability

Linear Regression Models

Understanding Model Drift In Machine Learning Applications

Explore content categories

How do kNN's work?

Applications of kNN's?

Summary:

Pros:

Cons:

Steps of a kNN:

More articles by Amsal G.

Machine Learning Basics: Support Vector Machines

Part 1: All about Artificial Neural Networks - Intuition

An update on what I have been up to...

Envisioning the Future: Healthcare

Deep Learning can Change the way we Predict Outcomes in the ICU

The Role of AI in Healthcare

Virtual Reality (VR) in Healthcare

Others also viewed

Machine Learning on Small Datasets

Random Forest In Machine Learning

Cost function in Machine learning

Forecasting with Machine Learning: Turning Historical Data into Future Insights

Machine Learning: The Ultimate Battle Royale of Algorithms

Machine Learning in 500 words

Handling Imbalanced Datasets in Machine Learning

Support Vector Machine (SVM) in Machine Learning

Do Machine Learning algorithms have memory?

Stanford Machine Learning Week 7 review

Similar topics

Machine Learning Models For Healthcare Predictive Analytics

Understanding Approximate Nearest Neighbor Algorithms

Machine Learning Models for Breast Cancer Risk Assessment

Model Interpretability and Explainability

Linear Regression Models

Understanding Model Drift In Machine Learning Applications

Explore content categories