Big Data Analytics
K - Means Clustering
The K-means clustering algorithm computes centroids and repeats until the optimal centroid is found. It is presumptively known how many clusters there are. It is also known as the flat clustering algorithm. The number of clusters found from data by the method is denoted by the letter ‘K’ in K-means.
In this method, data points are assigned to clusters in such a way that the sum of the squared distances between the data points and the centroid is as small as possible. It is essential to note that reduced diversity within clusters leads to more identical data points within the same cluster.
Working on K-Means Algorithm
The following stages will help us understand how the K-Means clustering technique works-
Applications:
K-means implements the Expectation-Maximization strategy to solve the problem. The Expectation-step is used to assign data points to the nearest cluster, and the Maximization-step is used to compute the centroid of each cluster.
Recommended by LinkedIn
There is an algorithm that tries to minimize the distance of the points in a cluster with their centroid – the k-means clustering technique.
K-means is a centroid-based algorithm, or a distance-based algorithm, where we calculate the distances to assign a point to a cluster. In K-Means, each cluster is associated with a centroid.
The main objective of the K-Means algorithm is to minimize the sum of distances between the points and their respective cluster centroid.
Let’s now take an example to understand how K-Means actually works:
Output images :
Thank You.
Chinthala Yeshwanth Reddy.