K-means Clustering
Descriptionβ
The k-means algorithm clusters given data by trying to separate samples in n
groups of equal variance by minimizing the criterion known as
within-the-cluster sum-of-squares.
The algorithm requires the number of clusters to be specified beforehand. In other words, the algorithm doesnβt compute the optimal number of clusters to use, but leaves that task to the user. It scales well to large numbers of samples, and it has been applied in a wide range of fields such as recommendation engines, fraud detection and market research.
Implementationβ
The K-means algorithm is implemented within the MAGE project. Be sure to check it out in the link above.
Use casesβ
Clustering is heavily used in social network analysis to discover user communities, find similarities between them and much more.