By: Esteban Alfaro
Date: may 2023
Clustering k-means:
As a method of machine learning, K-means grouping also known as K-means clustering is the most widely used partitional clustering algorithm. A partitional clustering aim to create groups (or clusters) present in the data by optimizing a specific objective function and iteratively improving the quality of the partitions.
K-means clustering is based on the Lloyd algorithm that was proposed by Stuart P. Lloyd of Bell Labs in 1957 as a technique for pulse-code modulation. Lloyd's work became widely circulated but remained unpublished until 1982.
Given a dataset D = {x 1 , x 2 , …, x N } consists of N points, let us denote the clustering obtained after applying K -means clustering by C = {C 1 , C 2 , …, C k …, C K }. The SSE for this clustering is defined by (1) where c k is the centroid of cluster C k . The objective is to find a clustering that minimizes the SSE score. The iterative assignment and update steps of the K -means algorithm aim to minimize the SSE score for the given set of centroids.
K-means clustering algorithm:
K -Means algorithm:
- Select K points as initial centroids
- repeat
- Form K clusters by assigning each point to its closest centroid
- Recompute the centroid of each cluster as the mean of each
cluster - until convergence criterion is met
No hay comentarios:
Publicar un comentario