Unsupervised machine learning

By: Esteban Alfaro

Date: may 2023

Clustering k-means:

As a method of machine learning, K-means grouping also known as K-means clustering is the most widely used partitional clustering algorithm. A partitional clustering aim to create groups (or clusters) present in the data by optimizing a specific objective function and iteratively improving the quality of the partitions.

K-means clustering is based on the Lloyd algorithm that was proposed by Stuart P. Lloyd of Bell Labs in 1957 as a technique for pulse-code modulation. Lloyd's work became widely circulated but remained unpublished until 1982.

Given a dataset D = {x 1 , x 2 , …, x N } consists of N points, let us denote the clustering obtained after applying K -means clustering by C = {C 1 , C 2 , …, C k …, C K }. The SSE for this clustering is defined by (1) where c k is the centroid of cluster C k . The objective is to find a clustering that minimizes the SSE score. The iterative assignment and update steps of the K -means algorithm aim to minimize the SSE score for the given set of centroids.


K-means clustering algorithm:

K -Means algorithm:

  1. Select K points as initial centroids
  2. repeat
  3. Form K clusters by assigning each point to its closest centroid
  4. Recompute the centroid of each cluster as the mean of each
    cluster
  5. until convergence criterion is met












No hay comentarios:

Publicar un comentario

Experimenting with different metrics

Non-convex balls By: Esteban Alfaro Sabogal Date: 06 august 2023 We are going to draw computationally the points that are a distance < r ...