# K-medoids

K-medoids is a clustering algorithm that works by finding $k$ data points (called *medoids*) such that the total distance between each data point and the closest *medoid* is minimal.

`Clustering.kmedoids`

— Function`kmedoids(dist::AbstractMatrix, k::Integer; ...) -> KmedoidsResult`

Perform K-medoids clustering of $n$ points into `k`

clusters, given the `dist`

matrix ($n×n$, `dist[i, j]`

is the distance between the `j`

-th and `i`

-th points).

**Arguments**

`init`

(defaults to`:kmpp`

): how medoids should be initialized, could be one of the following:- a
`Symbol`

indicating the name of a seeding algorithm (see Seeding for a list of supported methods). - an integer vector of length
`k`

that provides the indices of points to use as initial medoids.

- a
`maxiter`

,`tol`

,`display`

: see common options

**Note**

The function implements a *K-means style* algorithm instead of *PAM* (Partitioning Around Medoids). K-means style algorithm converges in fewer iterations, but was shown to produce worse (10-20% higher total costs) results (see e.g. Schubert & Rousseeuw (2019)).

`Clustering.kmedoids!`

— Function```
kmedoids!(dist::AbstractMatrix, medoids::Vector{Int};
[kwargs...]) -> KmedoidsResult
```

Update the current cluster `medoids`

using the `dist`

matrix.

The `medoids`

field of the returned `KmedoidsResult`

points to the same array as `medoids`

argument.

See `kmedoids`

for the description of optional `kwargs`

.

`Clustering.KmedoidsResult`

— Type`KmedoidsResult{T} <: ClusteringResult`

The output of `kmedoids`

function.

**Fields**

`medoids::Vector{Int}`

: the indices of $k$ medoids`assignments::Vector{Int}`

: the indices of clusters the points are assigned to, so that`medoids[assignments[i]]`

is the index of the medoid for the $i$-th point`costs::Vector{T}`

: assignment costs, i.e.`costs[i]`

is the cost of assigning $i$-th point to its medoid`counts::Vector{Int}`

: cluster sizes`totalcost::Float64`

: total assignment cost (the sum of`costs`

)`iterations::Int`

: the number of executed algorithm iterations`converged::Bool`

: whether the procedure converged

## References

- Teitz, M.B. and Bart, P. (1968).
*Heuristic Methods for Estimating the Generalized Vertex Median of a Weighted Graph*. Operations Research, 16(5), 955–961. doi:10.1287/opre.16.5.955 - Schubert, E. and Rousseeuw, P.J. (2019).
*Faster k-medoids clustering: Improving the PAM, CLARA, and CLARANS Algorithms*. SISAP, 171-187. doi:10.1007/978-3-030-32047-8_16