# Basics

The package implements a variety of clustering algorithms:

- K-means
- K-medoids
- Hierarchical Clustering
- MCL (Markov Cluster Algorithm)
- Affinity Propagation
- DBSCAN
- Fuzzy C-means

Most of the clustering functions in the package have a similar interface, making it easy to switch between different clustering algorithms.

## Inputs

A clustering algorithm, depending on its nature, may accept an input matrix in either of the following forms:

- Data matrix $X$ of size $d \times n$, the $i$-th column of $X$ (
`X[:, i]`

) is a data point (data*sample*) in $d$-dimensional space. - Distance matrix $D$ of size $n \times n$, where $D_{ij}$ is the distance between the $i$-th and $j$-th points, or the cost of assigning them to the same cluster.

## Common Options

Many clustering algorithms are iterative procedures. The functions share the basic options for controlling the iterations:

`maxiter::Integer`

: maximum number of iterations.`tol::Real`

: minimal allowed change of the objective during convergence. The algorithm is considered to be converged when the change of objective value between consecutive iterations drops below`tol`

.`display::Symbol`

: the level of information to be displayed. It may take one of the following values:`:none`

: nothing is shown`:final`

: only shows a brief summary when the algorithm ends`:iter`

: shows the progress at each iteration

## Results

A clustering function would return an object (typically, an instance of some `ClusteringResult`

subtype) that contains both the resulting clustering (e.g. assignments of points to the clusters) and the information about the clustering algorithm (e.g. the number of iterations and whether it converged).

`Clustering.ClusteringResult`

— Type`ClusteringResult`

Base type for the output of clustering algorithm.

The following generic methods are supported by any subtype of `ClusteringResult`

:

`Clustering.nclusters`

— Method`nclusters(R::ClusteringResult) -> Int`

Get the number of clusters.

`StatsBase.counts`

— Method`counts(R::ClusteringResult) -> Vector{Int}`

Get the vector of cluster sizes.

`counts(R)[k]`

is the number of points assigned to the $k$-th cluster.

`Clustering.wcounts`

— Method```
wcounts(R::ClusteringResult) -> Vector{Float64}
wcounts(R::FuzzyCMeansResult) -> Vector{Float64}
```

Get the weighted cluster sizes as the sum of weights of points assigned to each cluster.

For non-weighted clusterings assumes the weight of every data point is 1.0, so the result is equivalent to `convert(Vector{Float64}, counts(R))`

.

`Clustering.assignments`

— Method`assignments(R::ClusteringResult) -> Vector{Int}`

Get the vector of cluster indices for each point.

`assignments(R)[i]`

is the index of the cluster to which the $i$-th point is assigned.