ClusterValidityIndices.ClusterValidityIndicesModule

Main module for ClusterValidityIndices.jl, a Julia package of metrics for unsupervised learning.

This module exports all of the CVI modules, options, and utilities used by the ClusterValidityIndices.jl package. For full usage, see the official guide at https://ap6yc.github.io/ClusterValidityIndices.jl/dev/man/guide/.

Basic Usage

Install and import the package in a script with

using Pkg
Pkg.add("ClusterValidityIndices")
using ClusterValidityIndices

then create a CVI object with an empty argument constructor

my_cvi = DB()

and get the criterion values with get_cvi! (batch) or get_icvi! (incremental)

# Load some features and labels from a clustering process
features, labels = get_some_clustering_data()

# Batch criterion value
criterion_value = get_cvi!(my_cvi, features, labels)

# Incremental criterion values
criterion_values = zeros(length(labels))
for ix in eachindex(labels)
    criterion_values[ix] = get_icvi!(my_cvi, features[:, ix], labels[ix])
end

Imports

The following names are imported by the package as dependencies:

  • Base
  • Core
  • DocStringExtensions
  • ElasticArrays
  • LinearAlgebra
  • NumericalTypeAliases
  • Pkg

Exports

The following names are exported and available when using the package:

ClusterValidityIndices.CLUSTERVALIDITYINDICES_VERSIONConstant

CLUSTERVALIDITYINDICES_VERSION

Description

A constant that contains the version of the installed ClusterValidityIndices.jl package.

This value is computed at compile time, so it may be used to programmatically verify the version of ClusterValidityIndices that is installed in case a compat entry in your Project.toml is missing or otherwise incorrect.

ClusterValidityIndices.CVI_MODULESConstant

CVI_MODULES

Description

List of implemented CVIs, useful for iteration. Each element is the struct abbreviated name for the CVI, which can be instantiated for iteration with the empty constructor.

For example:

using ClusterValidityIndices
instantiated_cvis = [local_cvi() for local_cvi in CVI_MODULES]
ClusterValidityIndices.CHType
mutable struct CH <: CVI

Summary

The stateful information of the Calinski-Harabasz (CH) Cluster Validity Index

References

  1. L. E. Brito da Silva, N. M. Melton, and D. C. Wunsch II, "Incremental Cluster Validity Indices for Hard Partitions: Extensions and Comparative Study," ArXiv e-prints, Feb 2019, arXiv:1902.06711v1 [cs.LG].
  2. T. Calinski and J. Harabasz, "A dendrite method for cluster analysis," Communications in Statistics, vol. 3, no. 1, pp. 1-27, 1974.
  3. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, and J. Bailey, "Online Cluster Validity Indices for Streaming Data," ArXiv e-prints, 2018, arXiv:1801.02937v1 [stat.ML]. [Online].
  4. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, J. Bailey, "Online cluster validity indices for performance monitoring of streaming data clustering," Int. J. Intell. Syst., pp. 1-23, 2018.

Fields

  • label_map::Dict{Int64, Int64}

  • dim::Int64

  • n_samples::Int64

  • mu::Vector{Float64}

  • params::ClusterValidityIndices.CVIElasticParams

  • n_clusters::Int64

  • criterion_value::Float64

ClusterValidityIndices.CHMethod
CH() -> CH

Summary

Constructor for the Calinski-Harabasz (CH) Cluster Validity Index.

Examples

# Import the package
using ClusterValidityIndices
# Construct a CH module
my_cvi = CH()

References

  1. L. E. Brito da Silva, N. M. Melton, and D. C. Wunsch II, "Incremental Cluster Validity Indices for Hard Partitions: Extensions and Comparative Study," ArXiv e-prints, Feb 2019, arXiv:1902.06711v1 [cs.LG].
  2. T. Calinski and J. Harabasz, "A dendrite method for cluster analysis," Communications in Statistics, vol. 3, no. 1, pp. 1-27, 1974.
  3. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, and J. Bailey, "Online Cluster Validity Indices for Streaming Data," ArXiv e-prints, 2018, arXiv:1801.02937v1 [stat.ML]. [Online].
  4. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, J. Bailey, "Online cluster validity indices for performance monitoring of streaming data clustering," Int. J. Intell. Syst., pp. 1-23, 2018.

Method List / Definition Locations

CH()
ClusterValidityIndices.CVIType
abstract type CVI

Summary

Abstract supertype for all CVI objects. All index instantiations are subtypes of CVI.

Fields

ClusterValidityIndices.CVIElasticParamsType
CVIElasticParams(

) -> ClusterValidityIndices.CVIElasticParams
CVIElasticParams(
    dim::Integer
) -> ClusterValidityIndices.CVIElasticParams
CVIElasticParams(
    dim::Integer,
    n_clusters::Integer
) -> ClusterValidityIndices.CVIElasticParams

Summary

Constructor for the CVIElasticParams struct, using the dimension to prime the 2-D elastic matrices.

The empty constructor should only be used when initializing empty CVIs before setup. CVI setup should instead create this struct with a specified dimension dim, and batch updates can set both dim and n_clusters immediately.

Arguments

  • dim::Integer: the dimension to use for the first dimension of the 2-D matrices.
  • n_clusters::Integer: optional, the number of clusters if known. Default 0.

Method List / Definition Locations

CVIElasticParams()
CVIElasticParams(dim)
CVIElasticParams(dim, n_clusters)
ClusterValidityIndices.CVIElasticParamsType
struct CVIElasticParams

Summary

Container for the common elastic parameters of CVIs.

This is defined as an immutable struct because the

Fields

  • n::Vector{Int64}

    Number of samples per cluster, size of cvi.n_clusters.

  • CP::Vector{Float64}

    Compactness of each cluster, size of cvi.n_clusters.

  • v::ElasticArrays.ElasticMatrix{Float64, V} where V<:DenseVector{Float64}

    Prototype/centroid of each cluster, size of (cvi.dim, cvi.n_clusters).

  • G::ElasticArrays.ElasticMatrix{Float64, V} where V<:DenseVector{Float64}

    Compacness parameter G, size of (cvi.dim, cvi.n_clusters)

  • SEP::Vector{Float64}

    The measure of separation, size of cvi.n_clusters.

ClusterValidityIndices.CVIExpandTensorType

CVIExpandTensor

Description

The type of tensor used by the ClusterValidityIndices.jl package, used to configure array growth behavior.

Though perhaps an abuse of notation, CVIExpandTensor is defined as only a 3-D array here due to the frequent use of 3-dimensional arrays in the package. This maintains that the Julia Array type allows multiple orders (i.e., 3-D and onwards).

ClusterValidityIndices.DBType
mutable struct DB <: CVI

Summary

The stateful information of the Davies-Bouldin (DB) Cluster Validity Index.

References

  1. D. L. Davies and D. W. Bouldin, "A cluster separation measure," IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 1, no. 2, pp. 224-227, Feb. 1979.
  2. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, and J. Bailey, "Online Cluster Validity Indices for Streaming Data," ArXiv e-prints, 2018, arXiv:1801.02937v1 [stat.ML]. [Online].
  3. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, J. Bailey, "Online cluster validity indices for performance monitoring of streaming data clustering," Int. J. Intell. Syst., pp. 1-23, 2018.

Fields

  • label_map::Dict{Int64, Int64}

  • dim::Int64

  • n_samples::Int64

  • mu::Vector{Float64}

  • D::Matrix{Float64}

  • S::Vector{Float64}

  • params::ClusterValidityIndices.CVIElasticParams

  • n_clusters::Int64

  • criterion_value::Float64

ClusterValidityIndices.DBMethod
DB() -> DB

Summary

Constructor for the Davies-Bouldin (DB) Cluster Validity Index.

Examples

# Import the package
using ClusterValidityIndices
# Construct a DB module
my_cvi = DB()

References

  1. D. L. Davies and D. W. Bouldin, "A cluster separation measure," IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 1, no. 2, pp. 224-227, Feb. 1979.
  2. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, and J. Bailey, "Online Cluster Validity Indices for Streaming Data," ArXiv e-prints, 2018, arXiv:1801.02937v1 [stat.ML]. [Online].
  3. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, J. Bailey, "Online cluster validity indices for performance monitoring of streaming data clustering," Int. J. Intell. Syst., pp. 1-23, 2018.

Method List / Definition Locations

DB()
ClusterValidityIndices.GD43Type
mutable struct GD43 <: CVI

Summary

The stateful information of the Generalized Dunn's Index 43 (GD43) Cluster Validity Index.

References

  1. A. Ibrahim, J. M. Keller, and J. C. Bezdek, "Evaluating Evolving Structure in Streaming Data With Modified Dunn's Indices," IEEE Transactions on Emerging Topics in Computational Intelligence, pp. 1-12, 2019.
  2. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, and J. Bailey, "Online Cluster Validity Indices for Streaming Data," ArXiv e-prints, 2018, arXiv:1801.02937v1 [stat.ML].
  3. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, J. Bailey, "Online cluster validity indices for performance monitoring of streaming data clustering," Int. J. Intell. Syst., pp. 1-23, 2018.
  4. J. C. Dunn, "A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters," J. Cybern., vol. 3, no. 3 , pp. 32-57, 1973.
  5. J. C. Bezdek and N. R. Pal, "Some new indexes of cluster validity," IEEE Trans. Syst., Man, and Cybern., vol. 28, no. 3, pp. 301-315, Jun. 1998.

Fields

  • label_map::Dict{Int64, Int64}

  • dim::Int64

  • n_samples::Int64

  • mu::Vector{Float64}

  • D::Matrix{Float64}

  • params::ClusterValidityIndices.CVIElasticParams

  • n_clusters::Int64

  • criterion_value::Float64

ClusterValidityIndices.GD43Method
GD43() -> GD43

Summary

Constructor for the Generalized Dunn's Index 43 (GD43) Cluster Validity Index.

Examples

# Import the package
using ClusterValidityIndices
# Construct a GD43 module
my_cvi = GD43()

References

  1. A. Ibrahim, J. M. Keller, and J. C. Bezdek, "Evaluating Evolving Structure in Streaming Data With Modified Dunn's Indices," IEEE Transactions on Emerging Topics in Computational Intelligence, pp. 1-12, 2019.
  2. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, and J. Bailey, "Online Cluster Validity Indices for Streaming Data," ArXiv e-prints, 2018, arXiv:1801.02937v1 [stat.ML].
  3. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, J. Bailey, "Online cluster validity indices for performance monitoring of streaming data clustering," Int. J. Intell. Syst., pp. 1-23, 2018.
  4. J. C. Dunn, "A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters," J. Cybern., vol. 3, no. 3 , pp. 32-57, 1973.
  5. J. C. Bezdek and N. R. Pal, "Some new indexes of cluster validity," IEEE Trans. Syst., Man, and Cybern., vol. 28, no. 3, pp. 301-315, Jun. 1998.

Method List / Definition Locations

GD43()
ClusterValidityIndices.GD53Type
mutable struct GD53 <: CVI

Summary

The stateful information of the Generalized Dunn's Index 53 (GD53) Cluster Validity Index.

References

  1. A. Ibrahim, J. M. Keller, and J. C. Bezdek, "Evaluating Evolving Structure in Streaming Data With Modified Dunn's Indices," IEEE Transactions on Emerging Topics in Computational Intelligence, pp. 1-12, 2019.
  2. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, and J. Bailey, "Online Cluster Validity Indices for Streaming Data," ArXiv e-prints, 2018, arXiv:1801.02937v1 [stat.ML].
  3. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, J. Bailey, "Online cluster validity indices for performance monitoring of streaming data clustering," Int. J. Intell. Syst., pp. 1-23, 2018.
  4. J. C. Dunn, "A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters," J. Cybern., vol. 3, no. 3 , pp. 32-57, 1973.
  5. J. C. Bezdek and N. R. Pal, "Some new indexes of cluster validity," IEEE Trans. Syst., Man, and Cybern., vol. 28, no. 3, pp. 301-315, Jun. 1998.

Fields

  • label_map::Dict{Int64, Int64}

  • dim::Int64

  • n_samples::Int64

  • mu::Vector{Float64}

  • D::Matrix{Float64}

  • params::ClusterValidityIndices.CVIElasticParams

  • n_clusters::Int64

  • criterion_value::Float64

ClusterValidityIndices.GD53Method
GD53() -> GD53

Summary

Constructor for the Generalized Dunn's Index 53 (GD53) Cluster Validity Index.

Examples

# Import the package
using ClusterValidityIndices
# Construct a GD53 module
my_cvi = GD53()

References

  1. A. Ibrahim, J. M. Keller, and J. C. Bezdek, "Evaluating Evolving Structure in Streaming Data With Modified Dunn's Indices," IEEE Transactions on Emerging Topics in Computational Intelligence, pp. 1-12, 2019.
  2. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, and J. Bailey, "Online Cluster Validity Indices for Streaming Data," ArXiv e-prints, 2018, arXiv:1801.02937v1 [stat.ML].
  3. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, J. Bailey, "Online cluster validity indices for performance monitoring of streaming data clustering," Int. J. Intell. Syst., pp. 1-23, 2018.
  4. J. C. Dunn, "A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters," J. Cybern., vol. 3, no. 3 , pp. 32-57, 1973.
  5. J. C. Bezdek and N. R. Pal, "Some new indexes of cluster validity," IEEE Trans. Syst., Man, and Cybern., vol. 28, no. 3, pp. 301-315, Jun. 1998.

Method List / Definition Locations

GD53()
ClusterValidityIndices.LabelMapType

LabelMap

Description

Internal label mapping for incremental CVIs.

Alias for a dictionary mapping of integers to integers as cluster labels.

ClusterValidityIndices.PSType
mutable struct PS <: CVI

Summary

The stateful information of the Partition Separation (PS) Cluster Validity Index.

References

  1. Miin-Shen Yang and Kuo-Lung Wu, "A new validity index for fuzzy clustering," 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297), Melbourne, Victoria, Australia, 2001, pp. 89-92, vol.1.
  2. E. Lughofer, "Extensions of vector quantization for incremental clustering," Pattern Recognit., vol. 41, no. 3, pp. 995-1011, 2008.

Fields

  • label_map::Dict{Int64, Int64}

  • dim::Int64

  • n_samples::Int64

  • mu::Vector{Float64}

  • params::ClusterValidityIndices.CVIElasticParams

  • D::Matrix{Float64}

  • n_clusters::Int64

  • criterion_value::Float64

ClusterValidityIndices.PSMethod
PS() -> PS

Summary

Constructor for the Partition Separation (PS) Cluster Validity Index.

Examples

# Import the package
using ClusterValidityIndices
# Construct a PS module
my_cvi = PS()

References

  1. Miin-Shen Yang and Kuo-Lung Wu, "A new validity index for fuzzy clustering," 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297), Melbourne, Victoria, Australia, 2001, pp. 89-92, vol.1.
  2. E. Lughofer, "Extensions of vector quantization for incremental clustering," Pattern Recognit., vol. 41, no. 3, pp. 995-1011, 2008.

Method List / Definition Locations

PS()
ClusterValidityIndices.WBType
mutable struct WB <: CVI

Summary

The stateful information of the WB-Index (WB) Cluster Validity Index.

References

  1. L. E. Brito da Silva, N. M. Melton, and D. C. Wunsch II, "Incremental Cluster Validity Indices for Hard Partitions: Extensions and Comparative Study," ArXiv e-prints, Feb 2019, arXiv:1902.06711v1 [cs.LG].
  2. Q. Zhao, M. Xu, and P. Franti, "Sum-of-Squares Based Cluster Validity Index and Significance Analysis," in Adaptive and Natural Computing Algorithms, M. Kolehmainen, P. Toivanen, and B. Beliczynski, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009, pp. 313-322.
  3. Q. Zhao and P. Franti, "WB-index: A sum-of-squares based index for cluster validity," Data Knowledge Engineering, vol. 92, pp. 77-89, 2014.
  4. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, and J. Bailey, "Online Cluster Validity Indices for Streaming Data," ArXiv e-prints, 2018, arXiv:1801.02937v1 [stat.ML].
  5. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, J. Bailey, "Online cluster validity indices for performance monitoring of streaming data clustering," Int. J. Intell. Syst., pp. 1-23, 2018.

Fields

  • label_map::Dict{Int64, Int64}

  • dim::Int64

  • n_samples::Int64

  • mu::Vector{Float64}

  • params::ClusterValidityIndices.CVIElasticParams

  • n_clusters::Int64

  • criterion_value::Float64

ClusterValidityIndices.WBMethod
WB() -> WB

Summary

Constructor for the WB-Index (WB) Cluster Validity Index.

Examples

# Import the package
using ClusterValidityIndices
# Construct a WB module
my_cvi = WB()

References

  1. L. E. Brito da Silva, N. M. Melton, and D. C. Wunsch II, "Incremental Cluster Validity Indices for Hard Partitions: Extensions and Comparative Study," ArXiv e-prints, Feb 2019, arXiv:1902.06711v1 [cs.LG].
  2. Q. Zhao, M. Xu, and P. Franti, "Sum-of-Squares Based Cluster Validity Index and Significance Analysis," in Adaptive and Natural Computing Algorithms, M. Kolehmainen, P. Toivanen, and B. Beliczynski, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009, pp. 313-322.
  3. Q. Zhao and P. Franti, "WB-index: A sum-of-squares based index for cluster validity," Data Knowledge Engineering, vol. 92, pp. 77-89, 2014.
  4. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, and J. Bailey, "Online Cluster Validity Indices for Streaming Data," ArXiv e-prints, 2018, arXiv:1801.02937v1 [stat.ML].
  5. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, J. Bailey, "Online cluster validity indices for performance monitoring of streaming data clustering," Int. J. Intell. Syst., pp. 1-23, 2018.

Method List / Definition Locations

WB()
ClusterValidityIndices.XBType
mutable struct XB <: CVI

Summary

The stateful information of the Xie-Beni (XB) Cluster Validity Index.

References

  1. X. L. Xie and G. Beni, "A Validity Measure for Fuzzy Clustering," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 8, pp. 841-847, 1991.
  2. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, and J. Bailey, "Online Cluster Validity Indices for Streaming Data," ArXiv e-prints, 2018, arXiv:1801.02937v1 [stat.ML]. [Online].
  3. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, J. Bailey, "Online cluster validity indices for performance monitoring of streaming data clustering," Int. J. Intell. Syst., pp. 1-23, 2018.

Fields

  • label_map::Dict{Int64, Int64}

  • dim::Int64

  • n_samples::Int64

  • mu::Vector{Float64}

  • D::Matrix{Float64}

  • params::ClusterValidityIndices.CVIElasticParams

  • n_clusters::Int64

  • criterion_value::Float64

ClusterValidityIndices.XBMethod
XB() -> XB

Summary

Constructor for the Xie-Beni (XB) Cluster Validity Index.

Examples

# Import the package
using ClusterValidityIndices
# Construct a XB module
my_cvi = XB()

References

  1. X. L. Xie and G. Beni, "A Validity Measure for Fuzzy Clustering," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 8, pp. 841-847, 1991.
  2. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, and J. Bailey, "Online Cluster Validity Indices for Streaming Data," ArXiv e-prints, 2018, arXiv:1801.02937v1 [stat.ML]. [Online].
  3. M. Moshtaghi, J. C. Bezdek, S. M. Erfani, C. Leckie, J. Bailey, "Online cluster validity indices for performance monitoring of streaming data clustering," Int. J. Intell. Syst., pp. 1-23, 2018.

Method List / Definition Locations

XB()
ClusterValidityIndices.cSILType
mutable struct cSIL <: CVI

Summary

The stateful information of the Centroid-based Silhouette (cSIL) Cluster Validity Index.

References

  1. L. E. Brito da Silva, N. M. Melton, and D. C. Wunsch II, "Incremental Cluster Validity Indices for Hard Partitions: Extensions and Comparative Study," ArXiv e-prints, Feb 2019, arXiv:1902.06711v1 [cs.LG].
  2. P. J. Rousseeuw, "Silhouettes: A graphical aid to the interpretation and validation of cluster analysis," Journal of Computational and Applied Mathematics, vol. 20, pp. 53-65, 1987.
  3. M. Rawashdeh and A. Ralescu, "Center-wise intra-inter silhouettes," in Scalable Uncertainty Management, E. Hüllermeier, S. Link, T. Fober et al., Eds. Berlin, Heidelberg: Springer, 2012, pp. 406-419.

Fields

  • label_map::Dict{Int64, Int64}

  • dim::Int64

  • n_samples::Int64

  • mu::Vector{Float64}

  • params::ClusterValidityIndices.CVIElasticParams

  • S::Matrix{Float64}

  • sil_coefs::Vector{Float64}

  • n_clusters::Int64

  • criterion_value::Float64

ClusterValidityIndices.cSILMethod
cSIL() -> cSIL

Summary

Constructor for the Centroid-based Silhouette (cSIL) Cluster Validity Index.

Examples

# Import the package
using ClusterValidityIndices
# Construct a cSIL module
my_cvi = cSIL()

References

  1. L. E. Brito da Silva, N. M. Melton, and D. C. Wunsch II, "Incremental Cluster Validity Indices for Hard Partitions: Extensions and Comparative Study," ArXiv e-prints, Feb 2019, arXiv:1902.06711v1 [cs.LG].
  2. P. J. Rousseeuw, "Silhouettes: A graphical aid to the interpretation and validation of cluster analysis," Journal of Computational and Applied Mathematics, vol. 20, pp. 53-65, 1987.
  3. M. Rawashdeh and A. Ralescu, "Center-wise intra-inter silhouettes," in Scalable Uncertainty Management, E. Hüllermeier, S. Link, T. Fober et al., Eds. Berlin, Heidelberg: Springer, 2012, pp. 406-419.

Method List / Definition Locations

cSIL()
ClusterValidityIndices.rCIPType
mutable struct rCIP <: CVI

Summary

The stateful information of the (Renyi's) representative Cross Information Potential (rCIP) Cluster Validity Index.

References

  1. L. E. Brito da Silva, N. M. Melton, and D. C. Wunsch II, "Incremental Cluster Validity Indices for Hard Partitions: Extensions and Comparative Study," ArXiv e-prints, Feb 2019, arXiv:1902.06711v1 [cs.LG].
  2. E. Gokcay and J. C. Principe, "A new clustering evaluation function using Renyi's information potential," in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), vol. 6. Jun. 2000, pp. 3490-3493.
  3. E. Gokcay and J. C. Principe, "Information theoretic clustering," IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 2, pp. 158-171, Feb. 2002.
  4. D. Araújo, A. D. Neto, and A. Martins, "Representative cross information potential clustering," Pattern Recognit. Lett., vol. 34, no. 16, pp. 2181-2191, Dec. 2013.
  5. D. Araújo, A. D. Neto, and A. Martins, "Information-theoretic clustering: A representative and evolutionary approach," Expert Syst. Appl., vol. 40, no. 10, pp. 4190-4205, Aug. 2013.
  6. R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed. John Wiley & Sons, 2000.

Fields

  • label_map::Dict{Int64, Int64}

  • dim::Int64

  • n_samples::Int64

  • mu::Vector{Float64}

  • D::Matrix{Float64}

  • delta_term::Matrix{Float64}

  • params::ClusterValidityIndices.CVIElasticParams

  • sigma::ElasticArrays.ElasticArray{Float64, 3, M, V} where {M, V<:DenseVector{Float64}}

  • constant::Float64

  • n_clusters::Int64

  • criterion_value::Float64

ClusterValidityIndices.rCIPMethod
rCIP() -> rCIP

Summary

Constructor for the (Renyi's) representative Cross Information Potential (rCIP) Cluster Validity Index.

Examples

# Import the package
using ClusterValidityIndices
# Construct a rCIP module
my_cvi = rCIP()

References

  1. L. E. Brito da Silva, N. M. Melton, and D. C. Wunsch II, "Incremental Cluster Validity Indices for Hard Partitions: Extensions and Comparative Study," ArXiv e-prints, Feb 2019, arXiv:1902.06711v1 [cs.LG].
  2. E. Gokcay and J. C. Principe, "A new clustering evaluation function using Renyi's information potential," in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), vol. 6. Jun. 2000, pp. 3490-3493.
  3. E. Gokcay and J. C. Principe, "Information theoretic clustering," IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 2, pp. 158-171, Feb. 2002.
  4. D. Araújo, A. D. Neto, and A. Martins, "Representative cross information potential clustering," Pattern Recognit. Lett., vol. 34, no. 16, pp. 2181-2191, Dec. 2013.
  5. D. Araújo, A. D. Neto, and A. Martins, "Information-theoretic clustering: A representative and evolutionary approach," Expert Syst. Appl., vol. 40, no. 10, pp. 4190-4205, Aug. 2013.
  6. R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed. John Wiley & Sons, 2000.

Method List / Definition Locations

rCIP()
ClusterValidityIndices.add_cluster!Method
add_cluster!(
    cvi::CVI,
    sample::AbstractVector{T} where T<:Real;
    alt_CP
)

Summary

Adds a cluster to the a CVI, updating the count and elastic parameters accordingly.

Arguments

  • cvi::CVI: the CVI to add a cluster to.
  • sample::RealVector: the feature sample to base the new cluster off of.
  • alt_CP::Bool: optional, alternate compactness definition.

Method List / Definition Locations

add_cluster!(cvi, sample; alt_CP)
ClusterValidityIndices.evaluate!Method

Summary

Compute the criterion value of the CVI.

After computation, the resulting criterion value can be extracted from cvi.criterion_value. The criterion value is a function of the CVI/ICVI internal parameters, so at least two classes (i.e., unique labels) must be presented to the CVI in param_inc! or param_batch! before a non-zero value is returned.

Arguments

  • cvi::CVI: the stateful information of the CVI/ICVI to use for computing the criterion value.

Examples

julia> my_cvi = CH()
julia> data = load_some_data()
julia> labels = my_cluster_algorithm(data)
julia> param_batch!(my_cvi, data, labels)
julia> evaluate!(my_cvi)
julia> my_criterion_value = my_cvi.criterion_value

Method List / Definition Locations

evaluate!(cvi)
evaluate!(cvi)
evaluate!(cvi)
evaluate!(cvi)
evaluate!(cvi)
evaluate!(cvi)
evaluate!(cvi)
evaluate!(cvi)
evaluate!(cvi)
ClusterValidityIndices.expand_params!Method
expand_params!(
    params::ClusterValidityIndices.CVIElasticParams,
    n::Integer,
    CP::Float64,
    v::AbstractVector{T} where T<:Real,
    G::AbstractVector{T} where T<:Real
)

Summary

Expands the CVIElasticParams struct with the provided CVI parameters.

Arguments

  • params::CVIElasticParams: the CVI elastic parameters to expand.
  • n::Integer: the sample count of the new cluster.
  • CP::Float: the compactness of the new cluster.
  • v::RealVector: the prototype of the new cluster.
  • G::RealVector:

Method List / Definition Locations

expand_params!(params, n, CP, v, G)
ClusterValidityIndices.expand_strategy_1d!Method
expand_strategy_1d!(cvi_vec::Vector, n_new::Real) -> Vector

Summary

Implements the strategy for expanding a 1-D CVIExpandVector with an arbitrary number.

Arguments

  • cvi_vec::CVIExpandVector: the 1-D vector to append a number to.
  • n_new::Real: a floating point or integer number to append to the vector.

Method List / Definition Locations

expand_strategy_1d!(cvi_vec, n_new)
ClusterValidityIndices.expand_strategy_2d!Method
expand_strategy_2d!(
    cvi_mat::ElasticArrays.ElasticMatrix{T} where T,
    v_new::AbstractVector{T} where T<:Real
) -> ElasticArrays.ElasticMatrix{T} where T

Summary

Implements the strategy for expanding a 2-D CVIExpandMatrix with a vector on the last dimension.

Arguments

  • mat::CVIExpandMatrix: the 2-D matrix to append a vector to its last dimension.
  • v_new::RealVector: the 1-D vector to append to the matrix.

Method List / Definition Locations

expand_strategy_2d!(cvi_mat, v_new)
ClusterValidityIndices.expand_strategy_3d!Method
expand_strategy_3d!(
    cvi_mat::ElasticArrays.ElasticArray{T, 3, M, V} where {T<:Real, M, V<:DenseVector{T}},
    mat_new::AbstractMatrix{T} where T<:Real
) -> ElasticArrays.ElasticArray{T, 3, M, V} where {T<:Real, M, V<:DenseVector{T}}

Summary

Implements the strategy for expanding a 3-D CVI array with a 2-D matrix.

Arguments

  • cvi_mat::CVIExpandTensor: the 3-D CVI array to append to.
  • mat_new::RealMatrix: the 2-D matrix to append to the CVI array.

Method List / Definition Locations

expand_strategy_3d!(cvi_mat, mat_new)
ClusterValidityIndices.get_cvi!Method
get_cvi!(
    cvi::CVI,
    data::AbstractMatrix{T} where T<:Real,
    labels::AbstractVector{T} where T<:Integer
) -> Any

Summary

Compute and return the criterion value in batch mode.

This method takes the CVI object, a batch of samples as a matrix of floats, and a vector of integers that represent the labels prescribed to the data by your clustering algorithm.

Note

You cannot switch to incremental mode after evaluating a CVI in batch mode. To evaluate incrementally, you much create a new CVI object.

Arguments

  • cvi::CVI: the stateful information of the CVI providing the criterion value.
  • data::RealMatrix: a matrix of data, columns as samples and rows as features, used in the external clustering process.
  • labels::IntegerVector: a vector of integers representing labels prescribed to the data by the external clustering algorithm.

Examples

# Create a new CVI object
my_cvi = CH()

# Load in random data as an example; 10 samples with feature dimenison 3
dim = 3
n_samples = 10
data = rand(dim, n_samples)
labels = repeat(1:2, inner=n_samples)

# Compute the final criterion value in batch mode
criterion_value = get_cvi!(cvi, data, labels)

Method List / Definition Locations

get_cvi!(cvi, data, labels)
ClusterValidityIndices.get_cvi!Method
get_cvi!(
    cvi::CVI,
    sample::AbstractVector{T} where T<:Real,
    label::Integer
) -> Any

Summary

Compute and return the criterion value incrementally.

This method takes the CVI object, a single sample as a vector of floats, and a single integer that represents the label prescribed to the sample by your clustering algorithm.

Note

You cannot switch to batch mode after incrementally evaluating a CVI. To evaluate in batch, you much create a new CVI object.

Arguments

  • cvi::CVI: the stateful information of the ICVI providing the criterion value.
  • sample::RealVector: a vector of features used in clustering the sample.
  • label::Integer: the cluster label prescribed to the sample by the clustering algorithm.

Examples

# Create a new CVI object
my_cvi = CH()

# Load in random data as an example; 10 samples with feature dimenison 3
dim = 3
n_samples = 10
data = rand(dim, n_samples)
labels = repeat(1:2, inner=n_samples)

# Iteratively compute and extract the criterion value at every step
criterion_values = zeros(n_samples)
for ix = 1:n_samples
    sample = data[:, ix]
    label = labels[ix]
    criterion_values[ix] = get_icvi!(my_cvi, sample, label)
end

Method List / Definition Locations

get_cvi!(cvi, sample, label)
ClusterValidityIndices.get_internal_label!Method
get_internal_label!(
    label_map::Dict{Int64, Int64},
    label::Integer
) -> Int64

Summary

Get the internal label and update the label map if the label is new.

Arguments

  • label_map::LabelMap: label map to extract the internal label from.
  • label::Integer: the external label that corresponds to an internal label.

Method List / Definition Locations

get_internal_label!(label_map, label)
ClusterValidityIndices.init_cvi_update!Method
init_cvi_update!(
    cvi::CVI,
    data::AbstractMatrix{T} where T<:Real,
    labels::AbstractVector{T} where T<:Integer
) -> Any

Summary

Initializes batch CVI updates.

Arguments

  • cvi::CVI: the CVI to prime for batch update.
  • data::RealMatrix: the data to use for batch initialization.
  • labels::IntegerVector: the labels corresponding to the provided data.

Method List / Definition Locations

init_cvi_update!(cvi, data, labels)
ClusterValidityIndices.init_cvi_update!Method
init_cvi_update!(
    cvi::CVI,
    sample::AbstractVector{T} where T<:Real,
    label::Integer
) -> Int64

Summary

Initializes incremental CVI updates.

Arguments

  • cvi::CVI: the CVI to initialize incremental evaluation for.
  • sample::RealVector: the sample used for the incremental update.
  • label::Integer: the label provided with the sample for the incremental update.

Method List / Definition Locations

init_cvi_update!(cvi, sample, label)
ClusterValidityIndices.param_batch!Method

Summary

Compute the CVI parameters in batch.

This method updates only the internal parameters of the CVI algorithm in batch. When the criterion value itself is needed, use evaluate! and extract it from cvi.criterion_value.

Arguments

  • cvi::CVI: the stateful information of the CVI/ICVI algorithm.
  • data::RealMatrix: a matrix of data where rows are features and columns are samples, used in the external clustering algorithm.
  • labels::IntegerVector: a vector of labels that the external clustering algorithm prescribed to each column in data.

Examples

julia> my_cvi = CH()
julia> data = load_some_data()
julia> labels = my_cluster_algorithm(data)
julia> param_batch!(my_cvi, data, labels)

Method List / Definition Locations

param_batch!(cvi, data, labels)
param_batch!(cvi, data, labels)
param_batch!(cvi, data, labels)
param_batch!(cvi, data, labels)
param_batch!(cvi, data, labels)
param_batch!(cvi, data, labels)
param_batch!(cvi, data, labels)
param_batch!(cvi, data, labels)
param_batch!(cvi, data, labels)
ClusterValidityIndices.param_inc!Method

Summary

Compute the CVI parameters incrementally.

This method updates only internal parameters of the ICVI algorithm incrementally. When the criterion value itself is needed, use evaluate! and extract it from cvi.criterion_value.

Arguments

  • cvi::CVI: the stateful information of the CVI/ICVI algorithm.
  • sample::RealVector: a vector of features used in the external clustering algorithm.
  • label::Integer: the label that the external clustering algorithm prescribed to the sample.

Examples

julia> my_cvi = CH()
julia> data = load_some_data()
julia> labels = my_cluster_algorithm(data)
julia> param_inc!(my_cvi, data[:, 1], labels[1])

Method List / Definition Locations

param_inc!(cvi, sample, label)
param_inc!(cvi, sample, label)
param_inc!(cvi, sample, label)
param_inc!(cvi, sample, label)
param_inc!(cvi, sample, label)
param_inc!(cvi, sample, label)
param_inc!(cvi, sample, label)
param_inc!(cvi, sample, label)
param_inc!(cvi, sample, label)
ClusterValidityIndices.setup!Method
setup!(
    cvi::CVI,
    data::AbstractMatrix{T} where T<:Real
) -> ClusterValidityIndices.CVIElasticParams

Summary

Internal method, sets up the CVI based upon a batch of data.

Arguments

  • cvi::CVI: the CVI to setup in batch mode.
  • `data::RealMatrix': the data used for batch setup.

Method List / Definition Locations

setup!(cvi, data)
ClusterValidityIndices.setup!Method
setup!(
    cvi::CVI,
    sample::AbstractVector{T} where T<:Real
) -> ClusterValidityIndices.CVIElasticParams

Summary

Internal method, sets up the CVI based upon the type of the provided sample.

Arguments

  • cvi::CVI: the CVI to setup to the correct dimensions.
  • sample::RealVector: The sample to use as a basis for setting up the CVI.

Method List / Definition Locations

setup!(cvi, sample)
ClusterValidityIndices.update_meanMethod
update_mean(
    old_mean::AbstractVector{T} where T<:Real,
    sample::AbstractVector{T} where T<:Real,
    n_new::Integer
) -> Any

Summary

Returns an updated mean vector with a new vector and adjusted count of samples.

Arguments

  • old_mean::RealVector: the old mean to update incrementally.
  • sample::RealVector: the sample to use for updating the mean.
  • n_new::Integer: the new sample count to use for the updated average.

Method List / Definition Locations

update_mean(old_mean, sample, n_new)
ClusterValidityIndices.update_params!Method
update_params!(
    params::ClusterValidityIndices.CVIElasticParams,
    index::Integer,
    n::Integer,
    CP::Float64,
    v::AbstractVector{T} where T<:Real,
    G::AbstractVector{T} where T<:Real
)

Summary

Updates the elastic CVI parameters in place at index.

Arguments

  • params::CVIElasticParams: the CVI elastic parameters to update in place.
  • index::Integer: the cluster index to update for all elastic parameters.
  • n::Integer: the new cluster count at the index.
  • CP::Float: the new compactness at the index.
  • v::RealVector: the new prototype at the index.
  • G::RealVector:

Method List / Definition Locations

update_params!(params, index, n, CP, v, G)