# Reference

## Module

`CluGen`

— Module`CluGen`

A Julia package for generating multidimensional clusters. Provides the `clugen`

function for this purpose, as well as a number of auxiliary functions, used internally and modularly by `clugen`

. Users can swap these auxiliary functions by their own customized versions, fine-tuning their cluster generation strategies, or even use them as the basis for their own generation algorithms.

## Main functions

`CluGen.clugen`

— Function```
clugen(
num_dims::Integer,
num_clusters::Integer,
num_points::Integer,
direction::AbstractArray{<:Real},
angle_disp::Real,
cluster_sep::AbstractArray{<:Real,1},
llength::Real,
llength_disp::Real,
lateral_disp::Real;
# Keyword arguments
allow_empty::Bool = false,
cluster_offset::Union{AbstractArray{<:Real,1},Nothing} = nothing,
proj_dist_fn::Union{String,<:Function} = "norm",
point_dist_fn::Union{String,<:Function} = "n-1",
clusizes_fn::Union{<:Function,AbstractArray{<:Real,1}} = GluGen.clusizes,
clucenters_fn::Union{<:Function,AbstractArray{<:Real}} = GluGen.clucenters,
llengths_fn::Union{<:Function,AbstractArray{<:Real,1}} = GluGen.llengths,
angle_deltas_fn::Union{<:Function,AbstractArray{<:Real,1}} = GluGen.angle_deltas,
rng::Union{Integer,AbstractRNG}=Random.GLOBAL_RNG
) -> NamedTuple{(
:points, # Array{<:Real,2}
:clusters, # Array{<:Integer,1}
:projections, # Array{<:Real,2}
:sizes, # Array{<:Integer,1}
:centers, # Array{<:Real,2}
:directions, # Array{<:Real,2}
:lengths # Array{<:Real,1}
)}
```

Generate multidimensional clusters.

This is the main function of the CluGen package, and possibly the only function most users will need.

**Arguments (mandatory)**

`num_dims`

: Number of dimensions.`num_clusters`

: Number of clusters to generate.`num_points`

: Total number of points to generate.`direction`

: Average direction of the cluster-supporting lines. Can be a a vector of length`num_dims`

(same direction for all clusters) or a matrix of size`num_clusters`

x`num_dims`

(one direction per cluster).`angle_disp`

: Angle dispersion of cluster-supporting lines (radians).`cluster_sep`

: Average cluster separation in each dimension (`num_dims`

x 1).`llength`

: Average length of cluster-supporting lines.`llength_disp`

: Length dispersion of cluster-supporting lines.`lateral_disp`

: Cluster lateral dispersion, i.e., dispersion of points from their projection on the cluster-supporting line.

Note that the terms "average" and "dispersion" refer to measures of central tendency and statistical dispersion, respectively. Their exact meaning depends on the optional arguments, described next.

**Arguments (optional)**

`allow_empty`

: Allow empty clusters?`false`

by default.`cluster_offset`

: Offset to add to all cluster centers. If set to`nothing`

(the default), the offset will be equal to`zeros(num_dims)`

.`proj_dist_fn`

: Distribution of point projections along cluster-supporting lines, with three possible values:`"norm"`

(default): Distribute point projections along lines using a normal distribution (μ=*line center*, σ=`llength/6`

).`"unif"`

: Distribute points uniformly along the line.- User-defined function, which accepts three parameters, line length (float), number of points (integer), and a random number generator, and returns an array containing the distance of each point projection to the center of the line. For example, the
`"norm"`

option roughly corresponds to`(len, n, rng) -> (1.0 / 6.0) * len .* randn(rng, n)`

.

`point_dist_fn`

: Controls how the final points are created from their projections on the cluster-supporting lines, with three possible values:`"n-1"`

(default): Final points are placed on a hyperplane orthogonal to the cluster-supporting line, centered at each point's projection, using the normal distribution (μ=0, σ=`lateral_disp`

). This is done by the`CluGen.clupoints_n_1()`

function.`"n"`

: Final points are placed around their projection on the cluster-supporting line using the normal distribution (μ=0, σ=`lateral_disp`

). This is done by the`CluGen.clupoints_n()`

function.- User-defined function: The user can specify a custom point placement strategy by passing a function with the same signature as
`CluGen.clupoints_n_1()`

and`CluGen.clupoints_n()`

.

`clusizes_fn`

: Distribution of cluster sizes. By default, cluster sizes are determined by the`CluGen.clusizes()`

function, which uses the normal distribution (μ=`num_points`

/`num_clusters`

, σ=μ/3), and assures that the final cluster sizes add up to`num_points`

. This parameter allows the user to specify a custom function for this purpose, which must follow`CluGen.clusizes()`

's signature. Note that custom functions are not required to strictly obey the`num_points`

parameter. Alternatively, the user can specify an array of cluster sizes directly.`clucenters_fn`

: Distribution of cluster centers. By default, cluster centers are determined by the`CluGen.clucenters()`

function, which uses the uniform distribution, and takes into account the`num_clusters`

and`cluster_sep`

parameters for generating well-distributed cluster centers. This parameter allows the user to specify a custom function for this purpose, which must follow`CluGen.clucenters()`

's signature. Alternatively, the user can specify a matrix of size`num_clusters`

x`num_dims`

with the exact cluster centers.`llengths_fn`

: Distribution of line lengths. By default, the lengths of cluster-supporting lines are determined by the`CluGen.llengths()`

function, which uses the folded normal distribution (μ=`llength`

, σ=`llength_disp`

). This parameter allows the user to specify a custom function for this purpose, which must follow`CluGen.llengths()`

's signature. Alternatively, the user can specify an array of line lengths directly.`angle_deltas_fn`

: Distribution of line angle differences with respect to`direction`

. By default, the angles between the main`direction`

of each cluster and the final directions of their cluster-supporting lines are determined by the`CluGen.angle_deltas()`

function, which uses the wrapped normal distribution (μ=0, σ=`angle_disp`

) with support in the interval $\left[-\pi/2,\pi/2\right]$. This parameter allows the user to specify a custom function for this purpose, which must follow`CluGen.angle_deltas()`

's signature. Alternatively, the user can specify an array of angle deltas directly.`rng`

: The seed for the random number generator or an instance of`AbstractRNG`

for reproducible runs. Alternatively, the user can set the global RNG seed with`Random.seed!()`

before invoking`clugen()`

.

**Return values**

The function returns a `NamedTuple`

with the following fields:

`points`

: A`num_points`

x`num_dims`

matrix with the generated points for all clusters.`clusters`

: A`num_points`

x 1 vector indicating which cluster each point in`points`

belongs to.`projections`

: A`num_points`

x`num_dims`

matrix with the point projections on the cluster-supporting lines.`sizes`

: A`num_clusters`

x 1 vector with the number of points in each cluster.`centers`

: A`num_clusters`

x`num_dims`

matrix with the coordinates of the cluster centers.`directions`

: A`num_clusters`

x`num_dims`

matrix with the final direction of each cluster-supporting line.`angles`

: A`num_clusters`

x 1 vector with the angles between the cluster-supporting lines and the main direction.`lengths`

: A`num_clusters`

x 1 vector with the lengths of the cluster-supporting lines.

Note that if a custom function was given in the `clusizes_fn`

parameter, it is possible that `num_points`

may have a different value than what was specified in `clugen`

's `num_points`

parameter.

**Examples**

```
julia> # Create 5 clusters in 3D space with a total of 10000 points...
julia> out = clugen(3, 5, 10000, [0.5, 0.5, 0.5], pi / 16, [10, 10, 10], 10, 1, 2);
julia> out.centers # What are the cluster centers?
5×3 Matrix{Float64}:
8.12774 -16.8167 -1.80764
4.30111 -1.34916 -11.209
-22.3933 18.2706 -2.6716
-11.568 5.87459 4.11589
-19.5565 -10.7151 -12.2009
```

The following instruction displays a scatter plot of the clusters in 3D space:

`julia> plot(out.points[:, 1], out.points[:, 2], out.points[:, 3], seriestype=:scatter, group=out.point_clusters)`

Check the Examples section for a number of illustrative examples on how to use the `clugen()`

function. The Theory section provides more information on how the function works and the impact each parameter has on the final result.

`CluGen.clumerge`

— Function```
clumerge(
data::Union{NamedTuple,Dict}...;
fields::Tuple{Vararg{Symbol}}=(:points, :clusters),
clusters_field::Union{Symbol,Nothing}=:clusters,
output_type::Symbol=:NamedTuple
) -> Union{NamedTuple,Dict}
```

Merges the fields (specified in `fields`

) of two or more `data`

sets (named tuples or dictionaries). The fields to be merged need to have the same number of columns. The corresponding merged field will contain the rows of the fields to be merged, and will have a common supertype.

The `clusters_field`

parameter specifies a field containing integers that identify the cluster to which the respective points belongs to. If `clusters_field`

is specified (by default it's specified as `:clusters`

), cluster assignments in individual datasets will be updated in the merged dataset so that clusters are considered separate. This parameter can be set to `nothing`

, in which case no field will be considered as a special cluster assignments field.

This function can be used to merge data sets generated with the `clugen()`

function, by default merging the `:points`

and `:clusters`

fields in those data sets. It also works with arbitrary data by specifying alternative fields in the `fields`

parameter. It can be used, for example, to merge third-party data with `clugen()`

-generated data.

The function returns a `NamedTuple`

by default, but can return a dictionary by setting the `output_type`

parameter to `:Dict`

.

**Examples**

```
julia> # Generate data with clugen()
julia> clu_data = clugen(2, 5, 1000, [1, 1], 0.01, [20, 20], 14, 1.2, 1.5);
julia> # Generate 500 points of random uniform noise
julia> noise = (points=120 * rand(500, 2) .- 60, clusters=ones(Int32, 500));
julia> # Create a new data set with the clugen()-generated data plus the noise
julia> clu_data_with_noise = clumerge(noise, clu_data);
```

The Examples section contains several illustrative examples on how to use the `clumerge()`

function.

## Core functions

Core functions perform a number of useful operations during several steps of the algorithm. These functions may be useful in other contexts, and are thus exported by the package.

`CluGen.points_on_line`

— Function```
points_on_line(
center::AbstractArray{<:Real,1},
direction::AbstractArray{<:Real,1},
dist_center::AbstractArray{<:Real,1}
) -> AbstractArray{<:Real,2}
```

Determine coordinates of points on a line with `center`

and `direction`

, based on the distances from the center given in `dist_center`

.

This works by using the vector formulation of the line equation assuming `direction`

is a $n$-dimensional unit vector. In other words, considering $\mathbf{d}=$ `direction`

($n \times 1$), $\mathbf{c}=$ `center`

($n \times 1$), and $\mathbf{w}=$ `dist_center`

($p \times 1$), the coordinates of points on the line are given by:

\[\mathbf{P}=\mathbf{1}\,\mathbf{c}^T + \mathbf{w}\mathbf{d}^T\]

where $\mathbf{P}$ is the $p \times n$ matrix of point coordinates on the line, and $\mathbf{1}$ is a $p \times 1$ vector with all entries equal to 1.

**Examples**

```
julia> points_on_line([5.0, 5.0], [1.0, 0.0], -4:2:4) # 2D, 5 points
5×2 Matrix{Float64}:
1.0 5.0
3.0 5.0
5.0 5.0
7.0 5.0
9.0 5.0
julia> points_on_line([-2.0, 0, 0, 2.0], [0, 0, -1.0, 0], [10, -10]) # 4D, 2 points
2×4 Matrix{Float64}:
-2.0 0.0 -10.0 2.0
-2.0 0.0 10.0 2.0
```

`CluGen.rand_ortho_vector`

— Function```
rand_ortho_vector(
u::AbstractArray{<:Real,1};
rng::AbstractRNG=Random.GLOBAL_RNG
) -> AbstractArray{<:Real,1}
```

Get a random unit vector orthogonal to `u`

.

Note that `u`

is expected to be a unit vector itself.

**Examples**

```
julia> u = normalize([1, 2, 5.0, -3, -0.2]); # Define a 5D unit vector
julia> v = rand_ortho_vector(u);
julia> ≈(dot(u, v), 0; atol=1e-15) # Vectors orthogonal? (needs LinearAlgebra package)
true
julia> rand_ortho_vector([1, 0, 0]; rng=MersenneTwister(567)) # 3D, reproducible
3-element Vector{Float64}:
0.0
-0.717797705156548
0.6962517177515569
```

`CluGen.rand_unit_vector`

— Function```
rand_unit_vector(
num_dims::Integer;
rng::AbstractRNG = Random.GLOBAL_RNG
) -> AbstractArray{<:Real,1}
```

Get a random unit vector with `num_dims`

dimensions.

**Examples**

```
julia> v = rand_unit_vector(4) # 4D
4-element Vector{Float64}:
-0.24033021128704707
-0.032103799230189585
0.04223910709972599
-0.9692402145232775
julia> norm(v) # Check vector magnitude is 1 (needs LinearAlgebra package)
1.0
julia> rand_unit_vector(2; rng=MersenneTwister(33)) # 2D, reproducible
2-element Vector{Float64}:
0.8429232717309576
-0.5380337888779647
```

`CluGen.rand_vector_at_angle`

— Function```
rand_vector_at_angle(
u::AbstractArray{<:Real,1},
angle::Real;
rng::AbstractRNG = Random.GLOBAL_RNG
) -> AbstractArray{<:Real,1}
```

Get a random unit vector which is at `angle`

radians of vector `u`

.

Note that `u`

is expected to be a unit vector itself.

**Examples**

```
julia> u = normalize([1,0.5,0.3,-0.1]); # Define a 4D unit vector
julia> v = rand_vector_at_angle(u, pi/4); # pi/4 = 0.7853981... radians = 45 degrees
julia> a = acos(dot(u, v) / (norm(u) * norm(v))) # Angle (radians) between u and v?
0.7853981633974483
julia> rand_vector_at_angle([0, 1], pi/6; rng=MersenneTwister(456)) # 2D, reproducible
2-element Vector{Float64}:
-0.4999999999999999
0.8660254037844387
```

## Algorithm module functions

The module functions perform a complete step of the cluster generation algorithm, providing the package's out-of-the-box functionality. Users can swap one or more of these when invoking `clugen()`

in order to customize the algorithm to their needs.

Since these functions are specific to the cluster generation algorithm, they are not exported by the package.

`CluGen.angle_deltas`

— Function```
angle_deltas(
num_clusters::Integer,
angle_disp::Real;
rng::AbstractRNG = Random.GLOBAL_RNG
) -> AbstractArray{<:Real,1}
```

Determine the angles between the average cluster direction and the cluster-supporting lines. These angles are obtained from a wrapped normal distribution (μ=0, σ=`angle_disp`

) with support in the interval $\left[-\pi/2,\pi/2\right]$. Note this is different from the standard wrapped normal distribution, the support of which is given by the interval $\left[-\pi,\pi\right]$.

The `angle_disp`

parameter must be specified in radians and results are given in radians in the interval $\left[-\pi/2,\pi/2\right]$.

This function is not exported by the package and must be prefixed with `CluGen`

if invoked by user code.

**Examples**

```
julia> CluGen.angle_deltas(4, pi / 128)
4-element Vector{Float64}:
0.01888791855096079
-0.027851298321307266
0.03274154825228485
-0.004475798744567242
julia> CluGen.angle_deltas(3, pi / 32; rng=MersenneTwister(987)) # Reproducible
3-element Vector{Float64}:
0.08834204306583336
0.014678748091943444
-0.15202559427536264
```

`CluGen.clucenters`

— Function```
clucenters(
num_clusters::Integer,
clu_sep::AbstractArray{<:Real,1},
clu_offset::AbstractArray{<:Real,1};
rng::AbstractRNG = Random.GLOBAL_RNG
) -> AbstractArray{<:Real}
```

Determine cluster centers using the uniform distribution, taking into account the number of clusters (`num_clusters`

) and the average cluster separation (`clu_sep`

).

More specifically, let $c=$ `num_clusters`

, $\mathbf{s}=$ `clu_sep`

, $\mathbf{o}=$ `clu_offset`

, $n=$ `length(clu_sep)`

(i.e., number of dimensions). Cluster centers are obtained according to the following equation:

\[\mathbf{C}=c\mathbf{U} \cdot \operatorname{diag}(\mathbf{s}) + \mathbf{1}\,\mathbf{o}^T\]

where $\mathbf{C}$ is the $c \times n$ matrix of cluster centers, $\mathbf{U}$ is an $c \times n$ matrix of random values drawn from the uniform distribution between -0.5 and 0.5, and $\mathbf{1}$ is an $c \times 1$ vector with all entries equal to 1.

This function is not exported by the package and must be prefixed with `CluGen`

if invoked by user code.

**Examples**

```
julia> CluGen.clucenters(4, [10, 50], [0, 0]) # 2D
4×2 Matrix{Float64}:
10.7379 -37.3512
17.6206 32.511
6.95835 17.2044
-4.18188 -89.5734
julia> CluGen.clucenters(5, [20, 10, 30], [10, 10, -10]) # 3D
5×3 Matrix{Float64}:
-13.136 15.8746 2.34767
-29.1129 -0.715105 -46.6028
-23.6334 8.19236 20.879
7.30168 -1.20904 -41.2033
46.5412 7.3284 -42.8401
julia> CluGen.clucenters(3, [100], [0]; rng=MersenneTwister(121)) # 1D, reproducible
3×1 Matrix{Float64}:
-91.3675026663759
140.98964768714384
-124.90981996579862
```

`CluGen.clupoints_n_1`

— Function```
CluGen.clupoints_n_1(
projs::AbstractArray{<:Real,2},
lat_disp::Real,
line_len::Real,
clu_dir::AbstractArray{<:Real,1},
clu_ctr::AbstractArray{<:Real,1};
rng::AbstractRNG = Random.GLOBAL_RNG
) -> AbstractArray{<:Real}
```

Generate points from their $n$-dimensional projections on a cluster-supporting line, placing each point on a hyperplane orthogonal to that line and centered at the point's projection, using the normal distribution (μ=0, σ=`lat_disp`

).

This function's main intended use is by the `clugen()`

function, generating the final points when the `point_dist_fn`

parameter is set to `"n-1"`

.

This function is not exported by the package and must be prefixed with `CluGen`

if invoked by user code.

**Arguments**

`projs`

: Point projections on the cluster-supporting line.`lat_disp`

: Standard deviation for the normal distribution, i.e., cluster lateral dispersion.`line_len`

: Length of cluster-supporting line (ignored).`clu_dir`

: Direction of the cluster-supporting line (unit vector).`clu_ctr`

: Center position of the cluster-supporting line (ignored).`rng`

: An optional pseudo-random number generator for reproducible executions.

**Examples**

```
julia> projs = points_on_line([5.0, 5.0], [1.0, 0.0], -4:2:4) # Get 5 point projections on a 2D line
5×2 Matrix{Float64}:
1.0 5.0
3.0 5.0
5.0 5.0
7.0 5.0
9.0 5.0
julia> CluGen.clupoints_n_1(projs, 0.5, 1.0, [1, 0], [0, 0]; rng=MersenneTwister(123))
5×2 Matrix{Float64}:
1.0 5.59513
3.0 3.97591
5.0 4.42867
7.0 5.22971
9.0 4.80166
```

`CluGen.clupoints_n`

— Function```
GluGen.clupoints_n(
projs::AbstractArray{<:Real,2},
lat_disp::Real,
line_len::Real,
clu_dir::AbstractArray{<:Real,1},
clu_ctr::AbstractArray{<:Real,1};
rng::AbstractRNG = Random.GLOBAL_RNG
) -> AbstractArray{<:Real}
```

Generate points from their $n$-dimensional projections on a cluster-supporting line, placing each point around its projection using the normal distribution (μ=0, σ=`lat_disp`

).

This function's main intended use is by the `clugen()`

function, generating the final points when the `point_dist_fn`

parameter is set to `"n"`

.

`CluGen`

if invoked by user code.

**Arguments**

`projs`

: Point projections on the cluster-supporting line.`lat_disp`

: Standard deviation for the normal distribution, i.e., cluster lateral dispersion.`line_len`

: Length of cluster-supporting line (ignored).`clu_dir`

: Direction of the cluster-supporting line.`clu_ctr`

: Center position of the cluster-supporting line (ignored).`rng`

: An optional pseudo-random number generator for reproducible executions.

**Examples**

```
julia> projs = points_on_line([5.0, 5.0], [1.0, 0.0], -4:2:4) # Get 5 point projections on a 2D line
5×2 Matrix{Float64}:
1.0 5.0
3.0 5.0
5.0 5.0
7.0 5.0
9.0 5.0
julia> CluGen.clupoints_n(projs, 0.5, 1.0, [1, 0], [0, 0]; rng=MersenneTwister(123))
5×2 Matrix{Float64}:
1.59513 4.66764
4.02409 5.49048
5.57133 4.96226
7.22971 5.13691
8.80166 4.90289
```

`CluGen.clusizes`

— Function```
clusizes(
num_clusters::Integer,
num_points::Integer,
allow_empty::Bool;
rng::AbstractRNG = Random.GLOBAL_RNG
) -> AbstractArray{<:Integer,1}
```

Determine cluster sizes, i.e., the number of points in each cluster, using the normal distribution (μ=`num_points`

/`num_clusters`

, σ=μ/3), and then assuring that the final cluster sizes add up to `num_points`

via the `CluGen.fix_num_points!()`

function.

`CluGen`

if invoked by user code.

**Examples**

```
julia> CluGen.clusizes(4, 6, true)
4-element Vector{Int64}:
1
0
3
2
julia> CluGen.clusizes(4, 100, false)
4-element Vector{Int64}:
29
26
24
21
julia> CluGen.clusizes(5, 500, true; rng=MersenneTwister(123)) # Reproducible
5-element Vector{Int64}:
108
129
107
89
67
```

`CluGen.llengths`

— Function```
llengths(
num_clusters::Integer,
llength::Real,
llength_disp::Real;
rng::AbstractRNG = Random.GLOBAL_RNG
) -> AbstractArray{<:Real,1}
```

Determine length of cluster-supporting lines using the folded normal distribution (μ=`llength`

, σ=`llength_disp`

).

`CluGen`

if invoked by user code.

**Examples**

```
julia> CluGen.llengths(5, 10, 3)
5-element Vector{Float64}:
13.57080364295883
16.14453912336772
13.427952708601596
11.37824686122124
8.809962762114331
julia> CluGen.llengths(3, 100, 60; rng=MersenneTwister(111)) # Reproducible
3-element Vector{Float64}:
146.1737820482947
31.914161161783426
180.04064126207396
```

## Helper functions

The helper functions provide useful or reusable functionality, mainly to the module functions described in the previous section. This reusable functionality may be useful for users implementing their own customized module functions.

Except for `angle_btw()`

, these functions are not exported by the package since their use is limited to advanced algorithm customization scenarios.

`CluGen.angle_btw`

— Function`angle_btw(v1::AbstractArray{<:Real,1}, v2::AbstractArray{<:Real,1}) -> Real`

Angle between two $n$-dimensional vectors.

Typically, the angle between two vectors `v1`

and `v2`

can be obtained with:

`acos(dot(v1, v2) / (norm(v1) * norm(v2)))`

However, this approach is numerically unstable. The version provided here is numerically stable and based on the AngleBetweenVectors.jl package by Jeffrey Sarnoff (MIT license), implementing an algorithm provided by Prof. W. Kahan in these notes (see page 15).

**Examples**

```
julia> rad2deg(angle_btw([1.0, 1.0, 1.0, 1.0], [1.0, 0.0, 0.0, 0.0]))
60.00000000000001
```

`CluGen.clupoints_n_1_template`

— Function```
CluGen.clupoints_n_1_template(
projs::AbstractArray{<:Real,2},
lat_disp::Real,
clu_dir::AbstractArray{<:Real,1},
dist_fn::Function;
rng::AbstractRNG = Random.GLOBAL_RNG
) -> AbstractArray{<:Real}
```

Generate points from their $n$-dimensional projections on a cluster-supporting line, placing each point on a hyperplane orthogonal to that line and centered at the point's projection. The function specified in `dist_fn`

is used to perform the actual placement.

This function is used internally by `CluGen.clupoints_n_1()`

and may be useful for constructing user-defined final point placement strategies for the `point_dist_fn`

parameter of the main `clugen()`

function.

`CluGen`

if invoked by user code.

**Arguments**

`projs`

: Point projections on the cluster-supporting line.`lat_disp`

: Dispersion of points from their projection.`clu_dir`

: Direction of the cluster-supporting line (unit vector).`dist_fn`

: Function to place points on a second line, orthogonal to the first. The functions accepts as parameters the number of points in the current cluster, the`lateral_disp`

parameter (the same passed to the`clugen()`

function), and a random number generator, returning a vector containing the distance of each point to its projection on the cluster-supporting line.`rng`

: An optional pseudo-random number generator for reproducible executions.

`CluGen.fix_empty!`

— Function```
fix_empty!(
clu_num_points::AbstractArray{<:Integer,1},
allow_empty::Bool=false
) -> AbstractArray{<:Integer,1}
```

Certifies that, given enough points, no clusters are left empty. This is done by removing a point from the largest cluster and adding it to an empty cluster while there are empty clusters. If the total number of points is smaller than the number of clusters (or if the `allow_empty`

parameter is set to `true`

), this function does nothing.

This function is used internally by `CluGen.clusizes()`

and might be useful for custom cluster sizing implementations given as the `clusizes_fn`

parameter of the main `clugen()`

function.

`CluGen`

if invoked by user code.

`CluGen.fix_num_points!`

— Function```
fix_num_points!(
clu_num_points::AbstractArray{<:Integer,1},
num_points::Integer
) -> AbstractArray{<:Integer,1}
```

Certifies that the values in the `clu_num_points`

array, i.e. the number of points in each cluster, add up to `num_points`

. If this is not the case, the `clu_num_points`

array is modified in-place, incrementing the value corresponding to the smallest cluster while `sum(clu_num_points) < num_points`

, or decrementing the value corresponding to the largest cluster while `sum(clu_num_points) > num_points`

.

This function is used internally by `CluGen.clusizes()`

and might be useful for custom cluster sizing implementations given as the `clusizes_fn`

parameter of the main `clugen()`

function.

`CluGen`

if invoked by user code.