

Install this package with Pkg.add("GroupNumbers")


  • groupby2YYYZZZ(xs; keyfunc=ident, compare=isequal)
  • groupby_numbersYYYZZZ(xs; keyfunc=ident, compare=isapprox, kwargs)

Here, "YYY" = "" or "_dict", and "ZZZ" = "" or "_indices".

A family of iterators for grouping adjecent elements of the given iterator xs.

Apply keyfunc function to each element of xs to compute the key for comparison. For default, keyfunc is ident, so the key is each element itself.

Compare the adjacent keys by compare function. While groupby2YYYZZZ family adopt isequal to the default compare function, groupby_numbersYYYZZZ family adopt isapprox to the default compare function with accompanying kwargs being supplied to the optional keyword parameters of this default isapprox function, allowing the control of the tolerance.

While unbranded iterators ("ZZZ" = "") emit the grouped elements, the _indices alternatives ("ZZZ" = "_indices" ) emit the indices of the grouped elements.

While unbranded iterators ("YYY" = "") emit only the grouped elements or their indices, the _dict alternatives ("YYY" = "_dict" ) emit also the first keys.

compare functionemits the grouped elementsemits the grouped indices
groupby2_dictgroupby2_dict_indicesalso emits key
groupby_numbers_dictgroupby_numbers_dict_indicesalso emits key


Example 1: Groups characters in a string

Simple case

julia> collect(groupby2("AAAABBBCCD"))
4-element Vector{Vector{Char}}:
 ['A', 'A', 'A', 'A']
 ['B', 'B', 'B']
 ['C', 'C']

Emits keys

julia> collect(groupby2_dict("AAAABBBCCD"))
4-element Vector{Tuple{Any, Vector{Char}}}:
 ('A', ['A', 'A', 'A', 'A'])
 ('B', ['B', 'B', 'B'])
 ('C', ['C', 'C'])
 ('D', ['D'])

Groups without case sensitive

julia> collect(groupby2_dict("AaAABbBcCD", keyfunc=uppercase))
4-element Vector{Tuple{Any, Vector{Char}}}:
 ('A', ['A', 'a', 'A', 'A'])
 ('B', ['B', 'b', 'B'])
 ('C', ['c', 'C'])
 ('D', ['D'])

Groups without case sensitive. Emits the grouped indices rather than the grouped elements.

julia> collect(groupby2_dict_indices("AaAABbBcCD", keyfunc=uppercase))
4-element Vector{Tuple{Any, Vector{Int64}}}:
 ('A', [1, 2, 3, 4])
 ('B', [5, 6, 7])
 ('C', [8, 9])
 ('D', [10])

Example 2: Groups integer numbers

Simple case

julia> collect(groupby2([10,20,20,30]))
3-element Vector{Vector{Int64}}:
 [20, 20]
julia> collect(groupby_numbers([10,20,20,30])); # => same result

Emits keys

julia> collect(groupby2_dict([10,20,20,30]))
3-element Vector{Tuple{Any, Vector{Int64}}}:
 (10, [10])
 (20, [20, 20])
 (30, [30])

julia> collect(groupby_numbers_dict([10,20,20,30])); # => same result

Groups by absolute values

julia> collect(groupby2_dict([10,-20,20,30]; keyfunc=abs))
3-element Vector{Tuple{Any, Vector{Int64}}}:
 (10, [10])
 (20, [-20, 20])
 (30, [30])

julia> collect(groupby_numbers_dict([10,-20,20,30]; keyfunc=abs)); # => same result

Groups by absolute values. Emits the grouped indices rather than the grouped elements.

julia> collect(groupby2_dict_indices([10,-20,20,30]; keyfunc=abs))
3-element Vector{Tuple{Any, Vector{Int64}}}:
 (10, [1])
 (20, [2, 3])
 (30, [4])

julia> collect(groupby_numbers_dict_indices([10,-20,20,30]; keyfunc=abs)); # => same result

Example 3: Groups floating point numbers

Use groupby_numbersYYYZZZ rather than groupby2YYYZZZ to make groups of floating point numbers.

Simple case. Compare floating point numbers by isapprox function with default parameters.

julia> collect(groupby_numbers([ 2e-10, 2e-9, 2e-8, 2e-7 ] .+ 1))
3-element Vector{Vector{Float64}}:
 [1.0000000002, 1.000000002]

Adjusts tolerance with atol and rtol parameters.

Consult the manual of Base.isapprox for its keyword parameters such as atol and rtol.

julia> collect(groupby_numbers([ 2e-8, 2e-7, 2e-6, 2e-5 ] .+ 1; atol=1e-6))
3-element Vector{Vector{Float64}}:
 [1.00000002, 1.0000002]
julia> collect(groupby_numbers([ 2e-6, 2e-5, 2e-4, 2e-3 ] .+ 1; rtol=1e-4))
3-element Vector{Vector{Float64}}:
 [1.000002, 1.00002]

Groups by their absolute values

julia> collect(groupby_numbers([ 1+2e-6, -1+2e-5, 1+2e-4, 1-2e-3 ]; 
        keyfunc=abs, rtol=1e-4))
3-element Vector{Vector{Float64}}:
 [1.000002, -0.99998]

Emits the grouped indices rather than the grouped elements.

julia> collect(groupby_numbers_indices([ 1+2e-6, -1+2e-5, 1+2e-4, 1-2e-3 ]; 
        keyfunc=abs, rtol=1e-4))
3-element Vector{Vector{Int64}}:
 [1, 2]

Example 4: Groups noisy vectors

Groups array of vectors

Rotation preserves norm.

julia> using LinearAlgebra
julia> # Rotation matrix
julia> t=15; r15 = [ cosd(t) -sind(t); sind(t) cosd(t)]
2×2 Matrix{Float64}:
 0.965926  -0.258819
 0.258819   0.965926

julia> using IterTools
julia> vs1 = collect( Iterators.take(
                iterated(v -> (1+rand()*1e-8)*r15*v, [1,0]), 5) )
5-element Vector{Vector}:
 [1, 0]
 [0.9659258323666292, 0.25881904673099826]
 [0.8660254177031013, 0.5000000080359436]
 [0.7071067969544697, 0.7071067969544694]
 [0.5000000112991584, 0.8660254233551546]

julia> collect( groupby_numbers_indices(vs1;keyfunc=norm, atol=1e-6))
1-element Vector{Vector{Int64}}:
 [1, 2, 3, 4, 5]

Groups array of tuple consisting of vector and its norm

Calculate the vectors and their norms to avoid recalculate the latter.

julia> using LinearAlgebra

julia> vs1=vec( [ begin 
            v= [i1,i2] *(1+(rand()-0.5)*1e-8);
        end for i1 in -2:2, i2 in -2:2] );

julia> vs2=sort(vs1; by=x->x.n);

julia> collect(groupby_numbers_dict_indices(vs2; keyfunc=x->x.n))
6-element Vector{Tuple{Any, Vector{Int64}}}:
 (0.0, [1])
 (0.9999999976242439, [2, 3, 4, 5])
 (1.4142135561654923, [6, 7, 8, 9])
 (1.999999991951223, [10, 11, 12, 13])
 (2.2360679691661827, [14, 15, 16, 17, 18, 19, 20, 21])
 (2.828427114159456, [22, 23, 24, 25])

