GroupNumbers
Installation
Install this package with Pkg.add("GroupNumbers")
Description
groupby2YYYZZZ(xs; keyfunc=ident, compare=isequal)
groupby_numbersYYYZZZ(xs; keyfunc=ident, compare=isapprox, kwargs)
Here, "YYY" = "" or "_dict", and "ZZZ" = "" or "_indices".
A family of iterators for grouping adjecent elements of the given iterator xs
.
Apply keyfunc
function to each element of xs
to compute the key for comparison. For default, keyfunc
is ident
, so the key is each element itself.
Compare the adjacent keys by compare
function. While groupby2YYYZZZ
family adopt isequal
to the default compare
function, groupby_numbersYYYZZZ
family adopt isapprox
to the default compare
function with accompanying kwargs
being supplied to the optional keyword parameters of this default isapprox
function, allowing the control of the tolerance.
While unbranded iterators ("ZZZ" = "") emit the grouped elements, the _indices
alternatives ("ZZZ" = "_indices" ) emit the indices of the grouped elements.
While unbranded iterators ("YYY" = "") emit only the grouped elements or their indices, the _dict
alternatives ("YYY" = "_dict" ) emit also the first keys.
compare function | emits the grouped elements | emits the grouped indices | |
---|---|---|---|
isequal | groupby2 | groupby2_indices | |
groupby2_dict | groupby2_dict_indices | also emits key | |
isapprox | groupby_numbers | groupby_numbers_indices | |
groupby_numbers_dict | groupby_numbers_dict_indices | also emits key |
Examples
Example 1: Groups characters in a string
Simple case
julia> collect(groupby2("AAAABBBCCD"))
4-element Vector{Vector{Char}}:
['A', 'A', 'A', 'A']
['B', 'B', 'B']
['C', 'C']
['D']
Emits keys
julia> collect(groupby2_dict("AAAABBBCCD"))
4-element Vector{Tuple{Any, Vector{Char}}}:
('A', ['A', 'A', 'A', 'A'])
('B', ['B', 'B', 'B'])
('C', ['C', 'C'])
('D', ['D'])
Groups without case sensitive
julia> collect(groupby2_dict("AaAABbBcCD", keyfunc=uppercase))
4-element Vector{Tuple{Any, Vector{Char}}}:
('A', ['A', 'a', 'A', 'A'])
('B', ['B', 'b', 'B'])
('C', ['c', 'C'])
('D', ['D'])
Groups without case sensitive. Emits the grouped indices rather than the grouped elements.
julia> collect(groupby2_dict_indices("AaAABbBcCD", keyfunc=uppercase))
4-element Vector{Tuple{Any, Vector{Int64}}}:
('A', [1, 2, 3, 4])
('B', [5, 6, 7])
('C', [8, 9])
('D', [10])
Example 2: Groups integer numbers
Simple case
julia> collect(groupby2([10,20,20,30]))
3-element Vector{Vector{Int64}}:
[10]
[20, 20]
[30]
julia> collect(groupby_numbers([10,20,20,30])); # => same result
Emits keys
julia> collect(groupby2_dict([10,20,20,30]))
3-element Vector{Tuple{Any, Vector{Int64}}}:
(10, [10])
(20, [20, 20])
(30, [30])
julia> collect(groupby_numbers_dict([10,20,20,30])); # => same result
Groups by absolute values
julia> collect(groupby2_dict([10,-20,20,30]; keyfunc=abs))
3-element Vector{Tuple{Any, Vector{Int64}}}:
(10, [10])
(20, [-20, 20])
(30, [30])
julia> collect(groupby_numbers_dict([10,-20,20,30]; keyfunc=abs)); # => same result
Groups by absolute values. Emits the grouped indices rather than the grouped elements.
julia> collect(groupby2_dict_indices([10,-20,20,30]; keyfunc=abs))
3-element Vector{Tuple{Any, Vector{Int64}}}:
(10, [1])
(20, [2, 3])
(30, [4])
julia> collect(groupby_numbers_dict_indices([10,-20,20,30]; keyfunc=abs)); # => same result
Example 3: Groups floating point numbers
Use groupby_numbersYYYZZZ
rather than groupby2YYYZZZ
to make groups of floating point numbers.
Simple case. Compare floating point numbers by isapprox
function with default parameters.
julia> collect(groupby_numbers([ 2e-10, 2e-9, 2e-8, 2e-7 ] .+ 1))
3-element Vector{Vector{Float64}}:
[1.0000000002, 1.000000002]
[1.00000002]
[1.0000002]
Adjusts tolerance with atol
and rtol
parameters.
Consult the manual of Base.isapprox
for its keyword parameters such as atol
and rtol
.
julia> collect(groupby_numbers([ 2e-8, 2e-7, 2e-6, 2e-5 ] .+ 1; atol=1e-6))
3-element Vector{Vector{Float64}}:
[1.00000002, 1.0000002]
[1.000002]
[1.00002]
julia> collect(groupby_numbers([ 2e-6, 2e-5, 2e-4, 2e-3 ] .+ 1; rtol=1e-4))
3-element Vector{Vector{Float64}}:
[1.000002, 1.00002]
[1.0002]
[1.002]
Groups by their absolute values
julia> collect(groupby_numbers([ 1+2e-6, -1+2e-5, 1+2e-4, 1-2e-3 ];
keyfunc=abs, rtol=1e-4))
3-element Vector{Vector{Float64}}:
[1.000002, -0.99998]
[1.0002]
[0.998]
Emits the grouped indices rather than the grouped elements.
julia> collect(groupby_numbers_indices([ 1+2e-6, -1+2e-5, 1+2e-4, 1-2e-3 ];
keyfunc=abs, rtol=1e-4))
3-element Vector{Vector{Int64}}:
[1, 2]
[3]
[4]
Example 4: Groups noisy vectors
Groups array of vectors
Rotation preserves norm.
julia> using LinearAlgebra
julia> # Rotation matrix
julia> t=15; r15 = [ cosd(t) -sind(t); sind(t) cosd(t)]
2×2 Matrix{Float64}:
0.965926 -0.258819
0.258819 0.965926
julia> using IterTools
julia> vs1 = collect( Iterators.take(
iterated(v -> (1+rand()*1e-8)*r15*v, [1,0]), 5) )
5-element Vector{Vector}:
[1, 0]
[0.9659258323666292, 0.25881904673099826]
[0.8660254177031013, 0.5000000080359436]
[0.7071067969544697, 0.7071067969544694]
[0.5000000112991584, 0.8660254233551546]
julia> collect( groupby_numbers_indices(vs1;keyfunc=norm, atol=1e-6))
1-element Vector{Vector{Int64}}:
[1, 2, 3, 4, 5]
Groups array of tuple consisting of vector and its norm
Calculate the vectors and their norms to avoid recalculate the latter.
julia> using LinearAlgebra
julia> vs1=vec( [ begin
v= [i1,i2] *(1+(rand()-0.5)*1e-8);
(v=v,n=norm(v))
end for i1 in -2:2, i2 in -2:2] );
julia> vs2=sort(vs1; by=x->x.n);
julia> collect(groupby_numbers_dict_indices(vs2; keyfunc=x->x.n))
6-element Vector{Tuple{Any, Vector{Int64}}}:
(0.0, [1])
(0.9999999976242439, [2, 3, 4, 5])
(1.4142135561654923, [6, 7, 8, 9])
(1.999999991951223, [10, 11, 12, 13])
(2.2360679691661827, [14, 15, 16, 17, 18, 19, 20, 21])
(2.828427114159456, [22, 23, 24, 25])