Performance of not

Optimizations for not with special index types

For faster indexing, this package provides optimizations for not with some special index types, which means not(x) is not equivalent to FI(!in(x)) for x belonging to those index types, and will be converted to different index types by to_indices function.

There are two list of those index types, both of which are enabled for not, but for customized NotIndex types, optimizations in second list is enabled when indextype returns Vector{Int}.

There optimizations are enabled for any AbstractNotIndex types by default:

  • x::Colon will be converted to an empty Int array: Int[];
  • x::AbstractArray{Bool} will be converted to a LogicalIndex with mask mappedarray(!, x);
  • x::AbstractArray will be converted like FI(!in(x′)), while x′ is a Set like array converted from x with faster in;
  • For I′, J′ = to_indices(A, (not(I), not(J))), not(I′) and not(I′) will revert to I, J.

There optimizations are enabled only if indextype is defined as Vector{Int}:

  • x::Integer or x::OrdinalRange{<:Integer} will be converted to an Int array where x is removed from given axis.
  • x::Base.Slice will be converted to an empty Int array, when the slice represents the given axis, Otherwise, it will be treated as a normal AbstractUnitRange.

Performance tips for not

For small array, the optimized not(x) might be slower in some case, because of the overhead for creating a Set, see below.

There are some tips for better performance:

  • Use FI(!in(x)) instead of not(x).
  • Create your own "Not" type, see below example for details.
  • For a small array of indices, not(1, 2, 3) will faster than not([1, 2, 3]).

Performance comparing

This is a performance comparing of "Inverted Indexing" for different methods. There are five methods:

  • bynot: by not of this package;
  • byfi: by function index FI(!in(I));
  • bymap: by logical indices which test map(!in(I), axis);
  • byfilter: by removing I from axes by filter;
  • byNot: by InvertedIndices.Not.

The minimum time and allocation of each method for different type of index were compared:

using BenchmarkTools
using InvertedIndices
using FunctionIndices
using Latexify

# Linear
bynot(A, I) = A[not(I)]
byfi(A, I) = A[FI(!in(I))]
bymap(A, I) = A[map(!in(I), begin:end)]
byfilter(A, I) = A[filter(!in(I), begin:end)]
byNot(A, I) = A[Not(I)]

# Cartesian
bynot(A, Is...) = A[ntuple(i -> not(Is[i]), Val(length(Is)))...]
byfi(A, Is...) = A[ntuple(i -> FI(!in(Is[i])), Val(length(Is)))...]
bymap(A, Is...) = A[ntuple(i -> map(!in(Is[i]), axes(A, i)), Val(length(Is)))...]
byfilter(A, Is...) = A[ntuple(i -> filter(!in(Is[i]), axes(A, i)), Val(length(Is)))...]
byNot(A, Is...) = A[ntuple(i -> Not(Is[i]), Val(length(Is)))...]

const As = (rand(10), rand(10, 10), rand(10, 10, 10))
const fs = (bynot, byfi, bymap, byfilter, byNot)

# convert tuple of tuple to matrix
tt2mt(trs) = hcat(map(tr -> vcat(tr...), trs)...)

maketable(bench, As::Tuple=As, fs::Tuple=fs) = mdtable(
    # use map of map instead of for loop for typle stable
    map(
        f -> map(
            A -> begin
                trial = bench(f, A)
                trialmin = minimum(trial)
                trialallocs = allocs(trialmin)
                string(
                    BenchmarkTools.prettytime(time(trialmin)),
                    " (", trialallocs , " alloc",
                    trialallocs == 1 ? "" : "s", ": ",
                    BenchmarkTools.prettymemory(memory(trialmin)), ")"
                )
            end,
            As
        ),
        fs
    ) |> tt2mt;
    head=fs, side=[:Size; size.(As)...], latex=false,
);
maketable (generic function with 3 methods)

Indexing with Int

A random inbounds Int index.

Linear Indexing

maketable() do f, A
    ind = rand(firstindex(A):lastindex(A))
    @benchmark $f($A, $ind)
end
SizebynotbyfibymapbyfilterbyNot
(10,)174.605 ns (2 allocs: 256 bytes)335.090 ns (1 alloc: 128 bytes)385.764 ns (2 allocs: 192 bytes)396.527 ns (3 allocs: 400 bytes)1.700 μs (49 allocs: 1.59 KiB)
(10, 10)1.281 μs (2 allocs: 1.75 KiB)2.995 μs (1 alloc: 896 bytes)3.305 μs (2 allocs: 1.03 KiB)3.165 μs (3 allocs: 2.62 KiB)19.534 μs (599 allocs: 19.55 KiB)
(10, 10, 10)12.107 μs (2 allocs: 15.88 KiB)29.325 μs (1 alloc: 7.94 KiB)32.139 μs (2 allocs: 9.00 KiB)30.789 μs (3 allocs: 23.81 KiB)182.693 μs (6427 allocs: 193.44 KiB)

Multidimensional Indexing

maketable() do f, A
    inds = ntuple(i -> rand(axes(A, i)), Val(ndims(A)))
    @benchmark $f($A, $inds...)
end
SizebynotbyfibymapbyfilterbyNot
(10,)176.703 ns (2 allocs: 256 bytes)334.455 ns (1 alloc: 128 bytes)382.307 ns (2 allocs: 192 bytes)399.000 ns (3 allocs: 400 bytes)1.843 μs (52 allocs: 1.69 KiB)
(10, 10)809.337 ns (3 allocs: 992 bytes)1.791 μs (1 alloc: 736 bytes)2.287 μs (3 allocs: 864 bytes)1.215 μs (5 allocs: 1.25 KiB)16.451 μs (472 allocs: 15.12 KiB)
(10, 10, 10)6.503 μs (11 allocs: 6.41 KiB)15.551 μs (8 allocs: 6.03 KiB)18.916 μs (8 allocs: 6.19 KiB)7.057 μs (11 allocs: 6.80 KiB)163.959 μs (4802 allocs: 153.20 KiB)

Indexing with UnitRange

A random inbound UnitRange with half-length of axe.

Linear Indexing

maketable() do f, A
    axe = firstindex(A):lastindex(A)
    half = length(axe) ÷ 2
    b = rand(axe[begin:(end - half)])
    e = b + half
    @benchmark $f($A, $(b:e))
end
SizebynotbyfibymapbyfilterbyNot
(10,)127.008 ns (2 allocs: 192 bytes)263.350 ns (1 alloc: 96 bytes)356.157 ns (2 allocs: 160 bytes)297.233 ns (3 allocs: 336 bytes)1.048 μs (25 allocs: 1008 bytes)
(10, 10)640.226 ns (2 allocs: 896 bytes)2.986 μs (1 alloc: 448 bytes)2.963 μs (2 allocs: 608 bytes)2.084 μs (3 allocs: 1.75 KiB)11.348 μs (292 allocs: 11.64 KiB)
(10, 10, 10)6.120 μs (2 allocs: 8.12 KiB)28.841 μs (1 alloc: 4.06 KiB)28.781 μs (2 allocs: 5.12 KiB)20.245 μs (3 allocs: 16.06 KiB)109.371 μs (3397 allocs: 102.25 KiB)

Multidimensional Indexing

maketable() do f, A
    inds = ntuple(
        i -> begin
            axe = axes(A, i)
            half = length(axe) ÷ 2
            b = rand(axe[begin:(end - half)])
            e = b + half
            b:e
        end,
        Val(ndims(A))
    )
    @benchmark $f($A, $inds...)
end
SizebynotbyfibymapbyfilterbyNot
(10,)126.175 ns (2 allocs: 192 bytes)333.680 ns (1 alloc: 96 bytes)354.427 ns (2 allocs: 160 bytes)296.123 ns (3 allocs: 336 bytes)1.001 μs (22 allocs: 768 bytes)
(10, 10)298.614 ns (3 allocs: 384 bytes)1.046 μs (1 alloc: 192 bytes)1.119 μs (3 allocs: 320 bytes)622.227 ns (5 allocs: 672 bytes)4.791 μs (113 allocs: 3.89 KiB)
(10, 10, 10)1.330 μs (14 allocs: 1.23 KiB)3.948 μs (11 allocs: 976 bytes)4.675 μs (11 allocs: 1.02 KiB)1.780 μs (14 allocs: 1.53 KiB)21.266 μs (505 allocs: 18.42 KiB)

Indexing with StepRange

A StepRange with step 2.

Linear Indexing

maketable() do f, A
    ind = firstindex(A):2:lastindex(A)
    @benchmark $f($A, $ind)
end
SizebynotbyfibymapbyfilterbyNot
(10,)220.190 ns (2 allocs: 192 bytes)667.873 ns (1 alloc: 96 bytes)517.518 ns (2 allocs: 160 bytes)561.768 ns (3 allocs: 336 bytes)1.191 μs (31 allocs: 1.31 KiB)
(10, 10)1.360 μs (2 allocs: 992 bytes)6.398 μs (1 alloc: 496 bytes)4.500 μs (2 allocs: 656 bytes)4.740 μs (3 allocs: 1.84 KiB)11.047 μs (301 allocs: 12.95 KiB)
(10, 10, 10)12.924 μs (2 allocs: 8.12 KiB)63.576 μs (1 alloc: 4.06 KiB)44.218 μs (2 allocs: 5.12 KiB)46.829 μs (3 allocs: 16.06 KiB)110.593 μs (3491 allocs: 136.69 KiB)

Multidimensional Indexing

maketable() do f, A
    inds = ntuple(
        i -> begin
            axe = axes(A, i)
            axe[begin:2:end]
        end,
        Val(ndims(A))
    )
    @benchmark $f($A, $inds...)
end
SizebynotbyfibymapbyfilterbyNot
(10,)216.643 ns (2 allocs: 192 bytes)665.627 ns (1 alloc: 96 bytes)515.312 ns (2 allocs: 160 bytes)559.757 ns (3 allocs: 336 bytes)1.213 μs (31 allocs: 1.31 KiB)
(10, 10)540.995 ns (3 allocs: 448 bytes)2.541 μs (1 alloc: 256 bytes)1.721 μs (3 allocs: 384 bytes)1.226 μs (5 allocs: 736 bytes)6.862 μs (181 allocs: 7.56 KiB)
(10, 10, 10)2.111 μs (14 allocs: 1.78 KiB)11.183 μs (11 allocs: 1.50 KiB)7.096 μs (11 allocs: 1.56 KiB)3.100 μs (14 allocs: 2.08 KiB)35.594 μs (950 allocs: 39.73 KiB)

Indexing with Vector{Int}

A Vector{Int} which is a collected StepRange.

Linear Indexing

maketable() do f, A
    ind = collect(firstindex(A):2:lastindex(A))
    @benchmark $f($A, $ind)
end
SizebynotbyfibymapbyfilterbyNot
(10,)571.088 ns (5 allocs: 496 bytes)615.329 ns (1 alloc: 96 bytes)519.953 ns (2 allocs: 160 bytes)456.036 ns (3 allocs: 336 bytes)1.239 μs (31 allocs: 1.22 KiB)
(10, 10)4.561 μs (8 allocs: 2.09 KiB)42.285 μs (1 alloc: 496 bytes)22.534 μs (2 allocs: 656 bytes)21.897 μs (3 allocs: 1.84 KiB)11.546 μs (301 allocs: 12.16 KiB)
(10, 10, 10)46.122 μs (8 allocs: 13.59 KiB)3.934 ms (1 alloc: 4.06 KiB)1.979 ms (2 allocs: 5.12 KiB)1.972 ms (3 allocs: 16.06 KiB)116.570 μs (3491 allocs: 128.86 KiB)

Multidimensional Indexing

maketable() do f, A
    inds = ntuple(
        i -> begin
            axe = axes(A, i)
            collect(axe[begin:2:end])
        end,
        Val(ndims(A))
    )
    @benchmark $f($A, $inds...)
end
SizebynotbyfibymapbyfilterbyNot
(10,)573.082 ns (5 allocs: 496 bytes)611.282 ns (1 alloc: 96 bytes)504.497 ns (2 allocs: 160 bytes)459.756 ns (3 allocs: 336 bytes)1.227 μs (31 allocs: 1.22 KiB)
(10, 10)1.836 μs (9 allocs: 1.03 KiB)2.479 μs (1 alloc: 256 bytes)1.720 μs (3 allocs: 384 bytes)1.027 μs (5 allocs: 736 bytes)7.003 μs (181 allocs: 7.00 KiB)
(10, 10, 10)7.354 μs (20 allocs: 2.39 KiB)11.054 μs (8 allocs: 1.22 KiB)7.030 μs (8 allocs: 1.38 KiB)2.765 μs (11 allocs: 1.89 KiB)35.969 μs (950 allocs: 36.53 KiB)