Sparse Array Utilities

Sparse Constructors

In addition to the Tensor constructor, Finch provides a number of convenience constructors for common tensor types. For example, the spzeros and sprand functions have fspzeros and fsprand counterparts that return Finch tensors. We can also construct a sparse COO Tensor from a list of indices and values using the fsparse function.

Finch.fsparse — Function

fsparse(I::Tuple, V,[ M::Tuple, combine]; fill_value=zero(eltype(V)))

Create a sparse COO tensor S such that size(S) == M and S[(i[q] for i = I)...] = V[q]. The combine function is used to combine duplicates. If M is not specified, it is set to map(maximum, I). If the combine function is not supplied, combine defaults to + unless the elements of V are Booleans in which case combine defaults to |. All elements of I must satisfy 1 <= I[n][q] <= M[n]. Numerical zeros are retained as structural nonzeros; to drop numerical zeros, use dropzeros!.

See also: sparse

Examples

julia> I = ( [1, 2, 3], [1, 2, 3], [1, 2, 3]);

julia> V = [1.0; 2.0; 3.0];

julia> fsparse(I, V) SparseCOO (0.0) [1:3×1:3×1:3] │ │ │ └─└─└─[1, 1, 1] [2, 2, 2] [3, 3, 3] 1.0 2.0 3.0

Finch.fsparse! — Function

fsparse!(I..., V,[ M::Tuple])

Like fsparse, but the coordinates must be sorted and unique, and memory is reused.

Finch.fsprand — Function

fsprand([rng],[type], M..., p, [rfn])

Create a random sparse tensor of size m in COO format. There are two cases: - If p is floating point, the probability of any element being nonzero is independently given by p (and hence the expected density of nonzeros is also p). - If p is an integer, exactly p nonzeros are distributed uniformly at random throughout the tensor (and hence the density of nonzeros is exactly p / prod(M)). Nonzero values are sampled from the distribution specified by rfn and have the type type. The uniform distribution is used in case rfn is not specified. The optional rng argument specifies a random number generator.

See also: (sprand)(https://docs.julialang.org/en/v1/stdlib/SparseArrays/#SparseArrays.sprand)

Examples

julia> fsprand(Bool, 3, 3, 0.5)
SparseCOO (false) [1:3,1:3]
├─├─[1, 1]: true
├─├─[3, 1]: true
├─├─[2, 2]: true
├─├─[3, 2]: true
├─├─[3, 3]: true

julia> fsprand(Float64, 2, 2, 2, 0.5)
SparseCOO (0.0) [1:2,1:2,1:2]
├─├─├─[2, 2, 1]: 0.6478553157718558
├─├─├─[1, 1, 2]: 0.996665291437684
├─├─├─[2, 1, 2]: 0.7491940599574348

Finch.fspzeros — Function

fspzeros([type], M...)

Create a random zero tensor of size M, with elements of type type. The tensor is in COO format.

See also: (spzeros)(https://docs.julialang.org/en/v1/stdlib/SparseArrays/#SparseArrays.spzeros)

Examples

julia> fspzeros(Bool, 3, 3)
3×3-Tensor
└─ SparseCOO{2} (false) [:,1:3]

julia> fspzeros(Float64, 2, 2, 2)
2×2×2-Tensor
└─ SparseCOO{3} (0.0) [:,:,1:2]

Finch.ffindnz — Function

ffindnz(arr)

Return the nonzero elements of arr, as Finch understands arr. Returns (I..., V), where I are the coordinate vectors, one for each mode of arr, and V is a vector of corresponding nonzero values, which can be passed to fsparse.

See also: (findnz)(https://docs.julialang.org/en/v1/stdlib/SparseArrays/#SparseArrays.findnz)

Fill Values

Finch tensors support an arbitrary "background" value for sparse arrays. While most arrays use 0 as the background value, this is not always the case. For example, a sparse array of Int might use typemin(Int) as the background value. The default function returns the background value of a tensor. If you ever want to change the background value of an existing array, you can use the set_fill_value! function. The countstored function returns the number of stored elements in a tensor, and calling pattern! on a tensor returns tensor which is true whereever the original tensor stores a value. Note that countstored doesn't always return the number of non-zero elements in a tensor, as it counts the number of stored elements, and stored elements may include the background value. You can call dropfills! to remove explicitly stored background values from a tensor.

julia> A = fsparse([1, 1, 2, 3], [2, 4, 5, 6], [1.0, 2.0, 3.0])
3×6-Tensor
└─ SparseCOO{2} (0.0) [:,1:6]
   ├─ [1, 2]: 1.0
   ├─ [1, 4]: 2.0
   └─ [2, 5]: 3.0

julia> min.(A, -1)
3×6-Tensor
└─ Dense [:,1:6]
   ├─ [:, 1]: Dense [1:3]
   │  ├─ [1]: -1.0
   │  ├─ [2]: -1.0
   │  └─ [3]: -1.0
   ├─ [:, 2]: Dense [1:3]
   │  ├─ [1]: -1.0
   │  ├─ [2]: -1.0
   │  └─ [3]: -1.0
   ├─ ⋮
   ├─ [:, 5]: Dense [1:3]
   │  ├─ [1]: -1.0
   │  ├─ [2]: -1.0
   │  └─ [3]: -1.0
   └─ [:, 6]: Dense [1:3]
      ├─ [1]: -1.0
      ├─ [2]: -1.0
      └─ [3]: -1.0

julia> fill_value(A)
0.0

julia> B = set_fill_value!(A, -Inf)
3×6-Tensor
└─ SparseCOO{2} (-Inf) [:,1:6]
   ├─ [1, 2]: 1.0
   ├─ [1, 4]: 2.0
   └─ [2, 5]: 3.0

julia> min.(B, -1)
3×6-Tensor
└─ SparseDict (-Inf) [:,1:6]
   ├─ [:, 2]: SparseDict (-Inf) [1:3]
   │  └─ [1]: -1.0
   ├─ [:, 4]: SparseDict (-Inf) [1:3]
   │  └─ [1]: -1.0
   └─ [:, 5]: SparseDict (-Inf) [1:3]
      └─ [2]: -1.0

julia> countstored(A)
3

julia> pattern!(A)
3×6-Tensor
└─ SparseCOO{2} (false) [:,1:6]
   ├─ [1, 2]: true
   ├─ [1, 4]: true
   └─ [2, 5]: true

Finch.set_fill_value! — Function

set_fill_value!(fbr, init)

Return a tensor which is equal to fbr, but with the fill (implicit) value set to init. May reuse memory and render the original tensor unusable when modified.

julia> A = Tensor(SparseList(Element(0.0), 10), [2.0, 0.0, 3.0, 0.0, 4.0, 0.0, 5.0, 0.0, 6.0, 0.0])
10-Tensor
└─ SparseList (0.0) [1:10]
   ├─ [1]: 2.0
   ├─ [3]: 3.0
   ├─ ⋮
   ├─ [7]: 5.0
   └─ [9]: 6.0

julia> set_fill_value!(A, Inf)
10-Tensor
└─ SparseList (Inf) [1:10]
   ├─ [1]: 2.0
   ├─ [3]: 3.0
   ├─ ⋮
   ├─ [7]: 5.0
   └─ [9]: 6.0

Finch.pattern! — Function

pattern!(fbr)

Return the pattern of fbr. That is, return a tensor which is true wherever fbr is structurally unequal to its fill_value. May reuse memory and render the original tensor unusable when modified.

julia> A = Tensor(SparseList(Element(0.0), 10), [2.0, 0.0, 3.0, 0.0, 4.0, 0.0, 5.0, 0.0, 6.0, 0.0])
10-Tensor
└─ SparseList (0.0) [1:10]
   ├─ [1]: 2.0
   ├─ [3]: 3.0
   ├─ ⋮
   ├─ [7]: 5.0
   └─ [9]: 6.0

julia> pattern!(A)
10-Tensor
└─ SparseList (false) [1:10]
   ├─ [1]: true
   ├─ [3]: true
   ├─ ⋮
   ├─ [7]: true
   └─ [9]: true

Finch.countstored — Function

countstored(arr)

Return the number of stored elements in arr. If there are explicitly stored fill elements, they are counted too.

See also: (SparseArrays.nnz)(https://docs.julialang.org/en/v1/stdlib/SparseArrays/#SparseArrays.nnz) and (Base.summarysize)(https://docs.julialang.org/en/v1/base/base/#Base.summarysize)

Finch.dropfills — Function

dropfills(src)

Drop the fill values from src and return a new tensor with the same shape and format.

Finch.dropfills! — Function

dropfills!(dst, src)

Copy only the non- fill values from src into dst. The shape and format of dst must match src

How to tell whether an entry is "fill"

In the sparse world, a semantic distinction is sometimes made between "explicitly stored" values and "implicit" or "fill" values (usually zero). However, the formats in the Finch compiler represent a diverse set of structures beyond sparsity, and it is often unclear whether any of the values in the tensor are "explicit" (consider a mask matrix, which can be represented with a constant number of bits). Thus, Finch makes no semantic distinction between values which are stored explicitly or not. If users wish to make this distinction, they should instead store a tensor of tuples of the form (value, is_fill). For example,

julia> A = fsparse([1, 1, 2, 3], [2, 4, 5, 6], [(1.0, false), (0.0, true), (3.0, false)]; fill_value=(0.0, true))
3×6-Tensor
└─ SparseCOO{2} ((0.0, true)) [:,1:6]
   ├─ [1, 2]: (1.0, false)
   ├─ [1, 4]: (0.0, true)
   └─ [2, 5]: (3.0, false)

julia> B = Tensor(Dense(SparseList(Element((0.0, true)))), A)
3×6-Tensor
└─ Dense [:,1:6]
   ├─ [:, 1]: SparseList ((0.0, true)) [1:3]
   ├─ [:, 2]: SparseList ((0.0, true)) [1:3]
   │  └─ [1]: (1.0, false)
   ├─ ⋮
   ├─ [:, 5]: SparseList ((0.0, true)) [1:3]
   │  └─ [2]: (3.0, false)
   └─ [:, 6]: SparseList ((0.0, true)) [1:3]

julia> sum(map(last, B))
16

julia> sum(map(first, B))
4.0