Constructing Tensors

You can build a finch tensor with the Tensor constructor. In general, the Tensor constructor mirrors Julia's Array constructor, but with an additional prefixed argument which specifies the formatted storage for the tensor.

For example, to construct an empty sparse matrix:

julia> A_fbr = Tensor(Dense(SparseList(Element(0.0))), 4, 3)
Dense [:,1:3]
├─ [:, 1]: SparseList (0.0) [1:4]
├─ [:, 2]: SparseList (0.0) [1:4]
└─ [:, 3]: SparseList (0.0) [1:4]

To initialize a sparse matrix with some values:

julia> A = [0.0 0.0 4.4; 1.1 0.0 0.0; 2.2 0.0 5.5; 3.3 0.0 0.0]
4×3 Matrix{Float64}:
 0.0  0.0  4.4
 1.1  0.0  0.0
 2.2  0.0  5.5
 3.3  0.0  0.0

julia> A_fbr = Tensor(Dense(SparseList(Element(0.0))), A)
Dense [:,1:3]
├─ [:, 1]: SparseList (0.0) [1:4]
│  ├─ [2]: 1.1
│  ├─ [3]: 2.2
│  └─ [4]: 3.3
├─ [:, 2]: SparseList (0.0) [1:4]
└─ [:, 3]: SparseList (0.0) [1:4]
   ├─ [1]: 4.4
   └─ [3]: 5.5

Storage Tree Level Formats

This section describes the formatted storage for Finch tensors, the first argument to the Tensor constructor. Level storage types holds all of the tensor data, and can be nested hierarchichally.

Finch represents tensors hierarchically in a tree, where each node in the tree is a vector of subtensors and the leaves are the elements. Thus, a matrix is analogous to a vector of vectors, and a 3-tensor is analogous to a vector of vectors of vectors. The vectors at each level of the tensor all have the same structure, which can be selected by the user.

In a Finch tensor tree, the child of each node is selected by an array index. All of the children at the same level will use the same format and share the same storage. Finch is column major, so in an expression A[i_1, ..., i_N], the rightmost dimension i_N corresponds to the root level of the tree, and the leftmost dimension i_1 corresponds to the leaf level.

Our example could be visualized as follows:

CSC Format Index Tree

Types of Level Storage

Finch supports a variety of storage formats for each level of the tensor tree, each with advantages and disadvantages. Some storage formats support in-order access, while others support random access. Some storage formats must be written to in column-major order, while others support out-of-order writes. The capabilities of each level are summarized in the following tables along with some general descriptions.

Level Format NameGroupData CharacteristicColumn-Major ReadsRandom ReadsColumn-Major Bulk UpdateRandom Bulk UpdateRandom UpdatesStatus
DenseCoreDense
SparseTreeCoreSparse⚙️
SparseRunTreeCoreSparse Runs⚙️
ElementCoreLeaf
PatternCoreLeaf
SparseListAdvancedSparse
SparseRunListAdvancedSparse Runs
SparseVBLAdvancedSparse Blocks
RepeatedListAdvancedDense Runs
SingleSparseAdvancedSparse
SingleSparseRunAdvancedSparse Runs
SingleBlockAdvancedSparse Blocks⚙️
SparseBytemapAdvancedSparse
SparseDictAdvancedSparse✅️
AtomicLevelModifierNo Data⚙️
SeperationLevelModifierNo Data⚙️
SparseCOOLegacySparse✅️
SparseHashLegacySparse🕸️

The "Level Format Name" is the name of the level datatype. Other columns have descriptions below.

Status

SymbolMeaning
Indicates the level is ready for serious use.
⚙️Indicates the level is experimental and under development.
🕸️Indicates the level is deprecated, and may be removed in a future release.

Groups

Core Group

Contains the basic, minimal set of levels one should use to build and manipulate tensors. These levels can be efficiently read and written to in any order.

Advanced Group

Contains levels which are more specialized, and geared towards bulk updates. These levels may be more efficient in certain cases, but are also more restrictive about access orders and intended for more advanced usage.

Modifier Group

Contains levels which are also more specialized, but not towards a sparsity pattern. These levels modify other levels in a variety of ways, but don't store novel sparsity patterns. Typically, they modify how levels are stored or attach data to levels to support the utilization of various hardware features.

Legacy Group

Contains levels which are not recommended for new code, but are included for compatibility with older code.

Data Characteristics

Level TypeDescription
DenseLevels which store every subtensor.
LeafLevels which store only scalars, used for the leaf level of the tree.
SparseLevels which store only non-fill values, used for levels with few nonzeros.
Sparse RunsLevels which store runs of repeated non-fill values.
Sparse BlocksLevels which store Blocks of repeated non-fill values.
Dense RunsLevels which store runs of repeated values, and no compile-time zero annihilation.
No DataLevels which don't store data but which alter the storage pattern or attach additional meta-data.

Note that the Single sparse levels store a single instance of each nonzero, run, or block. These are useful with a parent level to represent IDs.

Access Characteristics

Operation TypeDescription
Column-Major ReadsIndicates efficient reading of data in column-major order.
Random ReadsIndicates efficient reading of data in random-access order.
Column-Major Bulk UpdateIndicates efficient writing of data in column-major order, the total time roughly linear to the size of the tensor.
Column-Major Random UpdateIndicates efficient writing of data in random-access order, the total time roughly linear to the size of the tensor.
Random UpdateIndicates efficient writing of data in random-access order, the total time roughly linear to the number of updates.

Examples of Popular Formats in Finch

Finch levels can be used to construct a variety of popular sparse formats. A few examples follow:

Format TypeSyntax
Sparse VectorTensor(SparseList(Element(0.0)), args...)
CSC MatrixTensor(Dense(SparseList(Element(0.0))), args...)
CSF 3-TensorTensor(Dense(SparseList(SparseList(Element(0.0)))), args...)
DCSC (Hypersparse) MatrixTensor(SparseList(SparseList(Element(0.0))), args...)
COO MatrixTensor(SparseCOO{2}(Element(0.0)), args...)
COO 3-TensorTensor(SparseCOO{3}(Element(0.0)), args...)
Dictionary-Of-KeysTensor(SparseHash{2}(Element(0.0)), args...)
Run-Length-Encoded ImageTensor(Dense(RepeatedRLE(Element(0.0))), args...)

Tensor Constructors

Finch.TensorType
Tensor{Lvl} <: AbstractFiber{Lvl}

The multidimensional array type used by Finch. Tensor is a thin wrapper around the hierarchical level storage of type Lvl.

Finch.TensorMethod
Tensor(lvl)

Construct a Tensor using the tensor level storage lvl. No initialization of storage is performed, it is assumed that position 1 of lvl corresponds to a valid tensor, and lvl will be wrapped as-is. Call a different constructor to initialize the storage.

Finch.TensorMethod
Tensor(lvl, arr)

Construct a Tensor and initialize it to the contents of arr. To explicitly copy into a tensor, use @ref[copyto!]

Finch.TensorMethod
Tensor(lvl, arr)

Construct a Tensor and initialize it to the contents of arr. To explicitly copy into a tensor, use @ref[copyto!]

Finch.TensorMethod
Tensor(arr, [init = zero(eltype(arr))])

Copy an array-like object arr into a corresponding, similar Tensor datastructure. Uses init as an initial value. May reuse memory when possible. To explicitly copy into a tensor, use @ref[copyto!].

Examples

julia> println(summary(Tensor(sparse([1 0; 0 1]))))
2×2 Tensor(Dense(SparseList(Element(0))))

julia> println(summary(Tensor(ones(3, 2, 4))))
3×2×4 Tensor(Dense(Dense(Dense(Element(0.0)))))

Level Constructors

Core Levels

Finch.DenseLevelType
DenseLevel{[Ti=Int]}(lvl, [dim])

A subfiber of a dense level is an array which stores every slice A[:, ..., :, i] as a distinct subfiber in lvl. Optionally, dim is the size of the last dimension. Ti is the type of the indices used to index the level.

julia> ndims(Tensor(Dense(Element(0.0))))
1

julia> ndims(Tensor(Dense(Dense(Element(0.0)))))
2

julia> Tensor(Dense(Dense(Element(0.0))), [1 2; 3 4])
Dense [:,1:2]
├─ [:, 1]: Dense [1:2]
│  ├─ [1]: 1.0
│  └─ [2]: 3.0
└─ [:, 2]: Dense [1:2]
   ├─ [1]: 2.0
   └─ [2]: 4.0
Finch.ElementLevelType
ElementLevel{D, [Tv=typeof(D)], [Tp=Int], [Val]}()

A subfiber of an element level is a scalar of type Tv, initialized to D. D may optionally be given as the first argument.

The data is stored in a vector of type Val with eltype(Val) = Tv. The type Ti is the index type used to access Val.

julia> Tensor(Dense(Element(0.0)), [1, 2, 3])
Dense [1:3]
├─ [1]: 1.0
├─ [2]: 2.0
└─ [3]: 3.0
Finch.PatternLevelType
PatternLevel{[Tp=Int]}()

A subfiber of a pattern level is the Boolean value true, but it's default is false. PatternLevels are used to create tensors that represent which values are stored by other fibers. See pattern! for usage examples.

julia> Tensor(Dense(Pattern()), 3)
Dense [1:3]
├─ [1]: true
├─ [2]: true
└─ [3]: true

Advanced Levels

Finch.SparseListLevelType
SparseListLevel{[Ti=Int], [Ptr, Idx]}(lvl, [dim])

A subfiber of a sparse level does not need to represent slices A[:, ..., :, i] which are entirely default. Instead, only potentially non-default slices are stored as subfibers in lvl. A sorted list is used to record which slices are stored. Optionally, dim is the size of the last dimension.

Ti is the type of the last tensor index, and Tp is the type used for positions in the level. The types Ptr and Idx are the types of the arrays used to store positions and indicies.

julia> Tensor(Dense(SparseList(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
Dense [:,1:3]
├─ [:, 1]: SparseList (0.0) [1:3]
│  ├─ [1]: 10.0
│  └─ [2]: 30.0
├─ [:, 2]: SparseList (0.0) [1:3]
└─ [:, 3]: SparseList (0.0) [1:3]
   ├─ [1]: 20.0
   └─ [3]: 40.0

julia> Tensor(SparseList(SparseList(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
SparseList (0.0) [:,1:3]
├─ [:, 1]: SparseList (0.0) [1:3]
│  ├─ [1]: 10.0
│  └─ [2]: 30.0
└─ [:, 3]: SparseList (0.0) [1:3]
   ├─ [1]: 20.0
   └─ [3]: 40.0
Finch.SparseByteMapLevelType
SparseByteMapLevel{[Ti=Int], [Ptr, Tbl]}(lvl, [dims])

Like the SparseListLevel, but a dense bitmap is used to encode which slices are stored. This allows the ByteMap level to support random access.

Ti is the type of the last tensor index, and Tp is the type used for positions in the level.

julia> Tensor(Dense(SparseByteMap(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
Dense [:,1:3]
├─ [:, 1]: SparseByteMap (0.0) [1:3]
│  ├─ [1]: 10.0
│  └─ [2]: 30.0
├─ [:, 2]: SparseByteMap (0.0) [1:3]
└─ [:, 3]: SparseByteMap (0.0) [1:3]
   ├─ [1]: 0.0
   └─ [3]: 0.0

julia> Tensor(SparseByteMap(SparseByteMap(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
SparseByteMap (0.0) [:,1:3]
├─ [:, 1]: SparseByteMap (0.0) [1:3]
│  ├─ [1]: 10.0
│  └─ [2]: 30.0
└─ [:, 3]: SparseByteMap (0.0) [1:3]
Finch.RepeatRLELevelType
RepeatRLELevel{[D], [Ti=Int], [Tp=Int], [Tv=typeof(D)]}([dim])

A subfiber of a repeat level is a vector that only stores contiguous repeated values once. The RepeatRLELevel records locations of repeats using a sorted list. Optionally, dim is the size of the vectors.

The fibers have type Tv, initialized to D. D may optionally be given as the first argument. Ti is the type of the last tensor index, and Tp is the type used for positions in the level.

julia> Tensor(RepeatRLE(0.0), [11, 11, 22, 22, 00, 00, 00, 33, 33])
RepeatRLE [1:9]
├─ [1:2]: 11.0
├─ [3:4]: 22.0
├─ [5:7]: 0.0
├─ [8:9]: 33.0
└─ [10:9]: 0.0
Finch.SparseRLELevelType
SparseRLELevel{[Ti=Int], [Ptr, Left, Right]}(lvl, [dim])

The sparse RLE level represent runs of equivalent slices A[:, ..., :, i] which are not entirely default. A sorted list is used to record the left and right endpoints of each run. Optionally, dim is the size of the last dimension.

Ti is the type of the last tensor index, and Tp is the type used for positions in the level. The types Ptr, Left, and Right are the types of the arrays used to store positions and endpoints.

julia> Tensor(Dense(SparseRLELevel(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
Dense [:,1:3]
├─ [:, 1]: SparseRLE (0.0) [1:3]
│  ├─ [1:1]: 10.0
│  └─ [2:2]: 30.0
├─ [:, 2]: SparseRLE (0.0) [1:3]
└─ [:, 3]: SparseRLE (0.0) [1:3]
   ├─ [1:1]: 20.0
   └─ [3:3]: 40.0
Finch.SparseVBLLevelType

SparseVBLLevel{[Ti=Int], [Ptr, Idx, Ofs]}(lvl, [dim])

Like the SparseListLevel, but contiguous subfibers are stored together in blocks.

```jldoctest julia> Tensor(Dense(SparseVBL(Element(0.0))), [10 0 20; 30 0 0; 0 0 40]) Dense [:,1:3] ├─[:,1]: SparseList (0.0) [1:3] │ ├─[1]: 10.0 │ ├─[2]: 30.0 ├─[:,2]: SparseList (0.0) [1:3] ├─[:,3]: SparseList (0.0) [1:3] │ ├─[1]: 20.0 │ ├─[3]: 40.0

julia> Tensor(SparseVBL(SparseVBL(Element(0.0))), [10 0 20; 30 0 0; 0 0 40]) SparseList (0.0) [:,1:3] ├─[:,1]: SparseList (0.0) [1:3] │ ├─[1]: 10.0 │ ├─[2]: 30.0 ├─[:,3]: SparseList (0.0) [1:3] │ ├─[1]: 20.0 │ ├─[3]: 40.0

Modifier Levels

Finch.SeparateLevelType
SeparateLevel{Lvl, [Val]}()

A subfiber of a Separate level is a separate tensor of type Lvl, in it's own memory space.

Each sublevel is stored in a vector of type Val with eltype(Val) = Lvl.

julia> Tensor(Dense(Separate(Element(0.0))), [1, 2, 3])
Dense [1:3]
├─ [1]: Pointer ->
│  └─ 1.0
├─ [2]: Pointer ->
│  └─ 2.0
└─ [3]: Pointer ->
   └─ 3.0

Legacy Levels

Finch.SparseCOOLevelType
SparseCOOLevel{[N], [TI=Tuple{Int...}], [Ptr, Tbl]}(lvl, [dims])

A subfiber of a sparse level does not need to represent slices which are entirely default. Instead, only potentially non-default slices are stored as subfibers in lvl. The sparse coo level corresponds to N indices in the subfiber, so fibers in the sublevel are the slices A[:, ..., :, i_1, ..., i_n]. A set of N lists (one for each index) are used to record which slices are stored. The coordinates (sets of N indices) are sorted in column major order. Optionally, dims are the sizes of the last dimensions.

TI is the type of the last N tensor indices, and Tp is the type used for positions in the level.

The type Tbl is an NTuple type where each entry k is a subtype AbstractVector{TI[k]}.

The type Ptr is the type for the pointer array.

julia> Tensor(Dense(SparseCOO{1}(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
Dense [:,1:3]
├─ [:, 1]: SparseCOO{1} (0.0) [1:3]
│  ├─ [1]: 10.0
│  └─ [2]: 30.0
├─ [:, 2]: SparseCOO{1} (0.0) [1:3]
└─ [:, 3]: SparseCOO{1} (0.0) [1:3]
   ├─ [1]: 20.0
   └─ [3]: 40.0

julia> Tensor(SparseCOO{2}(Element(0.0)), [10 0 20; 30 0 0; 0 0 40])
SparseCOO{2} (0.0) [:,1:3]
├─ [1, 1]: 10.0
├─ [2, 1]: 30.0
├─ [1, 3]: 20.0
└─ [3, 3]: 40.0
Finch.SparseHashLevelType
SparseHashLevel{[N], [TI=Tuple{Int...}], [Ptr, Tbl, Srt]}(lvl, [dims])

A subfiber of a sparse level does not need to represent slices which are entirely default. Instead, only potentially non-default slices are stored as subfibers in lvl. The sparse hash level corresponds to N indices in the subfiber, so fibers in the sublevel are the slices A[:, ..., :, i_1, ..., i_n]. A hash table is used to record which slices are stored. Optionally, dims are the sizes of the last dimensions.

TI is the type of the last N tensor indices, and Tp is the type used for positions in the level. Tbl is the type of the dictionary used to do hashing, Ptr stores the positions of subfibers, and Srt stores the sorted key/value pairs in the hash table.

julia> Tensor(Dense(SparseHash{1}(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
Dense [:,1:3]
├─ [:, 1]: SparseHash{1} (0.0) [1:3]
│  ├─ [1]: 10.0
│  └─ [2]: 30.0
├─ [:, 2]: SparseHash{1} (0.0) [1:3]
└─ [:, 3]: SparseHash{1} (0.0) [1:3]
   ├─ [1]: 20.0
   └─ [3]: 40.0

julia> Tensor(SparseHash{2}(Element(0.0)), [10 0 20; 30 0 0; 0 0 40])
SparseHash{2} (0.0) [:,1:3]
├─ [1, 1]: 10.0
├─ [2, 1]: 30.0
├─ [1, 3]: 20.0
└─ [3, 3]: 40.0