Constructing Tensors

You can build a finch tensor with the Tensor constructor. In general, the Tensor constructor mirrors Julia's Array constructor, but with an additional prefixed argument which specifies the formatted storage for the tensor.

For example, to construct an empty sparse matrix:

julia> A_fbr = Tensor(Dense(SparseList(Element(0.0))), 4, 3)
4×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseList (0.0) [1:4]
   ├─ [:, 2]: SparseList (0.0) [1:4]
   └─ [:, 3]: SparseList (0.0) [1:4]

To initialize a sparse matrix with some values:

julia> A = [0.0 0.0 4.4; 1.1 0.0 0.0; 2.2 0.0 5.5; 3.3 0.0 0.0]
4×3 Matrix{Float64}:
 0.0  0.0  4.4
 1.1  0.0  0.0
 2.2  0.0  5.5
 3.3  0.0  0.0

julia> A_fbr = Tensor(Dense(SparseList(Element(0.0))), A)
4×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseList (0.0) [1:4]
   │  ├─ [2]: 1.1
   │  ├─ [3]: 2.2
   │  └─ [4]: 3.3
   ├─ [:, 2]: SparseList (0.0) [1:4]
   └─ [:, 3]: SparseList (0.0) [1:4]
      ├─ [1]: 4.4
      └─ [3]: 5.5

Storage Tree Level Formats

This section describes the formatted storage for Finch tensors, the first argument to the Tensor constructor. Level storage types holds all of the tensor data, and can be nested hierarchichally.

Finch represents tensors hierarchically in a tree, where each node in the tree is a vector of subtensors and the leaves are the elements. Thus, a matrix is analogous to a vector of vectors, and a 3-tensor is analogous to a vector of vectors of vectors. The vectors at each level of the tensor all have the same structure, which can be selected by the user.

In a Finch tensor tree, the child of each node is selected by an array index. All of the children at the same level will use the same format and share the same storage. Finch is column major, so in an expression A[i_1, ..., i_N], the rightmost dimension i_N corresponds to the root level of the tree, and the leftmost dimension i_1 corresponds to the leaf level.

Our example could be visualized as follows:

CSC Format Index Tree

Types of Level Storage

Finch supports a variety of storage formats for each level of the tensor tree, each with advantages and disadvantages. Some storage formats support in-order access, while others support random access. Some storage formats must be written to in column-major order, while others support out-of-order writes. The capabilities of each level are summarized in the following tables along with some general descriptions.

Level Format NameGroupData CharacteristicColumn-Major ReadsRandom ReadsColumn-Major Bulk UpdateRandom Bulk UpdateRandom UpdatesStatus
DenseCoreDense
SparseTreeCoreSparse⚙️
SparseRLETreeCoreSparse Runs⚙️
ElementCoreLeaf
PatternCoreLeaf
SparseListAdvancedSparse
SparseRLEAdvancedSparse Runs
SparseVBLAdvancedSparse Blocks
SparsePointAdvancedSingle Sparse
SparseIntervalAdvancedSingle Sparse Run
SparseBandAdvancedSingle Sparse Block⚙️
DenseRLEAdvancedDense Runs
SparseBytemapAdvancedSparse
SparseDictAdvancedSparse✅️
AtomicLevelModifierNo Data⚙️
SeperationLevelModifierNo Data⚙️
SparseCOOLegacySparse✅️

The "Level Format Name" is the name of the level datatype. Other columns have descriptions below.

Status

SymbolMeaning
Indicates the level is ready for serious use.
⚙️Indicates the level is experimental and under development.
🕸️Indicates the level is deprecated, and may be removed in a future release.

Groups

Core Group

Contains the basic, minimal set of levels one should use to build and manipulate tensors. These levels can be efficiently read and written to in any order.

Advanced Group

Contains levels which are more specialized, and geared towards bulk updates. These levels may be more efficient in certain cases, but are also more restrictive about access orders and intended for more advanced usage.

Modifier Group

Contains levels which are also more specialized, but not towards a sparsity pattern. These levels modify other levels in a variety of ways, but don't store novel sparsity patterns. Typically, they modify how levels are stored or attach data to levels to support the utilization of various hardware features.

Legacy Group

Contains levels which are not recommended for new code, but are included for compatibility with older code.

Data Characteristics

Level TypeDescription
DenseLevels which store every subtensor.
LeafLevels which store only scalars, used for the leaf level of the tree.
SparseLevels which store only non-fill values, used for levels with few nonzeros.
Sparse RunsLevels which store runs of repeated non-fill values.
Sparse BlocksLevels which store Blocks of repeated non-fill values.
Dense RunsLevels which store runs of repeated values, and no compile-time zero annihilation.
No DataLevels which don't store data but which alter the storage pattern or attach additional meta-data.

Note that the Single sparse levels store a single instance of each nonzero, run, or block. These are useful with a parent level to represent IDs.

Access Characteristics

Operation TypeDescription
Column-Major ReadsIndicates efficient reading of data in column-major order.
Random ReadsIndicates efficient reading of data in random-access order.
Column-Major Bulk UpdateIndicates efficient writing of data in column-major order, the total time roughly linear to the size of the tensor.
Column-Major Random UpdateIndicates efficient writing of data in random-access order, the total time roughly linear to the size of the tensor.
Random UpdateIndicates efficient writing of data in random-access order, the total time roughly linear to the number of updates.

Examples of Popular Formats in Finch

Finch levels can be used to construct a variety of popular sparse formats. A few examples follow:

Format TypeSyntax
Sparse VectorTensor(SparseList(Element(0.0)), args...)
CSC MatrixTensor(Dense(SparseList(Element(0.0))), args...)
CSF 3-TensorTensor(Dense(SparseList(SparseList(Element(0.0)))), args...)
DCSC (Hypersparse) MatrixTensor(SparseList(SparseList(Element(0.0))), args...)
COO MatrixTensor(SparseCOO{2}(Element(0.0)), args...)
COO 3-TensorTensor(SparseCOO{3}(Element(0.0)), args...)
Run-Length-Encoded ImageTensor(Dense(DenseRLE(Element(0.0))), args...)

Tensor Constructors

Finch.TensorType
Tensor{Lvl} <: AbstractFiber{Lvl}

The multidimensional array type used by Finch. Tensor is a thin wrapper around the hierarchical level storage of type Lvl.

Finch.TensorMethod
Tensor(lvl)

Construct a Tensor using the tensor level storage lvl. No initialization of storage is performed, it is assumed that position 1 of lvl corresponds to a valid tensor, and lvl will be wrapped as-is. Call a different constructor to initialize the storage.

Finch.TensorMethod
Tensor(lvl, [undef], dims...)

Construct a Tensor of size dims, and initialize to undef, potentially allocating memory. Here undef is the UndefInitializer singleton type. dims... may be a variable number of dimensions or a tuple of dimensions, but it must correspond to the number of dimensions in lvl.

Finch.TensorMethod
Tensor(lvl, arr)

Construct a Tensor and initialize it to the contents of arr. To explicitly copy into a tensor, use @ref[copyto!]

Finch.TensorMethod
Tensor(lvl, arr)

Construct a Tensor and initialize it to the contents of arr. To explicitly copy into a tensor, use @ref[copyto!]

Finch.TensorMethod
Tensor(arr, [init = zero(eltype(arr))])

Copy an array-like object arr into a corresponding, similar Tensor datastructure. Uses init as an initial value. May reuse memory when possible. To explicitly copy into a tensor, use @ref[copyto!].

Examples

julia> println(summary(Tensor(sparse([1 0; 0 1]))))
2×2 Tensor(Dense(SparseList(Element(0))))

julia> println(summary(Tensor(ones(3, 2, 4))))
3×2×4 Tensor(Dense(Dense(Dense(Element(0.0)))))

Level Constructors

Core Levels

Finch.DenseLevelType
DenseLevel{[Ti=Int]}(lvl, [dim])

A subfiber of a dense level is an array which stores every slice A[:, ..., :, i] as a distinct subfiber in lvl. Optionally, dim is the size of the last dimension. Ti is the type of the indices used to index the level.

julia> ndims(Tensor(Dense(Element(0.0))))
1

julia> ndims(Tensor(Dense(Dense(Element(0.0)))))
2

julia> Tensor(Dense(Dense(Element(0.0))), [1 2; 3 4])
2×2-Tensor
└─ Dense [:,1:2]
   ├─ [:, 1]: Dense [1:2]
   │  ├─ [1]: 1.0
   │  └─ [2]: 3.0
   └─ [:, 2]: Dense [1:2]
      ├─ [1]: 2.0
      └─ [2]: 4.0
Finch.ElementLevelType
ElementLevel{Vf, [Tv=typeof(Vf)], [Tp=Int], [Val]}()

A subfiber of an element level is a scalar of type Tv, initialized to Vf. Vf may optionally be given as the first argument.

The data is stored in a vector of type Val with eltype(Val) = Tv. The type Tp is the index type used to access Val.

julia> Tensor(Dense(Element(0.0)), [1, 2, 3])
3-Tensor
└─ Dense [1:3]
   ├─ [1]: 1.0
   ├─ [2]: 2.0
   └─ [3]: 3.0
Finch.PatternLevelType
PatternLevel{[Tp=Int]}()

A subfiber of a pattern level is the Boolean value true, but it's fill_value is false. PatternLevels are used to create tensors that represent which values are stored by other fibers. See pattern! for usage examples.

julia> Tensor(Dense(Pattern()), 3)
3-Tensor
└─ Dense [1:3]
   ├─ [1]: true
   ├─ [2]: true
   └─ [3]: true

Advanced Levels

Finch.SparseListLevelType
SparseListLevel{[Ti=Int], [Ptr, Idx]}(lvl, [dim])

A subfiber of a sparse level does not need to represent slices A[:, ..., :, i] which are entirely fill_value. Instead, only potentially non-fill slices are stored as subfibers in lvl. A sorted list is used to record which slices are stored. Optionally, dim is the size of the last dimension.

Ti is the type of the last tensor index, and Tp is the type used for positions in the level. The types Ptr and Idx are the types of the arrays used to store positions and indicies.

julia> Tensor(Dense(SparseList(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseList (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   ├─ [:, 2]: SparseList (0.0) [1:3]
   └─ [:, 3]: SparseList (0.0) [1:3]
      ├─ [1]: 20.0
      └─ [3]: 40.0

julia> Tensor(SparseList(SparseList(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ SparseList (0.0) [:,1:3]
   ├─ [:, 1]: SparseList (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   └─ [:, 3]: SparseList (0.0) [1:3]
      ├─ [1]: 20.0
      └─ [3]: 40.0
Finch.DenseRLELevelType
DenseRLELevel{[Ti=Int], [Ptr, Right]}(lvl, [dim], [merge = true])

The dense RLE level represent runs of equivalent slices A[:, ..., :, i]. A sorted list is used to record the right endpoint of each run. Optionally, dim is the size of the last dimension.

Ti is the type of the last tensor index, and Tp is the type used for positions in the level. The types Ptr and Right are the types of the arrays used to store positions and endpoints.

The merge keyword argument is used to specify whether the level should merge duplicate consecutive runs.

julia> Tensor(Dense(DenseRLELevel(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: DenseRLE (0.0) [1:3]
   │  ├─ [1:1]: 10.0
   │  ├─ [2:2]: 30.0
   │  └─ [3:3]: 0.0
   ├─ [:, 2]: DenseRLE (0.0) [1:3]
   │  └─ [1:3]: 0.0
   └─ [:, 3]: DenseRLE (0.0) [1:3]
      ├─ [1:1]: 20.0
      ├─ [2:2]: 0.0
      └─ [3:3]: 40.0
Finch.SparseRLELevelType
SparseRLELevel{[Ti=Int], [Ptr, Left, Right]}(lvl, [dim]; [merge = true])

The sparse RLE level represent runs of equivalent slices A[:, ..., :, i] which are not entirely fill_value. A sorted list is used to record the left and right endpoints of each run. Optionally, dim is the size of the last dimension.

Ti is the type of the last tensor index, and Tp is the type used for positions in the level. The types Ptr, Left, and Right are the types of the arrays used to store positions and endpoints.

The merge keyword argument is used to specify whether the level should merge duplicate consecutive runs.

julia> Tensor(Dense(SparseRLELevel(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseRLE (0.0) [1:3]
   │  ├─ [1:1]: 10.0
   │  └─ [2:2]: 30.0
   ├─ [:, 2]: SparseRLE (0.0) [1:3]
   └─ [:, 3]: SparseRLE (0.0) [1:3]
      ├─ [1:1]: 20.0
      └─ [3:3]: 40.0
Finch.SparseVBLLevelType

SparseVBLLevel{[Ti=Int], [Ptr, Idx, Ofs]}(lvl, [dim])

Like the SparseListLevel, but contiguous subfibers are stored together in blocks.

```jldoctest julia> Tensor(Dense(SparseVBL(Element(0.0))), [10 0 20; 30 0 0; 0 0 40]) Dense [:,1:3] ├─[:,1]: SparseList (0.0) [1:3] │ ├─[1]: 10.0 │ ├─[2]: 30.0 ├─[:,2]: SparseList (0.0) [1:3] ├─[:,3]: SparseList (0.0) [1:3] │ ├─[1]: 20.0 │ ├─[3]: 40.0

julia> Tensor(SparseVBL(SparseVBL(Element(0.0))), [10 0 20; 30 0 0; 0 0 40]) SparseList (0.0) [:,1:3] ├─[:,1]: SparseList (0.0) [1:3] │ ├─[1]: 10.0 │ ├─[2]: 30.0 ├─[:,3]: SparseList (0.0) [1:3] │ ├─[1]: 20.0 │ ├─[3]: 40.0

Finch.SparseBandLevelType

SparseBandLevel{[Ti=Int], [Ptr, Idx, Ofs]}(lvl, [dim])

Like the SparseVBLLevel, but stores only a single block, and fills in zeros.

```jldoctest julia> Tensor(Dense(SparseBand(Element(0.0))), [10 0 20; 30 40 0; 0 0 50]) Dense [:,1:3] ├─[:,1]: SparseList (0.0) [1:3] │ ├─[1]: 10.0 │ ├─[2]: 30.0 ├─[:,2]: SparseList (0.0) [1:3] ├─[:,3]: SparseList (0.0) [1:3] │ ├─[1]: 20.0 │ ├─[3]: 40.0

Finch.SparsePointLevelType
SparsePointLevel{[Ti=Int], [Ptr, Idx]}(lvl, [dim])

A subfiber of a SparsePoint level does not need to represent slices A[:, ..., :, i] which are entirely fill_value. Instead, only potentially non-fill slices are stored as subfibers in lvl. A main difference compared to SparseList level is that SparsePoint level only stores a 'single' non-fill slice. It emits an error if the program tries to write multiple (>=2) coordinates into SparsePoint.

Ti is the type of the last tensor index. The types Ptr and Idx are the types of the arrays used to store positions and indicies.

julia> Tensor(Dense(SparsePoint(Element(0.0))), [10 0 0; 0 20 0; 0 0 30])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparsePoint (0.0) [1:3]
   │  └─ 10.0
   ├─ [:, 2]: SparsePoint (0.0) [1:3]
   │  └─ 20.0
   └─ [:, 3]: SparsePoint (0.0) [1:3]
      └─ 30.0

julia> Tensor(SparsePoint(Dense(Element(0.0))), [0 0 0; 0 0 30; 0 0 30])
3×3-Tensor
└─ SparsePoint (0.0) [:,1:3]
   └─ Dense [1:3]
      ├─ [1]: 0.0
      ├─ [2]: 30.0
      └─ [3]: 30.0
Finch.SparseIntervalLevelType
SparseIntervalLevel{[Ti=Int], [Ptr, Left, Right]}(lvl, [dim])

The single RLE level represent runs of equivalent slices A[:, ..., :, i] which are not entirely fill_value. A main difference compared to SparseRLE level is that SparseInterval level only stores a 'single' non-fill run. It emits an error if the program tries to write multiple (>=2) runs into SparseInterval.

Ti is the type of the last tensor index. The types Ptr, Left, and 'Right' are the types of the arrays used to store positions and endpoints.

julia> Tensor(SparseInterval(Element(0)), [0, 10, 0])
3-Tensor
└─ SparseInterval (0) [1:3]
   └─ [2:2]: 10

julia> x = Tensor(SparseInterval(Element(0)), 10);

julia> @finch begin for i = extent(3,6); x[~i] = 1 end end;

julia> x
10-Tensor
└─ SparseInterval (0) [1:10]
   └─ [3:6]: 1
Finch.SparseByteMapLevelType
SparseByteMapLevel{[Ti=Int], [Ptr, Tbl]}(lvl, [dims])

Like the SparseListLevel, but a dense bitmap is used to encode which slices are stored. This allows the ByteMap level to support random access.

Ti is the type of the last tensor index, and Tp is the type used for positions in the level.

julia> Tensor(Dense(SparseByteMap(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseByteMap (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   ├─ [:, 2]: SparseByteMap (0.0) [1:3]
   └─ [:, 3]: SparseByteMap (0.0) [1:3]
      ├─ [1]: 0.0
      └─ [3]: 0.0

julia> Tensor(SparseByteMap(SparseByteMap(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ SparseByteMap (0.0) [:,1:3]
   ├─ [:, 1]: SparseByteMap (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   └─ [:, 3]: SparseByteMap (0.0) [1:3]
Finch.SparseDictLevelType
SparseDictLevel{[Ti=Int], [Tp=Int], [Ptr, Idx, Val, Tbl, Pool=Dict]}(lvl, [dim])

A subfiber of a sparse level does not need to represent slices A[:, ..., :, i] which are entirely fill_value. Instead, only potentially non-fill slices are stored as subfibers in lvl. A datastructure specified by Tbl is used to record which slices are stored. Optionally, dim is the size of the last dimension.

Ti is the type of the last fiber index, and Tp is the type used for positions in the level. The types Ptr and Idx are the types of the arrays used to store positions and indicies.

julia> Tensor(Dense(SparseDict(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseDict (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   ├─ [:, 2]: SparseDict (0.0) [1:3]
   └─ [:, 3]: SparseDict (0.0) [1:3]
      ├─ [1]: 20.0
      └─ [3]: 40.0

julia> Tensor(SparseDict(SparseDict(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ SparseDict (0.0) [:,1:3]
   ├─ [:, 1]: SparseDict (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   └─ [:, 3]: SparseDict (0.0) [1:3]
      ├─ [1]: 20.0
      └─ [3]: 40.0

Modifier Levels

Finch.SeparateLevelType
SeparateLevel{Lvl, [Val]}()

A subfiber of a Separate level is a separate tensor of type Lvl, in it's own memory space.

Each sublevel is stored in a vector of type Val with eltype(Val) = Lvl.

julia> Tensor(Dense(Separate(Element(0.0))), [1, 2, 3])
3-Tensor
└─ Dense [1:3]
   ├─ [1]: Pointer ->
   │  └─ 1.0
   ├─ [2]: Pointer ->
   │  └─ 2.0
   └─ [3]: Pointer ->
      └─ 3.0

Legacy Levels

Finch.SparseCOOLevelType
SparseCOOLevel{[N], [TI=Tuple{Int...}], [Ptr, Tbl]}(lvl, [dims])

A subfiber of a sparse level does not need to represent slices which are entirely fill_value. Instead, only potentially non-fill slices are stored as subfibers in lvl. The sparse coo level corresponds to N indices in the subfiber, so fibers in the sublevel are the slices A[:, ..., :, i_1, ..., i_n]. A set of N lists (one for each index) are used to record which slices are stored. The coordinates (sets of N indices) are sorted in column major order. Optionally, dims are the sizes of the last dimensions.

TI is the type of the last N tensor indices, and Tp is the type used for positions in the level.

The type Tbl is an NTuple type where each entry k is a subtype AbstractVector{TI[k]}.

The type Ptr is the type for the pointer array.

julia> Tensor(Dense(SparseCOO{1}(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseCOO{1} (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   ├─ [:, 2]: SparseCOO{1} (0.0) [1:3]
   └─ [:, 3]: SparseCOO{1} (0.0) [1:3]
      ├─ [1]: 20.0
      └─ [3]: 40.0

julia> Tensor(SparseCOO{2}(Element(0.0)), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ SparseCOO{2} (0.0) [:,1:3]
   ├─ [1, 1]: 10.0
   ├─ [2, 1]: 30.0
   ├─ [1, 3]: 20.0
   └─ [3, 3]: 40.0