Constructing Tensors

You can build a finch tensor with the Tensor constructor. In general, the Tensor constructor mirrors Julia's Array constructor, but with an additional prefixed argument which specifies the formatted storage for the tensor.

For example, to construct an empty sparse matrix:

julia> A_fbr = Tensor(Dense(SparseList(Element(0.0))), 4, 3)
4×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseList (0.0) [1:4]
   ├─ [:, 2]: SparseList (0.0) [1:4]
   └─ [:, 3]: SparseList (0.0) [1:4]

To initialize a sparse matrix with some values:

julia> A = [0.0 0.0 4.4; 1.1 0.0 0.0; 2.2 0.0 5.5; 3.3 0.0 0.0]
4×3 Matrix{Float64}:
 0.0  0.0  4.4
 1.1  0.0  0.0
 2.2  0.0  5.5
 3.3  0.0  0.0

julia> A_fbr = Tensor(Dense(SparseList(Element(0.0))), A)
4×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseList (0.0) [1:4]
   │  ├─ [2]: 1.1
   │  ├─ [3]: 2.2
   │  └─ [4]: 3.3
   ├─ [:, 2]: SparseList (0.0) [1:4]
   └─ [:, 3]: SparseList (0.0) [1:4]
      ├─ [1]: 4.4
      └─ [3]: 5.5

Storage Tree Level Formats

This section describes the formatted storage for Finch tensors, the first argument to the Tensor constructor. Level storage types holds all of the tensor data, and can be nested hierarchichally.

Finch represents tensors hierarchically in a tree, where each node in the tree is a vector of subtensors and the leaves are the elements. Thus, a matrix is analogous to a vector of vectors, and a 3-tensor is analogous to a vector of vectors of vectors. The vectors at each level of the tensor all have the same structure, which can be selected by the user.

In a Finch tensor tree, the child of each node is selected by an array index. All of the children at the same level will use the same format and share the same storage. Finch is column major, so in an expression A[i_1, ..., i_N], the rightmost dimension i_N corresponds to the root level of the tree, and the leftmost dimension i_1 corresponds to the leaf level.

Our example could be visualized as follows:

CSC Format Index Tree

Types of Level Storage

Finch supports a variety of storage formats for each level of the tensor tree, each with advantages and disadvantages. Some storage formats support in-order access, while others support random access. Some storage formats must be written to in column-major order, while others support out-of-order writes. The capabilities of each level are summarized in the following tables along with some general descriptions.

Level Format Name	Group	Data Characteristic	Column-Major Reads	Random Reads	Column-Major Bulk Update	Random Bulk Update	Random Updates	Status
Dense	Core	Dense	✅	✅	✅	✅	✅	✅
SparseTree	Core	Sparse	✅	✅	✅	✅	✅	⚙️
SparseRLETree	Core	Sparse Runs	✅	✅	✅	✅	✅	⚙️
Element	Core	Leaf	✅	✅	✅	✅	✅	✅
Pattern	Core	Leaf	✅	✅	✅	✅	✅	✅
SparseList	Advanced	Sparse	✅	❌	✅	❌	❌	✅
SparseRLE	Advanced	Sparse Runs	✅	❌	✅	❌	❌	✅
SparseVBL	Advanced	Sparse Blocks	✅	❌	✅	❌	❌	✅
SparsePoint	Advanced	Single Sparse	✅	✅	✅	❌	❌	✅
SparseInterval	Advanced	Single Sparse Run	✅	✅	✅	❌	❌	✅
SparseBand	Advanced	Single Sparse Block	✅	✅	✅	❌	❌	⚙️
DenseRLE	Advanced	Dense Runs	✅	❌	✅	❌	❌	⚙
SparseBytemap	Advanced	Sparse	✅	✅	✅	✅	❌	✅
SparseDict	Advanced	Sparse	✅	✅	✅	✅	❌	✅️
AtomicLevel	Modifier	No Data	✅	✅	✅	✅	✅	⚙️
SeperationLevel	Modifier	No Data	✅	✅	✅	✅	✅	⚙️
SparseCOO	Legacy	Sparse	✅	✅	✅	❌	✅	✅️

The "Level Format Name" is the name of the level datatype. Other columns have descriptions below.

Status

Symbol	Meaning
✅	Indicates the level is ready for serious use.
⚙️	Indicates the level is experimental and under development.
🕸️	Indicates the level is deprecated, and may be removed in a future release.

Groups

Core Group

Contains the basic, minimal set of levels one should use to build and manipulate tensors. These levels can be efficiently read and written to in any order.

Advanced Group

Contains levels which are more specialized, and geared towards bulk updates. These levels may be more efficient in certain cases, but are also more restrictive about access orders and intended for more advanced usage.

Modifier Group

Contains levels which are also more specialized, but not towards a sparsity pattern. These levels modify other levels in a variety of ways, but don't store novel sparsity patterns. Typically, they modify how levels are stored or attach data to levels to support the utilization of various hardware features.

Legacy Group

Contains levels which are not recommended for new code, but are included for compatibility with older code.

Data Characteristics

Level Type	Description
Dense	Levels which store every subtensor.
Leaf	Levels which store only scalars, used for the leaf level of the tree.
Sparse	Levels which store only non-fill values, used for levels with few nonzeros.
Sparse Runs	Levels which store runs of repeated non-fill values.
Sparse Blocks	Levels which store Blocks of repeated non-fill values.
Dense Runs	Levels which store runs of repeated values, and no compile-time zero annihilation.
No Data	Levels which don't store data but which alter the storage pattern or attach additional meta-data.

Note that the Single sparse levels store a single instance of each nonzero, run, or block. These are useful with a parent level to represent IDs.

Access Characteristics

Operation Type	Description
Column-Major Reads	Indicates efficient reading of data in column-major order.
Random Reads	Indicates efficient reading of data in random-access order.
Column-Major Bulk Update	Indicates efficient writing of data in column-major order, the total time roughly linear to the size of the tensor.
Column-Major Random Update	Indicates efficient writing of data in random-access order, the total time roughly linear to the size of the tensor.
Random Update	Indicates efficient writing of data in random-access order, the total time roughly linear to the number of updates.

Examples of Popular Formats in Finch

Finch levels can be used to construct a variety of popular sparse formats. A few examples follow:

Format Type	Syntax
Sparse Vector	`Tensor(SparseList(Element(0.0)), args...)`
CSC Matrix	`Tensor(Dense(SparseList(Element(0.0))), args...)`
CSF 3-Tensor	`Tensor(Dense(SparseList(SparseList(Element(0.0)))), args...)`
DCSC (Hypersparse) Matrix	`Tensor(SparseList(SparseList(Element(0.0))), args...)`
COO Matrix	`Tensor(SparseCOO{2}(Element(0.0)), args...)`
COO 3-Tensor	`Tensor(SparseCOO{3}(Element(0.0)), args...)`
Run-Length-Encoded Image	`Tensor(Dense(DenseRLE(Element(0.0))), args...)`

Tensor Constructors

Finch.Tensor — Type

Tensor{Lvl} <: AbstractFiber{Lvl}

The multidimensional array type used by Finch. Tensor is a thin wrapper around the hierarchical level storage of type Lvl.

Finch.Tensor — Method

Tensor(lvl)

Construct a Tensor using the tensor level storage lvl. No initialization of storage is performed, it is assumed that position 1 of lvl corresponds to a valid tensor, and lvl will be wrapped as-is. Call a different constructor to initialize the storage.

Finch.Tensor — Method

Tensor(lvl, [undef], dims...)

Construct a Tensor of size dims, and initialize to undef, potentially allocating memory. Here undef is the UndefInitializer singleton type. dims... may be a variable number of dimensions or a tuple of dimensions, but it must correspond to the number of dimensions in lvl.

Finch.Tensor — Method

Tensor(lvl, arr)

Construct a Tensor and initialize it to the contents of arr. To explicitly copy into a tensor, use @ref[copyto!]

Finch.Tensor — Method

Tensor(lvl, arr)

Construct a Tensor and initialize it to the contents of arr. To explicitly copy into a tensor, use @ref[copyto!]

Finch.Tensor — Method

Tensor(arr, [init = zero(eltype(arr))])

Copy an array-like object arr into a corresponding, similar Tensor datastructure. Uses init as an initial value. May reuse memory when possible. To explicitly copy into a tensor, use @ref[copyto!].

Examples

julia> println(summary(Tensor(sparse([1 0; 0 1]))))
2×2 Tensor(Dense(SparseList(Element(0))))

julia> println(summary(Tensor(ones(3, 2, 4))))
3×2×4 Tensor(Dense(Dense(Dense(Element(0.0)))))

Level Constructors

Core Levels

Finch.DenseLevel — Type

DenseLevel{[Ti=Int]}(lvl, [dim])

A subfiber of a dense level is an array which stores every slice A[:, ..., :, i] as a distinct subfiber in lvl. Optionally, dim is the size of the last dimension. Ti is the type of the indices used to index the level.

julia> ndims(Tensor(Dense(Element(0.0))))
1

julia> ndims(Tensor(Dense(Dense(Element(0.0)))))
2

julia> Tensor(Dense(Dense(Element(0.0))), [1 2; 3 4])
2×2-Tensor
└─ Dense [:,1:2]
   ├─ [:, 1]: Dense [1:2]
   │  ├─ [1]: 1.0
   │  └─ [2]: 3.0
   └─ [:, 2]: Dense [1:2]
      ├─ [1]: 2.0
      └─ [2]: 4.0

Finch.ElementLevel — Type

ElementLevel{Vf, [Tv=typeof(Vf)], [Tp=Int], [Val]}()

A subfiber of an element level is a scalar of type Tv, initialized to Vf. Vf may optionally be given as the first argument.

The data is stored in a vector of type Val with eltype(Val) = Tv. The type Tp is the index type used to access Val.

julia> Tensor(Dense(Element(0.0)), [1, 2, 3])
3-Tensor
└─ Dense [1:3]
   ├─ [1]: 1.0
   ├─ [2]: 2.0
   └─ [3]: 3.0

Finch.PatternLevel — Type

PatternLevel{[Tp=Int]}()

A subfiber of a pattern level is the Boolean value true, but it's fill_value is false. PatternLevels are used to create tensors that represent which values are stored by other fibers. See pattern! for usage examples.

julia> Tensor(Dense(Pattern()), 3)
3-Tensor
└─ Dense [1:3]
   ├─ [1]: true
   ├─ [2]: true
   └─ [3]: true

Advanced Levels

Finch.SparseListLevel — Type

SparseListLevel{[Ti=Int], [Ptr, Idx]}(lvl, [dim])

A subfiber of a sparse level does not need to represent slices A[:, ..., :, i] which are entirely fill_value. Instead, only potentially non-fill slices are stored as subfibers in lvl. A sorted list is used to record which slices are stored. Optionally, dim is the size of the last dimension.

Ti is the type of the last tensor index, and Tp is the type used for positions in the level. The types Ptr and Idx are the types of the arrays used to store positions and indicies.

julia> Tensor(Dense(SparseList(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseList (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   ├─ [:, 2]: SparseList (0.0) [1:3]
   └─ [:, 3]: SparseList (0.0) [1:3]
      ├─ [1]: 20.0
      └─ [3]: 40.0

julia> Tensor(SparseList(SparseList(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ SparseList (0.0) [:,1:3]
   ├─ [:, 1]: SparseList (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   └─ [:, 3]: SparseList (0.0) [1:3]
      ├─ [1]: 20.0
      └─ [3]: 40.0

Finch.DenseRLELevel — Type

DenseRLELevel{[Ti=Int], [Ptr, Right]}(lvl, [dim], [merge = true])

The dense RLE level represent runs of equivalent slices A[:, ..., :, i]. A sorted list is used to record the right endpoint of each run. Optionally, dim is the size of the last dimension.

Ti is the type of the last tensor index, and Tp is the type used for positions in the level. The types Ptr and Right are the types of the arrays used to store positions and endpoints.

The merge keyword argument is used to specify whether the level should merge duplicate consecutive runs.

julia> Tensor(Dense(DenseRLELevel(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: DenseRLE (0.0) [1:3]
   │  ├─ [1:1]: 10.0
   │  ├─ [2:2]: 30.0
   │  └─ [3:3]: 0.0
   ├─ [:, 2]: DenseRLE (0.0) [1:3]
   │  └─ [1:3]: 0.0
   └─ [:, 3]: DenseRLE (0.0) [1:3]
      ├─ [1:1]: 20.0
      ├─ [2:2]: 0.0
      └─ [3:3]: 40.0

Finch.SparseRLELevel — Type

SparseRLELevel{[Ti=Int], [Ptr, Left, Right]}(lvl, [dim]; [merge = true])

The sparse RLE level represent runs of equivalent slices A[:, ..., :, i] which are not entirely fill_value. A sorted list is used to record the left and right endpoints of each run. Optionally, dim is the size of the last dimension.

Ti is the type of the last tensor index, and Tp is the type used for positions in the level. The types Ptr, Left, and Right are the types of the arrays used to store positions and endpoints.

The merge keyword argument is used to specify whether the level should merge duplicate consecutive runs.

julia> Tensor(Dense(SparseRLELevel(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseRLE (0.0) [1:3]
   │  ├─ [1:1]: 10.0
   │  └─ [2:2]: 30.0
   ├─ [:, 2]: SparseRLE (0.0) [1:3]
   └─ [:, 3]: SparseRLE (0.0) [1:3]
      ├─ [1:1]: 20.0
      └─ [3:3]: 40.0

Finch.SparseVBLLevel — Type

SparseVBLLevel{[Ti=Int], [Ptr, Idx, Ofs]}(lvl, [dim])

Like the SparseListLevel, but contiguous subfibers are stored together in blocks.

```jldoctest julia> Tensor(Dense(SparseVBL(Element(0.0))), [10 0 20; 30 0 0; 0 0 40]) Dense [:,1:3] ├─[:,1]: SparseList (0.0) [1:3] │ ├─[1]: 10.0 │ ├─[2]: 30.0 ├─[:,2]: SparseList (0.0) [1:3] ├─[:,3]: SparseList (0.0) [1:3] │ ├─[1]: 20.0 │ ├─[3]: 40.0

julia> Tensor(SparseVBL(SparseVBL(Element(0.0))), [10 0 20; 30 0 0; 0 0 40]) SparseList (0.0) [:,1:3] ├─[:,1]: SparseList (0.0) [1:3] │ ├─[1]: 10.0 │ ├─[2]: 30.0 ├─[:,3]: SparseList (0.0) [1:3] │ ├─[1]: 20.0 │ ├─[3]: 40.0

Finch.SparseBandLevel — Type

SparseBandLevel{[Ti=Int], [Ptr, Idx, Ofs]}(lvl, [dim])

Like the SparseVBLLevel, but stores only a single block, and fills in zeros.

```jldoctest julia> Tensor(Dense(SparseBand(Element(0.0))), [10 0 20; 30 40 0; 0 0 50]) Dense [:,1:3] ├─[:,1]: SparseList (0.0) [1:3] │ ├─[1]: 10.0 │ ├─[2]: 30.0 ├─[:,2]: SparseList (0.0) [1:3] ├─[:,3]: SparseList (0.0) [1:3] │ ├─[1]: 20.0 │ ├─[3]: 40.0

Finch.SparsePointLevel — Type

SparsePointLevel{[Ti=Int], [Ptr, Idx]}(lvl, [dim])

A subfiber of a SparsePoint level does not need to represent slices A[:, ..., :, i] which are entirely fill_value. Instead, only potentially non-fill slices are stored as subfibers in lvl. A main difference compared to SparseList level is that SparsePoint level only stores a 'single' non-fill slice. It emits an error if the program tries to write multiple (>=2) coordinates into SparsePoint.

Ti is the type of the last tensor index. The types Ptr and Idx are the types of the arrays used to store positions and indicies.

julia> Tensor(Dense(SparsePoint(Element(0.0))), [10 0 0; 0 20 0; 0 0 30])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparsePoint (0.0) [1:3]
   │  └─ 10.0
   ├─ [:, 2]: SparsePoint (0.0) [1:3]
   │  └─ 20.0
   └─ [:, 3]: SparsePoint (0.0) [1:3]
      └─ 30.0

julia> Tensor(SparsePoint(Dense(Element(0.0))), [0 0 0; 0 0 30; 0 0 30])
3×3-Tensor
└─ SparsePoint (0.0) [:,1:3]
   └─ Dense [1:3]
      ├─ [1]: 0.0
      ├─ [2]: 30.0
      └─ [3]: 30.0

Finch.SparseIntervalLevel — Type

SparseIntervalLevel{[Ti=Int], [Ptr, Left, Right]}(lvl, [dim])

The single RLE level represent runs of equivalent slices A[:, ..., :, i] which are not entirely fill_value. A main difference compared to SparseRLE level is that SparseInterval level only stores a 'single' non-fill run. It emits an error if the program tries to write multiple (>=2) runs into SparseInterval.

Ti is the type of the last tensor index. The types Ptr, Left, and 'Right' are the types of the arrays used to store positions and endpoints.

julia> Tensor(SparseInterval(Element(0)), [0, 10, 0])
3-Tensor
└─ SparseInterval (0) [1:3]
   └─ [2:2]: 10

julia> x = Tensor(SparseInterval(Element(0)), 10);

julia> @finch begin for i = extent(3,6); x[~i] = 1 end end;

julia> x
10-Tensor
└─ SparseInterval (0) [1:10]
   └─ [3:6]: 1

Finch.SparseByteMapLevel — Type

SparseByteMapLevel{[Ti=Int], [Ptr, Tbl]}(lvl, [dims])

Like the SparseListLevel, but a dense bitmap is used to encode which slices are stored. This allows the ByteMap level to support random access.

Ti is the type of the last tensor index, and Tp is the type used for positions in the level.

julia> Tensor(Dense(SparseByteMap(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseByteMap (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   ├─ [:, 2]: SparseByteMap (0.0) [1:3]
   └─ [:, 3]: SparseByteMap (0.0) [1:3]
      ├─ [1]: 0.0
      └─ [3]: 0.0

julia> Tensor(SparseByteMap(SparseByteMap(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ SparseByteMap (0.0) [:,1:3]
   ├─ [:, 1]: SparseByteMap (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   └─ [:, 3]: SparseByteMap (0.0) [1:3]

Finch.SparseDictLevel — Type

SparseDictLevel{[Ti=Int], [Tp=Int], [Ptr, Idx, Val, Tbl, Pool=Dict]}(lvl, [dim])

A subfiber of a sparse level does not need to represent slices A[:, ..., :, i] which are entirely fill_value. Instead, only potentially non-fill slices are stored as subfibers in lvl. A datastructure specified by Tbl is used to record which slices are stored. Optionally, dim is the size of the last dimension.

Ti is the type of the last fiber index, and Tp is the type used for positions in the level. The types Ptr and Idx are the types of the arrays used to store positions and indicies.

julia> Tensor(Dense(SparseDict(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseDict (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   ├─ [:, 2]: SparseDict (0.0) [1:3]
   └─ [:, 3]: SparseDict (0.0) [1:3]
      ├─ [1]: 20.0
      └─ [3]: 40.0

julia> Tensor(SparseDict(SparseDict(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ SparseDict (0.0) [:,1:3]
   ├─ [:, 1]: SparseDict (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   └─ [:, 3]: SparseDict (0.0) [1:3]
      ├─ [1]: 20.0
      └─ [3]: 40.0

Modifier Levels

Finch.SeparateLevel — Type

SeparateLevel{Lvl, [Val]}()

A subfiber of a Separate level is a separate tensor of type Lvl, in it's own memory space.

Each sublevel is stored in a vector of type Val with eltype(Val) = Lvl.

julia> Tensor(Dense(Separate(Element(0.0))), [1, 2, 3])
3-Tensor
└─ Dense [1:3]
   ├─ [1]: Pointer ->
   │  └─ 1.0
   ├─ [2]: Pointer ->
   │  └─ 2.0
   └─ [3]: Pointer ->
      └─ 3.0

Legacy Levels

Finch.SparseCOOLevel — Type

SparseCOOLevel{[N], [TI=Tuple{Int...}], [Ptr, Tbl]}(lvl, [dims])

A subfiber of a sparse level does not need to represent slices which are entirely fill_value. Instead, only potentially non-fill slices are stored as subfibers in lvl. The sparse coo level corresponds to N indices in the subfiber, so fibers in the sublevel are the slices A[:, ..., :, i_1, ..., i_n]. A set of N lists (one for each index) are used to record which slices are stored. The coordinates (sets of N indices) are sorted in column major order. Optionally, dims are the sizes of the last dimensions.

TI is the type of the last N tensor indices, and Tp is the type used for positions in the level.

The type Tbl is an NTuple type where each entry k is a subtype AbstractVector{TI[k]}.

The type Ptr is the type for the pointer array.

julia> Tensor(Dense(SparseCOO{1}(Element(0.0))), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ Dense [:,1:3]
   ├─ [:, 1]: SparseCOO{1} (0.0) [1:3]
   │  ├─ [1]: 10.0
   │  └─ [2]: 30.0
   ├─ [:, 2]: SparseCOO{1} (0.0) [1:3]
   └─ [:, 3]: SparseCOO{1} (0.0) [1:3]
      ├─ [1]: 20.0
      └─ [3]: 40.0

julia> Tensor(SparseCOO{2}(Element(0.0)), [10 0 20; 30 0 0; 0 0 40])
3×3-Tensor
└─ SparseCOO{2} (0.0) [:,1:3]
   ├─ [1, 1]: 10.0
   ├─ [2, 1]: 30.0
   ├─ [1, 3]: 20.0
   └─ [3, 3]: 40.0