Einsum.jl

Package Build	Package Status

	- help wanted!

This package exports a single macro @einsum, which implements similar notation to the Einstein summation convention to flexibly specify operations on Julia Arrays, similar to numpy's einsum function (but more flexible!).

For example, basic matrix multiplication can be implemented as:

@einsum A[i, j] := B[i, k] * C[k, j]

To install: Pkg.add("Einsum"), or else pkg> add Einsum after pressing ] on Julia 0.7 and later.

Documentation

Basics

If the destination array is preallocated, then use =:

A = ones(5, 6, 7) # preallocated space
X = randn(5, 2)
Y = randn(6, 2)
Z = randn(7, 2)

# Store the result in A, overwriting as necessary:
@einsum A[i, j, k] = X[i, r] * Y[j, r] * Z[k, r]

If destination is not preallocated, then use := to automatically create a new array for the result:

X = randn(5, 2)
Y = randn(6, 2)
Z = randn(7, 2)

# Create new array B with appropriate dimensions:
@einsum B[i, j, k] := X[i, r] * Y[j, r] * Z[k, r]

What happens under the hood

To execute an expression, @einsum uses Julia's metaprogramming capabilities to generate and execute a series of nested for loops. To see exactly what is generated, use @macroexpand (or @expand from MacroTools.jl):

@macroexpand @einsum A[i, j] := B[i, k] * C[k, j]

The output will look much like the following (note that we "sum out" over the index k, since it only occurs multiple times on the right hand side of the equation):

# determine type
T = promote_type(eltype(B), eltype(C))

# allocate new array
A = Array{T}(undef, size(B))

# check dimensions
@assert size(B, 2) == size(C, 2)

# main loop
@inbounds begin # skip bounds-checking for speed
    for i = 1:size(B, 1), j = 1:size(C, 2)
        s = zero(T)
        for k = 1:size(B,2)
            s += B[i, k] * C[k, j]
        end
        A[i, j] = s
    end
end

The actual generated code is a bit more verbose (and not neatly commented/formatted), and will take care to use the right types and keep hygienic.

You can also use updating assignment operators for preallocated arrays. E.g., @einsum A[i, j, k] *= X[i, r] * Y[j, r] * Z[k, r] will produce something like

for k = 1:size(A, 3)
    for j = 1:size(A, 2)
        for i = 1:size(A, 1)
            s = 0.0
            for r = 1:size(X, 2)
                s += X[i, r] * Y[j, r] * Z[k, r]
            end
            # Difference: here, the updating form is used:
            A[i, j, k] *= s
        end
    end
end

Rules for indexing variables

Indices that show up on the right-hand-side but not the left-hand-side are summed over
Arrays which share an index must be of the same size, in those dimensions

@einsum iterates over the extent of the right-hand-side indices. For example, the following code allocates an array A that is the same size as B and copies its data into A:

@einsum A[i,  j] := B[i, j]  # same as A = copy(B)

If an index appears on the right-hand-side, but does not appear on the left-hand-side, then this variable is summed over. For example, the following code allocates A to be size(B, 1) and sums over the rows of B:

@einsum A[i] := B[i, j]  # same as A = dropdims(sum(B, dims=2), dims=2)

If an index variable appears multiple times on the right-hand-side, then it is asserted that the sizes of these dimensions match. For example,

@einsum A[i] := B[i, j] * C[j]

will check that the second dimension of B matches the first dimension of C in length. In particular it is equivalent to the following code:

A = zeros(size(B, 1))
@assert size(B, 2) == size(C, 1)

for i = 1:size(B, 1), j = 1:size(B, 2)
    A[i] += B[i, j] * C[j]
end

So an error will be thrown if the specified dimensions of B and C don't match.

Offset indexing

@einsum also allows offsets on the right-hand-side, the range over which i is summed is then restricted:

@einsum A[i] = B[i - 5]

`@vielsum`

This variant of @einsum will run multi-threaded on the outermost loop. For this to be fast, the code must not introduce temporaries like s = 0 in the example above. Thus for example @expand @vielsum A[i,j,k] = X[i,r]*Y[j,r]*Z[k,r] results in something equivalent to @expand-ing the following:

Threads.@threads for k = 1:size(A,3)
    for j = 1:size(A,2)
        for i = 1:size(A,1)
            A[i,j,k] = 0.0
            for r = 1:size(X,2)
                A[i,j,k] += X[i,r] * Y[j,r] * Z[k,r]
            end
        end
    end
end

For this to be useful, you will need to set an environment variable before starting Julia, such as export JULIA_NUM_THREADS=4. See the manual for details, and note that this is somewhat experimental. This will not always be faster, especially for small arrays, as there is some overhead to dividing up the work.

At present you cannot use updating assignment operators like += with this macro (only = or :=) and you cannot assign to a scalar left-hand-side (only an array). These limitations prevent different threads from writing to the same memory at the same time.

`@einsimd`

This is a variant of @einsum which will put @simd in front of the innermost loop, encouraging Julia's compiler vectorize this loop (although it may do so already). For example @einsimd A[i,j,k] = X[i,r]*Y[j,r]*Z[k,r] will result in approximately

@inbounds for k = 1:size(A,3)
    for j = 1:size(A,2)
        for i = 1:size(A,1)
            s = 0.0
            @simd for r = 1:size(X,2)
                s += X[i,r] * Y[j,r] * Z[k,r]
            end
            A[i,j,k] = s
        end
    end
end

Whether this is a good idea or not you have to decide and benchmark for yourself in every specific case. @simd makes sense for certain kinds of operations; make yourself familiar with its documentation and the inner workings of it in general.

Other functionality

The @einsum macro can implement function calls within the nested for loop structure. For example, consider transposing a block matrix:

z = Any[rand(2,2) for i=1:2, j=1:2]
@einsum t[i,j] := transpose(z[j,i])

This produces a for loop structure with a transpose function call in the middle. Approximately:

for j = 1:size(z,1)
    for i = 1:size(z,2)
        t[i,j] = transpose(z[j,i])
    end
end

This will work as long the function calls are outside the array names. Again, you can use @macroexpand to see the exact code that is generated.

The output need not be an array. But note that on Julia 0.7 and 1.0, the rules for evaluating in global scope (for example at the REPL prompt) are a little different -- see this package for instance (which is loaded in IJulia notebooks). To get the same behavior as you would have inside a function, you use a let block:

p = rand(5); p .= p ./ sum(p)
let
    global S
    @einsum S := - p[i] * log(p[i])
end

Or perhaps clearer, explicitly define a function:

m(pq, p, q) = @einsum I := pq[i,j] * log(pq[i,j] / p[i] / q[j])

m(rand(5,10), rand(5), rand(10))

Related Packages:

TensorOperations.jl has less flexible syntax (only allowing strict Einstein convention contractions), but can produce much more efficient code. Instead of generating “naive” loops, it transforms the expressions into optimized contraction functions and takes care to use a good (cache-friendly) order for the looping.
ArrayMeta.jl aims to produce cache-friendly operations for more general loops (but is Julia 0.6 only).