Batched operations in Julia.

Batched Arrays

  • BatchedArray{T, NI, N} a general container that assumes the last N - NI dimensions are batch dimension
  • BatchedMatrix, BatchedVector
  • BatchedTranspose, BatchedAdjoint, BatchedUniformScaling batched version of them in stdlib: LinearAlgebra
  • for cuda, defined type alias CuBatchedArray, CuBatchedMatrix, CuBatchedVector

Supported routines

(CPU): CPU implementations are just wrappers of for-loops.

  • batched gemm: batched_gemm
  • batched tr: batched_tr
  • batched transpose: transpose(::AbstractArray{T, 3})
  • batched adjoint

(GPU): GPU implementations will use CUBLAS routines.

  • batched gemm: batched_gemm_strided (our BatchedArray can be assumed as strided)
  • batched tr
  • batched transpose (same as CPU)
  • batched adjoint (same as CPU)


For routines (e.g gemm), we use a prefix batched_ for their corresponding routines in BLAS or LAPACK and they should only define with AbstractArray{T, 3} (batched matrix) or AbstractArray{T, 2} (batched vector).

For methods (e.g, we simply overload them with a batched array type (e.g BatchedArray).


Apache License Version 2.0