LLᵀ and LLᴴ
LinearAlgebra.cholesky
— Methodsolver = cholesky(A::CuSparseMatrixCSR{T,Cint}; view::Char='F')
Compute the LLᴴ factorization of a sparse matrix A
on an NVIDIA GPU. The type T
can be Float32
, Float64
, ComplexF32
or ComplexF64
.
Input argument
A
: a sparse Hermitian positive definite matrix stored in theCuSparseMatrixCSR
format.
Keyword argument
*view
: A character that specifies which triangle of the sparse matrix is provided. Possible options are L
for the lower triangle, U
for the upper triangle, and F
for the full matrix.
Output argument
solver
: Opaque structureCudssSolver
that stores the factors of the LLᴴ decomposition.
LinearAlgebra.cholesky!
— Methodsolver = cholesky!(solver::CudssSolver{T}, A::CuSparseMatrixCSR{T,Cint})
Compute the LLᴴ factorization of a sparse matrix A
on an NVIDIA GPU, reusing the symbolic factorization stored in solver
. The type T
can be Float32
, Float64
, ComplexF32
or ComplexF64
.
using CUDA, CUDA.CUSPARSE
using CUDSS
using LinearAlgebra
using SparseArrays
T = ComplexF64
R = real(T)
n = 100
p = 5
A_cpu = sprand(T, n, n, 0.01)
A_cpu = A_cpu * A_cpu' + I
B_cpu = rand(T, n, p)
A_gpu = CuSparseMatrixCSR(A_cpu |> triu)
B_gpu = CuMatrix(B_cpu)
X_gpu = similar(B_gpu)
F = cholesky(A_gpu, view='U')
X_gpu = F \ B_gpu
R_gpu = B_gpu - CuSparseMatrixCSR(A_cpu) * X_gpu
norm(R_gpu)
# In-place LLᴴ
d_gpu = rand(R, n) |> CuVector
A_gpu = A_gpu + Diagonal(d_gpu)
cholesky!(F, A_gpu)
C_cpu = rand(T, n, p)
C_gpu = CuMatrix(C_cpu)
ldiv!(X_gpu, F, C_gpu)
R_gpu = C_gpu - ( CuSparseMatrixCSR(A_cpu) + Diagonal(d_gpu) ) * X_gpu
norm(R_gpu)
If we only store one triangle of A_gpu
, we can also use the wrappers Symmetric
and Hermitian
instead of using the keyword argument view
in cholesky
. For real matrices, both wrappers are allowed but only Hermitian
can be used for complex matrices.
H_gpu = Hermitian(A_gpu, :U)
F = cholesky(H_gpu)
LDLᵀ and LDLᴴ
LinearAlgebra.ldlt
— Methodsolver = ldlt(A::CuSparseMatrixCSR{T,Cint}; view::Char='F')
Compute the LDLᴴ factorization of a sparse matrix A
on an NVIDIA GPU. The type T
can be Float32
, Float64
, ComplexF32
or ComplexF64
.
Input argument
A
: a sparse Hermitian matrix stored in theCuSparseMatrixCSR
format.
Keyword argument
*view
: A character that specifies which triangle of the sparse matrix is provided. Possible options are L
for the lower triangle, U
for the upper triangle, and F
for the full matrix.
Output argument
solver
: Opaque structureCudssSolver
that stores the factors of the LDLᴴ decomposition.
LinearAlgebra.ldlt!
— Methodsolver = ldlt!(solver::CudssSolver{T}, A::CuSparseMatrixCSR{T,Cint})
Compute the LDLᴴ factorization of a sparse matrix A
on an NVIDIA GPU, reusing the symbolic factorization stored in solver
. The type T
can be Float32
, Float64
, ComplexF32
or ComplexF64
.
using CUDA, CUDA.CUSPARSE
using CUDSS
using LinearAlgebra
using SparseArrays
T = Float64
R = real(T)
n = 100
p = 5
A_cpu = sprand(T, n, n, 0.05) + I
A_cpu = A_cpu + A_cpu'
B_cpu = rand(T, n, p)
A_gpu = CuSparseMatrixCSR(A_cpu |> tril)
B_gpu = CuMatrix(B_cpu)
X_gpu = similar(B_gpu)
F = ldlt(A_gpu, view='L')
X_gpu = F \ B_gpu
R_gpu = B_gpu - CuSparseMatrixCSR(A_cpu) * X_gpu
norm(R_gpu)
# In-place LDLᵀ
d_gpu = rand(R, n) |> CuVector
A_gpu = A_gpu + Diagonal(d_gpu)
ldlt!(F, A_gpu)
C_cpu = rand(T, n, p)
C_gpu = CuMatrix(C_cpu)
ldiv!(X_gpu, F, C_gpu)
R_gpu = C_gpu - ( CuSparseMatrixCSR(A_cpu) + Diagonal(d_gpu) ) * X_gpu
norm(R_gpu)
If we only store one triangle of A_gpu
, we can also use the wrappers Symmetric
and Hermitian
instead of using the keyword argument view
in ldlt
. For real matrices, both wrappers are allowed but only Hermitian
can be used for complex matrices.
S_gpu = Symmetric(A_gpu, :L)
F = ldlt(S_gpu)
LU
LinearAlgebra.lu
— Methodsolver = lu(A::CuSparseMatrixCSR{T,Cint})
Compute the LU factorization of a sparse matrix A
on an NVIDIA GPU. The type T
can be Float32
, Float64
, ComplexF32
or ComplexF64
.
Input argument
A
: a sparse square matrix stored in theCuSparseMatrixCSR
format.
Output argument
solver
: an opaque structureCudssSolver
that stores the factors of the LU decomposition.
LinearAlgebra.lu!
— Methodsolver = lu!(solver::CudssSolver{T}, A::CuSparseMatrixCSR{T,Cint})
Compute the LU factorization of a sparse matrix A
on an NVIDIA GPU, reusing the symbolic factorization stored in solver
. The type T
can be Float32
, Float64
, ComplexF32
or ComplexF64
.
using CUDA, CUDA.CUSPARSE
using CUDSS
using LinearAlgebra
using SparseArrays
T = Float64
n = 100
A_cpu = sprand(T, n, n, 0.05) + I
b_cpu = rand(T, n)
A_gpu = CuSparseMatrixCSR(A_cpu)
b_gpu = CuVector(b_cpu)
F = lu(A_gpu)
x_gpu = F \ b_gpu
r_gpu = b_gpu - A_gpu * x_gpu
norm(r_gpu)
# In-place LU
d_gpu = rand(T, n) |> CuVector
A_gpu = A_gpu + Diagonal(d_gpu)
lu!(F, A_gpu)
c_cpu = rand(T, n)
c_gpu = CuVector(c_cpu)
ldiv!(x_gpu, F, c_gpu)
r_gpu = c_gpu - A_gpu * x_gpu
norm(r_gpu)