PaStiX Handbook  6.2.1

PastixFloat SVD low-rank kernels

pastix_fixdbl_t core_sge2lr_svd (int use_reltol, pastix_fixdbl_t tol, pastix_int_t rklimit, pastix_int_t m, pastix_int_t n, const void *Avoid, pastix_int_t lda, pastix_lrblock_t *Alr)
 Convert a full rank matrix in a low rank matrix, using SVD. More...
 
pastix_fixdbl_t core_srradd_svd (const pastix_lr_t *lowrank, pastix_trans_t transA1, const void *alphaptr, pastix_int_t M1, pastix_int_t N1, const pastix_lrblock_t *A, pastix_int_t M2, pastix_int_t N2, pastix_lrblock_t *B, pastix_int_t offx, pastix_int_t offy)
 Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T. More...
 

PastixDouble SVD low-rank kernels

pastix_fixdbl_t core_dge2lr_svd (int use_reltol, pastix_fixdbl_t tol, pastix_int_t rklimit, pastix_int_t m, pastix_int_t n, const void *Avoid, pastix_int_t lda, pastix_lrblock_t *Alr)
 Convert a full rank matrix in a low rank matrix, using SVD. More...
 
pastix_fixdbl_t core_drradd_svd (const pastix_lr_t *lowrank, pastix_trans_t transA1, const void *alphaptr, pastix_int_t M1, pastix_int_t N1, const pastix_lrblock_t *A, pastix_int_t M2, pastix_int_t N2, pastix_lrblock_t *B, pastix_int_t offx, pastix_int_t offy)
 Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T. More...
 

PastixComplex32 SVD low-rank kernels

pastix_fixdbl_t core_cge2lr_svd (int use_reltol, pastix_fixdbl_t tol, pastix_int_t rklimit, pastix_int_t m, pastix_int_t n, const void *Avoid, pastix_int_t lda, pastix_lrblock_t *Alr)
 Convert a full rank matrix in a low rank matrix, using SVD. More...
 
pastix_fixdbl_t core_crradd_svd (const pastix_lr_t *lowrank, pastix_trans_t transA1, const void *alphaptr, pastix_int_t M1, pastix_int_t N1, const pastix_lrblock_t *A, pastix_int_t M2, pastix_int_t N2, pastix_lrblock_t *B, pastix_int_t offx, pastix_int_t offy)
 Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T. More...
 

PastixComplex64 SVD low-rank kernels

pastix_fixdbl_t core_zge2lr_svd (int use_reltol, pastix_fixdbl_t tol, pastix_int_t rklimit, pastix_int_t m, pastix_int_t n, const void *Avoid, pastix_int_t lda, pastix_lrblock_t *Alr)
 Convert a full rank matrix in a low rank matrix, using SVD. More...
 
pastix_fixdbl_t core_zrradd_svd (const pastix_lr_t *lowrank, pastix_trans_t transA1, const void *alphaptr, pastix_int_t M1, pastix_int_t N1, const pastix_lrblock_t *A, pastix_int_t M2, pastix_int_t N2, pastix_lrblock_t *B, pastix_int_t offx, pastix_int_t offy)
 Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T. More...
 

Detailed Description

This is the SVD implementation of the low-rank kernels based on the LAPACK GESVD function.

Function Documentation

◆ core_sge2lr_svd()

pastix_fixdbl_t core_sge2lr_svd ( int  use_reltol,
pastix_fixdbl_t  tol,
pastix_int_t  rklimit,
pastix_int_t  m,
pastix_int_t  n,
const void *  Avoid,
pastix_int_t  lda,
pastix_lrblock_t Alr 
)

Convert a full rank matrix in a low rank matrix, using SVD.

Parameters
[in]tolThe tolerance used as a criterion to eliminate information from the full rank matrix. If tol < 0, then we compress up to rklimit. So if rklimit is set to min(m,n), and tol < 0., we get a full representation of the matrix under the form U * V^t.
[in]rklimitThe maximum rank to store the matrix in low-rank format. If -1, set to min(M, N) / PASTIX_LR_MINRATIO.
[in]mNumber of rows of the matrix A, and of the low rank matrix Alr.
[in]nNumber of columns of the matrix A, and of the low rank matrix Alr.
[in]AThe matrix of dimension lda-by-n that needs to be compressed
[in]ldaThe leading dimension of the matrix A. lda >= max(1, m)
[out]AlrThe low rank matrix structure that will store the low rank representation of A

If maximum rank is 0, then either the matrix norm is below the tolerance, and we can return a null rank matrix, or it is not and we need to return a full rank matrix.

Definition at line 116 of file core_sgelrops_svd.c.

References core_get_rklimit, core_slralloc(), core_slrsze(), pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lrblock_s::u, and pastix_lrblock_s::v.

◆ core_srradd_svd()

pastix_fixdbl_t core_srradd_svd ( const pastix_lr_t lowrank,
pastix_trans_t  transA1,
const void *  alphaptr,
pastix_int_t  M1,
pastix_int_t  N1,
const pastix_lrblock_t A,
pastix_int_t  M2,
pastix_int_t  N2,
pastix_lrblock_t B,
pastix_int_t  offx,
pastix_int_t  offy 
)

Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T.

u2v2^T - u1v1^T = (u2 u1) (v2 v1)^T Compute QR decomposition of (u2 u1) = Q1 R1 Compute QR decomposition of (v2 v1) = Q2 R2 Compute SVD of R1 R2^T = u sigma v^T Final solution is (Q1 u sigma^[1/2]) (Q2 v sigma^[1/2])^T

Parameters
[in]lowrankThe structure with low-rank parameters.
[in]transA1
  • PastixNoTrans: No transpose, op( A ) = A;
  • PastixTrans: Transpose, op( A ) = A';
[in]alphaalpha * A is add to B
[in]M1The number of rows of the matrix A.
[in]N1The number of columns of the matrix A.
[in]AThe low-rank representation of the matrix A.
[in]M2The number of rows of the matrix B.
[in]N2The number of columns of the matrix B.
[in]BThe low-rank representation of the matrix B.
[in]offxThe horizontal offset of A with respect to B.
[in]offyThe vertical offset of A with respect to B.
Returns
The new rank of u2 v2^T or -1 if ranks are too large for recompression

In relative tolerance, we can choose two solutions: 1) The first one, more conservative, is to compress relatively to the norm of the final matrix [ \alpha A + B ]. In this kernel, we exploit the fact that the previous computed product contains all the information of the final matrix to do it as follow:

float norm = LAPACKE_slange_work( LAPACK_COL_MAJOR, 'f', rank, rank, R, rank, NULL ); tol = tol * norm;

2) The second solution, less conservative, will allow to reduce the rank more efficiently. Since A and B have been compressed relatively to their respective norms, there is no reason to compress the sum relatively to its own norm, but it is more reasonable to compress it relatively to the norm of A and B. For example, A-B would be full with the first criterion, and rank null with the second. Note that here, we can only have an estimation that once again reduces the conservation of the criterion. || \alpha A + B || <= |\alpha| ||A|| + ||B|| <= |\alpha| ||U_aV_a|| + ||U_bV_b||

Definition at line 331 of file core_sgelrops_svd.c.

References core_get_rklimit, core_sgeadd(), core_slralloc(), core_slrconcatenate_u(), core_slrconcatenate_v(), core_slrcpy(), core_slrfree(), core_slrnrm(), core_slrsze(), PastixFrobeniusNorm, PastixLeft, PastixNoTrans, PastixRight, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lr_s::tolerance, pastix_lrblock_s::u, pastix_lr_s::use_reltol, and pastix_lrblock_s::v.

◆ core_dge2lr_svd()

pastix_fixdbl_t core_dge2lr_svd ( int  use_reltol,
pastix_fixdbl_t  tol,
pastix_int_t  rklimit,
pastix_int_t  m,
pastix_int_t  n,
const void *  Avoid,
pastix_int_t  lda,
pastix_lrblock_t Alr 
)

Convert a full rank matrix in a low rank matrix, using SVD.

Parameters
[in]tolThe tolerance used as a criterion to eliminate information from the full rank matrix. If tol < 0, then we compress up to rklimit. So if rklimit is set to min(m,n), and tol < 0., we get a full representation of the matrix under the form U * V^t.
[in]rklimitThe maximum rank to store the matrix in low-rank format. If -1, set to min(M, N) / PASTIX_LR_MINRATIO.
[in]mNumber of rows of the matrix A, and of the low rank matrix Alr.
[in]nNumber of columns of the matrix A, and of the low rank matrix Alr.
[in]AThe matrix of dimension lda-by-n that needs to be compressed
[in]ldaThe leading dimension of the matrix A. lda >= max(1, m)
[out]AlrThe low rank matrix structure that will store the low rank representation of A

If maximum rank is 0, then either the matrix norm is below the tolerance, and we can return a null rank matrix, or it is not and we need to return a full rank matrix.

Definition at line 116 of file core_dgelrops_svd.c.

References core_dlralloc(), core_dlrsze(), core_get_rklimit, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lrblock_s::u, and pastix_lrblock_s::v.

◆ core_drradd_svd()

pastix_fixdbl_t core_drradd_svd ( const pastix_lr_t lowrank,
pastix_trans_t  transA1,
const void *  alphaptr,
pastix_int_t  M1,
pastix_int_t  N1,
const pastix_lrblock_t A,
pastix_int_t  M2,
pastix_int_t  N2,
pastix_lrblock_t B,
pastix_int_t  offx,
pastix_int_t  offy 
)

Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T.

u2v2^T - u1v1^T = (u2 u1) (v2 v1)^T Compute QR decomposition of (u2 u1) = Q1 R1 Compute QR decomposition of (v2 v1) = Q2 R2 Compute SVD of R1 R2^T = u sigma v^T Final solution is (Q1 u sigma^[1/2]) (Q2 v sigma^[1/2])^T

Parameters
[in]lowrankThe structure with low-rank parameters.
[in]transA1
  • PastixNoTrans: No transpose, op( A ) = A;
  • PastixTrans: Transpose, op( A ) = A';
[in]alphaalpha * A is add to B
[in]M1The number of rows of the matrix A.
[in]N1The number of columns of the matrix A.
[in]AThe low-rank representation of the matrix A.
[in]M2The number of rows of the matrix B.
[in]N2The number of columns of the matrix B.
[in]BThe low-rank representation of the matrix B.
[in]offxThe horizontal offset of A with respect to B.
[in]offyThe vertical offset of A with respect to B.
Returns
The new rank of u2 v2^T or -1 if ranks are too large for recompression

In relative tolerance, we can choose two solutions: 1) The first one, more conservative, is to compress relatively to the norm of the final matrix [ \alpha A + B ]. In this kernel, we exploit the fact that the previous computed product contains all the information of the final matrix to do it as follow:

double norm = LAPACKE_dlange_work( LAPACK_COL_MAJOR, 'f', rank, rank, R, rank, NULL ); tol = tol * norm;

2) The second solution, less conservative, will allow to reduce the rank more efficiently. Since A and B have been compressed relatively to their respective norms, there is no reason to compress the sum relatively to its own norm, but it is more reasonable to compress it relatively to the norm of A and B. For example, A-B would be full with the first criterion, and rank null with the second. Note that here, we can only have an estimation that once again reduces the conservation of the criterion. || \alpha A + B || <= |\alpha| ||A|| + ||B|| <= |\alpha| ||U_aV_a|| + ||U_bV_b||

Definition at line 331 of file core_dgelrops_svd.c.

References core_dgeadd(), core_dlralloc(), core_dlrconcatenate_u(), core_dlrconcatenate_v(), core_dlrcpy(), core_dlrfree(), core_dlrnrm(), core_dlrsze(), core_get_rklimit, PastixFrobeniusNorm, PastixLeft, PastixNoTrans, PastixRight, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lr_s::tolerance, pastix_lrblock_s::u, pastix_lr_s::use_reltol, and pastix_lrblock_s::v.

◆ core_cge2lr_svd()

pastix_fixdbl_t core_cge2lr_svd ( int  use_reltol,
pastix_fixdbl_t  tol,
pastix_int_t  rklimit,
pastix_int_t  m,
pastix_int_t  n,
const void *  Avoid,
pastix_int_t  lda,
pastix_lrblock_t Alr 
)

Convert a full rank matrix in a low rank matrix, using SVD.

Parameters
[in]tolThe tolerance used as a criterion to eliminate information from the full rank matrix. If tol < 0, then we compress up to rklimit. So if rklimit is set to min(m,n), and tol < 0., we get a full representation of the matrix under the form U * V^t.
[in]rklimitThe maximum rank to store the matrix in low-rank format. If -1, set to min(M, N) / PASTIX_LR_MINRATIO.
[in]mNumber of rows of the matrix A, and of the low rank matrix Alr.
[in]nNumber of columns of the matrix A, and of the low rank matrix Alr.
[in]AThe matrix of dimension lda-by-n that needs to be compressed
[in]ldaThe leading dimension of the matrix A. lda >= max(1, m)
[out]AlrThe low rank matrix structure that will store the low rank representation of A

If maximum rank is 0, then either the matrix norm is below the tolerance, and we can return a null rank matrix, or it is not and we need to return a full rank matrix.

Definition at line 116 of file core_cgelrops_svd.c.

References core_clralloc(), core_clrsze(), core_get_rklimit, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lrblock_s::u, and pastix_lrblock_s::v.

◆ core_crradd_svd()

pastix_fixdbl_t core_crradd_svd ( const pastix_lr_t lowrank,
pastix_trans_t  transA1,
const void *  alphaptr,
pastix_int_t  M1,
pastix_int_t  N1,
const pastix_lrblock_t A,
pastix_int_t  M2,
pastix_int_t  N2,
pastix_lrblock_t B,
pastix_int_t  offx,
pastix_int_t  offy 
)

Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T.

u2v2^T - u1v1^T = (u2 u1) (v2 v1)^T Compute QR decomposition of (u2 u1) = Q1 R1 Compute QR decomposition of (v2 v1) = Q2 R2 Compute SVD of R1 R2^T = u sigma v^T Final solution is (Q1 u sigma^[1/2]) (Q2 v sigma^[1/2])^T

Parameters
[in]lowrankThe structure with low-rank parameters.
[in]transA1
  • PastixNoTrans: No transpose, op( A ) = A;
  • PastixTrans: Transpose, op( A ) = A';
[in]alphaalpha * A is add to B
[in]M1The number of rows of the matrix A.
[in]N1The number of columns of the matrix A.
[in]AThe low-rank representation of the matrix A.
[in]M2The number of rows of the matrix B.
[in]N2The number of columns of the matrix B.
[in]BThe low-rank representation of the matrix B.
[in]offxThe horizontal offset of A with respect to B.
[in]offyThe vertical offset of A with respect to B.
Returns
The new rank of u2 v2^T or -1 if ranks are too large for recompression

In relative tolerance, we can choose two solutions: 1) The first one, more conservative, is to compress relatively to the norm of the final matrix [ \alpha A + B ]. In this kernel, we exploit the fact that the previous computed product contains all the information of the final matrix to do it as follow:

float norm = LAPACKE_clange_work( LAPACK_COL_MAJOR, 'f', rank, rank, R, rank, NULL ); tol = tol * norm;

2) The second solution, less conservative, will allow to reduce the rank more efficiently. Since A and B have been compressed relatively to their respective norms, there is no reason to compress the sum relatively to its own norm, but it is more reasonable to compress it relatively to the norm of A and B. For example, A-B would be full with the first criterion, and rank null with the second. Note that here, we can only have an estimation that once again reduces the conservation of the criterion. || \alpha A + B || <= |\alpha| ||A|| + ||B|| <= |\alpha| ||U_aV_a|| + ||U_bV_b||

Definition at line 331 of file core_cgelrops_svd.c.

References core_cgeadd(), core_clralloc(), core_clrconcatenate_u(), core_clrconcatenate_v(), core_clrcpy(), core_clrfree(), core_clrnrm(), core_clrsze(), core_get_rklimit, PastixFrobeniusNorm, PastixLeft, PastixNoTrans, PastixRight, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lr_s::tolerance, pastix_lrblock_s::u, pastix_lr_s::use_reltol, and pastix_lrblock_s::v.

◆ core_zge2lr_svd()

pastix_fixdbl_t core_zge2lr_svd ( int  use_reltol,
pastix_fixdbl_t  tol,
pastix_int_t  rklimit,
pastix_int_t  m,
pastix_int_t  n,
const void *  Avoid,
pastix_int_t  lda,
pastix_lrblock_t Alr 
)

Convert a full rank matrix in a low rank matrix, using SVD.

Parameters
[in]tolThe tolerance used as a criterion to eliminate information from the full rank matrix. If tol < 0, then we compress up to rklimit. So if rklimit is set to min(m,n), and tol < 0., we get a full representation of the matrix under the form U * V^t.
[in]rklimitThe maximum rank to store the matrix in low-rank format. If -1, set to min(M, N) / PASTIX_LR_MINRATIO.
[in]mNumber of rows of the matrix A, and of the low rank matrix Alr.
[in]nNumber of columns of the matrix A, and of the low rank matrix Alr.
[in]AThe matrix of dimension lda-by-n that needs to be compressed
[in]ldaThe leading dimension of the matrix A. lda >= max(1, m)
[out]AlrThe low rank matrix structure that will store the low rank representation of A

If maximum rank is 0, then either the matrix norm is below the tolerance, and we can return a null rank matrix, or it is not and we need to return a full rank matrix.

Definition at line 116 of file core_zgelrops_svd.c.

References core_get_rklimit, core_zlralloc(), core_zlrsze(), pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lrblock_s::u, and pastix_lrblock_s::v.

◆ core_zrradd_svd()

pastix_fixdbl_t core_zrradd_svd ( const pastix_lr_t lowrank,
pastix_trans_t  transA1,
const void *  alphaptr,
pastix_int_t  M1,
pastix_int_t  N1,
const pastix_lrblock_t A,
pastix_int_t  M2,
pastix_int_t  N2,
pastix_lrblock_t B,
pastix_int_t  offx,
pastix_int_t  offy 
)

Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T.

u2v2^T - u1v1^T = (u2 u1) (v2 v1)^T Compute QR decomposition of (u2 u1) = Q1 R1 Compute QR decomposition of (v2 v1) = Q2 R2 Compute SVD of R1 R2^T = u sigma v^T Final solution is (Q1 u sigma^[1/2]) (Q2 v sigma^[1/2])^T

Parameters
[in]lowrankThe structure with low-rank parameters.
[in]transA1
  • PastixNoTrans: No transpose, op( A ) = A;
  • PastixTrans: Transpose, op( A ) = A';
[in]alphaalpha * A is add to B
[in]M1The number of rows of the matrix A.
[in]N1The number of columns of the matrix A.
[in]AThe low-rank representation of the matrix A.
[in]M2The number of rows of the matrix B.
[in]N2The number of columns of the matrix B.
[in]BThe low-rank representation of the matrix B.
[in]offxThe horizontal offset of A with respect to B.
[in]offyThe vertical offset of A with respect to B.
Returns
The new rank of u2 v2^T or -1 if ranks are too large for recompression

In relative tolerance, we can choose two solutions: 1) The first one, more conservative, is to compress relatively to the norm of the final matrix [ \alpha A + B ]. In this kernel, we exploit the fact that the previous computed product contains all the information of the final matrix to do it as follow:

double norm = LAPACKE_zlange_work( LAPACK_COL_MAJOR, 'f', rank, rank, R, rank, NULL ); tol = tol * norm;

2) The second solution, less conservative, will allow to reduce the rank more efficiently. Since A and B have been compressed relatively to their respective norms, there is no reason to compress the sum relatively to its own norm, but it is more reasonable to compress it relatively to the norm of A and B. For example, A-B would be full with the first criterion, and rank null with the second. Note that here, we can only have an estimation that once again reduces the conservation of the criterion. || \alpha A + B || <= |\alpha| ||A|| + ||B|| <= |\alpha| ||U_aV_a|| + ||U_bV_b||

Definition at line 331 of file core_zgelrops_svd.c.

References core_get_rklimit, core_zgeadd(), core_zlralloc(), core_zlrconcatenate_u(), core_zlrconcatenate_v(), core_zlrcpy(), core_zlrfree(), core_zlrnrm(), core_zlrsze(), PastixFrobeniusNorm, PastixLeft, PastixNoTrans, PastixRight, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lr_s::tolerance, pastix_lrblock_s::u, pastix_lr_s::use_reltol, and pastix_lrblock_s::v.