PaStiX Handbook: Low-rank SVD kernels

pastix_fixdbl_t core_cge2lr_svd	(	int	use_reltol,
		pastix_fixdbl_t	tol,
		pastix_int_t	rklimit,
		pastix_int_t	m,
		pastix_int_t	n,
		const void *	Avoid,
		pastix_int_t	lda,
		pastix_lrblock_t *	Alr
	)

Convert a full rank matrix in a low rank matrix, using SVD.

Parameters

[in]	use_reltol	TODO
[in]	tol	The tolerance used as a criterion to eliminate information from the full rank matrix. If tol < 0, then we compress up to rklimit. So if rklimit is set to min(m,n), and tol < 0., we get a full representation of the matrix under the form U * V^t.
[in]	rklimit	The maximum rank to store the matrix in low-rank format. If -1, set to min(M, N) / PASTIX_LR_MINRATIO.
[in]	m	Number of rows of the matrix A, and of the low rank matrix Alr.
[in]	n	Number of columns of the matrix A, and of the low rank matrix Alr.
[in]	Avoid	The matrix of dimension lda-by-n that needs to be compressed
[in]	lda	The leading dimension of the matrix A. lda >= max(1, m)
[out]	Alr	The low rank matrix structure that will store the low rank representation of A

Returns: TODO

If maximum rank is 0, then either the matrix norm is below the tolerance, and we can return a null rank matrix, or it is not and we need to return a full rank matrix.

Definition at line 128 of file core_cgelrops_svd.c.

References core_clralloc(), core_clrsze(), core_get_rklimit, core_slassq(), pastix_int_t, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lrblock_s::u, and pastix_lrblock_s::v.

pastix_fixdbl_t core_crradd_svd	(	const pastix_lr_t *	lowrank,
		pastix_trans_t	transA1,
		const void *	alphaptr,
		pastix_int_t	M1,
		pastix_int_t	N1,
		const pastix_lrblock_t *	A,
		pastix_int_t	M2,
		pastix_int_t	N2,
		pastix_lrblock_t *	B,
		pastix_int_t	offx,
		pastix_int_t	offy
	)

Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T.

u2v2^T - u1v1^T = (u2 u1) (v2 v1)^T Compute QR decomposition of (u2 u1) = Q1 R1 Compute QR decomposition of (v2 v1) = Q2 R2 Compute SVD of R1 R2^T = u sigma v^T Final solution is (Q1 u sigma^[1/2]) (Q2 v sigma^[1/2])^T

Parameters

[in]	lowrank	The structure with low-rank parameters.
[in]	transA1	PastixNoTrans: No transpose, op( A ) = A; PastixTrans: Transpose, op( A ) = A';
[in]	alphaptr	alpha * A is add to B
[in]	M1	The number of rows of the matrix A.
[in]	N1	The number of columns of the matrix A.
[in]	A	The low-rank representation of the matrix A.
[in]	M2	The number of rows of the matrix B.
[in]	N2	The number of columns of the matrix B.
[in]	B	The low-rank representation of the matrix B.
[in]	offx	The horizontal offset of A with respect to B.
[in]	offy	The vertical offset of A with respect to B.

Returns: The new rank of u2 v2^T or -1 if ranks are too large for recompression

In relative tolerance, we can choose two solutions: 1) The first one, more conservative, is to compress relatively to the norm of the final matrix \( \alpha A + B \). In this kernel, we exploit the fact that the previous computed product contains all the information of the final matrix to do it as follow:

float norm = LAPACKE_clange_work( LAPACK_COL_MAJOR, 'f', rank, rank, R, rank, NULL ); tol = tol * norm;

2) The second solution, less conservative, will allow to reduce the rank more efficiently. Since A and B have been compressed relatively to their respective norms, there is no reason to compress the sum relatively to its own norm, but it is more reasonable to compress it relatively to the norm of A and B. For example, A-B would be full with the first criterion, and rank null with the second. Note that here, we can only have an estimation that once again reduces the conservation of the criterion.

\[ || \alpha A + B || <= |\alpha| ||A|| + ||B|| <= |\alpha| ||U_aV_a|| + ||U_bV_b|| \]

Definition at line 356 of file core_cgelrops_svd.c.

References core_clrconcatenate_u(), core_clrconcatenate_v(), core_clrcpy(), core_clrnrm(), pastix_int_t, PastixFrobeniusNorm, PastixNoTrans, PastixRight, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lr_s::tolerance, and pastix_lr_s::use_reltol.

pastix_fixdbl_t core_sge2lr_svd	(	int	use_reltol,
		pastix_fixdbl_t	tol,
		pastix_int_t	rklimit,
		pastix_int_t	m,
		pastix_int_t	n,
		const void *	Avoid,
		pastix_int_t	lda,
		pastix_lrblock_t *	Alr
	)

Convert a full rank matrix in a low rank matrix, using SVD.

Parameters

[in]	use_reltol	TODO
[in]	tol	The tolerance used as a criterion to eliminate information from the full rank matrix. If tol < 0, then we compress up to rklimit. So if rklimit is set to min(m,n), and tol < 0., we get a full representation of the matrix under the form U * V^t.
[in]	rklimit	The maximum rank to store the matrix in low-rank format. If -1, set to min(M, N) / PASTIX_LR_MINRATIO.
[in]	m	Number of rows of the matrix A, and of the low rank matrix Alr.
[in]	n	Number of columns of the matrix A, and of the low rank matrix Alr.
[in]	Avoid	The matrix of dimension lda-by-n that needs to be compressed
[in]	lda	The leading dimension of the matrix A. lda >= max(1, m)
[out]	Alr	The low rank matrix structure that will store the low rank representation of A

Returns: TODO

If maximum rank is 0, then either the matrix norm is below the tolerance, and we can return a null rank matrix, or it is not and we need to return a full rank matrix.

Definition at line 128 of file core_sgelrops_svd.c.

References core_get_rklimit, core_slassq(), core_slralloc(), core_slrsze(), pastix_int_t, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lrblock_s::u, and pastix_lrblock_s::v.

pastix_fixdbl_t core_srradd_svd	(	const pastix_lr_t *	lowrank,
		pastix_trans_t	transA1,
		const void *	alphaptr,
		pastix_int_t	M1,
		pastix_int_t	N1,
		const pastix_lrblock_t *	A,
		pastix_int_t	M2,
		pastix_int_t	N2,
		pastix_lrblock_t *	B,
		pastix_int_t	offx,
		pastix_int_t	offy
	)

Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T.

u2v2^T - u1v1^T = (u2 u1) (v2 v1)^T Compute QR decomposition of (u2 u1) = Q1 R1 Compute QR decomposition of (v2 v1) = Q2 R2 Compute SVD of R1 R2^T = u sigma v^T Final solution is (Q1 u sigma^[1/2]) (Q2 v sigma^[1/2])^T

Parameters

[in]	lowrank	The structure with low-rank parameters.
[in]	transA1	PastixNoTrans: No transpose, op( A ) = A; PastixTrans: Transpose, op( A ) = A';
[in]	alphaptr	alpha * A is add to B
[in]	M1	The number of rows of the matrix A.
[in]	N1	The number of columns of the matrix A.
[in]	A	The low-rank representation of the matrix A.
[in]	M2	The number of rows of the matrix B.
[in]	N2	The number of columns of the matrix B.
[in]	B	The low-rank representation of the matrix B.
[in]	offx	The horizontal offset of A with respect to B.
[in]	offy	The vertical offset of A with respect to B.

Returns: The new rank of u2 v2^T or -1 if ranks are too large for recompression

In relative tolerance, we can choose two solutions: 1) The first one, more conservative, is to compress relatively to the norm of the final matrix \( \alpha A + B \). In this kernel, we exploit the fact that the previous computed product contains all the information of the final matrix to do it as follow:

float norm = LAPACKE_slange_work( LAPACK_COL_MAJOR, 'f', rank, rank, R, rank, NULL ); tol = tol * norm;

2) The second solution, less conservative, will allow to reduce the rank more efficiently. Since A and B have been compressed relatively to their respective norms, there is no reason to compress the sum relatively to its own norm, but it is more reasonable to compress it relatively to the norm of A and B. For example, A-B would be full with the first criterion, and rank null with the second. Note that here, we can only have an estimation that once again reduces the conservation of the criterion.

\[ || \alpha A + B || <= |\alpha| ||A|| + ||B|| <= |\alpha| ||U_aV_a|| + ||U_bV_b|| \]

Definition at line 356 of file core_sgelrops_svd.c.

References core_slrconcatenate_u(), core_slrconcatenate_v(), core_slrcpy(), core_slrnrm(), pastix_int_t, PastixFrobeniusNorm, PastixNoTrans, PastixRight, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lr_s::tolerance, and pastix_lr_s::use_reltol.

pastix_fixdbl_t core_dge2lr_svd	(	int	use_reltol,
		pastix_fixdbl_t	tol,
		pastix_int_t	rklimit,
		pastix_int_t	m,
		pastix_int_t	n,
		const void *	Avoid,
		pastix_int_t	lda,
		pastix_lrblock_t *	Alr
	)

Convert a full rank matrix in a low rank matrix, using SVD.

Parameters

[in]	use_reltol	TODO
[in]	tol	The tolerance used as a criterion to eliminate information from the full rank matrix. If tol < 0, then we compress up to rklimit. So if rklimit is set to min(m,n), and tol < 0., we get a full representation of the matrix under the form U * V^t.
[in]	rklimit	The maximum rank to store the matrix in low-rank format. If -1, set to min(M, N) / PASTIX_LR_MINRATIO.
[in]	m	Number of rows of the matrix A, and of the low rank matrix Alr.
[in]	n	Number of columns of the matrix A, and of the low rank matrix Alr.
[in]	Avoid	The matrix of dimension lda-by-n that needs to be compressed
[in]	lda	The leading dimension of the matrix A. lda >= max(1, m)
[out]	Alr	The low rank matrix structure that will store the low rank representation of A

Returns: TODO

If maximum rank is 0, then either the matrix norm is below the tolerance, and we can return a null rank matrix, or it is not and we need to return a full rank matrix.

Definition at line 128 of file core_dgelrops_svd.c.

References core_dlassq(), core_dlralloc(), core_dlrsze(), core_get_rklimit, pastix_int_t, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lrblock_s::u, and pastix_lrblock_s::v.

pastix_fixdbl_t core_drradd_svd	(	const pastix_lr_t *	lowrank,
		pastix_trans_t	transA1,
		const void *	alphaptr,
		pastix_int_t	M1,
		pastix_int_t	N1,
		const pastix_lrblock_t *	A,
		pastix_int_t	M2,
		pastix_int_t	N2,
		pastix_lrblock_t *	B,
		pastix_int_t	offx,
		pastix_int_t	offy
	)

Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T.

u2v2^T - u1v1^T = (u2 u1) (v2 v1)^T Compute QR decomposition of (u2 u1) = Q1 R1 Compute QR decomposition of (v2 v1) = Q2 R2 Compute SVD of R1 R2^T = u sigma v^T Final solution is (Q1 u sigma^[1/2]) (Q2 v sigma^[1/2])^T

Parameters

[in]	lowrank	The structure with low-rank parameters.
[in]	transA1	PastixNoTrans: No transpose, op( A ) = A; PastixTrans: Transpose, op( A ) = A';
[in]	alphaptr	alpha * A is add to B
[in]	M1	The number of rows of the matrix A.
[in]	N1	The number of columns of the matrix A.
[in]	A	The low-rank representation of the matrix A.
[in]	M2	The number of rows of the matrix B.
[in]	N2	The number of columns of the matrix B.
[in]	B	The low-rank representation of the matrix B.
[in]	offx	The horizontal offset of A with respect to B.
[in]	offy	The vertical offset of A with respect to B.

Returns: The new rank of u2 v2^T or -1 if ranks are too large for recompression

In relative tolerance, we can choose two solutions: 1) The first one, more conservative, is to compress relatively to the norm of the final matrix \( \alpha A + B \). In this kernel, we exploit the fact that the previous computed product contains all the information of the final matrix to do it as follow:

double norm = LAPACKE_dlange_work( LAPACK_COL_MAJOR, 'f', rank, rank, R, rank, NULL ); tol = tol * norm;

2) The second solution, less conservative, will allow to reduce the rank more efficiently. Since A and B have been compressed relatively to their respective norms, there is no reason to compress the sum relatively to its own norm, but it is more reasonable to compress it relatively to the norm of A and B. For example, A-B would be full with the first criterion, and rank null with the second. Note that here, we can only have an estimation that once again reduces the conservation of the criterion.

\[ || \alpha A + B || <= |\alpha| ||A|| + ||B|| <= |\alpha| ||U_aV_a|| + ||U_bV_b|| \]

Definition at line 356 of file core_dgelrops_svd.c.

References core_dlrconcatenate_u(), core_dlrconcatenate_v(), core_dlrcpy(), core_dlrnrm(), pastix_int_t, PastixFrobeniusNorm, PastixNoTrans, PastixRight, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lr_s::tolerance, and pastix_lr_s::use_reltol.

pastix_fixdbl_t core_zge2lr_svd	(	int	use_reltol,
		pastix_fixdbl_t	tol,
		pastix_int_t	rklimit,
		pastix_int_t	m,
		pastix_int_t	n,
		const void *	Avoid,
		pastix_int_t	lda,
		pastix_lrblock_t *	Alr
	)

Convert a full rank matrix in a low rank matrix, using SVD.

Parameters

[in]	use_reltol	TODO
[in]	tol	The tolerance used as a criterion to eliminate information from the full rank matrix. If tol < 0, then we compress up to rklimit. So if rklimit is set to min(m,n), and tol < 0., we get a full representation of the matrix under the form U * V^t.
[in]	rklimit	The maximum rank to store the matrix in low-rank format. If -1, set to min(M, N) / PASTIX_LR_MINRATIO.
[in]	m	Number of rows of the matrix A, and of the low rank matrix Alr.
[in]	n	Number of columns of the matrix A, and of the low rank matrix Alr.
[in]	Avoid	The matrix of dimension lda-by-n that needs to be compressed
[in]	lda	The leading dimension of the matrix A. lda >= max(1, m)
[out]	Alr	The low rank matrix structure that will store the low rank representation of A

Returns: TODO

If maximum rank is 0, then either the matrix norm is below the tolerance, and we can return a null rank matrix, or it is not and we need to return a full rank matrix.

Definition at line 128 of file core_zgelrops_svd.c.

References core_dlassq(), core_get_rklimit, core_zlralloc(), core_zlrsze(), pastix_int_t, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lrblock_s::u, and pastix_lrblock_s::v.

pastix_fixdbl_t core_zrradd_svd	(	const pastix_lr_t *	lowrank,
		pastix_trans_t	transA1,
		const void *	alphaptr,
		pastix_int_t	M1,
		pastix_int_t	N1,
		const pastix_lrblock_t *	A,
		pastix_int_t	M2,
		pastix_int_t	N2,
		pastix_lrblock_t *	B,
		pastix_int_t	offx,
		pastix_int_t	offy
	)

Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T.

u2v2^T - u1v1^T = (u2 u1) (v2 v1)^T Compute QR decomposition of (u2 u1) = Q1 R1 Compute QR decomposition of (v2 v1) = Q2 R2 Compute SVD of R1 R2^T = u sigma v^T Final solution is (Q1 u sigma^[1/2]) (Q2 v sigma^[1/2])^T

Parameters

[in]	lowrank	The structure with low-rank parameters.
[in]	transA1	PastixNoTrans: No transpose, op( A ) = A; PastixTrans: Transpose, op( A ) = A';
[in]	alphaptr	alpha * A is add to B
[in]	M1	The number of rows of the matrix A.
[in]	N1	The number of columns of the matrix A.
[in]	A	The low-rank representation of the matrix A.
[in]	M2	The number of rows of the matrix B.
[in]	N2	The number of columns of the matrix B.
[in]	B	The low-rank representation of the matrix B.
[in]	offx	The horizontal offset of A with respect to B.
[in]	offy	The vertical offset of A with respect to B.

Returns: The new rank of u2 v2^T or -1 if ranks are too large for recompression

In relative tolerance, we can choose two solutions: 1) The first one, more conservative, is to compress relatively to the norm of the final matrix \( \alpha A + B \). In this kernel, we exploit the fact that the previous computed product contains all the information of the final matrix to do it as follow:

double norm = LAPACKE_zlange_work( LAPACK_COL_MAJOR, 'f', rank, rank, R, rank, NULL ); tol = tol * norm;

2) The second solution, less conservative, will allow to reduce the rank more efficiently. Since A and B have been compressed relatively to their respective norms, there is no reason to compress the sum relatively to its own norm, but it is more reasonable to compress it relatively to the norm of A and B. For example, A-B would be full with the first criterion, and rank null with the second. Note that here, we can only have an estimation that once again reduces the conservation of the criterion.

\[ || \alpha A + B || <= |\alpha| ||A|| + ||B|| <= |\alpha| ||U_aV_a|| + ||U_bV_b|| \]

Definition at line 356 of file core_zgelrops_svd.c.

References core_zlrconcatenate_u(), core_zlrconcatenate_v(), core_zlrcpy(), core_zlrnrm(), pastix_int_t, PastixFrobeniusNorm, PastixNoTrans, PastixRight, pastix_lrblock_s::rk, pastix_lrblock_s::rkmax, pastix_lr_s::tolerance, and pastix_lr_s::use_reltol.

PastixComplex32 SVD low-rank kernels
pastix_fixdbl_t	core_cge2lr_svd (int use_reltol, pastix_fixdbl_t tol, pastix_int_t rklimit, pastix_int_t m, pastix_int_t n, const void Avoid, pastix_int_t lda, pastix_lrblock_t Alr)
	Convert a full rank matrix in a low rank matrix, using SVD. More...

pastix_fixdbl_t	core_crradd_svd (const pastix_lr_t lowrank, pastix_trans_t transA1, const void alphaptr, pastix_int_t M1, pastix_int_t N1, const pastix_lrblock_t A, pastix_int_t M2, pastix_int_t N2, pastix_lrblock_t B, pastix_int_t offx, pastix_int_t offy)
	Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T. More...

PastixFloat SVD low-rank kernels
pastix_fixdbl_t	core_sge2lr_svd (int use_reltol, pastix_fixdbl_t tol, pastix_int_t rklimit, pastix_int_t m, pastix_int_t n, const void Avoid, pastix_int_t lda, pastix_lrblock_t Alr)
	Convert a full rank matrix in a low rank matrix, using SVD. More...

pastix_fixdbl_t	core_srradd_svd (const pastix_lr_t lowrank, pastix_trans_t transA1, const void alphaptr, pastix_int_t M1, pastix_int_t N1, const pastix_lrblock_t A, pastix_int_t M2, pastix_int_t N2, pastix_lrblock_t B, pastix_int_t offx, pastix_int_t offy)
	Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T. More...

PastixDouble SVD low-rank kernels
pastix_fixdbl_t	core_dge2lr_svd (int use_reltol, pastix_fixdbl_t tol, pastix_int_t rklimit, pastix_int_t m, pastix_int_t n, const void Avoid, pastix_int_t lda, pastix_lrblock_t Alr)
	Convert a full rank matrix in a low rank matrix, using SVD. More...

pastix_fixdbl_t	core_drradd_svd (const pastix_lr_t lowrank, pastix_trans_t transA1, const void alphaptr, pastix_int_t M1, pastix_int_t N1, const pastix_lrblock_t A, pastix_int_t M2, pastix_int_t N2, pastix_lrblock_t B, pastix_int_t offx, pastix_int_t offy)
	Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T. More...

PastixComplex64 SVD low-rank kernels
pastix_fixdbl_t	core_zge2lr_svd (int use_reltol, pastix_fixdbl_t tol, pastix_int_t rklimit, pastix_int_t m, pastix_int_t n, const void Avoid, pastix_int_t lda, pastix_lrblock_t Alr)
	Convert a full rank matrix in a low rank matrix, using SVD. More...

pastix_fixdbl_t	core_zrradd_svd (const pastix_lr_t lowrank, pastix_trans_t transA1, const void alphaptr, pastix_int_t M1, pastix_int_t N1, const pastix_lrblock_t A, pastix_int_t M2, pastix_int_t N2, pastix_lrblock_t B, pastix_int_t offx, pastix_int_t offy)
	Add two LR structures A=(-u1) v1^T and B=u2 v2^T into u2 v2^T. More...

PastixComplex32 SVD low-rank kernels

PastixFloat SVD low-rank kernels

PastixDouble SVD low-rank kernels

PastixComplex64 SVD low-rank kernels

Detailed Description

Function Documentation

◆ core_cge2lr_svd()

◆ core_crradd_svd()

◆ core_sge2lr_svd()

◆ core_srradd_svd()

◆ core_dge2lr_svd()

◆ core_drradd_svd()

◆ core_zge2lr_svd()

◆ core_zrradd_svd()