PaStiX Handbook  6.2.1
CSR structure

## Data Structures

struct  fax_csr_s
Fax blocked csr structure. More...

## Macros

#define RAT_CBLK   0.02
Minimal memory increase accepted by the amalgamation algorithm (2%)

#define INFINI   1e9
Define an infinite time.

## Typedefs

typedef struct fax_csr_s fax_csr_t
Fax blocked csr structure.

## Functions

void faxCSRInit (pastix_int_t n, fax_csr_t *csr)
Initialize the data structure by doing the first allocations within the structure and initializing the fields. More...

void faxCSRClean (fax_csr_t *csr)
Free the data store in the structure. More...

pastix_int_t faxCSRGetNNZ (const fax_csr_t *csr)
Computes the number of non zero entries in the graph. More...

int faxCSRGenPA (const pastix_graph_t *graphA, const pastix_int_t *perm, fax_csr_t *graphPA)
Generate the graph of P*A from the graph of A and the permutation vector. More...

void faxCSRCompact (fax_csr_t *csr)
Compact a compressed graph. More...

void faxCSRCblkCompress (const fax_csr_t *graphA, const pastix_order_t *order, fax_csr_t *graphL, pastix_int_t *work)
Compact a element wise graph of a matrix A, according to the associated partition. More...

pastix_int_t faxCSRFactDirect (const fax_csr_t *graphA, const pastix_order_t *order, fax_csr_t *graphL)
Compute the non zero pattern of the direct factorization of a matrix A, given the supernode partition associated. More...

pastix_int_t faxCSRFactILUk (const fax_csr_t *graphA, const pastix_order_t *order, pastix_int_t level, fax_csr_t *graphL)
Compute the non zero pattern of the levelized incomplete factor for a sparse lower triangular matrix in CSC format. This pattern is exact iff the matrix has a SYMMETRIC non zero structure. More...

void faxCSRAmalgamate (int ilu, double rat_cblk, double rat_blas, fax_csr_t *graphL, pastix_order_t *order, PASTIX_Comm pastix_comm)
Amalgamate the given graph up to a given tolerance. More...

static double cblk_time_fact (pastix_int_t n, const pastix_int_t *ja, pastix_int_t colnbr)
Compute a time estimation cost of the factorization. More...

static double cblk_time_solve (pastix_int_t n, const pastix_int_t *ja, pastix_int_t colnbr)
Compute a time estimation cost of the solve step. More...

static pastix_int_t amalgamate_get_sonslist (pastix_int_t node, const pastix_int_t *sonindex, const pastix_int_t *sontab, const pastix_int_t *colweight, pastix_int_t *list)
Return the list of sons of a given node. More...

static pastix_int_t amalgamate_merge_cost (pastix_int_t a, pastix_int_t b, const fax_csr_t *P, const pastix_int_t *colweight)
Compute the cost of merging two nodes a and b together. The additional cost is returned. More...

static double amalgamate_merge_gain (pastix_int_t a, pastix_int_t b, const fax_csr_t *P, const pastix_int_t *colweight, pastix_int_t *tmp, double(*cblktime)(pastix_int_t, const pastix_int_t *, pastix_int_t))
Computes the computational gain of merging two nodes a and b together. More...

static void amalgamate_merge_col (pastix_int_t a, pastix_int_t b, fax_csr_t *P, pastix_int_t *tmp)
Merge two nodes together withing the given graph. More...

## ◆ fax_csr_s

 struct fax_csr_s

Fax blocked csr structure.

Definition at line 25 of file fax_csr.h.

Data Fields
pastix_int_t n
pastix_int_t total_nnz
pastix_int_t * nnz
pastix_int_t ** rows

## ◆ faxCSRInit()

 void faxCSRInit ( pastix_int_t n, fax_csr_t * csr )

Initialize the data structure by doing the first allocations within the structure and initializing the fields.

Parameters
 [in] n The size of the graph that needs to be initialized. [out] csr The graph to initialize.

Definition at line 40 of file fax_csr.c.

Referenced by faxCSRCblkCompress(), faxCSRFactILUk(), faxCSRGenPA(), and faxCSRPatchSymbol().

## ◆ faxCSRClean()

 void faxCSRClean ( fax_csr_t * csr )

Free the data store in the structure.

Parameters
 [in,out] csr The graph to clean.

Definition at line 65 of file fax_csr.c.

Referenced by faxCSRFactILUk(), faxCSRPatchSymbol(), orderAmalgamate(), and pastixSymbolFaxILUk().

## ◆ faxCSRGetNNZ()

 pastix_int_t faxCSRGetNNZ ( const fax_csr_t * csr )

Computes the number of non zero entries in the graph.

It is using the following formula: nnz = sum( i=0..n, nnz[n] ) The formula must be post computed to adapt to presence of diagonal elements or not, and to the symmetry of the graph.

Parameters
 [in] csr The graph on which the number of non zero entries is computed.
Return values
 The number of non zero entries.

Definition at line 99 of file fax_csr.c.

Referenced by orderAmalgamate(), and pastixSymbolFaxILUk().

## ◆ faxCSRGenPA()

 int faxCSRGenPA ( const pastix_graph_t * graphA, const pastix_int_t * perm, fax_csr_t * graphPA )

Generate the graph of P*A from the graph of A and the permutation vector.

Parameters
 [in] graphA The original graph Aon which the permutation will be applied. [in] perm Integer array of size graphA->n. Contains the permutation to apply to A. [in,out] graphPA On entry, the initialized graph with size graphA->n. On exit, contains the graph of P A.
Return values
 PASTIX_SUCCESS on success. PASTIX_ERR_ALLOC if allocation went wrong. PASTIX_ERR_BADPARAMETER if incorrect parameters are given.

Definition at line 183 of file fax_csr.c.

References faxCSRInit(), and PASTIX_SUCCESS.

Referenced by orderAmalgamate(), and pastixSymbolFaxILUk().

## ◆ faxCSRCompact()

 void faxCSRCompact ( fax_csr_t * csr )

Compact a compressed graph.

All nodes with no non zero entries are removed from the graph, the allocated space is not adjusted.

Parameters
 [in,out] csr The graph to compact. On entry, graph which might contain nodes with no non zero entries. On exit, all those nodes are suppressed and the compressed graph is returned.

Definition at line 129 of file fax_csr.c.

Referenced by faxCSRAmalgamate().

## ◆ faxCSRCblkCompress()

 void faxCSRCblkCompress ( const fax_csr_t * graphA, const pastix_order_t * order, fax_csr_t * graphL, pastix_int_t * work )

Compact a element wise graph of a matrix A, according to the associated partition.

Parameters
 [in] graphA The original graph of A element wise to compress. [in] order The ordering structure that holds the partition associated to graphAo. [out] graphL On entry, the block wise initialized graph of A with size order->cblknbr. On exit, contains the compressed graph of A. [in,out] work Workspace of size >= max( degree(L_i) ), so >= grapA->n.

Definition at line 249 of file fax_csr.c.

Referenced by faxCSRFactDirect(), and faxCSRFactILUk().

## ◆ faxCSRFactDirect()

 pastix_int_t faxCSRFactDirect ( const fax_csr_t * graphA, const pastix_order_t * order, fax_csr_t * graphL )

Compute the non zero pattern of the direct factorization of a matrix A, given the supernode partition associated.

Parameters
 [in] graphA The graph structure of the original matrix A to factorize. [in] order The order structure holding the number of cblk and the partition [out] graphL The graph the structure of the non zero pattern of the factorized matrix. On entry, a pointer to a graph structure. No need for initialization. On exit, the structure contains the computed graph.
Return values
 >=0,the number of non zero entries in the generated graph. -i,if the i^th parameter is incorrect

Definition at line 49 of file fax_csr_direct.c.

Referenced by orderAmalgamate().

## ◆ faxCSRFactILUk()

 pastix_int_t faxCSRFactILUk ( const fax_csr_t * graphA, const pastix_order_t * order, pastix_int_t level, fax_csr_t * graphL )

Compute the non zero pattern of the levelized incomplete factor for a sparse lower triangular matrix in CSC format. This pattern is exact iff the matrix has a SYMMETRIC non zero structure.

Remarks
This algorithm has been implemented according to the paper of David Hysom and Alex Pothen : Level-based Incomplete LU factorization: Graph Model and Algorithm
Parameters
 [in] graphA The graph structure of the original matrix A to factorize. [in] order The order structure holding the number of cblk and the partition [in] level It is the level desired for the ilu(k) factorization. (level >= 0) [in,out] graphL The graph the structure of the non zero pattern of the factorized matrix. On entry, a pointer to a graph structure. No need for initialization. On exit, the structure contains the computed graph.
Return values
 >=0,the number of non zero entries in the generated graph. -i,if the i^th parameter is incorrect

Allocated the working array

Initialized visited

Apply GS_Urow for each row

Reset the stack number of elements

Put the diagonal term

BFS phase

Definition at line 57 of file fax_csr_iluk.c.

References faxCSRCblkCompress(), faxCSRClean(), and faxCSRInit().

Referenced by orderAmalgamate(), and pastixSymbolFaxILUk().

## ◆ faxCSRAmalgamate()

 void faxCSRAmalgamate ( int ilu, double rat_cblk, double rat_blas, fax_csr_t * graphL, pastix_order_t * order, PASTIX_Comm pastix_comm )

Amalgamate the given graph up to a given tolerance.

This function takes a supernode graph and performs amalgamation of the columns until a fill tolerance is reached. Two level of fill tolerance are available, the first one, rat_cblk, defines a fill tolerance based on the graph structure only, and the second one, rat_blas, allows to keep amalgamate the structure until the higher ratio while it improves the BLAS performance based on the internal cost model.

Parameters
 [in] ilu 1: amalgamation for incomplete factorization will be performed. 0: amalgamation for direct factorization will be performed. [in] rat_cblk Percentage of fill-in allowed using structure only amalgamation. Must be >= 0. [in] rat_blas Percentage of fill-in allowed using structure BLAS performance model. rat_blas >= rat_cblk. The ratio is always on the number of non zero entries of the pattern, but the criterion to merge columns is based only on reducing the cost of computations. The cost function used is a model for Cholesky factorization or solve in double precision, other factorization and/or precision are not implemented. They are considered to be similar up to a factor, and so doesn't affect the amalgamation. The solve cost is used for ILU(k) factorization and the factorization cost for direct factorization. [in,out] graphL The graph of the matrix L to amalgamate. On exit, the compressed graph is returned (not the quotient graph !!) [in,out] order On entry the initial ordering structure associated to L. On exit, the update ordering structure with the amalgamation of close nodes. [in] pastix_comm MPI communicator. Used only for printf in this function.

Number of unknowns

IF THIS SNODE IS NOT A ROOT

IF THIS SNODE IS NOT A ROOT

Merge all the columns so it doesn't add fill-in

Definition at line 408 of file fax_csr_amalgamate.c.

Referenced by orderAmalgamate().

## ◆ cblk_time_fact()

 static double cblk_time_fact ( pastix_int_t n, const pastix_int_t * ja, pastix_int_t colnbr )
inlinestatic

Compute a time estimation cost of the factorization.

This function is used during the amalgamation algorithm to decide whether it is more interesting to amalgamate the blocks or to keep them separated in terms of performance cost

Parameters
 [in] n The number of rows in the column-block [in] ja Array of size n that holds the indexes of the rows in the given column block [in] colnbr The number of columns in the column-block
Returns
The estimated time in second to factorize the column-block and apply the resulting updates.

Contributions

Definition at line 64 of file fax_csr_amalgamate.c.

References cost().

Referenced by faxCSRAmalgamate().

## ◆ cblk_time_solve()

 static double cblk_time_solve ( pastix_int_t n, const pastix_int_t * ja, pastix_int_t colnbr )
inlinestatic

Compute a time estimation cost of the solve step.

This function is used during the amalgamation algorithm to decide whether it is more interesting to amalgamate the blocks or to keep them separated in terms of performance cost when doing incomplete factorization.

Parameters
 [in] n The number of rows in the column-block [in] ja Array of size n that holds the indexes of the rows in the given column block [in] colnbr The number of columns in the column-block
Returns
The estimated time in second to apply the solve step issued from this column-block

Definition at line 122 of file fax_csr_amalgamate.c.

References cost().

Referenced by faxCSRAmalgamate().

## ◆ amalgamate_get_sonslist()

 static pastix_int_t amalgamate_get_sonslist ( pastix_int_t node, const pastix_int_t * sonindex, const pastix_int_t * sontab, const pastix_int_t * colweight, pastix_int_t * list )
inlinestatic

Return the list of sons of a given node.

Parameters
 [in] node Node from which the list is wanted. [in] sonindex Integer array of the indexes of the first son of each node in sontab array. [in] sontab Integer array which contains the list of sons for each node. [in] colweight Integer array of the weight of each node. [out] list Integer array that can contains the list of sons, should be of size the number of unknowns to prevent overflow. On exit, contains the list of sons of node.
Returns
The number of sons of node.

Definition at line 165 of file fax_csr_amalgamate.c.

Referenced by faxCSRAmalgamate().

## ◆ amalgamate_merge_cost()

 static pastix_int_t amalgamate_merge_cost ( pastix_int_t a, pastix_int_t b, const fax_csr_t * P, const pastix_int_t * colweight )
inlinestatic

Compute the cost of merging two nodes a and b together. The additional cost is returned.

Parameters
 [in] a First node to merge. a >= 0. [in] b Second node to merge. b >= 0. [in] P The graph that contains a and b nodes with their associated rows. [in] colweight Integer array of size P->n with the weight of each node.
Returns
The additional non zero entries required if both nodes are merged together.

Definition at line 214 of file fax_csr_amalgamate.c.

References cost().

Referenced by faxCSRAmalgamate().

## ◆ amalgamate_merge_gain()

 static double amalgamate_merge_gain ( pastix_int_t a, pastix_int_t b, const fax_csr_t * P, const pastix_int_t * colweight, pastix_int_t * tmp, double(*)(pastix_int_t, const pastix_int_t *, pastix_int_t) cblktime )
inlinestatic

Computes the computational gain of merging two nodes a and b together.

It uses the given cost function cblktime to estimate the cost with and without merge and returns the cost variation.

Parameters
 [in] a First node to merge. a >= 0. [in] b Second node to merge. b >= 0. [in] P The graph that contains a and b nodes with their associated rows. [in] colweight Integer array of size P->n with the weight of each node. [in,out] tmp Workspace array to compute the union of the rows associated to both nodes. [in] cblktime Function that returns the computational cost of a cblk.
Returns
The cost variation between merging the two nodes together vs keeping them separated according to the cblktime function.

Definition at line 296 of file fax_csr_amalgamate.c.

Referenced by faxCSRAmalgamate().

## ◆ amalgamate_merge_col()

 static void amalgamate_merge_col ( pastix_int_t a, pastix_int_t b, fax_csr_t * P, pastix_int_t * tmp )
inlinestatic

Merge two nodes together withing the given graph.

Parameters
 [in] a First node to merge. a >= 0. [in] b Second node to merge. b >= 0. [in,out] P The graph that contains a and b nodes with their associated rows. On exit, a is merged into b node. [in,out] tmp Workspace array to compute the union of the rows associated to both nodes.

Definition at line 339 of file fax_csr_amalgamate.c.

Referenced by faxCSRAmalgamate().