PaStiX Handbook  6.2.1

Data Structures

struct  fax_csr_s
 Fax blocked csr structure. More...
 

Macros

#define RAT_CBLK   0.02
 Minimal memory increase accepted by the amalgamation algorithm (2%)
 
#define INFINI   1e9
 Define an infinite time.
 

Typedefs

typedef struct fax_csr_s fax_csr_t
 Fax blocked csr structure.
 

Functions

void faxCSRInit (pastix_int_t n, fax_csr_t *csr)
 Initialize the data structure by doing the first allocations within the structure and initializing the fields. More...
 
void faxCSRClean (fax_csr_t *csr)
 Free the data store in the structure. More...
 
pastix_int_t faxCSRGetNNZ (const fax_csr_t *csr)
 Computes the number of non zero entries in the graph. More...
 
int faxCSRGenPA (const pastix_graph_t *graphA, const pastix_int_t *perm, fax_csr_t *graphPA)
 Generate the graph of P*A from the graph of A and the permutation vector. More...
 
void faxCSRCompact (fax_csr_t *csr)
 Compact a compressed graph. More...
 
void faxCSRCblkCompress (const fax_csr_t *graphA, const pastix_order_t *order, fax_csr_t *graphL, pastix_int_t *work)
 Compact a element wise graph of a matrix A, according to the associated partition. More...
 
pastix_int_t faxCSRFactDirect (const fax_csr_t *graphA, const pastix_order_t *order, fax_csr_t *graphL)
 Compute the non zero pattern of the direct factorization of a matrix A, given the supernode partition associated. More...
 
pastix_int_t faxCSRFactILUk (const fax_csr_t *graphA, const pastix_order_t *order, pastix_int_t level, fax_csr_t *graphL)
 Compute the non zero pattern of the levelized incomplete factor for a sparse lower triangular matrix in CSC format. This pattern is exact iff the matrix has a SYMMETRIC non zero structure. More...
 
void faxCSRAmalgamate (int ilu, double rat_cblk, double rat_blas, fax_csr_t *graphL, pastix_order_t *order, PASTIX_Comm pastix_comm)
 Amalgamate the given graph up to a given tolerance. More...
 
static double cblk_time_fact (pastix_int_t n, const pastix_int_t *ja, pastix_int_t colnbr)
 Compute a time estimation cost of the factorization. More...
 
static double cblk_time_solve (pastix_int_t n, const pastix_int_t *ja, pastix_int_t colnbr)
 Compute a time estimation cost of the solve step. More...
 
static pastix_int_t amalgamate_get_sonslist (pastix_int_t node, const pastix_int_t *sonindex, const pastix_int_t *sontab, const pastix_int_t *colweight, pastix_int_t *list)
 Return the list of sons of a given node. More...
 
static pastix_int_t amalgamate_merge_cost (pastix_int_t a, pastix_int_t b, const fax_csr_t *P, const pastix_int_t *colweight)
 Compute the cost of merging two nodes a and b together. The additional cost is returned. More...
 
static double amalgamate_merge_gain (pastix_int_t a, pastix_int_t b, const fax_csr_t *P, const pastix_int_t *colweight, pastix_int_t *tmp, double(*cblktime)(pastix_int_t, const pastix_int_t *, pastix_int_t))
 Computes the computational gain of merging two nodes a and b together. More...
 
static void amalgamate_merge_col (pastix_int_t a, pastix_int_t b, fax_csr_t *P, pastix_int_t *tmp)
 Merge two nodes together withing the given graph. More...
 

Detailed Description


Data Type Documentation

◆ fax_csr_s

struct fax_csr_s

Fax blocked csr structure.

Definition at line 25 of file fax_csr.h.

Data Fields
pastix_int_t n
pastix_int_t total_nnz
pastix_int_t * nnz
pastix_int_t ** rows

Function Documentation

◆ faxCSRInit()

void faxCSRInit ( pastix_int_t  n,
fax_csr_t csr 
)

Initialize the data structure by doing the first allocations within the structure and initializing the fields.

Parameters
[in]nThe size of the graph that needs to be initialized.
[out]csrThe graph to initialize.

Definition at line 40 of file fax_csr.c.

Referenced by faxCSRCblkCompress(), faxCSRFactILUk(), faxCSRGenPA(), and faxCSRPatchSymbol().

◆ faxCSRClean()

void faxCSRClean ( fax_csr_t csr)

Free the data store in the structure.

Parameters
[in,out]csrThe graph to clean.

Definition at line 65 of file fax_csr.c.

Referenced by faxCSRFactILUk(), faxCSRPatchSymbol(), orderAmalgamate(), and pastixSymbolFaxILUk().

◆ faxCSRGetNNZ()

pastix_int_t faxCSRGetNNZ ( const fax_csr_t csr)

Computes the number of non zero entries in the graph.

It is using the following formula: nnz = sum( i=0..n, nnz[n] ) The formula must be post computed to adapt to presence of diagonal elements or not, and to the symmetry of the graph.

Parameters
[in]csrThe graph on which the number of non zero entries is computed.
Return values
Thenumber of non zero entries.

Definition at line 99 of file fax_csr.c.

Referenced by orderAmalgamate(), and pastixSymbolFaxILUk().

◆ faxCSRGenPA()

int faxCSRGenPA ( const pastix_graph_t *  graphA,
const pastix_int_t *  perm,
fax_csr_t graphPA 
)

Generate the graph of P*A from the graph of A and the permutation vector.

Parameters
[in]graphAThe original graph Aon which the permutation will be applied.
[in]permInteger array of size graphA->n. Contains the permutation to apply to A.
[in,out]graphPAOn entry, the initialized graph with size graphA->n. On exit, contains the graph of P A.
Return values
PASTIX_SUCCESSon success.
PASTIX_ERR_ALLOCif allocation went wrong.
PASTIX_ERR_BADPARAMETERif incorrect parameters are given.

Definition at line 183 of file fax_csr.c.

References faxCSRInit(), and PASTIX_SUCCESS.

Referenced by orderAmalgamate(), and pastixSymbolFaxILUk().

◆ faxCSRCompact()

void faxCSRCompact ( fax_csr_t csr)

Compact a compressed graph.

All nodes with no non zero entries are removed from the graph, the allocated space is not adjusted.

Parameters
[in,out]csrThe graph to compact. On entry, graph which might contain nodes with no non zero entries. On exit, all those nodes are suppressed and the compressed graph is returned.

Definition at line 129 of file fax_csr.c.

Referenced by faxCSRAmalgamate().

◆ faxCSRCblkCompress()

void faxCSRCblkCompress ( const fax_csr_t graphA,
const pastix_order_t order,
fax_csr_t graphL,
pastix_int_t *  work 
)

Compact a element wise graph of a matrix A, according to the associated partition.

Parameters
[in]graphAThe original graph of A element wise to compress.
[in]orderThe ordering structure that holds the partition associated to graphAo.
[out]graphLOn entry, the block wise initialized graph of A with size order->cblknbr. On exit, contains the compressed graph of A.
[in,out]workWorkspace of size >= max( degree(L_i) ), so >= grapA->n.

Definition at line 249 of file fax_csr.c.

References pastix_order_s::baseval, pastix_order_s::cblknbr, faxCSRInit(), and pastix_order_s::rangtab.

Referenced by faxCSRFactDirect(), and faxCSRFactILUk().

◆ faxCSRFactDirect()

pastix_int_t faxCSRFactDirect ( const fax_csr_t graphA,
const pastix_order_t order,
fax_csr_t graphL 
)

Compute the non zero pattern of the direct factorization of a matrix A, given the supernode partition associated.

Parameters
[in]graphAThe graph structure of the original matrix A to factorize.
[in]orderThe order structure holding the number of cblk and the partition
[out]graphLThe graph the structure of the non zero pattern of the factorized matrix. On entry, a pointer to a graph structure. No need for initialization. On exit, the structure contains the computed graph.
Return values
>=0,thenumber of non zero entries in the generated graph.
-i,ifthe i^th parameter is incorrect

Definition at line 49 of file fax_csr_direct.c.

References pastix_order_s::cblknbr, faxCSRCblkCompress(), pastix_order_s::rangtab, and pastix_order_s::treetab.

Referenced by orderAmalgamate().

◆ faxCSRFactILUk()

pastix_int_t faxCSRFactILUk ( const fax_csr_t graphA,
const pastix_order_t order,
pastix_int_t  level,
fax_csr_t graphL 
)

Compute the non zero pattern of the levelized incomplete factor for a sparse lower triangular matrix in CSC format. This pattern is exact iff the matrix has a SYMMETRIC non zero structure.

Remarks
This algorithm has been implemented according to the paper of David Hysom and Alex Pothen : Level-based Incomplete LU factorization: Graph Model and Algorithm
Parameters
[in]graphAThe graph structure of the original matrix A to factorize.
[in]orderThe order structure holding the number of cblk and the partition
[in]levelIt is the level desired for the ilu(k) factorization. (level >= 0)
[in,out]graphLThe graph the structure of the non zero pattern of the factorized matrix. On entry, a pointer to a graph structure. No need for initialization. On exit, the structure contains the computed graph.
Return values
>=0,thenumber of non zero entries in the generated graph.
-i,ifthe i^th parameter is incorrect

Allocated the working array

Initialized visited

Apply GS_Urow for each row

Reset the stack number of elements

Put the diagonal term

BFS phase

Definition at line 57 of file fax_csr_iluk.c.

References faxCSRCblkCompress(), faxCSRClean(), and faxCSRInit().

Referenced by orderAmalgamate(), and pastixSymbolFaxILUk().

◆ faxCSRAmalgamate()

void faxCSRAmalgamate ( int  ilu,
double  rat_cblk,
double  rat_blas,
fax_csr_t graphL,
pastix_order_t order,
PASTIX_Comm  pastix_comm 
)

Amalgamate the given graph up to a given tolerance.

This function takes a supernode graph and performs amalgamation of the columns until a fill tolerance is reached. Two level of fill tolerance are available, the first one, rat_cblk, defines a fill tolerance based on the graph structure only, and the second one, rat_blas, allows to keep amalgamate the structure until the higher ratio while it improves the BLAS performance based on the internal cost model.

Parameters
[in]ilu
  • 1: amalgamation for incomplete factorization will be performed.
  • 0: amalgamation for direct factorization will be performed.
[in]rat_cblkPercentage of fill-in allowed using structure only amalgamation. Must be >= 0.
[in]rat_blasPercentage of fill-in allowed using structure BLAS performance model. rat_blas >= rat_cblk. The ratio is always on the number of non zero entries of the pattern, but the criterion to merge columns is based only on reducing the cost of computations. The cost function used is a model for Cholesky factorization or solve in double precision, other factorization and/or precision are not implemented. They are considered to be similar up to a factor, and so doesn't affect the amalgamation. The solve cost is used for ILU(k) factorization and the factorization cost for direct factorization.
[in,out]graphLThe graph of the matrix L to amalgamate. On exit, the compressed graph is returned (not the quotient graph !!)
[in,out]orderOn entry the initial ordering structure associated to L. On exit, the update ordering structure with the amalgamation of close nodes.
[in]pastix_commMPI communicator. Used only for printf in this function.

Number of unknowns

IF THIS SNODE IS NOT A ROOT

IF THIS SNODE IS NOT A ROOT

Merge all the columns so it doesn't add fill-in

Definition at line 408 of file fax_csr_amalgamate.c.

References amalgamate_get_sonslist(), amalgamate_merge_col(), amalgamate_merge_cost(), amalgamate_merge_gain(), cblk_time_fact(), cblk_time_solve(), pastix_order_s::cblknbr, faxCSRCompact(), INFINI, PASTIX_SUCCESS, pastixOrderCheck(), pastix_order_s::peritab, pastix_order_s::permtab, pqueueExit(), pqueueInit(), pqueuePop1(), pqueuePush1(), pqueueSize(), pastix_order_s::rangtab, RAT_CBLK, pastix_order_s::treetab, and pastix_order_s::vertnbr.

Referenced by orderAmalgamate().

◆ cblk_time_fact()

static double cblk_time_fact ( pastix_int_t  n,
const pastix_int_t *  ja,
pastix_int_t  colnbr 
)
inlinestatic

Compute a time estimation cost of the factorization.

This function is used during the amalgamation algorithm to decide whether it is more interesting to amalgamate the blocks or to keep them separated in terms of performance cost

Parameters
[in]nThe number of rows in the column-block
[in]jaArray of size n that holds the indexes of the rows in the given column block
[in]colnbrThe number of columns in the column-block
Returns
The estimated time in second to factorize the column-block and apply the resulting updates.

Contributions

Definition at line 64 of file fax_csr_amalgamate.c.

References cost().

Referenced by faxCSRAmalgamate().

◆ cblk_time_solve()

static double cblk_time_solve ( pastix_int_t  n,
const pastix_int_t *  ja,
pastix_int_t  colnbr 
)
inlinestatic

Compute a time estimation cost of the solve step.

This function is used during the amalgamation algorithm to decide whether it is more interesting to amalgamate the blocks or to keep them separated in terms of performance cost when doing incomplete factorization.

Parameters
[in]nThe number of rows in the column-block
[in]jaArray of size n that holds the indexes of the rows in the given column block
[in]colnbrThe number of columns in the column-block
Returns
The estimated time in second to apply the solve step issued from this column-block

Definition at line 122 of file fax_csr_amalgamate.c.

References cost().

Referenced by faxCSRAmalgamate().

◆ amalgamate_get_sonslist()

static pastix_int_t amalgamate_get_sonslist ( pastix_int_t  node,
const pastix_int_t *  sonindex,
const pastix_int_t *  sontab,
const pastix_int_t *  colweight,
pastix_int_t *  list 
)
inlinestatic

Return the list of sons of a given node.

Parameters
[in]nodeNode from which the list is wanted.
[in]sonindexInteger array of the indexes of the first son of each node in sontab array.
[in]sontabInteger array which contains the list of sons for each node.
[in]colweightInteger array of the weight of each node.
[out]listInteger array that can contains the list of sons, should be of size the number of unknowns to prevent overflow. On exit, contains the list of sons of node.
Returns
The number of sons of node.

Definition at line 165 of file fax_csr_amalgamate.c.

Referenced by faxCSRAmalgamate().

◆ amalgamate_merge_cost()

static pastix_int_t amalgamate_merge_cost ( pastix_int_t  a,
pastix_int_t  b,
const fax_csr_t P,
const pastix_int_t *  colweight 
)
inlinestatic

Compute the cost of merging two nodes a and b together. The additional cost is returned.

Parameters
[in]aFirst node to merge. a >= 0.
[in]bSecond node to merge. b >= 0.
[in]PThe graph that contains a and b nodes with their associated rows.
[in]colweightInteger array of size P->n with the weight of each node.
Returns
The additional non zero entries required if both nodes are merged together.

Definition at line 214 of file fax_csr_amalgamate.c.

References cost().

Referenced by faxCSRAmalgamate().

◆ amalgamate_merge_gain()

static double amalgamate_merge_gain ( pastix_int_t  a,
pastix_int_t  b,
const fax_csr_t P,
const pastix_int_t *  colweight,
pastix_int_t *  tmp,
double(*)(pastix_int_t, const pastix_int_t *, pastix_int_t)  cblktime 
)
inlinestatic

Computes the computational gain of merging two nodes a and b together.

It uses the given cost function cblktime to estimate the cost with and without merge and returns the cost variation.

Parameters
[in]aFirst node to merge. a >= 0.
[in]bSecond node to merge. b >= 0.
[in]PThe graph that contains a and b nodes with their associated rows.
[in]colweightInteger array of size P->n with the weight of each node.
[in,out]tmpWorkspace array to compute the union of the rows associated to both nodes.
[in]cblktimeFunction that returns the computational cost of a cblk.
Returns
The cost variation between merging the two nodes together vs keeping them separated according to the cblktime function.

Definition at line 296 of file fax_csr_amalgamate.c.

Referenced by faxCSRAmalgamate().

◆ amalgamate_merge_col()

static void amalgamate_merge_col ( pastix_int_t  a,
pastix_int_t  b,
fax_csr_t P,
pastix_int_t *  tmp 
)
inlinestatic

Merge two nodes together withing the given graph.

Parameters
[in]aFirst node to merge. a >= 0.
[in]bSecond node to merge. b >= 0.
[in,out]PThe graph that contains a and b nodes with their associated rows. On exit, a is merged into b node.
[in,out]tmpWorkspace array to compute the union of the rows associated to both nodes.

Definition at line 339 of file fax_csr_amalgamate.c.

Referenced by faxCSRAmalgamate().