PaStiX Handbook
6.4.0

Data Structures  
struct  fax_csr_s 
Fax blocked csr structure. More...  
Macros  
#define  RAT_CBLK 0.02 
Minimal memory increase accepted by the amalgamation algorithm (2%)  
#define  INFINI 1e9 
Define an infinite time.  
Typedefs  
typedef struct fax_csr_s  fax_csr_t 
Fax blocked csr structure.  
Functions  
void  faxCSRInit (pastix_int_t n, fax_csr_t *csr) 
Initialize the data structure by doing the first allocations within the structure and initializing the fields. More...  
void  faxCSRClean (fax_csr_t *csr) 
Free the data store in the structure. More...  
pastix_int_t  faxCSRGetNNZ (const fax_csr_t *csr) 
Computes the number of non zero entries in the graph. More...  
int  faxCSRGenPA (const pastix_graph_t *graphA, const pastix_int_t *perm, fax_csr_t *graphPA) 
Generate the graph of P*A from the graph of A and the permutation vector. More...  
void  faxCSRCompact (fax_csr_t *csr) 
Compact a compressed graph. More...  
void  faxCSRCblkCompress (const fax_csr_t *graphA, const pastix_order_t *order, fax_csr_t *graphL, pastix_int_t *work) 
Compact a element wise graph of a matrix A, according to the associated partition. More...  
pastix_int_t  faxCSRFactDirect (const fax_csr_t *graphA, const pastix_order_t *order, fax_csr_t *graphL) 
Compute the non zero pattern of the direct factorization of a matrix A, given the supernode partition associated. More...  
pastix_int_t  faxCSRFactILUk (const fax_csr_t *graphA, const pastix_order_t *order, pastix_int_t level, fax_csr_t *graphL) 
Compute the non zero pattern of the levelized incomplete factor for a sparse lower triangular matrix in CSC format. This pattern is exact iff the matrix has a SYMMETRIC non zero structure. More...  
void  faxCSRAmalgamate (int ilu, double rat_cblk, double rat_blas, fax_csr_t *graphL, pastix_order_t *order, PASTIX_Comm pastix_comm) 
Amalgamate the given graph up to a given tolerance. More...  
static double  cblk_time_fact (pastix_int_t n, const pastix_int_t *ja, pastix_int_t colnbr) 
Compute a time estimation cost of the factorization. More...  
static double  cblk_time_solve (pastix_int_t n, const pastix_int_t *ja, pastix_int_t colnbr) 
Compute a time estimation cost of the solve step. More...  
static pastix_int_t  amalgamate_get_sonslist (pastix_int_t node, const pastix_int_t *sonindex, const pastix_int_t *sontab, const pastix_int_t *colweight, pastix_int_t *list) 
Return the list of sons of a given node. More...  
static pastix_int_t  amalgamate_merge_cost (pastix_int_t a, pastix_int_t b, const fax_csr_t *P, const pastix_int_t *colweight) 
Compute the cost of merging two nodes a and b together. The additional cost is returned. More...  
static double  amalgamate_merge_gain (pastix_int_t a, pastix_int_t b, const fax_csr_t *P, const pastix_int_t *colweight, pastix_int_t *tmp, double(*cblktime)(pastix_int_t, const pastix_int_t *, pastix_int_t)) 
Computes the computational gain of merging two nodes a and b together. More...  
static void  amalgamate_merge_col (pastix_int_t a, pastix_int_t b, fax_csr_t *P, pastix_int_t *tmp) 
Merge two nodes together withing the given graph. More...  
struct fax_csr_s 
Data Fields  

pastix_int_t  n  
pastix_int_t  total_nnz  
pastix_int_t *  nnz  
pastix_int_t **  rows 
void faxCSRInit  (  pastix_int_t  n, 
fax_csr_t *  csr  
) 
Initialize the data structure by doing the first allocations within the structure and initializing the fields.
[in]  n  The size of the graph that needs to be initialized. 
[out]  csr  The graph to initialize. 
Definition at line 40 of file fax_csr.c.
References pastix_int_t.
Referenced by faxCSRCblkCompress(), faxCSRFactILUk(), faxCSRGenPA(), and faxCSRPatchSymbol().
void faxCSRClean  (  fax_csr_t *  csr  ) 
Free the data store in the structure.
[in,out]  csr  The graph to clean. 
Definition at line 65 of file fax_csr.c.
References pastix_int_t.
Referenced by faxCSRFactILUk(), faxCSRPatchSymbol(), orderAmalgamate(), and pastixSymbolFaxILUk().
pastix_int_t faxCSRGetNNZ  (  const fax_csr_t *  csr  ) 
Computes the number of non zero entries in the graph.
It is using the following formula: nnz = sum( i=0..n, nnz[n] ) The formula must be post computed to adapt to presence of diagonal elements or not, and to the symmetry of the graph.
[in]  csr  The graph on which the number of non zero entries is computed. 
The  number of non zero entries. 
Definition at line 99 of file fax_csr.c.
References pastix_int_t.
Referenced by orderAmalgamate(), and pastixSymbolFaxILUk().
int faxCSRGenPA  (  const pastix_graph_t *  graphA, 
const pastix_int_t *  perm,  
fax_csr_t *  graphPA  
) 
Generate the graph of P*A from the graph of A and the permutation vector.
[in]  graphA  The original graph Aon which the permutation will be applied. 
[in]  perm  Integer array of size graphA>n. Contains the permutation to apply to A. 
[in,out]  graphPA  On entry, the initialized graph with size graphA>n. On exit, contains the graph of P A. 
PASTIX_SUCCESS  on success. 
PASTIX_ERR_ALLOC  if allocation went wrong. 
PASTIX_ERR_BADPARAMETER  if incorrect parameters are given. 
Definition at line 183 of file fax_csr.c.
References faxCSRInit(), pastix_int_t, and PASTIX_SUCCESS.
Referenced by orderAmalgamate(), and pastixSymbolFaxILUk().
void faxCSRCompact  (  fax_csr_t *  csr  ) 
Compact a compressed graph.
All nodes with no non zero entries are removed from the graph, the allocated space is not adjusted.
[in,out]  csr  The graph to compact. On entry, graph which might contain nodes with no non zero entries. On exit, all those nodes are suppressed and the compressed graph is returned. 
Definition at line 129 of file fax_csr.c.
References pastix_int_t.
Referenced by faxCSRAmalgamate().
void faxCSRCblkCompress  (  const fax_csr_t *  graphA, 
const pastix_order_t *  order,  
fax_csr_t *  graphL,  
pastix_int_t *  work  
) 
Compact a element wise graph of a matrix A, according to the associated partition.
[in]  graphA  The original graph of A element wise to compress. 
[in]  order  The ordering structure that holds the partition associated to graphAo. 
[out]  graphL  On entry, the block wise initialized graph of A with size order>cblknbr. On exit, contains the compressed graph of A. 
[in,out]  work  Workspace of size >= max( degree(L_i) ), so >= grapA>n. 
Definition at line 249 of file fax_csr.c.
References pastix_order_s::baseval, pastix_order_s::cblknbr, faxCSRInit(), pastix_int_t, and pastix_order_s::rangtab.
Referenced by faxCSRFactDirect(), and faxCSRFactILUk().
pastix_int_t faxCSRFactDirect  (  const fax_csr_t *  graphA, 
const pastix_order_t *  order,  
fax_csr_t *  graphL  
) 
Compute the non zero pattern of the direct factorization of a matrix A, given the supernode partition associated.
[in]  graphA  The graph structure of the original matrix A to factorize. 
[in]  order  The order structure holding the number of cblk and the partition 
[out]  graphL  The graph the structure of the non zero pattern of the factorized matrix. On entry, a pointer to a graph structure. No need for initialization. On exit, the structure contains the computed graph. 
>=0,the  number of non zero entries in the generated graph. 
i,if  the i^th parameter is incorrect 
Definition at line 49 of file fax_csr_direct.c.
References pastix_order_s::cblknbr, faxCSRCblkCompress(), pastix_int_t, pastix_order_s::rangtab, and pastix_order_s::treetab.
Referenced by orderAmalgamate().
pastix_int_t faxCSRFactILUk  (  const fax_csr_t *  graphA, 
const pastix_order_t *  order,  
pastix_int_t  level,  
fax_csr_t *  graphL  
) 
Compute the non zero pattern of the levelized incomplete factor for a sparse lower triangular matrix in CSC format. This pattern is exact iff the matrix has a SYMMETRIC non zero structure.
[in]  graphA  The graph structure of the original matrix A to factorize. 
[in]  order  The order structure holding the number of cblk and the partition 
[in]  level  It is the level desired for the ilu(k) factorization. (level >= 0) 
[in,out]  graphL  The graph the structure of the non zero pattern of the factorized matrix. On entry, a pointer to a graph structure. No need for initialization. On exit, the structure contains the computed graph. 
>=0,the  number of non zero entries in the generated graph. 
i,if  the i^th parameter is incorrect 
Allocated the working array
Initialized visited
Apply GS_Urow for each row
Reset the stack number of elements
Put the diagonal term
BFS phase
Definition at line 57 of file fax_csr_iluk.c.
References faxCSRCblkCompress(), faxCSRClean(), faxCSRInit(), and pastix_int_t.
Referenced by orderAmalgamate(), and pastixSymbolFaxILUk().
void faxCSRAmalgamate  (  int  ilu, 
double  rat_cblk,  
double  rat_blas,  
fax_csr_t *  graphL,  
pastix_order_t *  order,  
PASTIX_Comm  pastix_comm  
) 
Amalgamate the given graph up to a given tolerance.
This function takes a supernode graph and performs amalgamation of the columns until a fill tolerance is reached. Two level of fill tolerance are available, the first one, rat_cblk, defines a fill tolerance based on the graph structure only, and the second one, rat_blas, allows to keep amalgamate the structure until the higher ratio while it improves the BLAS performance based on the internal cost model.
[in]  ilu 

[in]  rat_cblk  Percentage of fillin allowed using structure only amalgamation. Must be >= 0. 
[in]  rat_blas  Percentage of fillin allowed using structure BLAS performance model. rat_blas >= rat_cblk. The ratio is always on the number of non zero entries of the pattern, but the criterion to merge columns is based only on reducing the cost of computations. The cost function used is a model for Cholesky factorization or solve in double precision, other factorization and/or precision are not implemented. They are considered to be similar up to a factor, and so doesn't affect the amalgamation. The solve cost is used for ILU(k) factorization and the factorization cost for direct factorization. 
[in,out]  graphL  The graph of the matrix L to amalgamate. On exit, the compressed graph is returned (not the quotient graph !!) 
[in,out]  order  On entry the initial ordering structure associated to L. On exit, the update ordering structure with the amalgamation of close nodes. 
[in]  pastix_comm  MPI communicator. Used only for printf in this function. 
Number of unknowns
IF THIS SNODE IS NOT A ROOT
IF THIS SNODE IS NOT A ROOT
Merge all the columns so it doesn't add fillin
Definition at line 408 of file fax_csr_amalgamate.c.
References amalgamate_get_sonslist(), amalgamate_merge_col(), amalgamate_merge_cost(), amalgamate_merge_gain(), cblk_time_fact(), cblk_time_solve(), pastix_order_s::cblknbr, faxCSRCompact(), INFINI, pastix_int_t, PASTIX_SUCCESS, pastixOrderCheck(), pastix_order_s::peritab, pastix_order_s::permtab, pqueueExit(), pqueueInit(), pqueuePop1(), pqueuePush1(), pqueueSize(), pastix_order_s::rangtab, RAT_CBLK, pastix_order_s::treetab, and pastix_order_s::vertnbr.
Referenced by orderAmalgamate().

inlinestatic 
Compute a time estimation cost of the factorization.
This function is used during the amalgamation algorithm to decide whether it is more interesting to amalgamate the blocks or to keep them separated in terms of performance cost
[in]  n  The number of rows in the columnblock 
[in]  ja  Array of size n that holds the indexes of the rows in the given column block 
[in]  colnbr  The number of columns in the columnblock 
Contributions
Definition at line 64 of file fax_csr_amalgamate.c.
References cost(), and pastix_int_t.
Referenced by faxCSRAmalgamate().

inlinestatic 
Compute a time estimation cost of the solve step.
This function is used during the amalgamation algorithm to decide whether it is more interesting to amalgamate the blocks or to keep them separated in terms of performance cost when doing incomplete factorization.
[in]  n  The number of rows in the columnblock 
[in]  ja  Array of size n that holds the indexes of the rows in the given column block 
[in]  colnbr  The number of columns in the columnblock 
Definition at line 122 of file fax_csr_amalgamate.c.
References cost(), and pastix_int_t.
Referenced by faxCSRAmalgamate().

inlinestatic 
Return the list of sons of a given node.
[in]  node  Node from which the list is wanted. 
[in]  sonindex  Integer array of the indexes of the first son of each node in sontab array. 
[in]  sontab  Integer array which contains the list of sons for each node. 
[in]  colweight  Integer array of the weight of each node. 
[out]  list  Integer array that can contains the list of sons, should be of size the number of unknowns to prevent overflow. On exit, contains the list of sons of node. 
Definition at line 165 of file fax_csr_amalgamate.c.
References pastix_int_t.
Referenced by faxCSRAmalgamate().

inlinestatic 
Compute the cost of merging two nodes a and b together. The additional cost is returned.
[in]  a  First node to merge. a >= 0. 
[in]  b  Second node to merge. b >= 0. 
[in]  P  The graph that contains a and b nodes with their associated rows. 
[in]  colweight  Integer array of size P>n with the weight of each node. 
Definition at line 214 of file fax_csr_amalgamate.c.
References cost(), and pastix_int_t.
Referenced by faxCSRAmalgamate().

inlinestatic 
Computes the computational gain of merging two nodes a and b together.
It uses the given cost function cblktime to estimate the cost with and without merge and returns the cost variation.
[in]  a  First node to merge. a >= 0. 
[in]  b  Second node to merge. b >= 0. 
[in]  P  The graph that contains a and b nodes with their associated rows. 
[in]  colweight  Integer array of size P>n with the weight of each node. 
[in,out]  tmp  Workspace array to compute the union of the rows associated to both nodes. 
[in]  cblktime  Function that returns the computational cost of a cblk. 
Definition at line 296 of file fax_csr_amalgamate.c.
References pastix_int_t.
Referenced by faxCSRAmalgamate().

inlinestatic 
Merge two nodes together withing the given graph.
[in]  a  First node to merge. a >= 0. 
[in]  b  Second node to merge. b >= 0. 
[in,out]  P  The graph that contains a and b nodes with their associated rows. On exit, a is merged into b node. 
[in,out]  tmp  Workspace array to compute the union of the rows associated to both nodes. 
Definition at line 339 of file fax_csr_amalgamate.c.
References pastix_int_t.
Referenced by faxCSRAmalgamate().