PaStiX Handbook 6.4.0
|
Data Structures | |
struct | fax_csr_s |
Fax blocked csr structure. More... | |
Macros | |
#define | RAT_CBLK 0.02 |
Minimal memory increase accepted by the amalgamation algorithm (2%) | |
#define | INFINI 1e9 |
Define an infinite time. | |
Typedefs | |
typedef struct fax_csr_s | fax_csr_t |
Fax blocked csr structure. | |
Functions | |
void | faxCSRInit (pastix_int_t n, fax_csr_t *csr) |
Initialize the data structure by doing the first allocations within the structure and initializing the fields. | |
void | faxCSRClean (fax_csr_t *csr) |
Free the data store in the structure. | |
pastix_int_t | faxCSRGetNNZ (const fax_csr_t *csr) |
Computes the number of non zero entries in the graph. | |
int | faxCSRGenPA (const pastix_graph_t *graphA, const pastix_int_t *perm, fax_csr_t *graphPA) |
Generate the graph of P*A from the graph of A and the permutation vector. | |
void | faxCSRCompact (fax_csr_t *csr) |
Compact a compressed graph. | |
void | faxCSRCblkCompress (const fax_csr_t *graphA, const pastix_order_t *order, fax_csr_t *graphL, pastix_int_t *work) |
Compact a element wise graph of a matrix A, according to the associated partition. | |
pastix_int_t | faxCSRFactDirect (const fax_csr_t *graphA, const pastix_order_t *order, fax_csr_t *graphL) |
Compute the non zero pattern of the direct factorization of a matrix A, given the supernode partition associated. | |
pastix_int_t | faxCSRFactILUk (const fax_csr_t *graphA, const pastix_order_t *order, pastix_int_t level, fax_csr_t *graphL) |
Compute the non zero pattern of the levelized incomplete factor for a sparse lower triangular matrix in CSC format. This pattern is exact iff the matrix has a SYMMETRIC non zero structure. | |
void | faxCSRAmalgamate (int ilu, double rat_cblk, double rat_blas, fax_csr_t *graphL, pastix_order_t *order, PASTIX_Comm pastix_comm) |
Amalgamate the given graph up to a given tolerance. | |
static double | cblk_time_fact (pastix_int_t n, const pastix_int_t *ja, pastix_int_t colnbr) |
Compute a time estimation cost of the factorization. | |
static double | cblk_time_solve (pastix_int_t n, const pastix_int_t *ja, pastix_int_t colnbr) |
Compute a time estimation cost of the solve step. | |
static pastix_int_t | amalgamate_get_sonslist (pastix_int_t node, const pastix_int_t *sonindex, const pastix_int_t *sontab, const pastix_int_t *colweight, pastix_int_t *list) |
Return the list of sons of a given node. | |
static pastix_int_t | amalgamate_merge_cost (pastix_int_t a, pastix_int_t b, const fax_csr_t *P, const pastix_int_t *colweight) |
Compute the cost of merging two nodes a and b together. The additional cost is returned. | |
static double | amalgamate_merge_gain (pastix_int_t a, pastix_int_t b, const fax_csr_t *P, const pastix_int_t *colweight, pastix_int_t *tmp, double(*cblktime)(pastix_int_t, const pastix_int_t *, pastix_int_t)) |
Computes the computational gain of merging two nodes a and b together. | |
static void | amalgamate_merge_col (pastix_int_t a, pastix_int_t b, fax_csr_t *P, pastix_int_t *tmp) |
Merge two nodes together withing the given graph. | |
struct fax_csr_s |
Data Fields | ||
---|---|---|
pastix_int_t | n | |
pastix_int_t | total_nnz | |
pastix_int_t * | nnz | |
pastix_int_t ** | rows |
#define RAT_CBLK 0.02 |
Minimal memory increase accepted by the amalgamation algorithm (2%)
Definition at line 29 of file fax_csr_amalgamate.c.
#define INFINI 1e9 |
Define an infinite time.
Definition at line 34 of file fax_csr_amalgamate.c.
void faxCSRInit | ( | pastix_int_t | n, |
fax_csr_t * | csr | ||
) |
Initialize the data structure by doing the first allocations within the structure and initializing the fields.
[in] | n | The size of the graph that needs to be initialized. |
[out] | csr | The graph to initialize. |
Definition at line 40 of file fax_csr.c.
References pastix_int_t.
Referenced by faxCSRCblkCompress(), faxCSRFactILUk(), faxCSRGenPA(), and faxCSRPatchSymbol().
void faxCSRClean | ( | fax_csr_t * | csr | ) |
Free the data store in the structure.
[in,out] | csr | The graph to clean. |
Definition at line 65 of file fax_csr.c.
References pastix_int_t.
Referenced by faxCSRFactILUk(), faxCSRPatchSymbol(), orderAmalgamate(), and pastixSymbolFaxILUk().
pastix_int_t faxCSRGetNNZ | ( | const fax_csr_t * | csr | ) |
Computes the number of non zero entries in the graph.
It is using the following formula: nnz = sum( i=0..n, nnz[n] ) The formula must be post computed to adapt to presence of diagonal elements or not, and to the symmetry of the graph.
[in] | csr | The graph on which the number of non zero entries is computed. |
The | number of non zero entries. |
Definition at line 99 of file fax_csr.c.
References pastix_int_t.
Referenced by orderAmalgamate(), and pastixSymbolFaxILUk().
int faxCSRGenPA | ( | const pastix_graph_t * | graphA, |
const pastix_int_t * | perm, | ||
fax_csr_t * | graphPA | ||
) |
Generate the graph of P*A from the graph of A and the permutation vector.
[in] | graphA | The original graph Aon which the permutation will be applied. |
[in] | perm | Integer array of size graphA->n. Contains the permutation to apply to A. |
[in,out] | graphPA | On entry, the initialized graph with size graphA->n. On exit, contains the graph of P A. |
PASTIX_SUCCESS | on success. |
PASTIX_ERR_ALLOC | if allocation went wrong. |
PASTIX_ERR_BADPARAMETER | if incorrect parameters are given. |
Definition at line 183 of file fax_csr.c.
References faxCSRInit(), pastix_int_t, and PASTIX_SUCCESS.
Referenced by orderAmalgamate(), and pastixSymbolFaxILUk().
void faxCSRCompact | ( | fax_csr_t * | csr | ) |
Compact a compressed graph.
All nodes with no non zero entries are removed from the graph, the allocated space is not adjusted.
[in,out] | csr | The graph to compact. On entry, graph which might contain nodes with no non zero entries. On exit, all those nodes are suppressed and the compressed graph is returned. |
Definition at line 129 of file fax_csr.c.
References pastix_int_t.
Referenced by faxCSRAmalgamate().
void faxCSRCblkCompress | ( | const fax_csr_t * | graphA, |
const pastix_order_t * | order, | ||
fax_csr_t * | graphL, | ||
pastix_int_t * | work | ||
) |
Compact a element wise graph of a matrix A, according to the associated partition.
[in] | graphA | The original graph of A element wise to compress. |
[in] | order | The ordering structure that holds the partition associated to graphAo. |
[out] | graphL | On entry, the block wise initialized graph of A with size order->cblknbr. On exit, contains the compressed graph of A. |
[in,out] | work | Workspace of size >= max( degree(L_i) ), so >= grapA->n. |
Definition at line 249 of file fax_csr.c.
References pastix_order_s::baseval, pastix_order_s::cblknbr, faxCSRInit(), pastix_int_t, and pastix_order_s::rangtab.
Referenced by faxCSRFactDirect(), and faxCSRFactILUk().
pastix_int_t faxCSRFactDirect | ( | const fax_csr_t * | graphA, |
const pastix_order_t * | order, | ||
fax_csr_t * | graphL | ||
) |
Compute the non zero pattern of the direct factorization of a matrix A, given the supernode partition associated.
[in] | graphA | The graph structure of the original matrix A to factorize. |
[in] | order | The order structure holding the number of cblk and the partition |
[out] | graphL | The graph the structure of the non zero pattern of the factorized matrix. On entry, a pointer to a graph structure. No need for initialization. On exit, the structure contains the computed graph. |
>=0,the | number of non zero entries in the generated graph. |
-i,if | the i^th parameter is incorrect |
Definition at line 49 of file fax_csr_direct.c.
References pastix_order_s::cblknbr, faxCSRCblkCompress(), pastix_int_t, pastix_order_s::rangtab, and pastix_order_s::treetab.
Referenced by orderAmalgamate().
pastix_int_t faxCSRFactILUk | ( | const fax_csr_t * | graphA, |
const pastix_order_t * | order, | ||
pastix_int_t | level, | ||
fax_csr_t * | graphL | ||
) |
Compute the non zero pattern of the levelized incomplete factor for a sparse lower triangular matrix in CSC format. This pattern is exact iff the matrix has a SYMMETRIC non zero structure.
[in] | graphA | The graph structure of the original matrix A to factorize. |
[in] | order | The order structure holding the number of cblk and the partition |
[in] | level | It is the level desired for the ilu(k) factorization. (level >= 0) |
[in,out] | graphL | The graph the structure of the non zero pattern of the factorized matrix. On entry, a pointer to a graph structure. No need for initialization. On exit, the structure contains the computed graph. |
>=0,the | number of non zero entries in the generated graph. |
-i,if | the i^th parameter is incorrect |
Allocated the working array
Initialized visited
Apply GS_Urow for each row
Reset the stack number of elements
Put the diagonal term
BFS phase
Definition at line 57 of file fax_csr_iluk.c.
References faxCSRCblkCompress(), faxCSRClean(), faxCSRInit(), and pastix_int_t.
Referenced by orderAmalgamate(), and pastixSymbolFaxILUk().
void faxCSRAmalgamate | ( | int | ilu, |
double | rat_cblk, | ||
double | rat_blas, | ||
fax_csr_t * | graphL, | ||
pastix_order_t * | order, | ||
PASTIX_Comm | pastix_comm | ||
) |
Amalgamate the given graph up to a given tolerance.
This function takes a supernode graph and performs amalgamation of the columns until a fill tolerance is reached. Two level of fill tolerance are available, the first one, rat_cblk, defines a fill tolerance based on the graph structure only, and the second one, rat_blas, allows to keep amalgamate the structure until the higher ratio while it improves the BLAS performance based on the internal cost model.
[in] | ilu |
|
[in] | rat_cblk | Percentage of fill-in allowed using structure only amalgamation. Must be >= 0. |
[in] | rat_blas | Percentage of fill-in allowed using structure BLAS performance model. rat_blas >= rat_cblk. The ratio is always on the number of non zero entries of the pattern, but the criterion to merge columns is based only on reducing the cost of computations. The cost function used is a model for Cholesky factorization or solve in double precision, other factorization and/or precision are not implemented. They are considered to be similar up to a factor, and so doesn't affect the amalgamation. The solve cost is used for ILU(k) factorization and the factorization cost for direct factorization. |
[in,out] | graphL | The graph of the matrix L to amalgamate. On exit, the compressed graph is returned (not the quotient graph !!) |
[in,out] | order | On entry the initial ordering structure associated to L. On exit, the update ordering structure with the amalgamation of close nodes. |
[in] | pastix_comm | MPI communicator. Used only for printf in this function. |
Number of unknowns
IF THIS SNODE IS NOT A ROOT
IF THIS SNODE IS NOT A ROOT
Merge all the columns so it doesn't add fill-in
Definition at line 408 of file fax_csr_amalgamate.c.
References amalgamate_get_sonslist(), amalgamate_merge_col(), amalgamate_merge_cost(), amalgamate_merge_gain(), cblk_time_fact(), cblk_time_solve(), pastix_order_s::cblknbr, faxCSRCompact(), INFINI, pastix_int_t, PASTIX_SUCCESS, pastixOrderCheck(), pastix_order_s::peritab, pastix_order_s::permtab, pqueueExit(), pqueueInit(), pqueuePop1(), pqueuePush1(), pqueueSize(), pastix_order_s::rangtab, RAT_CBLK, pastix_order_s::treetab, and pastix_order_s::vertnbr.
Referenced by orderAmalgamate().
|
inlinestatic |
Compute a time estimation cost of the factorization.
This function is used during the amalgamation algorithm to decide whether it is more interesting to amalgamate the blocks or to keep them separated in terms of performance cost
[in] | n | The number of rows in the column-block |
[in] | ja | Array of size n that holds the indexes of the rows in the given column block |
[in] | colnbr | The number of columns in the column-block |
Contributions
Definition at line 64 of file fax_csr_amalgamate.c.
References cost(), and pastix_int_t.
Referenced by faxCSRAmalgamate().
|
inlinestatic |
Compute a time estimation cost of the solve step.
This function is used during the amalgamation algorithm to decide whether it is more interesting to amalgamate the blocks or to keep them separated in terms of performance cost when doing incomplete factorization.
[in] | n | The number of rows in the column-block |
[in] | ja | Array of size n that holds the indexes of the rows in the given column block |
[in] | colnbr | The number of columns in the column-block |
Definition at line 122 of file fax_csr_amalgamate.c.
References cost(), and pastix_int_t.
Referenced by faxCSRAmalgamate().
|
inlinestatic |
Return the list of sons of a given node.
[in] | node | Node from which the list is wanted. |
[in] | sonindex | Integer array of the indexes of the first son of each node in sontab array. |
[in] | sontab | Integer array which contains the list of sons for each node. |
[in] | colweight | Integer array of the weight of each node. |
[out] | list | Integer array that can contains the list of sons, should be of size the number of unknowns to prevent overflow. On exit, contains the list of sons of node. |
Definition at line 165 of file fax_csr_amalgamate.c.
References amalgamate_get_sonslist(), and pastix_int_t.
Referenced by amalgamate_get_sonslist(), and faxCSRAmalgamate().
|
inlinestatic |
Compute the cost of merging two nodes a and b together. The additional cost is returned.
[in] | a | First node to merge. a >= 0. |
[in] | b | Second node to merge. b >= 0. |
[in] | P | The graph that contains a and b nodes with their associated rows. |
[in] | colweight | Integer array of size P->n with the weight of each node. |
Definition at line 214 of file fax_csr_amalgamate.c.
References cost(), and pastix_int_t.
Referenced by faxCSRAmalgamate().
|
inlinestatic |
Computes the computational gain of merging two nodes a and b together.
It uses the given cost function cblktime to estimate the cost with and without merge and returns the cost variation.
[in] | a | First node to merge. a >= 0. |
[in] | b | Second node to merge. b >= 0. |
[in] | P | The graph that contains a and b nodes with their associated rows. |
[in] | colweight | Integer array of size P->n with the weight of each node. |
[in,out] | tmp | Workspace array to compute the union of the rows associated to both nodes. |
[in] | cblktime | Function that returns the computational cost of a cblk. |
Definition at line 296 of file fax_csr_amalgamate.c.
References pastix_int_t.
Referenced by faxCSRAmalgamate().
|
inlinestatic |
Merge two nodes together withing the given graph.
[in] | a | First node to merge. a >= 0. |
[in] | b | Second node to merge. b >= 0. |
[in,out] | P | The graph that contains a and b nodes with their associated rows. On exit, a is merged into b node. |
[in,out] | tmp | Workspace array to compute the union of the rows associated to both nodes. |
Definition at line 339 of file fax_csr_amalgamate.c.
References pastix_int_t.
Referenced by faxCSRAmalgamate().