PaStiX Handbook
6.4.0
|
Functions | |
void | propMappTree (Cand *candtab, const EliminTree *etree, pastix_int_t candnbr, int nocrossproc, int allcand) |
Apply the proportional mapping algorithm. More... | |
void | splitSymbol (BlendCtrl *ctrl, symbol_matrix_t *symbmtx) |
Split the column blocks of the symbol matrix to generate parallelism. More... | |
void | simuRun (SimuCtrl *simuctrl, const BlendCtrl *ctrl, const symbol_matrix_t *symbptr) |
Run the simulation to map the data on the nodes. More... | |
int | solverMatrixGen (SolverMatrix *solvmtx, const symbol_matrix_t *symbmtx, const pastix_order_t *ordeptr, const SimuCtrl *simuctrl, const BlendCtrl *ctrl, PASTIX_Comm comm, isched_t *isched) |
Initialize the solver matrix structure. More... | |
int | solverMatrixGenSeq (SolverMatrix *solvmtx, const symbol_matrix_t *symbmtx, const pastix_order_t *ordeptr, const SimuCtrl *simuctrl, const BlendCtrl *ctrl, PASTIX_Comm comm, isched_t *isched, pastix_int_t is_dbg) |
Initialize the solver matrix structure in sequential. More... | |
This module contains all the subroutines and structures to perform the analyze step and prepare the numerical factorization and solve. It is composed of four main steps. The first one is the computation of the proportional mapping based on the elimination tree to attribute worker candidates to all nodes in the tree. The second one is the cutting step of the symbol matrix to generate more parallelism. Then, the simulation predict the best mapping out of the candidates and returns the associated static scheduling for the tasks. Finally, the local solver structure is created to control the numerical factorization and solve and store the problem data.
void propMappTree | ( | Cand * | candtab, |
const EliminTree * | etree, | ||
pastix_int_t | candnbr, | ||
int | nocrossproc, | ||
int | allcand | ||
) |
Apply the proportional mapping algorithm.
This function computes the proportionnal mapping of the elimination tree. The result is a set of potential candidates to compute each node of the elimination tree. The real candidate will be affected during the simulation with simuRun(). It is then important to reduce as much as possible the number of candidates per node, while keeping enough freedom for the scheduling to allow a good load balance and few idle times in the final static decision.
[in,out] | candtab | On entry, the candtab array must contain the cost of each node of the elimination tree, and their depth in the tree as computed by candBuild(). On exit, the fields fcandnum and lcandnum are computed with the proportional mapping algorithm that tries to balance the load between the candidates and distribute the branches to everyone according to their cost. |
[in] | etree | The elimination tree to map on the ressources. |
[in] | candnbr | The total number of candidates to distribute over the elimination tree. |
[in] | nocrossproc | If nocrossproc is enabled, candidates can NOT be part of two subranches with different co-workers in each branch. If nocrossproc is disabled, candidate can be shared between two subranches if the amount of extra work exceeds 10%. |
[in] | allcand | No proportional mapping is performed and everyone is candidate to everything. This will have a large performance impact on the simulation. |
Definition at line 422 of file propmap.c.
References eTreeRoot(), etree_s::nodetab, pastix_int_t, propMappSubtree(), propMappSubtreeOn1P(), and etree_node_s::subcost.
Referenced by pastix_subtask_blend().
void splitSymbol | ( | BlendCtrl * | ctrl, |
symbol_matrix_t * | symbmtx | ||
) |
Split the column blocks of the symbol matrix to generate parallelism.
This is the main function that cut the symbol matrix column blocks, and return the new symbolMatrix. Cost matrix, elimination tree, and candidate array are updated on exit of this function with respect to the newly created column blocks and blocks. See splitSmart() for the cutting algorithm.
[in] | ctrl | The blend control structure. On entry, candtab must be initialized. On exit, costmtx, etree, and candtab are updated accordinglyy to the extended symbol matrix, if new cblk are generated. |
[in] | symbmtx | On entry, the symbol matrix structure to split. On exit, the new symbol matrix with the new cblk and blocks. |
Definition at line 522 of file splitsymbol.c.
References extracblk_s::addcblk, blendctrl_s::candtab, candUpdate(), symbol_matrix_s::cblknbr, blendctrl_s::clustnum, costMatrixBuild(), costMatrixExit(), blendctrl_s::costmtx, blendctrl_s::debug, blendctrl_s::etree, eTreeBuild(), eTreeExit(), extraCblkExit(), extraCblkInit(), extraCblkMerge(), blendctrl_s::iparm, IPARM_FACTORIZATION, IPARM_FLOAT, IPARM_VERBOSE, pastixSymbolCheck(), pastixSymbolPrintStats(), PastixVerboseNo, splitSmart(), and blendctrl_s::up_after_split.
Referenced by pastix_subtask_blend().
void simuRun | ( | SimuCtrl * | simuctrl, |
const BlendCtrl * | ctrl, | ||
const symbol_matrix_t * | symbptr | ||
) |
Run the simulation to map the data on the nodes.
This routine simulates the numerical factorization to generate the static scheduling and the final mapping of the column block onto the PaStiX processes.
[in,out] | simuctrl | The pointer to the simulation structure initialized by simuInit(). |
[in] | ctrl | The pointer to the blend control structure which contains the required data, such as the worker distribution among the processes, the candidates array for each column block, and the cost of the computations. |
[in] | symbptr | The block symbol structure of the problem. |
Definition at line 986 of file simu_run.c.
References symbol_matrix_s::bloknbr, simu_task_s::bloknum, symbol_cblk_s::bloknum, simuctrl_s::bloktab, blendctrl_s::candtab, simuctrl_s::cblknbr, symbol_matrix_s::cblknbr, simu_task_s::cblknum, simuctrl_s::cblktab, symbol_matrix_s::cblktab, simuctrl_s::clustab, cand_s::cluster, blendctrl_s::clustnbr, blendctrl_s::clustnum, simu_ftgt_s::clustnum, blendctrl_s::core2clust, blendctrl_s::costlevel, cand_s::costlevel, simu_cblk_s::ctrbcnt, blendctrl_s::dirname, blendctrl_s::dparm, DPARM_PRED_FACT_TIME, extendint_Add(), cand_s::fccandnum, FTGT_BLOKDST, FTGT_CTRBNBR, FTGT_PRIONUM, FTGT_PROCDST, FTGT_TASKDST, simu_task_s::ftgtcnt, simuctrl_s::ftgtcnt, simu_blok_s::ftgtnum, simuctrl_s::ftgtprio, simuctrl_s::ftgttab, simuctrl_s::ftgttimetab, simu_ftgt_s::infotab, blendctrl_s::iparm, IPARM_VERBOSE, cand_s::lccandnum, blendctrl_s::local_nbthrds, simu_cblk_s::owned, simu_blok_s::ownerclust, simuctrl_s::ownetab, pastix_int_t, pqueueSize(), simu_cluster_s::prionum, simu_task_s::prionum, simu_proc_s::procalias, simuctrl_s::proctab, simu_proc_s::readytask, blendctrl_s::ricar, simu_computeBlockCtrbNbr(), simu_computeTask(), simu_getNextTaskNextProc(), simu_printBlockCtrbNbr(), simu_pushToReadyHeap(), simu_putInAllReadyQueues(), simu_blok_s::tasknum, simu_proc_s::tasktab, simuctrl_s::tasktab, simu_task_s::time, timerSetMax(), timerVal(), blendctrl_s::total_nbcores, blendctrl_s::total_nbthrds, and cand_s::treelevel.
Referenced by pastix_subtask_blend().
int solverMatrixGen | ( | SolverMatrix * | solvmtx, |
const symbol_matrix_t * | symbmtx, | ||
const pastix_order_t * | ordeptr, | ||
const SimuCtrl * | simuctrl, | ||
const BlendCtrl * | ctrl, | ||
PASTIX_Comm | comm, | ||
isched_t * | isched | ||
) |
Initialize the solver matrix structure.
This function takes all the global preprocessing steps: the symbol matrix and the result of the simulation step to generate the solver matrix that holds only local information of each PaStiX process.
[in,out] | solvmtx | On entry, the allocated pointer to a solver matrix structure. On exit, this structure holds alls the local information required to perform the numerical factorization. |
[in] | symbmtx | The global symbol matrix structure. |
[in] | ordeptr | The ordering structure. |
[in] | simuctrl | The information resulting from the simulation that will provide the data mapping, and the order of the task execution for the static scheduling. |
[in] | ctrl | The blend control structure that contains extra information computed during the analyze steps and the parameters of the analyze step. |
[in] | comm | TODO |
[in] | isched | TODO |
PASTIX_SUCCESS | if success. |
PASTIX_ERR_OUTOFMEMORY | if one of the malloc failed. |
Definition at line 88 of file solver_matrix_gen.c.
References symbol_matrix_s::baseval, solver_cblk_s::bcscnum, solver_matrix_s::bloknbr, symbol_matrix_s::bloknbr, solver_matrix_s::bloktab, solver_blok_s::browind, symbol_matrix_s::browmax, solver_cblk_s::brown2d, solver_matrix_s::brownbr, solver_cblk_s::brownum, solver_matrix_s::browtab, blendctrl_s::candtab, cblk_colnbr(), solver_matrix_s::cblkmax1d, solver_matrix_s::cblkmin2d, solver_matrix_s::cblknbr, symbol_matrix_s::cblknbr, solver_matrix_s::cblkschur, simuctrl_s::cblktab, solver_matrix_s::cblktab, symbol_matrix_s::cblktab, cand_s::cblktype, solver_cblk_s::cblktype, blendctrl_s::clustnbr, blendctrl_s::clustnum, solver_matrix_s::coefnbr, solver_matrix_s::fanincnt, solver_matrix_s::faninnbr, solver_cblk_s::fblokptr, solver_cblk_s::fcolnum, solver_matrix_s::gbloknbr, solver_matrix_s::gcbl2loc, solver_matrix_s::gcblknbr, solver_cblk_s::gfaninnum, solver_cblk_s::lcolidx, solver_cblk_s::lcolnum, blendctrl_s::local_nbctxts, blendctrl_s::local_nbthrds, solver_matrix_s::maxrecv, solver_matrix_s::nb2dblok, solver_matrix_s::nb2dcblk, solver_matrix_s::nodenbr, simu_cblk_s::owned, pastix_int_t, PASTIX_SUCCESS, solver_cblk_s::priority, solver_matrix_s::recvcnt, solver_matrix_s::recvnbr, solver_cblk_s::sndeidx, solverInit(), solvMatGen_cblkIs2D(), solvMatGen_fill_localnums(), solvMatGen_fill_tasktab(), solvMatGen_init_blok(), solvMatGen_init_cblk(), solvMatGen_max_buffers(), solvMatGen_register_local_cblk(), solvMatGen_register_remote_cblk(), solvMatGen_reorder_browtab(), solvMatGen_stats_last(), solvMatGen_supernode_index(), solver_cblk_s::stride, symbol_cblk_get_colnum(), simuctrl_s::tasknbr, and blendctrl_s::total_nbcores.
Referenced by pastix_subtask_blend().
int solverMatrixGenSeq | ( | SolverMatrix * | solvmtx, |
const symbol_matrix_t * | symbmtx, | ||
const pastix_order_t * | ordeptr, | ||
const SimuCtrl * | simuctrl, | ||
const BlendCtrl * | ctrl, | ||
PASTIX_Comm | comm, | ||
isched_t * | isched, | ||
pastix_int_t | is_dbg | ||
) |
Initialize the solver matrix structure in sequential.
This function takes all the global preprocessing steps: the symbol matrix, and the result of the simulation step to generate the solver matrix for one PaStiX process.
[in,out] | solvmtx | On entry, the allocated pointer to a solver matrix structure. On exit, this structure holds alls the local information required to perform the numerical factorization. |
[in] | symbmtx | The global symbol matrix structure. |
[in] | ordeptr | The ordering structure. |
[in] | simuctrl | The information resulting from the simulation that will provide the data mapping, and the order of the task execution for the static scheduling. |
[in] | ctrl | The blend control structure that contains extra information computed during the analyze steps and the parameters of the analyze step. |
[in] | comm | TODO |
[in] | isched | TODO |
[in] | is_dbg | TODO |
PASTIX_SUCCESS | if success. |
PASTIX_ERR_OUTOFMEMORY | if one of the malloc failed. |
Definition at line 488 of file solver_matrix_gen.c.
References symbol_matrix_s::baseval, solver_cblk_s::bcscnum, solver_matrix_s::bloknbr, symbol_matrix_s::bloknbr, symbol_cblk_s::bloknum, simuctrl_s::bloktab, solver_matrix_s::bloktab, solver_blok_s::browind, symbol_matrix_s::browmax, solver_cblk_s::brown2d, solver_matrix_s::brownbr, solver_cblk_s::brownum, symbol_cblk_s::brownum, solver_matrix_s::browtab, blendctrl_s::candtab, cblk_colnbr(), solver_matrix_s::cblkmax1d, solver_matrix_s::cblkmin2d, solver_matrix_s::cblknbr, symbol_matrix_s::cblknbr, solver_matrix_s::cblkschur, solver_matrix_s::cblktab, symbol_matrix_s::cblktab, cand_s::cblktype, solver_cblk_s::cblktype, blendctrl_s::clustnbr, blendctrl_s::clustnum, solver_matrix_s::coefnbr, solver_cblk_s::fblokptr, solver_cblk_s::fcolnum, solver_matrix_s::gcblknbr, solver_cblk_s::lcolidx, solver_cblk_s::lcolnum, blendctrl_s::local_nbctxts, blendctrl_s::local_nbthrds, solver_matrix_s::nb2dblok, solver_matrix_s::nb2dcblk, solver_matrix_s::nodenbr, simu_blok_s::ownerclust, pastix_int_t, PASTIX_SUCCESS, solverInit(), solvMatGen_cblkIs2D(), solvMatGen_fill_tasktab(), solvMatGen_init_blok(), solvMatGen_init_cblk(), solvMatGen_max_buffers(), solvMatGen_register_local_cblk(), solvMatGen_reorder_browtab(), solvMatGen_stats_last(), solvMatGen_supernode_index(), solver_cblk_s::stride, simuctrl_s::tasknbr, and blendctrl_s::total_nbcores.
Referenced by pastix_subtask_blend().