Fabulous installation procedure
Table of Contents
Go back to README.html.
The tutorial file can be obtain tutorial.cpp .
1. Support
1.1. Includes
Usual include
#include <fabulous.hpp> #include <utility>
To help us, we will be using the matrix market file reader and a random matrix generator. You can find them here https://gitlab.inria.fr/solverstack/fabulous/-/tree/master/src/common
#include "../common/MatrixMarketLoader.hpp" #include "../common/RandomMatrixLoader.hpp"
To simplify the syntax we will use the following aliases
namespace fa = ::fabulous; using S = std::complex<double>; using Block = fa::Block<S>;
1.2. Matrix class
To use the C++
interface of Fabulous
we need to define a matrix class.
1.2.1. Attributes
The matrix class will be simple, we will use a wrapper of the dense matrices of fabulous
class Matrix { private: Block _data;
1.2.2. Load from a file
To simplify, we add a function to load a matrix from a file
void load_matrix(const std::string &filename) { fa::MatrixMarketLoader mml; mml.LoadMatrix(_data, filename); }
1.2.3. Constructor
In our case we only have one constructor using a file.
public: using value_type = std::complex<double>; using primary_type = double; explicit Matrix(const std::string &filename) { load_matrix(filename); }
1.2.4. User interface
The size of your matrix (for display mostly).
int size() const { return _data.get_nb_row(); }
The matrix vector product (required by fabulous
), in our case it is the wrapper to the lapack
gemm
.
// B := \alpha * A * X + \beta * B int64_t operator()(const Block &X, Block &B, S alpha=S{1.0}, S beta=S{0.0}) const { int M = B.get_nb_row(); int N = B.get_nb_col(); int K = X.get_nb_row(); fa::lapacke::gemm( M, N, K, _data.get_ptr(), _data.get_leading_dim(), X.get_ptr(), X.get_leading_dim(), B.get_ptr(), B.get_leading_dim(), alpha, beta ); return fa::lapacke::flops::gemm<S>(M, N, K); }
1.2.5. Footer
}; // end class Matrix
2. Main
int main() {
2.1. Parameters of the solvers
2.1.1. Accuracy
Fist let's set the solver accuracy, for instance, how small do we want \(\frac{||B-AX||}{||B||}\) ?
To do so in fabulous
you need to specify an array with either, \(1\) element if the accuracy is the same for all right-hand-sides,
or a specific one for each right-hand-side.
In our let's take the same one for all of the right-hand-sides.
std::vector<double> epsilon{1e-2};
2.1.2. Fail-safe
Iterative methods in fabulous
(but also in general) may no converge and running indefinitely. In that regard we need a fail-safe.
Fabulous
counts the number of matrix vector product (mvp) and the number of iterations separately.
The mvp accounts for the number of right-hand-sides of \(X\) when you do a \(Y = AX\)
The iteration describe the total number of step the algorithm needed to do to converge.
If the number of right-hand-side is \(1\) then the mvp is basically the number of iterations.
The fail-safe over the mvp is a better indicator for multiple right-hand-side algorithm.
const int max_mvp = 200;
2.1.3. Krylov space method
Fabulous
implement Krylov space methods, those methods need a maximum space size for searching the solution.
This maximum basically represent the memory cost of the algorithm, it needs to be big enough to be able to find the solution.
After the Krylov space is at it's maximum size a restart
occur, that is we compute the partial solution and empty the Krylov space.
const int max_krylov_space = 50;
Some algorithm have a feature that is called deflated restart
(-DR
variant of algorithm).
The difference with a usual restart (also called cold restart) is that some spectral information of the Krylov space is keep.
This is translated by choosing the number of eigen vector you want to kept.
Note: with real matrices the eigen values may be complex, thus the number of eigen vector might be implicitly increase by \(1\)
const int nb_eigen_pair = 5;
Krylov space methods also require a way to orthogonalize a base of the Krylov space.
There are \(2\) types of orthogonality in fabulous
, BLOCK
for orthogonalizing all the right-hand-side at the same time
and RUHE
for orthogonalizing one right-hand-side at a time.
Also there are \(4\) schemes of orthogonality in fabulous
, Classical Gram-Schmidt (CGS
), Modified Gram-Schmidt (MGS
)
and their iterative extension (ICGS
, IMGS
).
Let's pick BLOCK
and CGS
.
auto ortho = fa::OrthoType::BLOCK + fa::OrthoScheme::CGS;
2.2. Setting the problem
First get a matrix, young1c
matrix from matrix market will do.
/* get the young1c matrix for this example at ftp://math.nist.gov/pub/MatrixMarket2/Harwell-Boeing/acoust/young1c.mtx.gz */ Matrix A{"young1c.mtx"};
Now let's setup the right-hand-side to be a random dense matrix and the initial guess to be zero.
const int dim = A.size(); const int nrhs = 6; Block X{dim, nrhs}; Block B{dim, nrhs}; fa::RandomMatrixLoader rnl; rnl.LoadMatrix(B);
2.2.1. Equation
Now that we have our matrices we want to solve the equation \(AX = B\).
To do so we need first to declare a placeholder, like
fa::noop_placeholder _;
This placeholder allows to declare an equation without needing to specify all the parameters.
auto eq = fa::equation(A, _, _, X, B, epsilon);
To be more precise, here are all the parameters
fa::equation(A, M, CB, X, B, epsilon)
M
: the preconditionner of the formAMy = b
withMy = x
.CB
: a callback function of prototypevoid callback(int, Logger&, Vector, Vector)
called after each iteration.
2.2.2. Parameter aggregation
All solver share the same parameters structure to pass all parameters at ones
auto param = fa::parameters(max_mvp, max_krylov_space); //param.set_quiet(true); // turn off the solvers' verbose
2.3. Choosing the algorithm
Now that we have the equation and the parameters it is time to call our solver.
2.3.1. BGMRES
At first let's choose the usual BGMRES
algorithm (Block-GMRES)
auto gmres = fa::bgmres::std(); auto log = fa::bgmres::solve(eq, gmres, param, ortho); //log.print("gmres"); // this prints comma-separated columns with information about convergence
Now let's change the orthogonalization method.
ortho = fa::OrthoType::RUHE + fa::OrthoScheme::MGS;
Let's do a new run to see
X.zero(); // Reset X log = fa::bgmres::solve(eq, gmres, param, ortho); //log.print("gmres_ruhe_mgs");
2.3.2. IB-BGMRES
The IB
prefix signify that, in the case where some right-hand-sides have already converged, they will be detected and treated as such.
X.zero(); // Reset X auto ib_gmres = fa::bgmres::ib(); log = fa::bgmres::solve(eq, ib_gmres, param, ortho); //log.print("ib_gmres");
2.3.3. IB-BGMRES-DR
As explain in Section Krylov space method, IB-BGMRES-DR
is the IB-BGMRES
with deflated restart
X.zero(); // Reset X auto restart = fa::deflated_restart(nb_eigen_pair, S{0.0}); auto ib_gmres_dr = fa::bgmres::ibdr(); log = fa::bgmres::solve(eq, ib_gmres_dr, param, ortho, restart); //log.print("ib_gmres_dr");
2.3.4. IB-BGCRO-DR
IB-BGCRO-DR
is an algorithm that goes faster after each solve by saving spectral information in a block vector, \(U_k\) in our case
X.zero(); // Reset X Block Uk{}; // delation space storage, /!\ start empty /!\ auto ib_gcro_dr = fa::bgcro::ibdr(); log = fa::bgcro::solve(eq, ib_gcro_dr, param, Uk, ortho, restart); //log.print("ib_gcro_dr_1"); X.zero(); // Reset X rnl.LoadMatrix(B); // new right-hand-side log = fa::bgcro::solve(eq, ib_gcro_dr, param, Uk, ortho, restart); //log.print("ib_gcro_dr_2");
3. Footer
return 0;
}
4. Compiling
Put the tutorial.cpp file in the src/examples
folder
g++ -o tutorial.o -c tutorial.cpp -I../../include -DHAVE_LAPACK_CONFIG_H -DLAPACK_COMPLEX_CPP # lapack flags for required complex numbers g++ -o tutorial tutorial.o -lopenblas cp ../../data/young1c.mtx . ./tutorial