This document aims at describing Chameleon's API, internal functions, as well as the source code organization.

To get information about the Chameleon project, installation guide and usage examples please refer to the user's guide.

Chameleon user's API is mostly composed of linear algebra routines of the form CHAMELEON_name[_Tile[_Async]] where name follows the LAPACK naming scheme that can be used with the Chameleon library. These routines are described individually in the section Linear algebra routines.

In addition to the algorithmic routines Chameleon provides a set of functions to control the overall process, see Auxiliary and control routines

Linear algebra routines

LAPACK matrix layout : the simplest Chameleon interface which is equivalent to CBLAS/LAPACKE
Tile matrix layout : interface to be used with Descriptors, the Chameleon specific structure to handle the data as a set of tiles, see Descriptor and Tile
Tile matrix layout, asynchronous interface : same as the tile interface but without waiting for the termination of tasks, see Sequences

Auxiliary and control routines

Control : to initialize/finalize Chameleon, functions that must be called to use any routines, pause/resume task executions, etc
Options : to set/get some parameters such as the number of CPUs and GPUs to use
Descriptor : to handle the data structure used with the Tile** interface
Tile : to convert (copy) LAPACK type data into Descriptor one and conversely
Sequences : to handle asynchronous tasks execution, to be used with the Tile_Async interface
Workspace : to handle specific workspace to be used with some of the algorithms
Auxiliary : to get some extra information such as the version, the data byte size, the mpi rank

Libraries and source code organization

Chameleon's project is made of some C libraries and executables whose compilation is optional (examples and testing).

The libraries are organized as follows :

chameleon : user's API and task based algorithms, depends on "chameleon_quark|openmp|parsec|starpu", "coreblas", "hqr"
chameleon_quark|openmp|parsec|starpu : interface to the different runtimes, depends on "coreblas" and optionally on "gpucublas" or "gpuhiblas" and on a runtime system library
coreblas and gpucublas or gpuhipblas : interfaces to the CPU and GPU kernels
hqr : HQR is a C library providing tools to generate hierachical trees adapted to 2D block-cyclic data distribution and algorithms based on tiled QR/algorithms

Lets have a look to the source code organization in directories.

cmake_modules : CMakes scripts for setting variables. morse_cmake is used for system introspection.
compute : the task based algorithms as well as the different interfaces (Lapack, Tile, Async) to call them
control : some Chameleon general routines to control the library, initialize, finalize, set options, descriptors and sequences handling, etc
coreblas : the Chameleon interface to CPU linear algebra kernels
distrib : some hints to install Chameleon's dependencies
doc : users and developers documentations
example : couple of C files to show how to use Chameleon
gpucublas : the Chameleon interface to GPU linear algebra kernels (cublas)
gpuhipblas : the Chameleon interface to GPU linear algebra kernels (hipblas)
hqr : HQR is a C library providing tools to generate hierachical trees adapted to 2D block-cyclic data distribution and algorithms based on tiled QR/algorithms
include : Chameleon's headers file necessary for users
lapack_api : the Chameleon interface CBLAS/LAPACKE like
lib : material related to the distribution
plasma-conversion : scripts to convert plasma task based algorithms into chameleon's ones
runtime : interfaces to runtime systems
simucore : data to be able to simulate chameleon executions through StarPU+SimGrid
testing : source files for testing executables (timing and numerical checks)
tools : other scripts for testing (software development quality)